pybedtools.bedtool.BedTool.parallel_apply

BedTool.parallel_apply(iterations, func, func_args, func_kwargs, processes=1, _orig_pool=None)[source]

Generalized method for applying a function in parallel.

Typically used when having to do many random shufflings.

func_args and func_kwargs will be passed to func each time in iterations, and these iterations will be split across processes processes.

Notes on the function, func:

  • the function should manually remove any tempfiles created. This is because the BedTool.TEMPFILES list of auto-created tempfiles does not share state across processes, so things will not get cleaned up automatically as they do in a single-process pybedtools session.

  • this includes deleting any “chromsizes” or genome files – generally it will be best to require a genome filename in func_kwargs if you’ll be using any BedTool methods that accept the g kwarg.

  • the function should be a module-level function (rather than a class method) because class methods can’t be pickled across process boundaries

  • the function can have any signature and have any return value

_orig_pool can be a previously-created multiprocessing.Pool instance; otherwise, a new Pool will be created with processes