Each¶

Similar to BedTool.filter(), which applies a function to return True or False given an Interval, the BedTool.each() method applies a function to return a new, possibly modified Interval.

The BedTool.each() method applies a function to every feature. Like BedTool.filter(), you can use your own function or some pre-defined ones in the featurefuncs module. Also like filter(), *args and **kwargs are sent to the function.

>>> a = pybedtools.example_bedtool('a.bed')
>>> b = pybedtools.example_bedtool('b.bed')

>>> # The results of an "intersect" with c=True will return features
>>> # with an additional field representing the counts.
>>> with_counts = a.intersect(b, c=True)

Let’s define a function that will take the number of counts in each feature as calculated above and divide by the number of bases in that feature. We can also supply an optional scalar, like 0.001, to get the results in “number of intersections per kb”. We then insert that value into the score field of the feature. Here’s the function:

>>> def normalize_count(feature, scalar=0.001):
...     """
...     assume feature's last field is the count
...     """
...     counts = float(feature[-1])
...     normalized = round(counts / (len(feature) * scalar), 2)
...
...     # need to convert back to string to insert into feature
...     feature.score = str(normalized)
...     return feature

And we apply it like this:

>>> normalized = with_counts.each(normalize_count)
>>> print(normalized)
chr1        1       100     feature1        0.0     +       0
chr1        100     200     feature2        10.0    +       1
chr1        150     500     feature3        2.86    -       1
chr1        900     950     feature4        20.0    +       1

Similar to BedTool.filter(), we could have used the Python built-in function map to map a function to each Interval. In fact, this can still be useful if you don’t want a BedTool object as a result. For example:

>>> feature_lengths = map(len, a)

However, the BedTool.each() method returns a BedTool object, which can be used in a chain of commands, e.g.,

>>> a.intersect(b).each(normalize_count).filter(lamda x: float(x[4]) < 1e-5)

Each¶

pybedtools

Navigation

Related Topics