Intersections¶
One common use of BEDTools and pybedtools
is to perform
intersections.
First, let’s create some example BedTool
instances:
>>> a = pybedtools.example_bedtool('a.bed')
>>> b = pybedtools.example_bedtool('b.bed')
Then do the intersection with the BedTool.intersect()
method:
>>> a_and_b = a.intersect(b)
a_and_b
is a new BedTool
instance. It now points to a temp file
on disk, which is stored in the attribute a_and_b.fn
; this temp file contains
the intersection of a
and b
.
We can either print the new BedTool
(which will show ALL features
– use with caution if you have huge files!) or use the
BedTool.head()
method to show up to the first N lines (10 by
default). Here’s what a
, b
, and a_and_b
look like:
>>> a.head()
chr1 1 100 feature1 0 +
chr1 100 200 feature2 0 +
chr1 150 500 feature3 0 -
chr1 900 950 feature4 0 +
>>> b.head()
chr1 155 200 feature5 0 -
chr1 800 901 feature6 0 +
>>> a_and_b.head()
chr1 155 200 feature2 0 +
chr1 155 200 feature3 0 -
chr1 900 901 feature4 0 +
The BedTool.intersect()
method simply wraps the BEDTools program
intersectBed
. This means that we can pass BedTool.intersect()
any
arguments that intersectBed
accepts. For example, if we want to use the
intersectBed
switch -u
(which, according to the BEDTools documentation,
acts as a True/False switch to indicate that we want to see the features in a
that overlapped something in b
), then we can use the keyword argument
u=True
, like this:
>>> # Intersection using the -u switch
>>> a_with_b = a.intersect(b, u=True)
>>> a_with_b.head()
chr1 100 200 feature2 0 +
chr1 150 500 feature3 0 -
chr1 900 950 feature4 0 +
This time, a_with_b
is another BedTool
object that points to a
different temp file whose name is stored in a_with_b.fn
. You can read
more about the use of temp files in Principle 1: Temporary files are created (and deleted) automatically. More on
arguments that you can pass to BedTool
objects in a moment, but
first, some info about saving files.