Creating a BedTool
¶
To create a BedTool
, first you need to import the
pybedtools
module. For these examples, I’m assuming you have
already done the following:
>>> import pybedtools
>>> from pybedtools import BedTool
Next, you need a BED file to work with. If you already have one, then great
– move on to the next section. If not, pybedtools
comes with some
example bed files used for testing. You can take a look at the list of
example files that ship with pybedtools
with the
list_example_files()
function:
>>> # list the example bed files
>>> files = pybedtools.list_example_files()
Once you decide on a file to use, feed the your choice to the
example_filename()
function to get the full path:
>>> # get the full path to an example bed file
>>> bedfn = pybedtools.example_filename('a.bed')
The full path of bedfn will depend on your installation (this is similar
to the data()
function in R, if you’re familiar with that).
Now that you have a filename – either one of the example files or your
own, you create a new BedTool
simply by pointing it to that
filename:
>>> # create a new BedTool from the example bed file
>>> myBedTool = BedTool(bedfn)
Alternatively, you can construct BED files from scratch by using the
from_string
keyword argument. However, all spaces will be converted to
tabs using this method, so you’ll have to be careful if you add “name”
columns. This can be useful if you want to create de novo BED files on
the fly:
>>> # an "inline" example:
>>> fromscratch1 = pybedtools.BedTool('chrX 1 100', from_string=True)
>>> print(fromscratch1)
chrX 1 100
>>> # using a longer string to make a bed file. Note that
>>> # newlines don't matter, and one or more consecutive
>>> # spaces will be converted to a tab character.
>>> larger_string = """
... chrX 1 100 feature1 0 +
... chrX 50 350 feature2 0 -
... chr2 5000 10000 another_feature 0 +
... """
>>> fromscratch2 = BedTool(larger_string, from_string=True)
>>> print(fromscratch2)
chrX 1 100 feature1 0 +
chrX 50 350 feature2 0 -
chr2 5000 10000 another_feature 0 +
Of course, you’ll usually be using your own bed files that have some biological importance for your work that are saved in places convenient for you, for example:
>>> a = BedTool('/data/sample1/peaks.bed')