pybedtools.bedtool.BedTool

class pybedtools.bedtool.BedTool(fn=None, from_string=False, remote=False)[source]
__init__(fn=None, from_string=False, remote=False)[source]

Wrapper around Aaron Quinlan’s BEDtools suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.

fn is typically the name of a BED-like file, but can also be one of the following:

  • a string filename

  • another BedTool object

  • an iterable of Interval objects

  • an open file object

  • a “file contents” string (see below)

If from_string is True, then you can pass a string that contains the contents of the BedTool you want to create. This will treat all spaces as TABs and write to tempfile, treating whatever you pass as fn as the contents of the bed file. This also strips empty lines.

Typical usage is to point to an existing file:

a = BedTool('a.bed')

But you can also create one from scratch from a string:

>>> s = '''
... chrX  1  100
... chrX 25  800
... '''
>>> a = BedTool(s, from_string=True)

Or use examples that come with pybedtools:

>>> example_files = pybedtools.list_example_files()
>>> assert 'a.bed' in example_files
>>> a = pybedtools.example_bedtool('a.bed')

Methods

__init__([fn, from_string, remote])

Wrapper around Aaron Quinlan's BEDtools suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.

absolute_distance(other[, closest_kwargs, ...])

Returns an iterator of the absolute distances between features in self and other.

all_hits(interval[, same_strand, overlap])

Return all intervals that overlap interval.

annotate(*args, **kwargs)

Wraps bedtools annotate.

any_hits(interval[, same_strand, overlap])

Return whether or not any intervals overlap interval.

as_intervalfile()

Returns an IntervalFile of this BedTool for low-level interface.

at(inds)

Returns a new BedTool with only intervals at lines inds

bam_to_bed(*args, **kwargs)

Wraps bedtools bamtobed.

bam_to_fastq(*args, **kwargs)

Wraps bedtools bamtofastq.

bamtobed(*args, **kwargs)

Wraps bedtools bamtobed.

bamtofastq(*args, **kwargs)

Wraps bedtools bamtofastq.

bed12tobed6(*args, **kwargs)

Wraps bedtools bed12tobed6.

bed6(*args, **kwargs)

Wraps bedtools bed12tobed6.

bedpe_to_bam(*args, **kwargs)

Wraps bedtools bedpetobam.

bedpetobam(*args, **kwargs)

Wraps bedtools bedpetobam.

bedtobam(*args, **kwargs)

Wraps bedtools bedtobam

bgzip([in_place, force, is_sorted])

Helper function for more control over "tabixed" BedTools.

cat(*args, **kwargs)

Concatenate interval files together.

check_genome(**kwargs)

Handles the different ways of specifying a genome in kwargs:

closest(*args, **kwargs)

Wraps bedtools closest.

cluster(*args, **kwargs)

Wraps bedtools cluster.

colormap_normalize([vmin, vmax, percentile, log])

Returns a normalization instance for use by featurefuncs.add_color().

complement(*args, **kwargs)

Wraps bedtools complement.

count()

Count the number features in this BedTool.

count_hits(interval[, same_strand, overlap])

Return the number of intervals that overlap interval.

coverage(*args, **kwargs)

Wraps bedtools coverage.

cut(indexes[, stream])

Analagous to unix cut.

delete_temporary_history([ask, raw_input_func])

Use at your own risk! This method will delete temp files.

each(func, *args, **kwargs)

Modify each feature with a user-defined function.

expand(*args, **kwargs)

Wraps bedtools expand

features()

Returns an iterable of features

field_count([n])

Number of fields in each line of this BedTool (checks n lines)

filter(func, *args, **kwargs)

Filter features by user-defined function.

fisher(*args, **kwargs)

Wraps 'fisher'. Returns an object representing the output.

flank(*args, **kwargs)

Wraps bedtools flank.

from_dataframe(df[, outfile, sep, header, ...])

Creates a BedTool from a pandas.DataFrame.

genome_coverage(*args, **kwargs)

Wraps bedtools genomecov.

genomecov(*args, **kwargs)

Wraps bedtools genomecov.

getfasta(*args, **kwargs)

Wraps bedtools getfasta.

groupby(*args, **kwargs)

Wraps bedtools groupby.

handle_kwargs(prog, arg_order, **kwargs)

Handle most cases of BEDTool program calls, but leave the specifics up to individual methods.

head([n, as_string])

Prints the first n lines or returns them if as_string is True

igv(*args, **kwargs)

Wraps bedtools igv.

intersect(*args, **kwargs)

Wraps bedtools intersect.

introns([gene, exon])

Create intron features (requires specific input format).

jaccard(*args, **kwargs)

Returns a dictionary with keys (intersection, union, jaccard).

liftover(chainfile[, unmapped, liftover_args])

Returns a new BedTool of the liftedOver features, saving the unmapped ones as unmapped.

links(*args, **kwargs)

Wraps linksBed.

makewindows(*args, **kwargs)

Wraps bedtools makewindows.

map(*args, **kwargs)

Wraps bedtools map; See also BedTool.each().

mask_fasta(*args, **kwargs)

Wraps bedtools maskfasta.

maskfasta(*args, **kwargs)

Wraps bedtools maskfasta.

merge(*args, **kwargs)

Wraps bedtools merge.

moveto(*args, **kwargs)

Move to a new filename (can be much quicker than BedTool.saveas())

multi_bam_coverage(*args, **kwargs)

Wraps bedtools multicov.

multi_intersect(*args, **kwargs)

Wraps bedtools multiintersect.

multicov(*args, **kwargs)

Wraps bedtools multicov.

multiinter(*args, **kwargs)

Wraps bedtools multiintersect.

nuc(*args, **kwargs)

Wraps bedtools nuc.

nucleotide_content(*args, **kwargs)

Wraps bedtools nuc.

overlap(*args, **kwargs)

Wraps bedtools overlap.

pair_to_bed(*args, **kwargs)

Wraps bedtools pairtobed.

pair_to_pair(*args, **kwargs)

Wraps bedtools pairtopair.

pairtobed(*args, **kwargs)

Wraps bedtools pairtobed.

pairtopair(*args, **kwargs)

Wraps bedtools pairtopair.

parallel_apply(iterations, func, func_args, ...)

Generalized method for applying a function in parallel.

print_sequence()

Print the sequence that was retrieved by BedTool.sequence.

random(*args, **kwargs)

Wraps bedtools random.

random_jaccard(other[, genome_fn, ...])

Computes the naive Jaccard statistic (intersection divided by union).

random_op(*args, **kwargs)

For backwards compatibility; see BedTool.parallel_apply instead.

random_subset(*args, **kwargs)

Return a BedTool containing a random subset.

randomintersection(other, iterations[, ...])

Perform iterations shufflings, each time intersecting with other.

randomintersection_bp(other, iterations, ...)

Like randomintersection, but return the bp overlap instead of the number of intersecting intervals.

randomstats(other, iterations[, new, ...])

Dictionary of results from many randomly shuffled intersections.

relative_distance(other[, genome, g])

Returns an iterator of relative distances between features in self and other.

reldist(*args, **kwargs)

If detail=False, then return a dictionary with keys (reldist, count,

remove_invalid(*args, **kwargs)

Remove invalid features that may break BEDTools programs.

sample(*args, **kwargs)

Wraps 'sample'.

save_seqs(fn)

Save sequences, after calling BedTool.sequence.

saveas(*args, **kwargs)

Make a copy of the BedTool.

seq(loc, fasta)

Return just the sequence from a region string or a single location >>> fn = pybedtools.example_filename('test.fa') >>> BedTool.seq('chr1:2-10', fn) 'GATGAGTCT' >>> BedTool.seq(('chr1', 1, 10), fn) 'GATGAGTCT'

sequence(*args, **kwargs)

Wraps bedtools getfasta.

set_chromsizes(chromsizes)

Prepare BedTool for operations that require chromosome coords.

shift(*args, **kwargs)

Wraps bedtools shift.

shuffle(*args, **kwargs)

Wraps bedtools shuffle.

slop(*args, **kwargs)

Wraps bedtools slop.

sort(*args, **kwargs)

Wraps bedtools sort.

spacing(*args, **kwargs)

Wraps bedtools spacing

split(func, *args, **kwargs)

Split each feature using a user-defined function.

splitbed(*args, **kwargs)

Wraps 'bedtools split'.

subtract(*args, **kwargs)

Wraps bedtools subtract.

tabix([in_place, force, is_sorted])

Prepare a BedTool for use with Tabix.

tabix_contigs()

Returns a list of contigs from the tabix index.

tabix_intervals(interval_or_string[, ...])

Retrieve all intervals within coordinates from a "tabixed" BedTool.

tag(*args, **kwargs)

Wraps bedtools tag.

tag_bam(*args, **kwargs)

Wraps bedtools tag.

tail([lines, as_string])

Like head, but prints last 10 lines of the file by default.

to_bam(*args, **kwargs)

Wraps bedtools bedtobam

to_dataframe([disable_auto_names])

Create a pandas.DataFrame, passing args and kwargs to pandas.read_csv The separator kwarg sep is given a tab t as value by default.

total_coverage()

Return the total number of bases covered by this interval file.

truncate_to_chrom(genome)

Ensure all features fall within chromosome limits.

union_bedgraphs(*args, **kwargs)

Wraps bedtools unionbedg.

unionbedg(*args, **kwargs)

Wraps bedtools unionbedg.

window(*args, **kwargs)

Wraps bedtools window.

window_maker(*args, **kwargs)

Wraps bedtools makewindows.

with_attrs(*args, **kwargs)

Helper method for adding attributes in the middle of a pipeline.

Attributes

TEMPFILES

file_type

Return the type of the current file.

intervals