pybedtools.bedtool.BedTool¶
- class pybedtools.bedtool.BedTool(fn=None, from_string=False, remote=False)[source]¶
- __init__(fn=None, from_string=False, remote=False)[source]¶
Wrapper around Aaron Quinlan’s
BEDtools
suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.fn is typically the name of a BED-like file, but can also be one of the following:
a string filename
another BedTool object
an iterable of Interval objects
an open file object
a “file contents” string (see below)
If from_string is True, then you can pass a string that contains the contents of the BedTool you want to create. This will treat all spaces as TABs and write to tempfile, treating whatever you pass as fn as the contents of the bed file. This also strips empty lines.
Typical usage is to point to an existing file:
a = BedTool('a.bed')
But you can also create one from scratch from a string:
>>> s = ''' ... chrX 1 100 ... chrX 25 800 ... ''' >>> a = BedTool(s, from_string=True)
Or use examples that come with pybedtools:
>>> example_files = pybedtools.list_example_files() >>> assert 'a.bed' in example_files >>> a = pybedtools.example_bedtool('a.bed')
Methods
__init__
([fn, from_string, remote])Wrapper around Aaron Quinlan's
BEDtools
suite of programs (https://github.com/arq5x/bedtools); also contains many useful methods for more detailed work with BED files.absolute_distance
(other[, closest_kwargs, ...])Returns an iterator of the absolute distances between features in self and other.
all_hits
(interval[, same_strand, overlap])Return all intervals that overlap
interval
.annotate
(*args, **kwargs)Wraps
bedtools annotate
.any_hits
(interval[, same_strand, overlap])Return whether or not any intervals overlap
interval
.Returns an IntervalFile of this BedTool for low-level interface.
at
(inds)Returns a new BedTool with only intervals at lines
inds
bam_to_bed
(*args, **kwargs)Wraps
bedtools bamtobed
.bam_to_fastq
(*args, **kwargs)Wraps
bedtools bamtofastq
.bamtobed
(*args, **kwargs)Wraps
bedtools bamtobed
.bamtofastq
(*args, **kwargs)Wraps
bedtools bamtofastq
.bed12tobed6
(*args, **kwargs)Wraps
bedtools bed12tobed6
.bed6
(*args, **kwargs)Wraps
bedtools bed12tobed6
.bedpe_to_bam
(*args, **kwargs)Wraps
bedtools bedpetobam
.bedpetobam
(*args, **kwargs)Wraps
bedtools bedpetobam
.bedtobam
(*args, **kwargs)Wraps
bedtools bedtobam
bgzip
([in_place, force, is_sorted])Helper function for more control over "tabixed" BedTools.
cat
(*args, **kwargs)Concatenate interval files together.
check_genome
(**kwargs)Handles the different ways of specifying a genome in kwargs:
closest
(*args, **kwargs)Wraps
bedtools closest
.cluster
(*args, **kwargs)Wraps
bedtools cluster
.colormap_normalize
([vmin, vmax, percentile, log])Returns a normalization instance for use by featurefuncs.add_color().
complement
(*args, **kwargs)Wraps
bedtools complement
.count
()Count the number features in this BedTool.
count_hits
(interval[, same_strand, overlap])Return the number of intervals that overlap
interval
.coverage
(*args, **kwargs)Wraps
bedtools coverage
.cut
(indexes[, stream])Analagous to unix
cut
.delete_temporary_history
([ask, raw_input_func])Use at your own risk! This method will delete temp files.
each
(func, *args, **kwargs)Modify each feature with a user-defined function.
expand
(*args, **kwargs)Wraps
bedtools expand
features
()Returns an iterable of features
field_count
([n])Number of fields in each line of this BedTool (checks
n
lines)filter
(func, *args, **kwargs)Filter features by user-defined function.
fisher
(*args, **kwargs)Wraps 'fisher'. Returns an object representing the output.
flank
(*args, **kwargs)Wraps
bedtools flank
.from_dataframe
(df[, outfile, sep, header, ...])Creates a BedTool from a pandas.DataFrame.
genome_coverage
(*args, **kwargs)Wraps
bedtools genomecov
.genomecov
(*args, **kwargs)Wraps
bedtools genomecov
.getfasta
(*args, **kwargs)Wraps
bedtools getfasta
.groupby
(*args, **kwargs)Wraps
bedtools groupby
.handle_kwargs
(prog, arg_order, **kwargs)Handle most cases of BEDTool program calls, but leave the specifics up to individual methods.
head
([n, as_string])Prints the first n lines or returns them if as_string is True
igv
(*args, **kwargs)Wraps
bedtools igv
.intersect
(*args, **kwargs)Wraps
bedtools intersect
.introns
([gene, exon])Create intron features (requires specific input format).
jaccard
(*args, **kwargs)Returns a dictionary with keys (intersection, union, jaccard).
liftover
(chainfile[, unmapped, liftover_args])Returns a new BedTool of the liftedOver features, saving the unmapped ones as
unmapped
.links
(*args, **kwargs)Wraps
linksBed
.makewindows
(*args, **kwargs)Wraps
bedtools makewindows
.map
(*args, **kwargs)Wraps
bedtools map
; See alsoBedTool.each()
.mask_fasta
(*args, **kwargs)Wraps
bedtools maskfasta
.maskfasta
(*args, **kwargs)Wraps
bedtools maskfasta
.merge
(*args, **kwargs)Wraps
bedtools merge
.moveto
(*args, **kwargs)Move to a new filename (can be much quicker than BedTool.saveas())
multi_bam_coverage
(*args, **kwargs)Wraps
bedtools multicov
.multi_intersect
(*args, **kwargs)Wraps
bedtools multiintersect
.multicov
(*args, **kwargs)Wraps
bedtools multicov
.multiinter
(*args, **kwargs)Wraps
bedtools multiintersect
.nuc
(*args, **kwargs)Wraps
bedtools nuc
.nucleotide_content
(*args, **kwargs)Wraps
bedtools nuc
.overlap
(*args, **kwargs)Wraps
bedtools overlap
.pair_to_bed
(*args, **kwargs)Wraps
bedtools pairtobed
.pair_to_pair
(*args, **kwargs)Wraps
bedtools pairtopair
.pairtobed
(*args, **kwargs)Wraps
bedtools pairtobed
.pairtopair
(*args, **kwargs)Wraps
bedtools pairtopair
.parallel_apply
(iterations, func, func_args, ...)Generalized method for applying a function in parallel.
Print the sequence that was retrieved by BedTool.sequence.
random
(*args, **kwargs)Wraps
bedtools random
.random_jaccard
(other[, genome_fn, ...])Computes the naive Jaccard statistic (intersection divided by union).
random_op
(*args, **kwargs)For backwards compatibility; see BedTool.parallel_apply instead.
random_subset
(*args, **kwargs)Return a BedTool containing a random subset.
randomintersection
(other, iterations[, ...])Perform
iterations
shufflings, each time intersecting withother
.randomintersection_bp
(other, iterations, ...)Like randomintersection, but return the bp overlap instead of the number of intersecting intervals.
randomstats
(other, iterations[, new, ...])Dictionary of results from many randomly shuffled intersections.
relative_distance
(other[, genome, g])Returns an iterator of relative distances between features in self and other.
reldist
(*args, **kwargs)If detail=False, then return a dictionary with keys (reldist, count,
remove_invalid
(*args, **kwargs)Remove invalid features that may break BEDTools programs.
sample
(*args, **kwargs)Wraps 'sample'.
save_seqs
(fn)Save sequences, after calling BedTool.sequence.
saveas
(*args, **kwargs)Make a copy of the BedTool.
seq
(loc, fasta)Return just the sequence from a region string or a single location >>> fn = pybedtools.example_filename('test.fa') >>> BedTool.seq('chr1:2-10', fn) 'GATGAGTCT' >>> BedTool.seq(('chr1', 1, 10), fn) 'GATGAGTCT'
sequence
(*args, **kwargs)Wraps
bedtools getfasta
.set_chromsizes
(chromsizes)Prepare BedTool for operations that require chromosome coords.
shift
(*args, **kwargs)Wraps
bedtools shift
.shuffle
(*args, **kwargs)Wraps
bedtools shuffle
.slop
(*args, **kwargs)Wraps
bedtools slop
.sort
(*args, **kwargs)Wraps
bedtools sort
.spacing
(*args, **kwargs)Wraps
bedtools spacing
split
(func, *args, **kwargs)Split each feature using a user-defined function.
splitbed
(*args, **kwargs)Wraps 'bedtools split'.
subtract
(*args, **kwargs)Wraps
bedtools subtract
.tabix
([in_place, force, is_sorted])Prepare a BedTool for use with Tabix.
tabix_contigs
()Returns a list of contigs from the tabix index.
tabix_intervals
(interval_or_string[, ...])Retrieve all intervals within coordinates from a "tabixed" BedTool.
tag
(*args, **kwargs)Wraps
bedtools tag
.tag_bam
(*args, **kwargs)Wraps
bedtools tag
.tail
([lines, as_string])Like
head
, but prints last 10 lines of the file by default.to_bam
(*args, **kwargs)Wraps
bedtools bedtobam
to_dataframe
([disable_auto_names])Create a pandas.DataFrame, passing args and kwargs to pandas.read_csv The separator kwarg
sep
is given a tabt
as value by default.Return the total number of bases covered by this interval file.
truncate_to_chrom
(genome)Ensure all features fall within chromosome limits.
union_bedgraphs
(*args, **kwargs)Wraps
bedtools unionbedg
.unionbedg
(*args, **kwargs)Wraps
bedtools unionbedg
.window
(*args, **kwargs)Wraps
bedtools window
.window_maker
(*args, **kwargs)Wraps
bedtools makewindows
.with_attrs
(*args, **kwargs)Helper method for adding attributes in the middle of a pipeline.
Attributes
TEMPFILES
Return the type of the current file.
intervals