pybedtools.bedtool.BedTool.intersect

BedTool.intersect(*args, **kwargs)[source]

Wraps bedtools intersect.

For convenience, the file or stream this BedTool points to is implicitly passed as the -a argument to intersectBed

Original BEDTools help::

Tool:    bedtools intersect (aka intersectBed)
Version: v2.31.1
Summary: Report overlaps between two feature files.

Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>

        Note: -b may be followed with multiple databases and/or 
        wildcard (*) character(s). 
Options: 
        -wa     Write the original entry in A for each overlap.

        -wb     Write the original entry in B for each overlap.
                - Useful for knowing **what** A overlaps. Restricted by -f and -r.

        -loj    Perform a "left outer join". That is, for each feature in A
                report each overlap with B.  If no overlaps are found, 
                report a NULL feature for B.

        -wo     Write the original A and B entries plus the number of base
                pairs of overlap between the two features.
                - Overlaps restricted by -f and -r.
                  Only A features with overlap are reported.

        -wao    Write the original A and B entries plus the number of base
                pairs of overlap between the two features.
                - Overlapping features restricted by -f and -r.
                  However, A features w/o overlap are also reported
                  with a NULL B feature and overlap = 0.

        -u      Write the original A entry **once** if **any** overlaps found in B.
                - In other words, just report the fact >=1 hit was found.
                - Overlaps restricted by -f and -r.

        -c      For each entry in A, report the number of overlaps with B.
                - Reports 0 for A entries that have no overlap with B.
                - Overlaps restricted by -f, -F, -r, and -s.

        -C      For each entry in A, separately report the number of
                - overlaps with each B file on a distinct line.
                - Reports 0 for A entries that have no overlap with B.
                - Overlaps restricted by -f, -F, -r, and -s.

        -v      Only report those entries in A that have **no overlaps** with B.
                - Similar to "grep -v" (an homage).

        -ubam   Write uncompressed BAM output. Default writes compressed BAM.

        -s      Require same strandedness.  That is, only report hits in B
                that overlap A on the **same** strand.
                - By default, overlaps are reported without respect to strand.

        -S      Require different strandedness.  That is, only report hits in B
                that overlap A on the **opposite** strand.
                - By default, overlaps are reported without respect to strand.

        -f      Minimum overlap required as a fraction of A.
                - Default is 1E-9 (i.e., 1bp).
                - FLOAT (e.g. 0.50)

        -F      Minimum overlap required as a fraction of B.
                - Default is 1E-9 (i.e., 1bp).
                - FLOAT (e.g. 0.50)

        -r      Require that the fraction overlap be reciprocal for A AND B.
                - In other words, if -f is 0.90 and -r is used, this requires
                  that B overlap 90% of A and A **also** overlaps 90% of B.

        -e      Require that the minimum fraction be satisfied for A OR B.
                - In other words, if -e is used with -f 0.90 and -F 0.10 this requires
                  that either 90% of A is covered OR 10% of  B is covered.
                  Without -e, both fractions would have to be satisfied.

        -split  Treat "split" BAM or BED12 entries as distinct BED intervals.

        -g      Provide a genome file to enforce consistent chromosome sort order
                across input files. Only applies when used with -sorted option.

        -nonamecheck    For sorted data, don't throw an error if the file has different naming conventions
                        for the same chromosome. ex. "chr1" vs "chr01".

        -sorted Use the "chromsweep" algorithm for sorted (-k1,1 -k2,2n) input.

        -names  When using multiple databases, provide an alias for each that
                will appear instead of a fileId when also printing the DB record.

        -filenames      When using multiple databases, show each complete filename
                        instead of a fileId when also printing the DB record.

        -sortout        When using multiple databases, sort the output DB hits
                        for each record.

        -bed    If using BAM input, write output as BED.

        -header Print the header from the A file prior to results.

        -nobuf  Disable buffered output. Using this option will cause each line
                of output to be printed as it is generated, rather than saved
                in a buffer. This will make printing large output files 
                noticeably slower, but can be useful in conjunction with
                other software tools and scripts that need to process one
                line of bedtools output at a time.

        -iobuf  Specify amount of memory to use for input buffer.
                Takes an integer argument. Optional suffixes K/M/G supported.
                Note: currently has no effect with compressed files.

Notes: 
        (1) When a BAM file is used for the A file, the alignment is retained if overlaps exist,
        and excluded if an overlap cannot be found.  If multiple overlaps exist, they are not
        reported, as we are only testing for one or more overlaps.