pybedtools.bedtool.BedTool.window_maker

BedTool.window_maker(*args, **kwargs)[source]

Wraps bedtools makewindows.

There are two alternatives for supplying a genome. Use g="genome.filename" if you have a genome’s chrom sizes saved as a file. This is the what BEDTools expects when using it from the command line. Alternatively, use the genome="assembly.name" (for example, genome="hg19") to use chrom sizes for that assembly without having to manage a separate file. The genome argument triggers a call pybedtools.chromsizes, so see that method for more details.

Original BEDTools help::

Tool: bedtools makewindows
Version: v2.31.1
Summary: Makes adjacent or sliding windows across a genome or BED file.

Usage: bedtools makewindows [OPTIONS] [-g <genome> OR -b <bed>]
 [ -w <window**size> OR -n <number of windows> ]

Input Options: 
        -g <genome>
                Genome file size (see notes below).
                Windows will be created for each chromosome in the file.

        -b <bed>
                BED file (with chrom,start,end fields).
                Windows will be created for each interval in the file.

Windows Output Options: 
        -w <window**size>
                Divide each input interval (either a chromosome or a BED interval)
                to fixed-sized windows (i.e. same number of nucleotide in each window).
                Can be combined with -s <step**size>

        -s <step**size>
                Step size: i.e., how many base pairs to step before
                creating a new window. Used to create "sliding" windows.
                - Defaults to window size (non-sliding windows).

        -n <number**of**windows>
                Divide each input interval (either a chromosome or a BED interval)
                to fixed number of windows (i.e. same number of windows, with
                varying window sizes).

        -reverse
                 Reverse numbering of windows in the output, i.e. report 
                 windows in decreasing order

ID Naming Options: 
        -i src|winnum|srcwinnum
                The default output is 3 columns: chrom, start, end .
                With this option, a name column will be added.
                 "-i src" - use the source interval's name.
                 "-i winnum" - use the window number as the ID (e.g. 1,2,3,4...).
                 "-i srcwinnum" - use the source interval's name with the window number.
                See below for usage examples.

Notes: 
        (1) The genome file should tab delimited and structured as follows:
         <chromName><TAB><chromSize>

        For example, Human (hg19):
        chr1    249250621
        chr2    243199373
        ...
        chr18**gl000207**random 4262

Tip 1. Use samtools faidx to create a genome file from a FASTA: 
        One can the samtools faidx command to index a FASTA file.
        The resulting .fai index is suitable as a genome file, 
        as bedtools will only look at the first two, relevant columns
        of the .fai file.

        For example:
        samtools faidx GRCh38.fa
        bedtools makewindows -w 100 -g GRCh38.fa.fai

Tip 2. Use UCSC Table Browser to create a genome file: 
        One can use the UCSC Genome Browser's MySQL database to extract
        chromosome sizes. For example, H. sapiens:

        mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
        "select chrom, size from hg19.chromInfo"  > hg19.genome

Examples: 
 # Divide the human genome into windows of 1MB:
 $ bedtools makewindows -g hg19.txt -w 1000000
 chr1 0 1000000
 chr1 1000000 2000000
 chr1 2000000 3000000
 chr1 3000000 4000000
 chr1 4000000 5000000
 ...

 # Divide the human genome into sliding (=overlapping) windows of 1MB, with 500KB overlap:
 $ bedtools makewindows -g hg19.txt -w 1000000 -s 500000
 chr1 0 1000000
 chr1 500000 1500000
 chr1 1000000 2000000
 chr1 1500000 2500000
 chr1 2000000 3000000
 ...

 # Divide each chromosome in human genome to 1000 windows of equal size:
 $ bedtools makewindows -g hg19.txt -n 1000
 chr1 0 249251
 chr1 249251 498502
 chr1 498502 747753
 chr1 747753 997004
 chr1 997004 1246255
 ...

 # Divide each interval in the given BED file into 10 equal-sized windows:
 $ cat input.bed
 chr5 60000 70000
 chr5 73000 90000
 chr5 100000 101000
 $ bedtools makewindows -b input.bed -n 10
 chr5 60000 61000
 chr5 61000 62000
 chr5 62000 63000
 chr5 63000 64000
 chr5 64000 65000
 ...

 # Add a name column, based on the window number: 
 $ cat input.bed
 chr5  60000  70000 AAA
 chr5  73000  90000 BBB
 chr5 100000 101000 CCC
 $ bedtools makewindows -b input.bed -n 3 -i winnum
 chr5        60000   63334   1
 chr5        63334   66668   2
 chr5        66668   70000   3
 chr5        73000   78667   1
 chr5        78667   84334   2
 chr5        84334   90000   3
 chr5        100000  100334  1
 chr5        100334  100668  2
 chr5        100668  101000  3
 ...

 # Reverse window numbers: 
 $ cat input.bed
 chr5  60000  70000 AAA
 chr5  73000  90000 BBB
 chr5 100000 101000 CCC
 $ bedtools makewindows -b input.bed -n 3 -i winnum -reverse
 chr5        60000   63334   3
 chr5        63334   66668   2
 chr5        66668   70000   1
 chr5        73000   78667   3
 chr5        78667   84334   2
 chr5        84334   90000   1
 chr5        100000  100334  3
 chr5        100334  100668  2
 chr5        100668  101000  1
 ...

 # Add a name column, based on the source ID + window number: 
 $ cat input.bed
 chr5  60000  70000 AAA
 chr5  73000  90000 BBB
 chr5 100000 101000 CCC
 $ bedtools makewindows -b input.bed -n 3 -i srcwinnum
 chr5        60000   63334   AAA**1
 chr5        63334   66668   AAA**2
 chr5        66668   70000   AAA**3
 chr5        73000   78667   BBB**1
 chr5        78667   84334   BBB**2
 chr5        84334   90000   BBB**3
 chr5        100000  100334  CCC**1
 chr5        100334  100668  CCC**2
 chr5        100668  101000  CCC**3
 ...