pybedtools.bedtool.BedTool.window_maker¶
- BedTool.window_maker(*args, **kwargs)[source]¶
Wraps
bedtools makewindows
.There are two alternatives for supplying a genome. Use
g="genome.filename"
if you have a genome’s chrom sizes saved as a file. This is the what BEDTools expects when using it from the command line. Alternatively, use thegenome="assembly.name"
(for example,genome="hg19"
) to use chrom sizes for that assembly without having to manage a separate file. Thegenome
argument triggers a callpybedtools.chromsizes
, so see that method for more details.Original BEDTools help::
Tool: bedtools makewindows Version: v2.31.1 Summary: Makes adjacent or sliding windows across a genome or BED file. Usage: bedtools makewindows [OPTIONS] [-g <genome> OR -b <bed>] [ -w <window**size> OR -n <number of windows> ] Input Options: -g <genome> Genome file size (see notes below). Windows will be created for each chromosome in the file. -b <bed> BED file (with chrom,start,end fields). Windows will be created for each interval in the file. Windows Output Options: -w <window**size> Divide each input interval (either a chromosome or a BED interval) to fixed-sized windows (i.e. same number of nucleotide in each window). Can be combined with -s <step**size> -s <step**size> Step size: i.e., how many base pairs to step before creating a new window. Used to create "sliding" windows. - Defaults to window size (non-sliding windows). -n <number**of**windows> Divide each input interval (either a chromosome or a BED interval) to fixed number of windows (i.e. same number of windows, with varying window sizes). -reverse Reverse numbering of windows in the output, i.e. report windows in decreasing order ID Naming Options: -i src|winnum|srcwinnum The default output is 3 columns: chrom, start, end . With this option, a name column will be added. "-i src" - use the source interval's name. "-i winnum" - use the window number as the ID (e.g. 1,2,3,4...). "-i srcwinnum" - use the source interval's name with the window number. See below for usage examples. Notes: (1) The genome file should tab delimited and structured as follows: <chromName><TAB><chromSize> For example, Human (hg19): chr1 249250621 chr2 243199373 ... chr18**gl000207**random 4262 Tip 1. Use samtools faidx to create a genome file from a FASTA: One can the samtools faidx command to index a FASTA file. The resulting .fai index is suitable as a genome file, as bedtools will only look at the first two, relevant columns of the .fai file. For example: samtools faidx GRCh38.fa bedtools makewindows -w 100 -g GRCh38.fa.fai Tip 2. Use UCSC Table Browser to create a genome file: One can use the UCSC Genome Browser's MySQL database to extract chromosome sizes. For example, H. sapiens: mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \ "select chrom, size from hg19.chromInfo" > hg19.genome Examples: # Divide the human genome into windows of 1MB: $ bedtools makewindows -g hg19.txt -w 1000000 chr1 0 1000000 chr1 1000000 2000000 chr1 2000000 3000000 chr1 3000000 4000000 chr1 4000000 5000000 ... # Divide the human genome into sliding (=overlapping) windows of 1MB, with 500KB overlap: $ bedtools makewindows -g hg19.txt -w 1000000 -s 500000 chr1 0 1000000 chr1 500000 1500000 chr1 1000000 2000000 chr1 1500000 2500000 chr1 2000000 3000000 ... # Divide each chromosome in human genome to 1000 windows of equal size: $ bedtools makewindows -g hg19.txt -n 1000 chr1 0 249251 chr1 249251 498502 chr1 498502 747753 chr1 747753 997004 chr1 997004 1246255 ... # Divide each interval in the given BED file into 10 equal-sized windows: $ cat input.bed chr5 60000 70000 chr5 73000 90000 chr5 100000 101000 $ bedtools makewindows -b input.bed -n 10 chr5 60000 61000 chr5 61000 62000 chr5 62000 63000 chr5 63000 64000 chr5 64000 65000 ... # Add a name column, based on the window number: $ cat input.bed chr5 60000 70000 AAA chr5 73000 90000 BBB chr5 100000 101000 CCC $ bedtools makewindows -b input.bed -n 3 -i winnum chr5 60000 63334 1 chr5 63334 66668 2 chr5 66668 70000 3 chr5 73000 78667 1 chr5 78667 84334 2 chr5 84334 90000 3 chr5 100000 100334 1 chr5 100334 100668 2 chr5 100668 101000 3 ... # Reverse window numbers: $ cat input.bed chr5 60000 70000 AAA chr5 73000 90000 BBB chr5 100000 101000 CCC $ bedtools makewindows -b input.bed -n 3 -i winnum -reverse chr5 60000 63334 3 chr5 63334 66668 2 chr5 66668 70000 1 chr5 73000 78667 3 chr5 78667 84334 2 chr5 84334 90000 1 chr5 100000 100334 3 chr5 100334 100668 2 chr5 100668 101000 1 ... # Add a name column, based on the source ID + window number: $ cat input.bed chr5 60000 70000 AAA chr5 73000 90000 BBB chr5 100000 101000 CCC $ bedtools makewindows -b input.bed -n 3 -i srcwinnum chr5 60000 63334 AAA**1 chr5 63334 66668 AAA**2 chr5 66668 70000 AAA**3 chr5 73000 78667 BBB**1 chr5 78667 84334 BBB**2 chr5 84334 90000 BBB**3 chr5 100000 100334 CCC**1 chr5 100334 100668 CCC**2 chr5 100668 101000 CCC**3 ...