metaseq.integration.chipseq.Chipseq¶
-
class
metaseq.integration.chipseq.
Chipseq
(ip_bam, control_bam, dbfn=None)[source]¶ Bases:
object
Class for visualizing and interactively exploring ChIP-seq data.
Needs two BAM files (for IP and control) and a gffutils database filename in order to display gene models.
Typical usage is to create a normalized array of signal over each feature with the diff_array method, and then plot with the plot method.
The resulting figure has the matrix as a heatmap, the average signal over features, and a panel with points that can be zoomed and clicked, spawning a mini-browser window for the corresponding feature.
Configuration can be done by adjusting the following attributes after creating a Chipseq instance:
- _strip_kwargs (the style for the dots in the left panel)
- browser_plotting_kwargs (style of signal lines in the mini-browser)
Methods
callback
(event)Callback function to spawn a mini-browser when a feature is clicked. diff_array
(features[, force, func, ...])Scales the control and IP data to million mapped reads, then subtracts scaled control from scaled IP, applies func(diffed) to the diffed array, and finally sets self.diffed_array to be the result. plot
(x[, row_order, imshow_kwargs, strip])Plot the scaled ChIP-seq data. Set up a
Chipseq
object.Parameters: - ip_bam – filename of BAM file for ChIP data
- control_bam – filename of BAM file for control data
- dbfn – filename of gffutils database
Methods
callback
(event)Callback function to spawn a mini-browser when a feature is clicked. diff_array
(features[, force, func, ...])Scales the control and IP data to million mapped reads, then subtracts scaled control from scaled IP, applies func(diffed) to the diffed array, and finally sets self.diffed_array to be the result. plot
(x[, row_order, imshow_kwargs, strip])Plot the scaled ChIP-seq data. Methods
__init__
(ip_bam, control_bam[, dbfn])Set up a Chipseq
object.callback
(event)Callback function to spawn a mini-browser when a feature is clicked. diff_array
(features[, force, func, ...])Scales the control and IP data to million mapped reads, then subtracts scaled control from scaled IP, applies func(diffed) to the diffed array, and finally sets self.diffed_array to be the result. plot
(x[, row_order, imshow_kwargs, strip])Plot the scaled ChIP-seq data. -
__init__
(ip_bam, control_bam, dbfn=None)[source]¶ Set up a
Chipseq
object.Parameters: - ip_bam – filename of BAM file for ChIP data
- control_bam – filename of BAM file for control data
- dbfn – filename of gffutils database
-
diff_array
(features, force=True, func=None, array_kwargs={}, cache=None)[source]¶ Scales the control and IP data to million mapped reads, then subtracts scaled control from scaled IP, applies func(diffed) to the diffed array, and finally sets self.diffed_array to be the result.
Arrays self.ip and self.control are set as well, and if force=False, then previously-created arrays will be used instead of re-calculating new ones. This is useful if you want to easily try multiple func functions without having to re-calculate the data.
Another side-effect is that self.features is set so that it can be accesed by other methods.
Parameters: - features – a list of pybedtools.Interval objects
- array_kwargs – extra keyword args passed to genomic_signal.array; typically this will include bins, processes, and chunksize arguments.
- func – a function to apply to the diffed arrays. By default
this is
metaseq.plotutils.nice_log()
; another option might be lambda x: x, or lambda x: 1e6*x - force – Force a re-calculation of the arrays; otherwise uses cached values
-
plot
(x, row_order=None, imshow_kwargs=None, strip=True)[source]¶ Plot the scaled ChIP-seq data.
Parameters: - x – X-axis to use (e.g, for TSS +/- 1kb with 100 bins, this would be np.linspace(-1000, 1000, 100))
- row_order – Array-like object containing row order – typically the result of an np.argsort call.
- strip – Include axes along the left side with points that can be clicked to spawn a minibrowser for that feature.