API docs

genomic_signal

The classes in this module enable random access to a variety of file formats (BAM, bigWig, bigBed, BED) using a uniform syntax, and allow you to compute coverage across many features in parallel or just a single feature.

Using classes in the metaseq.integration and metaseq.minibrowser modules, you can connect these objects to matplotlib figures that show a window into the data, making exploration easy and interactive.

Generally, the genomic_signal() function is all you need – just provide a filename and the format and it will take care of the rest, returning a genomic signal of the proper type.

Adding support for a new format is straightforward:

  • Write a new adapter for the format in metaseq.filetype_adapters
  • Subclass one of the existing classes below, setting the adapter attribute to be an instance of this new adapter
  • Add the new class to the _registry dictionary to enable support for the file format.

Note that to support parallel processing and to avoid repeating code, these classes delegate their local_coverage methods to the metaseq.array_helpers._local_coverage() function.


Functions:

metaseq._genomic_signal.genomic_signal Factory function that makes the right class for the file format.
metaseq._genomic_signal.supported_formats Returns list of formats supported by metaseq’s genomic signal objects.

Classes

metaseq._genomic_signal.BaseSignal Base class to represent objects from which genomic signal can be calculated/extracted.
metaseq._genomic_signal.IntervalSignal Abstract class for bed, BAM and bigBed files.
metaseq._genomic_signal.BigWigSignal Class for operating on bigWig files
metaseq._genomic_signal.BamSignal Class for operating on BAM files.
metaseq._genomic_signal.BigBedSignal Class for operating on bigBed files.
metaseq._genomic_signal.BedSignal Class for operating on BED files.

metaseq.results_table

Classes

metaseq.results_table.ResultsTable Wrapper around a pandas.DataFrame that adds additional functionality.
metaseq.results_table.DESeqResults Class for working with results from DESeq.
metaseq.results_table.DESeq2Results Class for working with results from DESeq2.
metaseq.results_table.EdgeRResults Class for working with results from edgeR.
metaseq.results_table.LazyDict Dictionary-like object that lazily-loads ResultsTable objects.

metaseq.integration

Module that ties together various parts of metaseq


Classes

metaseq.integration.chipseq.Chipseq Class for visualizing and interactively exploring ChIP-seq data.

Functions

metaseq.integration.signal_comparison.compare Compares two genomic signal objects and outputs results as a bedGraph file.

metaseq.plotutils

Module with handy utilities for plotting genomic signal

Functions

metaseq.plotutils.imshow Do-it-all function to help with plotting heatmaps
metaseq.plotutils.add_labels_to_subsets Helper function for adding labels to subsets within a heatmap.
metaseq.plotutils.clustered_sortind Uses MiniBatch k-means clustering to cluster matrix into groups.
metaseq.plotutils.calculate_limits Calculate limits for a group of arrays in a flexible manner.
metaseq.plotutils.ci_plot Plots the mean and 95% ci for the given array on the given axes
metaseq.plotutils.ci Column-wise confidence interval.
metaseq.plotutils.tip_zscores Calculates the “target identification from profiles” (TIP) zscores from Cheng et al.
metaseq.plotutils.tip_fdr Returns adjusted TIP p-values for a particular alpha.
metaseq.plotutils.nice_log Uses a log scale but with negative numbers.
metaseq.plotutils.prepare_logged Transform x and y to a log scale while dealing with zeros.
metaseq.plotutils.matrix_and_line_shell Helper function to construct an empty figure that has space for a matrix, a summary line plot directly below it, a colorbar axis, and an optional “strip” axis that parallels the matrix (and shares its y-axis) where data can be added to create callbacks.
metaseq.plotutils.input_ip_plots All-in-one plotting function to make a 5-panel figure.

Classes

metaseq.plotutils.MarginalHistScatter Class to enable incremental appending of scatterplots, each of which generate additional marginal histograms.

metaseq.colormap_adjust

Module to handle custom colormaps.

cmap_powerlaw_adjust, cmap_center_adjust, and cmap_center_adjust are from https://sites.google.com/site/theodoregoetz/notes/matplotlib_colormapadjust

Functions

metaseq.colormap_adjust.color_test Figure filled in with color; useful for troubleshooting or experimenting
metaseq.colormap_adjust.smart_colormap Creates a “smart” colormap that is centered on zero, and accounts for asymmetrical vmin and vmax by matching saturation/value of high and low colors.
metaseq.colormap_adjust.cmap_discretize
metaseq.colormap_adjust.cmap_powerlaw_adjust Returns a new colormap based on the one given but adjusted via power-law, newcmap = oldcmap**a.
metaseq.colormap_adjust.cmap_center_adjust Returns a new colormap based on the one given
metaseq.colormap_adjust.cmap_center_point_adjust Converts center to a ratio between 0 and 1 of the range given and calls cmap_center_adjust().

metaseq.minibrowser

Module for spawning mini genome browsers using a plugin structure, making it possible to build rather complex mini-browsers. The goal is to point the mini-browser to some data, and call its plot() method with a feature. This will spawn a new figure showing the data for that interval.

MiniBrowser classes are just a general way of mapping data-manipulation or data-visualization methods to an Axes on which the data should be displayed.

To make a new subclass:

  1. Create one or more methods that accept an Axes object and a pybedtools Interval object and return a feature. The simplest do-nothing method would be:

    def my_panel(self, ax, feature)
        return feature
    

    A more useful method might be one that plots genomic signal over the region:

    def my_panel(self, ax, feature):
        # for simplicity, assume just use the first genomic_signal
        gs = self.genomic_signal_objs[0]
        x, y = gs.local_coverage(feature, bins=100)
        ax.plot(x, y, **kwargs)
        ax.axis('tight')
        return feature
    
  2. Then, override the panels() method. This method:

    • Creates Axes as needed; assumes that self.make_fig() has already been called so that self.fig is available.
    • Returns a list of (ax, method) tuples. This list maps created Axes to methods that should operate on them (like my_panel method above).

    For example:

    def panels(self):
        ax = self.fig.add_subplot(111)
        return [(ax, self.my_panel)]
    

A figure is spawned by calling the plot method on a pybedtools genomic interval, e.g.:

s = SignalMiniBrowser(ip, control])
s.plot(feature)

Classes

metaseq.minibrowser.BaseMiniBrowser Base class for plotting a genomic region.
metaseq.minibrowser.SignalMiniBrowser Base class for plotting genomic signal.
metaseq.minibrowser.GeneModelMiniBrowser Mini-browser to show a signal panel on top and gene models on the bottom.

metaseq.filetype_adapters

This module provides classes that make a file format conform to a uniform API. These are not generally needed by end-users, rather, they are used internally by higher-level code like metaseq.genomic_signal.

File-type adapters accept a filename of the appropriate format (which is not checked) as the only argument to their constructor.

Subclasses must define __getitem__ to accept a pybedtools.Interval and return an iterator of pybedtools.Intervals

Subclasses must define make_fileobj(), which returns an object to be iterated over in __getitem__

Classes

metaseq.filetype_adapters.BaseAdapter Base class for filetype adapters
metaseq.filetype_adapters.BamAdapter Adapter that provides random access to BAM objects using Pysam
metaseq.filetype_adapters.BedAdapter Adapter that provides random access to BED files via Tabix
metaseq.filetype_adapters.BigBedAdapter Adapter that provides random access to bigBed files via bx-python
metaseq.filetype_adapters.BigWigAdapter Adapter that provides random access to bigWig files bia bx-python

metaseq.array_helpers

Functions

_local_count The count of genomic signal (typcially BED features) found within an interval.
_local_coverage Returns a binned vector of coverage.