metaseq.results_table.LazyDict

class metaseq.results_table.LazyDict(fn_dict, dbfn, index_file, extra=None, cls=<class 'metaseq.results_table.DESeqResults'>, modifier=None)[source]

Bases: object

Dictionary-like object that lazily-loads ResultsTable objects.

Parameters:

fn_dict : dict

Keys of fn_dict will be the keys of this LazyDict object. Values should be filenames which will be loaded into ResultsTable object upon access for the first time.

index_file : str

Path to a file that contains one ID per line. This file is used to ensure all ResultsTable objects are aligned to the same index.

dbfn : str

Filename to a gffutils database. This enables gene info to be attached to the dataframe.

extra : pandas.dataframe

This dataframe hat will be merged into the data in each file. This is useful for attaching things like gene lengths, alt names, etc. In order for it to work, this dataframe must be indexed the same way the ResultsTable files are indexed.

cls : ResultsTable class or subclass

Each filename in fn_dict will be converted using this class.

modifier : callable

Upon first access, each newly-constructed ResultsTable will first have the extra data attached, and then will be provided as this callable’s only argument. The callable can make any modifications to the ResultsTable, and return a new version that will be used in the future when the same key is accessed. For example, exonic bp data can be provided as part of the extra dataframe, and then the modifier can be a function that adds an RPKM column.

Notes

When a key is provided for the first time, the workflow is ResultsTable(fn, **kwargs) -> attach extra -> send to modifier -> return extended and modified ResultsTable. Subsequent access of the same key will immediately return the extended-and-modified ResultsTable.

Methods

items()
keys()
values()

Methods

__init__(fn_dict, dbfn, index_file[, extra, ...]) Dictionary-like object that lazily-loads ResultsTable objects.
items()
keys()
values()
__init__(fn_dict, dbfn, index_file, extra=None, cls=<class 'metaseq.results_table.DESeqResults'>, modifier=None)[source]

Dictionary-like object that lazily-loads ResultsTable objects.

Parameters:

fn_dict : dict

Keys of fn_dict will be the keys of this LazyDict object. Values should be filenames which will be loaded into ResultsTable object upon access for the first time.

index_file : str

Path to a file that contains one ID per line. This file is used to ensure all ResultsTable objects are aligned to the same index.

dbfn : str

Filename to a gffutils database. This enables gene info to be attached to the dataframe.

extra : pandas.dataframe

This dataframe hat will be merged into the data in each file. This is useful for attaching things like gene lengths, alt names, etc. In order for it to work, this dataframe must be indexed the same way the ResultsTable files are indexed.

cls : ResultsTable class or subclass

Each filename in fn_dict will be converted using this class.

modifier : callable

Upon first access, each newly-constructed ResultsTable will first have the extra data attached, and then will be provided as this callable’s only argument. The callable can make any modifications to the ResultsTable, and return a new version that will be used in the future when the same key is accessed. For example, exonic bp data can be provided as part of the extra dataframe, and then the modifier can be a function that adds an RPKM column.

Notes

When a key is provided for the first time, the workflow is ResultsTable(fn, **kwargs) -> attach extra -> send to modifier -> return extended and modified ResultsTable. Subsequent access of the same key will immediately return the extended-and-modified ResultsTable.