gffutils.interface.FeatureDB

class gffutils.interface.FeatureDB(dbfn, default_encoding='utf-8', keep_order=False, pragmas={'journal_mode': 'MEMORY', 'main.cache_size': 10000, 'main.page_size': 4096, 'synchronous': 'NORMAL'}, sort_attribute_values=False, text_factory=<class 'str'>)[source]
__init__(dbfn, default_encoding='utf-8', keep_order=False, pragmas={'journal_mode': 'MEMORY', 'main.cache_size': 10000, 'main.page_size': 4096, 'synchronous': 'NORMAL'}, sort_attribute_values=False, text_factory=<class 'str'>)[source]

Connect to a database created by gffutils.create_db().

Parameters:
  • dbfn (str) – Path to a database created by gffutils.create_db().

  • text_factory (callable) – Optionally set the way sqlite3 handles strings. Default is str

  • default_encoding (str) – When non-ASCII characters are encountered, assume they are in this encoding.

  • keep_order (bool) –

    If True, all features returned from this instance will have the order of their attributes maintained. This can be turned on or off database-wide by setting the keep_order attribute or with this kwarg, or on a feature-by-feature basis by setting the keep_order attribute of an individual feature.

    Default is False, since this includes a sorting step that can get time-consuming for many features.

  • sort_attributes_values (bool) – If True, then in cases where there are multiple values for an attribute then ensure they appear in the same order every time. This is typically only used for testing, in cases where it is important to have consistent ordering.

  • pragmas (dict) – Dictionary of pragmas to use when connecting to the database. See http://www.sqlite.org/pragma.html for the full list of possibilities, and constants.default_pragmas for the defaults. These can be changed later using the FeatureDB.set_pragmas() method.

Notes

dbfn can also be a subclass of _DBCreator, useful for when gffutils.create_db() is provided the dbfn=":memory:" kwarg.

Methods

__init__(dbfn[, default_encoding, ...])

Connect to a database created by gffutils.create_db().

add_relation(parent, child, level[, ...])

Manually add relations to the database.

all_features([limit, strand, featuretype, ...])

Iterate through the entire database.

analyze()

Runs the sqlite ANALYZE command to potentially speed up queries dramatically.

bed12(feature[, block_featuretype, ...])

Converts feature into a BED12 format.

children(id[, level, featuretype, order_by, ...])

Return children of feature id.

children_bp(feature[, child_featuretype, ...])

Total bp of all children of a featuretype.

count_features_of_type([featuretype])

Simple count of features.

create_introns([exon_featuretype, ...])

Create introns from existing annotations.

create_splice_sites([exon_featuretype, ...])

Create splice sites from existing annotations.

delete(features[, make_backup])

Delete features from database.

execute(query)

Execute arbitrary queries on the db.

features_of_type(featuretype[, limit, ...])

Returns an iterator of gffutils.Feature objects.

featuretypes()

Iterate over feature types found in the database.

interfeatures(features[, new_featuretype, ...])

Construct new features representing the space between features.

iter_by_parent_childs([featuretype, level, ...])

For each parent of type featuretype, yield a list L of that parent and all of its children ([parent] + list(children)).

merge(features[, merge_criteria, multiline])

Merge features matching criteria together

merge_all([merge_order, merge_criteria, ...])

Merge all features in database according to criteria.

method([limit, strand, featuretype, ...])

Iterate through the entire database.

parents(id[, level, featuretype, order_by, ...])

Return parents of feature id.

region([region, seqid, start, end, strand, ...])

Return features within specified genomic coordinates.

schema()

Returns the database schema as a string.

seqids()

Yield the unique sequence IDs (chromosomes, contigs) observed in the database.

set_pragmas(pragmas)

Set pragmas for the current database connection.

update(data[, make_backup])

Update the on-disk database with features in data.