gffutils.interface.FeatureDB
- class gffutils.interface.FeatureDB(dbfn, default_encoding='utf-8', keep_order=False, pragmas={'journal_mode': 'MEMORY', 'main.cache_size': 10000, 'main.page_size': 4096, 'synchronous': 'NORMAL'}, sort_attribute_values=False, text_factory=<class 'str'>)[source]
- __init__(dbfn, default_encoding='utf-8', keep_order=False, pragmas={'journal_mode': 'MEMORY', 'main.cache_size': 10000, 'main.page_size': 4096, 'synchronous': 'NORMAL'}, sort_attribute_values=False, text_factory=<class 'str'>)[source]
Connect to a database created by
gffutils.create_db()
.- Parameters:
dbfn (str) – Path to a database created by
gffutils.create_db()
.text_factory (callable) – Optionally set the way sqlite3 handles strings. Default is str
default_encoding (str) – When non-ASCII characters are encountered, assume they are in this encoding.
keep_order (bool) –
If True, all features returned from this instance will have the order of their attributes maintained. This can be turned on or off database-wide by setting the
keep_order
attribute or with this kwarg, or on a feature-by-feature basis by setting thekeep_order
attribute of an individual feature.Default is False, since this includes a sorting step that can get time-consuming for many features.
sort_attributes_values (bool) – If True, then in cases where there are multiple values for an attribute then ensure they appear in the same order every time. This is typically only used for testing, in cases where it is important to have consistent ordering.
pragmas (dict) – Dictionary of pragmas to use when connecting to the database. See http://www.sqlite.org/pragma.html for the full list of possibilities, and constants.default_pragmas for the defaults. These can be changed later using the
FeatureDB.set_pragmas()
method.
Notes
dbfn
can also be a subclass of_DBCreator
, useful for whengffutils.create_db()
is provided thedbfn=":memory:"
kwarg.
Methods
__init__
(dbfn[, default_encoding, ...])Connect to a database created by
gffutils.create_db()
.add_relation
(parent, child, level[, ...])Manually add relations to the database.
all_features
([limit, strand, featuretype, ...])Iterate through the entire database.
analyze
()Runs the sqlite ANALYZE command to potentially speed up queries dramatically.
bed12
(feature[, block_featuretype, ...])Converts
feature
into a BED12 format.children
(id[, level, featuretype, order_by, ...])Return children of feature
id
.children_bp
(feature[, child_featuretype, ...])Total bp of all children of a featuretype.
count_features_of_type
([featuretype])Simple count of features.
create_introns
([exon_featuretype, ...])Create introns from existing annotations.
create_splice_sites
([exon_featuretype, ...])Create splice sites from existing annotations.
delete
(features[, make_backup])Delete features from database.
execute
(query)Execute arbitrary queries on the db.
features_of_type
(featuretype[, limit, ...])Returns an iterator of
gffutils.Feature
objects.Iterate over feature types found in the database.
interfeatures
(features[, new_featuretype, ...])Construct new features representing the space between features.
iter_by_parent_childs
([featuretype, level, ...])For each parent of type
featuretype
, yield a list L of that parent and all of its children ([parent] + list(children)
).merge
(features[, merge_criteria, multiline])Merge features matching criteria together
merge_all
([merge_order, merge_criteria, ...])Merge all features in database according to criteria.
method
([limit, strand, featuretype, ...])Iterate through the entire database.
parents
(id[, level, featuretype, order_by, ...])Return parents of feature
id
.region
([region, seqid, start, end, strand, ...])Return features within specified genomic coordinates.
schema
()Returns the database schema as a string.
seqids
()Yield the unique sequence IDs (chromosomes, contigs) observed in the database.
set_pragmas
(pragmas)Set pragmas for the current database connection.
update
(data[, make_backup])Update the on-disk database with features in
data
.