API

gffutils

Create a database

create_db

Create a database from a GFF or GTF file.

Interact with a database

First, connect to an existing database:

FeatureDB

Then, use the methods of FeatureDB to interact:

FeatureDB.children

Return children of feature id.

FeatureDB.parents

Return parents of feature id.

FeatureDB.schema

Returns the database schema as a string.

FeatureDB.features_of_type

Returns an iterator of gffutils.Feature objects.

FeatureDB.count_features_of_type

Simple count of features.

FeatureDB.all_features

Iterate through the entire database.

FeatureDB.execute

Execute arbitrary queries on the db.

FeatureDB.featuretypes

Iterate over feature types found in the database.

FeatureDB.region

Return features within specified genomic coordinates.

FeatureDB.iter_by_parent_childs

For each parent of type featuretype, yield a list L of that parent and all of its children ([parent] + list(children)).

Modify a FeatureDB:

FeatureDB.update

Update the on-disk database with features in data.

FeatureDB.delete

Delete features from database.

FeatureDB.add_relation

Manually add relations to the database.

FeatureDB.set_pragmas

Set pragmas for the current database connection.

Operate on features:

FeatureDB.interfeatures

Construct new features representing the space between features.

FeatureDB.children_bp

Total bp of all children of a featuretype.

FeatureDB.merge

Merge features matching criteria together

FeatureDB.create_introns

Create introns from existing annotations.

FeatureDB.bed12

Converts feature into a BED12 format.

Feature objects

Most FeatureDB methods return Feature objects:

Feature

You can extract the sequence for a feature:

Feature.sequence

Retrieves the sequence of this feature as a string.

Creating a Feature object:

feature_from_line

Given a line from a GFF file, return a Feature object

Integration with other tools

gffutils.biopython_integration.to_seqfeature

Converts a gffutils.Feature object to a Bio.SeqFeature object.

gffutils.biopython_integration.from_seqfeature

Converts a Bio.SeqFeature object to a gffutils.Feature object.

gffutils.pybedtools_integration.tsses

Create 1-bp transcription start sites for all transcripts in the database and return as a sorted pybedtools.BedTool object pointing to a temporary file.

gffutils.pybedtools_integration.to_bedtool

Convert any iterator into a pybedtools.BedTool object.

Utilities

gffutils.helpers.asinterval

Converts a gffutils.Feature to a pybedtools.Interval

gffutils.helpers.merge_attributes

Merges two attribute dictionaries into a single dictionary.

gffutils.helpers.sanitize_gff_db

Sanitize given GFF db.

gffutils.helpers.annotate_gff_db

Annotate a GFF file by cross-referencing it with another GFF file, e.g. one containing gene models.

gffutils.helpers.infer_dialect

Infer the dialect based on the attributes.

gffutils.helpers.example_filename

Return the full path of a data file that ships with gffutils.

gffutils.inspect.inspect

Inspect a GFF or GTF data source.