gffutils.inspect.inspect
- gffutils.inspect.inspect(data, look_for=['featuretype', 'chrom', 'attribute_keys', 'feature_count'], limit=None, verbose=True)[source]
Inspect a GFF or GTF data source.
This function is useful for figuring out the different featuretypes found in a file (for potential removal before creating a FeatureDB).
Returns a dictionary with a key for each item in
look_for
and a corresponding value that is a dictionary of how many of each unique item were found.There will always be a
feature_count
key, indicating how many features were looked at (iflimit
is provided, thenfeature_count
will be the same aslimit
).For example, if
look_for
is [‘chrom’, ‘featuretype’], then the result will be a dictionary like:{ 'chrom': { 'chr1': 500, 'chr2': 435, 'chr3': 200, ... ... }. 'featuretype': { 'gene': 150, 'exon': 324, ... }, 'feature_count': 5000 }
- Parameters:
data (str, FeatureDB instance, or iterator of Features) – If
data
is a string, assume it’s a GFF or GTF filename. If it’s a FeatureDB instance, then itsall_features()
method will be automatically called. Otherwise, assume it’s an iterable of Feature objects.look_for (list) –
List of things to keep track of. Options are:
any attribute of a Feature object, such as chrom, source, start, stop, strand.
”attribute_keys”, which will look at all the individual attribute keys of each feature
limit (int) – Number of features to look at. Default is no limit.
verbose (bool) – Report how many features have been processed.
- Return type:
dict