Track hub from Excel

If you are an avid user of the UCSC Genome Browser and the trackhub package, you might find it tedious to write a script for every single hub you create. The command-line tool uses this package to further automate the trak hub making process. Additionally, if you are not familiar with Python, this tool makes it even easier to make track hubs. Follow this guide for how to use the package and how to fill out an Excel workbook to make any simple or complex visualization on the UCSC Genome Browser.

1. Create a template

The Excel file must have a specific format to be parsed correctly.

To create a template, run:

trackhub_from_excel --template

This will create a file template.xlsx in the current directory, with the correct sheets that you can fill out with the instructions below.

Alternatively, you can get a working example to study and test:

trackhub_from_excel --create-example my_example.xlsx

2. Fill out the Excel workbook

Using the template just created, fill out the sheets with the data you’d like to visualize.

hub and genome sheets

The following special sheet names are used for configuring the hub and the genome (for assembly tracks):

hub – This sheet is necessary for all track hubs. It defines the hub name and labels and genome.

genome – This sheet is only necessary when using a genome assembly. It points to the 2bit file and gives the genome a name and label.

Example `hub` sheet

hub	examplehub
short_label	Example hub
long_label	Example track hub from Excel
email	user@example.com
genome	hg38

Container track sheets

Container tracks must be configured in their own sheet.

The following special sheet names are required when using the corresponding container track. These sheets are created in the template: aggregate_config, view_config, super_config, and composite_config.

Each type of container track must be on its own sheet. For example the sheet view_config can have several view tracks defined but there can be no other types of container tracks defined on that sheet. This applies to all container track types.

Columns in these sheets correspond to valid track parameters for the respective track type. There are also some special fields for container track configuration:

Extra aggregate_config field: An aggregate track can be placed in a super track. In this case, include add a super track in super_config, and in aggregate_config add a super column and add the name of the super track.

Extra view_config field: Views must be placed inside of a composite track. Configure this by adding the name of the composite track in the column labeled composite.

Extra composite_config field: A composite track can be placed in a super track. In this case, include the name of the super track in the column labeled super similar to as described above for view_config.

Super tracks are within the track hub and therefore do not need special fields.

Note

Subgroups are not specified in the composite config. Rather, they are automatically inferred and created based on the subgroups assigned to individual tracks (and which composites those tracks are assigned to). This makes it much more convenient to organize your tracks. See the tracks section below for details.

Example `view_config` sheet

name	view	short_label	long_label	visibility	tracktype	composite
signal	signalview	Genomic signal	Genomic signal (CPM)	full	bigWig	experiment1
peaks	peaksview	Peaks	Called peaks (macs2)	dense	bigBed	experiment1

Example `composite_config` sheet

name	short_label	long_label	tracktype	super
experiment1	experiment 1	Experiment 1	bigWig	supertrack1
experiment2	experiment 2	Experiment 2	bigWig

Example `super_config` sheet

name	short_label	long_label
supertrack1	Super track	Super track

Tracks

All other sheets that do not have the special names indicated above are assumed to configure tracks.

Each row defines a track and must have values in the name, tracktype, and source (or bigDataUrl) columns. Use source when the file is on disk and use bigDataUrl when the file is publicly hosted. The user can define more fields according to the specific track type.

Different track types can be listed on the same sheet. Tracks in different containers can be listed on the same sheet and tracks in the same containers can be listed on different sheets.

Leave the cell in the Excel sheet blank to omit that track field for that track. The program will remove this field for the track.

To use container tracks, be sure to define the container and use the container and container_type fields for the track.

For example, to place a track in a view track you need first add a row for the view in the view_config sheet that includes a name field. In another sheet (containing tracks, so you can name it whatever you want), fill out a row for the track including the container_type and container fields in addition to the required fields described above. For the container_type column, fill in “view” and for the container column fill in the same name that is in the view_config sheet.

To add a subgroup to a track, make a column with the prefix subgroup_. The value after the underscore will become the name of the subgroup. Fill in the group that data file fits into.

For example, to make subgroups based on genotype, you might label the column subgroup_genotype and fill in the rows with “WT” or “KO”. You can make as many subgroups as you need.

Example `tracks` sheet

name	short_label	long_label	tracktype	source	visibility	color	container	container_type	dimensions	subgroup_celltype	subgroup_genotype
k562_wt	K562 WT signal	K562 cells, WT signal	bigWig	data/kwt.bigwig	full	120,51,154	signalview	view	dimX=genotype dimY=celltype	k562	wt
k562_wt_pk	K562 WT peaks	K562 cells, WT peaks	bigBed	data/kwt.bigbed	dense	120,51,154	peakview	view	dimX=genotype dimY=celltype	k562	wt

3. Run the script

This will default to naming the track hub directory as “staging”

python trackhub_from_excel.py --excel_file experiment.xlsx

You can run it with the --staging flag to specify the name

python trackhub_from_excel.py --excel_file experiment.xlsx --staging experiment

The output directory will then be ready for uploading to a host.

Track hub from Excel

1. Create a template

2. Fill out the Excel workbook

hub and genome sheets

Example hub sheet

Container track sheets

Example view_config sheet

Example composite_config sheet

Example super_config sheet