Mar 10, 2020

Public workspaceVisualizing Calling Card Data on the WashU Epigenome Browser

  • 1Washington University, Saint Louis
  • Transposon Calling Cards
Icon indicating open access to content
QR code linking to this content
Protocol Citation: Arnav Moudgil, Rob Mitra 2020. Visualizing Calling Card Data on the WashU Epigenome Browser. protocols.io https://dx.doi.org/10.17504/protocols.io.bca8ishw
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: February 07, 2020
Last Modified: March 10, 2020
Protocol Integer ID: 32832
Abstract
This document explains how to visualize calling card insertions as wells as density and peak tracks on the WashU Epigenome Browser.
Materials
If you are hosting files on a server, you will require the following software package:
Before start
Please make sure you have installed the required software and packages (see Materials section).

This protocol describes how to visualize calling card data, such as raw insertions (in a CCF file), insertion densities (bedgraph file), and peaks (BED file). Ideally, you will be able to store files on a publicly-accessible webserver. This requires enabling Cross-Origin Resource Sharing (CORS). If this is not possible, data files can also be directly uploaded from your computer to the browser. This approach is described in the second half of the protocol.
Introduction
Introduction
During the course of a calling card experiment, or after all the analysis has been said and done, you may want to visually inspect the data. These can be the raw insertions (stored as a CCF file), insertion densities (as a bedgraph file), or the peak boundaries (as a BED file). Instructions for generating these kinds of files can be found in the following protocols:
The WashU Epigenome Browser natively supports visualizing CCF files as a calling card track. In addition, it supports standard genomic file formats such as BED, bedgraph, bigWig, and HiC. This allows us to compare and contrast calling card data to popular genomic assays like ChIP-seq, DNase-seq, ATAC-seq, and Hi-C.

There are two ways load data onto the browser:
  1. Hosting your data on a publicly-accessible server and letting the browser fetch the data.
  2. Directly uploading text files to the browser.

The first method is preferred: track information and preferences can be stored in a simple text file, which can then be shared with collaborators. This saves the effort and cost of moving around potentially large datasets. Instead, the data remains in one place and can be accessed by anyone from anywhere.

The second option can be used if a public server is not available to you, or if you want to rapidly look at files that are already on your computer.
For this protocol, we will use the following files:
  • HCT-116_PBase.ccf, a file of raw calling card insertions
  • HCT-116_PBase.bedgraph, a file of calling card insertion densities across the genome
  • HCT-116_PBase_peaks.bed, a file of peaks inferred from calling card data
Uploading Data from an External Server
Uploading Data from an External Server
BED, bedgraph, and CCF files are all text-based formats. Before being hosted on an external server, these files must be compressed and indexed. This allows for fast, random-access to the data. For this step, make sure you have installed htslib (see Materials).

To compress the file:
bgzip HCT-116_PBase.ccf
This will compress the original file, creating HCT-116_PBase.ccf.gz

To index the compressed file:
tabix -p bed HCT-116_PBase.ccf.gz
This will create an index file, HCT-116_PBase.ccf.gz.tbi

Both HCT-116_PBase.ccf.gz and HCT-116_PBase.ccf.gz.tbi should be copied to a publicly-accessible webserver. These steps also apply to bedgraph and BED files.
Next, we create a JSON file. JSON is a generic standard describing objects and their properties. Here, we use it to describe each track, its data, and any options we may want to customize. The JSON file is plain text; it is recommended that you use a text editor (e.g. Sublime, Atom, Visual Studio Code, vim/emacs, etc.) instead of a word processor (e.g. Microsoft Word) to create this file.

We will create HCT-116_PBase.json for our data. Here is the structure of a JSON file for a single track depicting calling card data:

[ { "type":"callingcard", "url":"https://htcf.wustl.edu/files/xXl8ZAXy/HCT-116_PBase.ccf.gz", "name":"piggyBac insertions", "showOnHubLoad":"true", "options":{ "color":"#3182bd", "height":100, "logScale":"log10",
"markerSize":3,
"opacity":[100],
"show":"all", "sampleSize":1000,
},
},
]

All tracks in the JSON file must be between the square brackets [...]. Curly braces {...} and commas separate invidual tracks, and within a track, another set of curly braces may be used to specify grouped options. Let's consider each of these entries in turn:

"type":"callingcard",
This specifies the track type, which in this case is a calling card track.

"url":"https://htcf.wustl.edu/files/xXl8ZAXy/HCT-116_PBase.ccf.gz",
This points to the web address where your data are stored. Note that this must point to the compressed file. The browser will automatically located the index file. (If the index file is not present, it will throw an error).

"name":"piggyBac insertions",
This is the track label.

"showOnHubLoad":"true",
This specifies whether the track should be display immediately after the JSON file has been uploaded. The default value is "false."

There are also a number of options that can be specified to customize the track appearance. While all of them can be changed after the tracks have been loaded, specifying them in the JSON file helps to record and reproduce settings. Note that these only need to be set if you wish to override defaults.

"color":"#f4916c",
This sets the color of the individual calling card markers. The defaul is blue.

"height":100,
This sets the height of the track in pixels. The default is 40.

"logScale":"log10",
This transforms the y-axis from a linear scale to a logarithmic scale. By default, the track uses a linear scale, but for calling card experiments we recommed using a logarithmic scale.

"markerSize":3,
This is the radius of the marker in pixels. The default value is 3.

"opacity":[100],
This specifies the opacity of the markers, which can help emphasize data-dense regions. This takes an integer value between 0 and 100. The value supplied must be in square brackets. The default value is 100.

"show":"all", "sampleSize":1000,
This pair of options specify whether the browser should display all data points, or simply a random subsample. If there are many (e.g. thousands) of data points in view, it can slow down your web browser. Subsampling can help preserve a global representation of your data while still remaining responsive and memory-friendly. The "show" option can take two values: "all" (default) or "sample." The former draws all data points. If the latter is specified, then the "sampleSize" option is applied. This number determines how many points to subsample and is set to 1000 by default.
We can also add the bedgraph and BED files to the JSON:
[ { "type":"callingcard", "url":"https://htcf.wustl.edu/files/xXl8ZAXy/HCT-116_PBase.ccf.gz", "name":"piggyBac insertions", "showOnHubLoad":"true", "options":{ "color":"#3182bd", "height":100, "logScale":"log10",
"markerSize":3,
"opacity":[100],
"show":"all", "sampleSize":1000,
},
},
{ "type":"bedgraph", "url":"https://htcf.wustl.edu/files/xXl8ZAXy/HCT-116_PBase.bedgraph", "name":"HCT-116 PBase density", "showOnHubLoad":"true", "options":{ "color":"#3182bd", "height":50,
},
},
{ "type":"bed", "url":"https://htcf.wustl.edu/files/xXl8ZAXy/HCT-116_PBase.bed", "name":"HCT-116 PBase peaks", "showOnHubLoad":"true", "options": { "color":"#3182bd",
"height":20, "displayMode":"density",
}
},
]
Many options have the same names and settings as the calling card track. A complete list of options for bedgraph and BED tracks can be found here.
Now we can upload this file to the browser and visualize it. Go to the homepage and select the appropriate genome:



Once the genome has loaded, click on the blue "Tracks" button and select "Custom tracks."



Select the second tab, "Add Custom Data Hub."


Clicking on the second dialog box ("Choose datahub file") will open a filesystem navigator window. Find the JSON file on your system and upload it.

Upon successful upload, you should see a table listing the uploaded tracks:



After closing the custom track pane (with the red X in the upper right corner), all three tracks should be visible:


Note that the gencodeV29 track has been removed for this screenshot.

Uploading Local Data Files
Uploading Local Data Files
As an alternative to hosting your data externally, you can directly upload text-based data files, including CCF, bedgraph, and BED. If you choose this option, your files should not be compressed. Also, for very large files, this approach may slow down your browser.
Click on the blue "Tracks" button and select "Text track."



Select "callingcard" from the dropdown menu, then click on the "Choose Files" button. Select one or more CCF files to upload. The browser will then load the tracks; this may take a few moments, depending on the size of the file(s).



After the tracks have been loaded, close the pane by clicking on the red X in the top right corner. When directly uploading text files, tracks will be rendered with default options.

These steps can be repeated for BED and bedgraph files.

Interacting with the Calling Card Track
Interacting with the Calling Card Track
The calling card track was designed for interactive exploration. This is chiefly accomplished by a rollover box that appears as the mouse cursor nears a data point. An approximate location is at the top of the pane, while the bottom lists the read count, strand, and barcode for each insertion:



Right clicking on the calling card track will bring up a preference pane, where tracks can be dynamically customized:


SImilar preference panes exist for the BED and bedgraph tracks.

This example showcases all the different options available for calling card tracks, such as color, marker size, opacity, and subsampling: