Jul 19, 2022

Public workspaceQuality control analysis for 10X snRNA-seq  V.2

  • 1University of California, San Diego
Icon indicating open access to content
QR code linking to this content
Protocol CitationDaniel Jacobsen, Dinh H Diep 2022. Quality control analysis for 10X snRNA-seq . protocols.io https://dx.doi.org/10.17504/protocols.io.261genbqjg47/v2Version created by Dinh H Diep
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it’s working
Created: July 19, 2022
Last Modified: July 19, 2022
Protocol Integer ID: 67087
Abstract
Here we describe a computational protocol for performing quality control analysis on shallow sequencing data obtained from 10X snRNA-seq experiments. The workflow starts with raw MiSeq run folders and uses cellranger to generate count matrices. The raw count matrices are analyzed and sequencing saturation plots are generated. The saturation plots are then compared against plots from a reference set of libraries with varying qualities (bad, fair, good, great), thus allowing for the determination of sequencing requirements as well as an assessment of the overall quality of each 10X snRNA experiment.
Materials
cellranger software
bcl2fastq software

Note
Make sure that cellranger is in the environment's path, otherwise modify commands to include the full path to cellranger.

Download the tar.gz file from this protocol.
Extract the tar.gz file from this protocol.
<FILENAME> is the name of the downloaded tar.gz file.
Command
This command extracts a tar.gz file (linux)
tar -xzf <FILENAME>

Install anaconda or miniconda Python distributions following given instructions.
Install preseq using given instructions from http://smithlabresearch.org/software/preseq/.


Note
preseq must be in the environment's path.

Preseq requires the GSL libraries. Install GSL using the instructions from https://www.gnu.org/software/gsl/.

Create a symbolic link so that preseq can find the required gsl library.
Command
This command creates a symbolic link to a gsl library for preseq (linux)
sudo ln -s /usr/local/lib/libgsl.so /usr/lib/libgsl.so.0

Install samtools using given instructions from http://www.htslib.org/download/.
Use conda to install bcl2fastq with the following terminal command:

Command
This command installs bcl2fastq for demultiplexing Illumina sequencing runs. (linux)
conda install -c dranew bcl2fastq

Use conda to install required python packages with the following terminal command:

Command
This command installs required python packages for generating preseq plots for 10X_snRNA_preseq_analysis package (linux)
conda install -c conda-forge numpy seaborn matplotlib pandas

Run cellranger mkfastq to generate fastq files. Make sure that the following placeholders are set to the correct paths and desired names.
<FASTQ_OUT> is the name of the output folder
<RUN> is the path to the MiSeq run folder
<CSV> is the path to the sample-sheet.csv file

Command
This command generates cellranger mkfastq results. (linux)
cellranger mkfastq --id=<FASTQ_OUT> --run=<RUN> --sample-sheet=<CSV>

Run cellranger count. Make sure that the following placeholders are set to the correct paths and desired names.
<ID> is the name of the output folder for the sample
<SAMPLE> is the sample name used for the sample in the sample-sheet.csv file
<REF> is the path to the cellranger reference data folder
<NUM> is the number of expected cells from the experiment


Command
This command generates cellranger count results. (linux)
cellranger count --id <ID> --fastqs <FASTQ_OUT> --sample <SAMPLE> --transcriptome <REF> --include-introns --expect-cells <NUM>


Run the preseq script in the folder downloaded from this protocol. Make sure that the following placeholders are sset to the correct paths and names.

<PATH_TO_FOLDER> is the path to the folder that was extracted from the tar.gz file.
<ID> is the name of the output folder for the sample generated with cellranger.
Command
This command generates the preseq results for 10X snRNA experiments. (linux)
<PATH_TO_FOLDER>/scripts/loop.preseq.r.sh <ID>

View outputs.

<ID>.lc_extrap_log.txt contains preseq statistics
<ID>.lc_extrap_output.png to view the sequencing saturation plots
<ID>/outs/web_summary.html to view the cellranger analyses
<ID>/outs/summary.csv to view quality statistics generated by cellranger