Apr 27, 2020

Public workspacePrimalSeq: Generation of tiled virus amplicons for MiSeq sequencing

  • 1Scripps Research, Department of Immunology and Microbiology;
  • 2Department of Epidemiology of Microbial Diseases, Yale School of Public Health;
  • 3University of Birmingham, Institute of Microbiology and Infection
Icon indicating open access to content
QR code linking to this content
Protocol CitationNate Matteson, Nathan D Grubaugh, Karthik Gangavarapu, Josh Quick, Nick Loman, Kristian Andersen 2020. PrimalSeq: Generation of tiled virus amplicons for MiSeq sequencing. protocols.io https://dx.doi.org/10.17504/protocols.io.bez7jf9n
Manuscript citation:
Grubaugh, ND. et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biology 20,8 (2019) https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1618-7
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol readily in our lab. It is successful for a wide range of virus samples.
Created: April 13, 2020
Last Modified: April 27, 2020
Protocol Integer ID: 35615
Abstract
Generated in collaboration by the Loman, Andersen, and Grubaugh labs.

For general use of the protocol and primer design, please cite:
Quick, J. et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nature Protocols 12 (6), 1261-1276 (2017)https://www.nature.com/nprot/journal/v12/n6/abs/nprot.2017.066.html

For measuring intrahost virus genetic diversity and calling variants using iVar, please cite:
Grubaugh, ND. et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biology 20,8 (2019) https://genomebiology.biomedcentral.com/articles/10.1186/s13059-018-1618-7

The general approach to this protocol is to amplify the virus genome in small (~400 bp) overlapping fragments using two highly multiplexed PCR reactions (where the overlapping segments are in separate reactions). The amplicons are combined after PCR and are the correct size for library preparation and paired-end 250 nt sequencing using the Illumina MiSeq.

Version 4 notes: This protocol has been updated to include considerations for measuring intra-batch contamination through the introduction of sample-specific barcoded spike-ins.

All of the primers are now listed in a separate spreadsheet. We currently have 400 bp amplicon schemes for Zika virus, West Nile virus (North America lineage I genotype), Usutu virus, and chikungunya virus (ECSA genotype). More will be made available soon. You can build your own primer sets by using Primal Scheme.

Overview of tiled virus amplicon sequencing protocol


Guidelines
Considerations for measuring intrahost genetic diversity using this amplicon-based protocol:
  1. Requires at least 1000 virus RNA copies going into cDNA synthesis. More is better. Try to normalize virus RNA copies between samples to make comparisons easier.
  2. Process each RNA sample twice through the protocol to sequence as technical replicates. By calling variants only present in both replicates, it reduces the number of false positives (mainly from sequencing errors) and increases the accuracy of variant frequency measurements.
  3. Obtain at least 400x nt coverage of each nucleotide position. Because of different amplicon efficiencies, this typically means that ~1M 250 nt paired-end reads are needed. Amplification of high input virus concentrations (>10,000 virus RNA copies) are more even and require fewer total reads.
  4. During our validation process, the lowest intrahost variant frequency that we could accurately and consistently measure was 3%. Measuring lower than this requires additional input copies, coverage depth, and validation.
  5. Beware of intrahost virus variants that exist within primer binding sites as they can decrease the amplification efficiency of that particular virus haplotype. Because the primer sites are trimmed and are covered by an overlapping amplicon, the variants within the primer sites can be accurately measured. All variants within the amplicon with a primer mismatch, however, can be significantly altered. This is the major limitation with any PCR protocol for virus population diversity analysis.
  6. Use our data pipeline, iVar (intrahost variant analysis from replicates) to process and analyze the data. It will align to the reference (or call a consensus), trim primers, call variants, compare variants between replicates, and flag variants within primer sites.

Considerations for estimating intra-batch contamination using this amplicon-based protocol:
  1. Requires a suitable amount of barcode reads. We aim to get 1% barcode reads for each sample which, given our recommendation of ~1M reads per sample, amounts to ~10,000 barcode reads. Less than this and small amounts of spillover have exaggerated effects. For RNA extracted from mosquito pools, we find an addition of 10 fg of spike-in to be suitable for most samples.
  2. Limit of detection at 1% barcode reads was found to be ~0.05% contaminating reads if spiked-in barcode transcripts have completely different barcode regions and ~0.1% if barcode transcripts share one barcode region.
Materials
MATERIALS
ReagentQ5 High-Fidelity 2X Master Mix - 100 rxnsNew England BiolabsCatalog #M0492S
ReagentAgencourt AMPure XPBeckman CoulterCatalog #A63880
ReagentKAPA Hyper Prep Kits (24 rxns with amplification)Kapa BiosystemsCatalog #07962347001
ReagentAgilent High Sensitivity DNA KitAgilent TechnologiesCatalog #5067-4626
ReagentQubit 1X dsDNA High Sensitivity Assay KitThermo Fisher ScientificCatalog #Q33230
ReagentKAPA Library Quantification Kit for Illumina® PlatformsKapa BiosystemsCatalog #KK4835
ReagentSuperScript™ IV VILO™ Master MixThermo FisherCatalog #11756500
ReagentCustom primers
ReagentBIOO Scientific NEXTflex Dual-indexed DNA Barcodes
Generation of barcode spike-ins
Generation of barcode spike-ins
Barcode spike-ins are generated with a step-out PCR using a common insert and variable forward and reverse primers. The insert we use and structures of the variable primers is show below. X’s refer to the barcode region. We generated 8 forward barcodes, and 12 reverse barcodes which are 20bp long and have an edit distance of at least 10 nucleotides from all other generated barcodes. By mixing and matching primers a total of 96 unique barcode transcripts can be generated.
> Zea mays alcohol dehydrogenase 1 (adh1), mRNA (XM_008650471.2)
AGGGTCTCGGAGTGGATTGATTTGGGATTCTGTTCGAAGATTTGCGGAGGGGGGCAATGGCGACCGCGGGGAAGGTGATCAAGTGCAAAGCTGCGGTGGCATGGGAGTCCATGCAAGCCACTGTCGATCGAGGAAGTGGAGGTAGCGCCTTCGCAGGCCATGGAGGTGCGCGTCAAGATCCTCTTCACCTCGCTCTGCCACACCGACGTCTACTTCTGGGAGGCCAAGGGGCAGACTCCCGTGTTCCCTCGGATCTTTGGCCACGAGGCTGGAGGTATCATAGAGAGTGTTGGAGAGGGTGTGACTGACGTAGCTC

> Barcode transcript step-out forward primer (structure)
AATGTCGCAGGCACTTGTCCxxxxxxxxxxxxxxxxxxxxAGGGTCTCGGAGTGGATTGA

> Barcode transcript step-out reverse primer (structure)
GGTCAGAGCTGTCTCCTGCTxxxxxxxxxxxxxxxxxxxxGAGCTACGTCAGTCACACCC


Prepare a PCR reaction for each combination of forward and reverse primers. A master mix can be created by combining all components except the forward and reverse primers.

ComponentVolume in 20 µL reactionFinal Concentration
Q5 Reaction Buffer (5x)5 µL1x
dNTPs (10 mM)0.5 µL200 uM
Forward Primer (10 µM)1 µL400 nM
Reverse Primer (10 µM)1 µL400 nM
Q5 Polymerase0.25 µL0.02 U/ul
Water16.25 µL-
Template DNA (adh1; 0.1 ng/µL)1 µL<1000 ng

Run the following cycles on a thermocycler:
CyclesTemperatureTime
198°C30 seconds
1098°C10 seconds
68°C10 seconds
72°C20 minutes
172°C2 minutes
14°C

Proceed immediately to cleanup
Post PCR cleanup
Post PCR cleanup
Allow Mag-Bind TotalPure NGS beads to equilibrate to room temperature, vortex until homogenous.
Bring PCR product volume up to 25 µL with water (if not at volume already).
Add 50 µL of beads to 25 µL of PCR product, mix well, and incubate at room temperature for 10 minutes.
Place tubes on a magnetic stand and incubate until solution appears clear.
Discard supernatant without disturbing the beads.
While tubes are on the magnet, add 200 µL of 80% EtOH, incubate for 30 seconds, and discard the EtOH wash.
Repeat previous 80% EtOH wash and remove as much EtOH as possible.
Leave tubes on magnet and air dry for 5 minutes.
Remove tubes from magnet and add 20 µL of nuclease-free water. Mix well by pipetting.
Place tubes on magnet stand. When solution appears clear, remove supernatant without disturbing the beads and place into new tubes.
Quantify the DNA concentration using the Qubit High Sensitivity DNA kit (or equivalent) from 1 µL of each product. Expected range = 10-100 ng/µL DNA. Sequencing from lower concentrations may still work.
Note: If your lab has a KingFisher, you can download our automated protocols here: https://github.com/grubaughlab/Kingfisher_protocols (use ‘purification.bdz’ for this step)
Preparation of cDNA
Preparation of cDNA
Isolate viral RNA using Omega Viral DNA/RNA kit, Trizol, or equivalent.
Many different cDNA synthesis kits can be used, but choose something that is relatively high-fidelity. The current protocols uses SuperScript IV VILO Master Mix because the enzyme has low error rates and the protocol is fast and easy.
Dilute a working stock of Barcoded Spike-ins 1:100,000 to obtain a concentration of 38 fM. Select unique spike-in for each sample. Try not to repeat spike-ins from recent runs.
Note: Be careful to not cross-contaminate the spike-ins by centrifuging all liquid from the caps and only opening one index at a time

ComponentVolume in 20 µL reaction
SSIV VILO Master Mix4 µL
Nuclease-free water5-14 µL
Barcoded Spike-in (38 fM)1 µL
Virus RNA1-10 µL

Run the following cycles on a thermocycler:
TemperatureTime
25°C10 minutes
50°C10 minutes
85°C5 minutes
4°C

Store samples at 4°C (for use same day) or -20°C (for use within a week) until ready for PCR.
Pause
PCR generation of tiled amplicons
PCR generation of tiled amplicons
Validated primer schemes can be found here. Prepare two primer pools by mixing equal volumes of each 10 µM primer. Primers indicated by “*” should be pooled at a concentration of 50 µM and primers indicated by “**” should be pooled at a concentration of 100 µM to help normalize sequencing coverage. The sequences for the primer which amplify the barcoded transcript are shown below:

> Contamination Primer Forward
AATGTCGCAGGCACTTGTCC

> Contamination Primer Reverse
GGTCAGAGCTGTCTCCTGCT

Note: Concentration for contamination primers should be identical to concentration of individual primer pairs in pool. Concentration listed below is for our West Nile virus protocol but will vary based on other schemes.

Prepare two PCR reactions for each sample (one for each primer pool):
ComponentVolume in 25 µl reaction
Q5 2x Master Mix12.5 µl
Primer pool (#1 or #2)1 µl
Contamination Primer Forward (0.26 µM)1 µl
Contamination Primer Reverse (0.26 µM)1 µl
Nuclease-free water8.5 µl
cDNA1 µl

Run the following cycles on a thermocycler


CyclesTemperatureTime
198°C30 seconds
3595°C15 seconds
65°C5 minutes
14°C

Run 5 µl of each product on a 1% agarose gel. Each should produce a visible 400 bp band.
Optional
Post PCR cleanup
Post PCR cleanup
Allow Mag-Bind TotalPure NGS beads to equilibrate to room temperature, vortex until homogenous.
Bring PCR product volume up to 25 µL with water (if not at volume already).
Add 45 µL of beads to 25 µL of PCR product, mix well, and incubate at room temperature for 10 minutes.
Place tubes on a magnetic stand and incubate until solution appears clear.
Discard supernatant without disturbing the beads.
While tubes are on the magnet, add 200 µL of 80% EtOH, incubate for 30 seconds, and discard the EtOH wash.
Repeat previous 80% EtOH wash and remove as much EtOH as possible.
Leave tubes on magnet and air dry for 5 minutes.
Remove tubes from magnet and add 20 µL of nuclease-free water. Mix well by pipetting.
Place tubes on magnet stand. When solution appears clear, remove supernatant without disturbing the beads and place into new tubes.
Quantify the DNA concentration using the Qubit High Sensitivity DNA kit (or equivalent) from 1 µL of each product. Expected range = 10-100 ng/µL DNA. Sequencing from lower concentrations may still work.
Note: If your lab has a KingFisher, you can download our automated protocols here: https://github.com/grubaughlab/Kingfisher_protocols (use ‘purification.bdz’ for this step)
Pause
End-repair and A-tailing
End-repair and A-tailing
Combine 25-50 ng of PCR-amplified DNA from primer pool 1 and 2 together for a total of 50-100 ng in 12.5 µL (equal concentrations of each amplicon pool). QS to a total volume of 12.5 µL using nuclease-free water.

Alternatively: proceed using 50-100 ng of primer pool product separately for library preparation. This allows for additional monitoring of cross-contamination. Data can be merged computationally post sequencing.
Combine the following components from the Kapa Hyper prep kit for end repair:
ComponentVolume in 15 µl reaction
End Repair & A-tailing buffer1.75 µl
End Repair & A-tailing enzyme mix0.75 µl
PCR-amplified DNA (50 ng)12.5 µl

Run the following cycles on a thermocycler:

TemperatureTime
20°C30 minutes
65°C30 minutes
4°C

Adapter ligation
Adapter ligation
Dilute a working stock of NEXTflex Dual-Indexed DNA Barcodes 1:100 to obtain a concentration of 250 nM. Select unique barcodes for each sample. Try not to repeat barcodes from recent runs.

Note: Be careful to not cross-contaminate the adaptors by centrifuging all liquid from the caps and only opening one index at a time.

Combine the following components:
ComponentVolume in 27.5 µL reaction
Ligation buffer7.5 µL
DNA ligase2.5 µL
NEXTflex DNA Barcodes (250nM)2.5 µL
End repair reaction product15 µL

Incubate at 20°C for 15 minutes
Proceed immediately to cleanup
Post ligation cleanup
Post ligation cleanup
Allow Mag-Bind TotalPure NGS beads to equilibrate to room temperature, vortex until homogenous.
Add 22 µL of beads to 27.5 µL of ligation product, mix well, and incubate at room temperature for 10 minutes.
Place tubes on a magnetic stand and incubate until solution appears clear.
Discard supernatant without disturbing the beads.
While tubes are on the magnet, add 200 µL of 80% EtOH, incubate for 30 seconds, and discard the EtOH wash.
Repeat previous 80% EtOH wash and remove as much EtOH as possible.
Leave tubes on magnet and air dry for 5 minutes.
Remove tubes from magnet and add 20 µL of nuclease-free water. Mix well by pipetting.
Place tubes on magnet stand. When solution appears clear, remove supernatant without disturbing the beads and place into new tubes - 15 µL will go into library amplification.
Note: If your lab has a KingFisher, you can download our automated protocols here: https://github.com/grubaughlab/Kingfisher_protocols (use ‘purification.bdz’ for this step)
Library amplification
Library amplification
Combine the following components:
ComponentVolume in 34 µL reaction
2X KAPA HiFi HotStart ReadyMix17 µL
Illumina primer mix2 µL
Adaptor-ligated library 15 µL

Run the following cycles on a thermocycler:

CyclesTemperatureTime
198°C45 seconds
1298°C15 seconds
60°C30 seconds
72°C30 seconds
172°C1 minute
4°C


Proceed immediately to cleanup or store at 4°C.
Pause
Post amplification cleanup
Post amplification cleanup
Allow Mag-Bind TotalPure NGS beads to equilibrate to room temperature, vortex until homogenous.
Add 27.2 µL of beads to 34 µL of amplified product, mix well, and incubate at RT for 10 minutes.
Place tubes on a magnetic stand and incubate until solution appears clear.
Discard supernatant without disturbing the beads.
While tubes are on the magnet, add 200 µL of 80% EtOH, incubate for 30 seconds, and discard the EtOH wash.
Repeat previous 80% EtOH wash and remove as much EtOH as possible.
Leave tubes on magnet and air dry for 5 minutes.
Remove tubes from magnet and add 25 µL of Tris-EDTA or elution buffer. Mix well by pipetting.
Place tubes on magnet stand. When solution appears clear, remove supernatant without disturbing the beads and place into new tubes.
Note: If your lab has a KingFisher, you can download our automated protocols here: https://github.com/grubaughlab/Kingfisher_protocols (use ‘purification.bdz’ for this step)
Library quantification and pooling
Library quantification and pooling
Quantify the DNA concentration of each sample (1 µL) using the Qubit High Sensitivity DNA kit.
Pool equal concentrations (e.g., 1-10 ng) of each library for sequencing.
Check DNA fragment distributions of the pooled sample using the BioAnalyzer DNA 1000 kit. Peak fragment size from 400 bp tiled amplicons with proper ligated adaptors should be ~ 580 nt. If ~180 bp bands (adaptor dimers) still exist, perform post amplification cleanup again.
Quantify the DNA concentration of the pooled library (1 µL) using the Qubit High Sensitivity DNA kit.
Note: At least 0.76 ng/µL is required to achieve 2 nM for library pooling. Libraries will need to be concentrated or re-amplified if less than this amount.
Convert DNA libraries from weight to moles:
Molecular weight [nM] = Library concentration [ng/µL] / ((ave. library size x 650)/1,000,000)
Example: if ave. size of library is 580 bp and concentration is 2.5 ng/µL…

(580 x 650) / 1,000,000 = 0.377
5 / 0.377 = 6.6 nM

Dilute the pooled library to 2 nM in 10 mM TE.
(Optional) Ensure the library molar concentration using the Kapa Library Quantification kit.
Optional
If sending your sample to a genomics core (i.e., not loading the MiSeq yourself), stop here.
Diluting the pooled library for sequencing
Diluting the pooled library for sequencing
Combine 10 µL of the 2nM pooled library to 10 µL of 0.1 N NaOH and mix. Incubate from 5 minutes at room temperature to denature the dsDNA.
Add 980 µL of HT1 (comes with the MiSeq kits). New concentration = 20 pM.
Dilute to the desired concentration using the following volumes.

Concentration10 pM12 pM14 pM16 pM
20 pM Library295 µl255 µl415 µl475 µl
Prechilled HT1300 µl240 µl180 µl120 µl
PhiX control *5 µl5 µl5 µl5 µl

*PhiX control should also be denatured and diluted to 20 pM.
Note: loading too high of a sample on a MiSeq leads to over-clustering and decreased quality, which may make the data unusable. Adding too low leads to under-clustering and may not generate enough data for sufficient sequencing coverage. In our hands, optimal cluster densities were reached using 10-12 pM with the MiSeq v2 kits and 14-16 pM with the MiSeq v3 kits. Loading concentrations should be empirically determined with each lab.
Following loading instructions located in the MiSeq user guides.
Data processing and analysis
Data processing and analysis
Use iVar, following instructions on: www.github.com/andersen-lab/ivar.

An example pipeline for generating consensus sequences and utilizing barcode transcripts to estimate contamination can be found on: www.github.com/watronfire/PrimalSeq_Pipeline