Dec 04, 2024

Public workspaceExperimental validation for transcoding scheme named R+ in DNA data storage

Peer-reviewed method
  • Deruilin Liu1,2,
  • Demin Xu2,
  • Liuxin Shi2,
  • Jianyuan Zhang3,
  • Kewei Bi4,
  • Bei Luo5,
  • Chen Liu5,
  • Yuxiang Li6,
  • Guangyi Fan2,7,
  • Wen Wang2,8,
  • Zhi Ping2,9
  • 1College of Life Sciences, University of Chinese Academy of Sciences;
  • 2BGI research, Shenzhen;
  • 3BGI research, Hangzhou;
  • 4BGI research, Changzhou;
  • 5Wuhan BGI Technology Service Co.,Ltd.;
  • 6HIM-BGI Omics Center, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences;
  • 7BGI research, Qingdao;
  • 8BGI research, Beijing;
  • 9School of Medicine, The Chinese University of Hong Kong, Shenzhen
  • GigaScience Press
Icon indicating open access to content
QR code linking to this content
Protocol CitationDeruilin Liu, Demin Xu, Liuxin Shi, Jianyuan Zhang, Kewei Bi, Bei Luo, Chen Liu, Yuxiang Li, Guangyi Fan, Wen Wang, Zhi Ping 2024. Experimental validation for transcoding scheme named R+ in DNA data storage. protocols.io https://dx.doi.org/10.17504/protocols.io.q26g7mr78gwz/v1
Manuscript citation:
Liu D, Xu D, Shi L, Zhang J, Bi K, Luo B, Liu C, Li Y, Fan G, Wang W, Ping Z (2025) A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine. GigaByte 2025(). doi: 10.46471/gigabyte.147
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Working
We use this protocol and it's working
Created: November 05, 2024
Last Modified: December 04, 2024
Protocol Integer ID: 111579
Keywords: DNA storage, Sequence assembly
Funders Acknowledgements:
National Key Research and Development Program of China
Grant ID: 2021YFF1200204
National Natural Science Foundation of China
Grant ID: 32101182, 32201175
Shenzhen Science, Technology and Innovation Commission
Grant ID: SGDX20220530110802015
Abstract
We have developed a universal transcoding scheme named R+ for expanded molecular alphabet and demonstrating the in vitro experimental validation of DNA data storage containing 5-methylcytosine (5mC) from data writing to data reading.
The experimental validation mainly included assembly of 5mC fragments and nanopore sequencing. For more information see the accompanying paper and the following protocol.
Assembly of 5mC fragments.
Assembly of 5mC fragments.
Use T4 Polynucleotide Kinase (NEB, CAT#: M0202L) to phosphorylate 108-nt oligonucleotides, followed by an annealing step to form double-stranded DNA with 8nt sticky ends.

The phosphorylation and annealing is performed using the following program:
Temperature37 °C Duration02:00:00
Temperature95 °C Duration00:05:00
Temperature70 °C Duration00:30:00
Temperature55 °C Duration00:05:00
Temperature37 °C Duration00:05:00
Temperature12 °C Duration00:00:00 hold

2h 45m
Ligating three sets of five 108-bp fragments and four 108-bp fragments to form three 540-bp segments and 432-bp segments respectively.

The T4 ligation is performed using the following program:
Temperature16 °C Duration18:00:00
18h
Gel electrophoresis.

Agarose gels Concentration3 Mass / % volume
Run condition Amount180 V Duration01:00:00

The ligation products are then purified by gel purification before usage. The gel purifications are performed using the Freeze N Squeeze DNA Gel Extraction Spin Columns (BIO-RAD, CAT#: 4106139 and NucleoSpin Gel and PCR Clean-up Kit (MACHEREY-NAGEL, CAT#: 740609.250).

1h
Use T4 DNA ligase (NEB, CAT#: M0202L) to further assembly three 540-bp segments and 432-bp segments respectively.

T4 ligation is performed using the following program:
Temperature16 °C Duration18:00:00

18h
Gel electrophoresis.

Agarose Concentration1.5 Mass / % volume
Run condition Amount180 V Duration01:00:00

The ligation products are purified by gel purifications before usage. The gel purifications are performed using the Freeze N Squeeze DNA Gel Extraction Spin Columns (BIO-RAD, CAT#: 4106139 and NucleoSpin Gel and PCR Clean-up Kit (MACHEREY-NAGEL, CAT#: 740609.250).

1h
5mC fragments sequencing experiments on MinION.
5mC fragments sequencing experiments on MinION.
Prepare pooled barcoded sample with up to 96 unique barcodes for every sample, to then be combined and run together on a MinION R 10.4.1 flow cell.
Use NEB Blunt/TA Ligase Master Mix (catalogue number M0367) and NEBNext Quick Ligation Modules (catalogue number E6056) containing all the NEB reagents needed for use with the Native Barcoding Kit to perform barcode ligation.
Use EDTA to stop the barcode ligation reaction.

Final sequencing library mixture (total Amount75 µL ):
eluted DNA Amount12 µL
sequencing buffer Amount37.5 µL
library beads Amount25.5 µL
Load 75 μl of the library mix to the flow cell via the SpotON sample port after the flow cell is initially flushed with priming mix via the priming port.
Record the raw data in the format of FAST5 using MinKNOW software (Oxford Nanopore Technologies).