Plant materials and growth conditions described in MATERIALS
Agarie S, Kawaguchi A, Kodera A, Sunagawa H, Kojima H, Nose A, Nakahara T. 2009. Potential of the common ice plant, Mesembryanthemum crystallinum as a new high-functional food as evaluated by polyol accumulation. Plant Prod Sci. 12(1):37–46. doi:10.1626/pps.12.37.
Gamborg, OL., Miller, RA., Ojima, K. 1968. Nutrient requirements of suspension cultures of soybean root cells. Exp. Cell Res. 50(1): 151–158. https://doi.org/10.1016/0014-4827(68)90403-5
Winter K, Ltittge U, Winter E, Troughton JH. 1978. Seasonal shift from C3 photosynthesis to crassulacean acid metabolism in Mesembryanthemum crystallinum growing in its natural environment. Oecologia (Berl). 34:225–237. https://doi.org/10.1007/BF00345168
②Clean read preparation and genome size estimation
Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M, Otto TD. 2013. REAPR: a universal tool for genome assembly evaluation. Genome Biol. 14(5):R47. doi:10.1186/gb-2013-14-5-r47.
Liu Y, Schröder J, Schmidt B. 2013. Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data. Bioinformatics. 29(3):308–315. doi:10.1093/bioinformatics/bts690.
Marçais G, Kingsford C. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics.27(6):764–770. doi:10.1093/bioinformatics/btr011.
Ranallo-Benavidez TR, Jaron KS, Schatz MC. 2020. GenomeScope 2.0 and Smudgeplot for reference- free profiling of polyploid genomes. Nat Commun. 11:1432.
➂ De novo genome assembly and quality evaluation
Lim SD, Lee S, Choi WG, Yim WC, Cushman JC. 2019. Laying the foundation for crassulacean acid metabolism (CAM) biodesign: expression of the C4 metabolism cycle genes of CAM in Arabidopsis. Front Plant Sci. 10:101. doi:10.3389/fpls.2019.00101.
Manni M, Berkeley MR, Seppey M, Zdobnov EM. 2021. BUSCO: assessing genomic data quality and beyond. Curr Protoc.1(12):e323. doi:10.1002/cpz1.323.
McGinnis S, Madden TL. 2004. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 32:20–25. doi:10.1093/nar/gkh435.
Nishimura O, Hara Y, Kuraku S. 2017. GVolante for standardizing completeness assessment of genome and transcriptome assemblies. Bioinformatics. 33(22):3635–3637. doi:10.1093/bioinformatics/btx445.
Pedersen BS, Quinlan AR. 2018. Mosdepth: Quick coverage calculation for genomes and exomes. Bioinformatics.34(5):867–868. doi:10.1093/bioinformatics/btx699.
Pryszcz LP, Gabaldón T. 2016. Redundans: an assembly pipeline for highly heterozygous genomes. Nucleic Acids Res.44(12):e113. doi:10.1093/nar/gkw294.
Sato, R. Kondo, Y. Agarie, S. (2022) Supplementary_Information. figshare. Dataset. https://doi.org/10.6084/m9.figshare.21788624.v5 Swat S, Laskowski A, Badura J, Frohmberg W, Wojciechowski P, Swiercz A, Kasprzak M, Blazewicz J. 2021. Genome-scale de novo assembly using ALGA. Bioinformatics.37(12):1644–1651. doi:10.1093/bioinformatics/btab005.
④ Phylogenetic tree creation among multiple plant species using 18S ribosomal DNA
Lemoine F, Correia D, Lefort V, Doppelt-Azeroual O, Mareuil F, Cohen-Boulakia S, Gascuel O. 2019. NGPhylogeny.fr: new generation phylogenetic services for non-specialists. Nucleic Acids Res.47(W1):W260-W265. doi: 10.1093/nar/gkz303.
Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glöckner FO. 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 35(21):7188–7196. doi:10.1093/nar/gkm864
Seemann T. 2018. barrnap 0.9: rapid ribosomal RNA prediction. https://github.com/tseemann/barrnap
Shimodaira H, Hasegawa M. 1999. Multiple comparisons of loglikelihoods with applications to phylogenetic inference. Mol BiolEvol. 16:1114–1116.
⑤ Detection of repetitive regions
Abrusán, G., Grundmann, N., DeMester, L., & Makalowski, W. (2009). TEclass--a tool for automated classification of unknown eukaryotic transposable elements. Bioinformatics (Oxford, England), 25(10), 1329–1330. https://doi.org/10.1093/bioinformatics/btp084
Bao W, Kojima KK, Kohany O. 2015. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 6(1):4–9. doi:10.1186/s13100-015-0041-9.
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A.117(17):9451–9457. doi:10.1073/pnas.1921046117.
Smit, AFA, Hubley, R, Green, P. RepeatMasker Open-4.0. 2013-2015 <http://www.repeatmasker.org>
➅ Search for genomic sequences coding transfer RNA (tRNA) and micro-RNA (miRNA)
Chan PP, Lin BY, Mak AJ, Lowe TM. 2021. TRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49(16):9077–9096. doi:10.1093/nar/gkab688.
Cognat V, Pawlak G, Duchêne AM, Daujat M, Gigant A, Salinas T, Michaud M, Gutmann B, Giegé P, Gobert A, et al. 2013. PlantRNA, a database for tRNAs of photosynthetic eukaryotes. Nucleic Acids Res. 41(D1):273–279. doi:10.1093/nar/gks935.
Nawrocki EP, Eddy SR. 2013. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics.29(22):2933–2935. doi:10.1093/bioinformatics/btt509.
Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. 2021. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom Bioinform. 3(1):1–11. doi:10.1093/nargab/lqaa108.
Buchfink B, Reuter K, Drost HG. 2021. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods.18(4):366–368. doi:10.1038/s41592-021-01101-x
Shen W, Le S, Li Y, Hu F. 2016. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation.PLoS One.11(10):e0163962. doi:10.1371/journal.pone.0163962.
Lim SD, Lee S, Choi WG, Yim WC, Cushman JC. 2019. Laying the foundation for crassulacean acid metabolism (CAM) biodesign: expression of the C4 metabolism cycle genes of CAM in Arabidopsis. Front Plant Sci. 10:101. doi:10.3389/fpls.2019.00101.
⑧ Protein domain searches
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, Tosatto SCE, Paladin L, Raj S, Richardson LJ, et al. 2021. Pfam: The protein families database in 2021. Nucleic Acids Res. 49(D1):D412–D419. doi:10.1093/nar/gkaa913.
Potter SC, Luciani A, Eddy SR, Park Y, Lopez R, Finn RD. 2018. HMMER web server: 2018 update. Nucleic Acids Res.46(W1):W200–W204. doi:10.1093/nar/gky448.
Zheng, Y, Jiao, C, Sun, H, Rosli, HG, Pombo, MA, Zhang, P, Banf, M, Dai, X, Martin, GB, Giovannoni, JJ, Zhao, PX, Rhee, SY, Fei, Z. 2016. iTAK: A Program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol. Plant. 9(12): 1667–1670. https://doi.org/10.1016/j.molp.2016.09.014