Scientific Updates

Plos Biology∣Tang's research group and collaborators develop a novel single-cell RNA-seq technology based on third generation sequencing platform

The first single-cell RNA-seq technology was developed in 2009, ushering in the era of single-cell genomics (Tang et al., 2009). The development of single-cell sequencing technologies over the past decade has greatly accelerated research in the biomedical field, helping researchers overcome major challenges such as cellular heterogeneity and sample rarity. Single-cell transcriptome maps of humans and model organisms have been reported. However, the current single-cell sequencing technologies are almost based on the second-generation sequencing platform with the read length  being generally around 150 bp. The lengths of transcripts in the human transcriptome are generally above 1000 bp, and some transcripts are even longer than 100 kb (Piovesan et al., 2016; Frankish et al., 2019), far exceeding the maximum length of reads that can be detected by the second-generation sequencing methods.

On December 30, 2020, Prof Fuchou Tang’s group at Beijing Advanced Innovation Center for Genomics, Biomedical Pioneering Innovation Center of Peking University, in collaboration with Beijing GrandOmics Biosciences Co., Ltd., published a research paper entitled "Single-cell RNA-seq analysis of mouse preimplantation embryos by third-generation sequencing" on Plos Biology, which solved the core difficulty of obtaining accurate information of full-length transcripts in individual cells by single-cell RNA-seq based on next generation sequencing platform. The main breakthroughs of this research are:

1) Developed a high-sensitivity single cell RNA-seq method based on the third-generation sequencing platform—SCAN-seq (Single cell amplification and sequencing of full-length RNAs by Nanopore platform). This technology is capable of directly capturing the full-length transcripts in single cells, showing high sensitivity and high robustness. It can detect more than 8,000 genes in each individual mouse embryonic stem cell, which is comparable to, or even better than previous NGS platform-based single cell RNA-seq methods (Fig. 1).

Fig. 1 Characterization and evaluation of SCAN-seq

2) Identified more than 30,000 unannotated transcripts. This study identified 6,487 and 27,250 unannotated transcripts of different types in mESCs and mouse preimplantation embryos, respectively. Particularly, new combinations of annotated splice junctions from different transcripts or within the same transcript could be reliably identified using this method (Fig. 2).

Fig. 2 Identified unannotated transcripts

3) Evaluated allele-specific transcripts within each individual cell during mouse preimplantation development. SCAN-seq shows high accuracy (averaged error rates as 1.8%) for identification of mouse strain-specific SNPs. The study captured the gradual increase of mRNAs from the paternal alleles after 2-cell embryo stage when zygotic genome activation started and finally the copy number of mRNAs from maternal & paternal alleles become comparable in each individual cell at blastocyst stage (Fig. 3).

Fig. 3 Analysis of allele specific transcripts

The SCAN-seq method developed by this study should have broad application prospects, which can overcome various limitations of single-cell RNA-seq based on the second-generation sequencing platform. It advances single-cell genomics sequencing from the "second" to the "third" era:(1)From the limited information that can only be sequenced at one end of the cDNA in a single cell, to the full-length information of the cDNA in a single cell.(2)From the mixed measurement of all the different alternative splicing products (transcripts) of a gene in a single cell, to the precise separation of all the different alternative splicing products (transcripts) of each gene in a single cell.(3)From indistinguishable mixing of parental source expression information of a gene in a single cell, to precise separation of parental source transcripts of each gene in a single cell.(4)From only detecting transcript information of genes with unique sequences in single cells, to also being able to accurately detect transcript information of highly repetitive sequences in single cells;(5)From “one gene, one phenotype” (There are about 30,000 genes in the human genome),to “one RNA isoform, one phenotype” (There are approximately 300,000 different alternative splicing transcripts in the human genome). In conclusion, single cell RNA-seq method based on the third generation sequencing platform will unveil more mysteries of "dark matter" in the transcriptome and bring new development opportunities to human biomedical research.

Xiaoying Fan, a research fellow in Bioland Laboratory, Yuhan Liao, a Ph.D candidate in School of Life Sciences at Peking University, and Master Tang Dong and Master Li Pidong from Beijing GrandOmics Biosciences Co., Ltd. are the co-first authors of the paper. Professor Tang Fuchou from Beijing Advanced Innovation Center for Genomics, Biomedical Pioneering Innovation Center of Peking University and Dr. Wang Yang from Beijing GrandOmics Biosciences Co., Ltd. are the co-corresponding authors of the paper. The research was supported by the National Natural Science Foundation of China, Beijing Municipal Science & Technology Commission and Beijing Advanced Innovation Center for Genomics.