Scientific Updates

PNAS|Genome-wide single-cell and single-molecule footprinting of transcription factors with deaminase

In multicellular organisms, each somatic cell generally harbors the same genome; however, gene expression varies across different tissues, allowing cells to perform specialized functions. Transcription factors (TFs) are a class of proteins that regulate gene transcription by binding to DNA, primarily at promoter and distal enhancer regions. They play a crucial role in modulating gene expression, cellular differentiation, and developmental processes in response to environmental signals. Understanding TF-DNA interactions at a genome-wide scale is of paramount importance, as aberrant TF binding is closely associated with complex diseases such as developmental disorders, cancer, and neurodegenerative diseases.

 

Nevertheless, analyzing TF binding in mammalian cells presents multiple challenges. First, TF binding is dynamic and context-dependent, varying across cell types, developmental stages, and environmental conditions. This necessitates single-cell resolution analysis of TF binding in primary tissues. Second, despite the ability of TFs to recognize relatively short sequence motifs, motifs alone do not reliably predict actual TF binding sites, necessitating experimental validation. Finally, with thousands of TFs encoded in the mammalian genome, current methods are limited in scalability, resolution, and throughput, hindering a systematic understanding of TF occupancy across different cell populations. The coordination of TFs in regulating gene networks remains a central yet unresolved question in eukaryotic biology.

 

Traditional methods for identifying TF-DNA interactions, such as chromatin immunoprecipitation followed by sequencing (ChIP-seq) and ChIP-exo, rely on antibodies, the quality and availability of which vary significantly, limiting their universality. Another commonly used approach, DNase-seq, utilizes nuclease digestion to detect TF footprints on DNA; however, these experiments typically require a large number of cells and deep sequencing, restricting their application in scenarios with limited sample availability or single-molecule TF binding events. Recently, some laboratories have developed single-molecule footprinting (SMF) techniques by combining cytosine methyltransferase M.CviPI with bisulfite sequencing to determine TF occupancy. However, since M.CviPI exclusively methylates cytosines within GpC dinucleotides, SMF is only applicable to biological systems lacking endogenous cytosine methylation interference and is restricted to TFs binding at GpC-containing sequences, with limited resolution (7–14 bp). Another single-molecule footprinting method, Fiber-seq, utilizes N6-adenine methyltransferase and long-read sequencing to map TF footprints and nucleosome positioning. However, its high cost, low sequencing depth, and limited 6mA detection accuracy have hindered its application for genome-wide TF binding analysis.

 

 

To address these challenges, on December 17, 2024, the Xie Xiaoliang laboratory at the Center for Life Sciences/ Biomedical Pioneering Innovation Center, Peking University, published a study in PNAS titled "Genome-wide single-cell and single-molecule footprinting of transcription factors with deaminase". The study reports a novel single-molecule footprinting technology based on double-stranded DNA deaminase, termed FOODIE (FOOtprinting with DeamInasE). This method employs a deaminase to convert cytosines in the genome to uracils, which are subsequently read as thymine during PCR amplification and next-generation sequencing (NGS). In contrast, cytosines at TF binding sites remain unconverted, enabling high-resolution footprinting of TF binding sites (Figure 1). FOODIE's advantages include compatibility with widely used NGS platforms, lower cost, greater scalability, and the ability to analyze TF binding at both the single-molecule and single-cell levels, significantly enhancing throughput and resolution.

 

 

To identify an optimal double-stranded DNA deaminase for TF footprinting, the authors screened multiple deaminases and selected the highly efficient and low sequence-preference enzyme DddB for developing FOODIE. Given that TFs typically bind accessible chromatin regions, the authors combined FOODIE with Tn5 transposase-based fragmentation to isolate these regions for TF footprinting analysis. This streamlined workflow is robust, user-friendly, and requires only a small number of cells, making it suitable for single-cell applications. Using the human lymphoblastoid cell line GM12878, the authors demonstrated FOODIE’s accuracy in detecting genome-wide TF footprints. Compared to previous methods, FOODIE exhibits significantly improved sensitivity and resolution. By integrating motif analysis and ChIP-seq data, the authors confirmed the precise binding sites of 79 TFs in GM12878 cells and quantified their binding distribution at promoters and enhancers. Further, FOODIE was applied to individual mammalian cells within microtiter wells. Single-cell FOODIE (scFOODIE) enabled the detection of TF binding states in heterogeneous tissue samples (Figure 2). The authors demonstrated that scFOODIE could resolve TF binding patterns within a mixture of four human cell lines (GM12878, K562, HEK293, and HeLa) and classify cells based on cell-type-specific chromatin accessibility signals. Subsequently, scFOODIE was applied to mouse hippocampal tissue, analyzing 11,200 cells corresponding to eight major cell types, providing a powerful tool for systematically investigating gene regulatory networks in heterogeneous tissues.

 

 

Leveraging FOODIE, the authors explored co-binding patterns of neighboring TFs within accessible genomic regions and examined positive (cooperative) and negative (exclusive) TF interactions. They developed a computational algorithm to quantify TF cooperativity from FOODIE data (Figure 3). The study revealed that genome-wide, TF pairs exhibit more positive than negative cooperativity. Positive cooperativity involves TFs facilitating each other's DNA binding, enabling regulatory control through sigmoidal TF concentration dependence, which enhances sensitivity. In contrast, negative cooperativity ensures rapid gene regulation responses to abrupt TF concentration changes in the nucleus. Understanding TF binding and combinatorial interactions in different cell types provides critical insights into gene regulation mechanisms and cellular function.

 

 

FOODIE offers novel insights into mammalian transcriptional regulatory programs. Research suggests that gene expression occurs in bursts, a phenomenon attributed to the single-molecule nature of DNA in single cells. To coordinate the expression of multiple genes for specific biological functions, their synchronization is crucial. Researchers have long sought to identify cooperative gene modules (CGMs), which regulate distinct biological processes through coordinated transcriptional control. The authors hypothesized that different CGMs are regulated by a shared set of TFs. To test this hypothesis, they investigated the enrichment of specific TFs within three types of CGMs. The results showed that housekeeping gene CGMs exhibit TF enrichment at promoters, whereas tissue-specific CGMs demonstrate TF enrichment at enhancers (Figure 4). Based on these analyses, the authors propose that genes involved in fundamental cellular functions rely on promoter-bound TFs to ensure robustness, while tissue-specific genes depend more on enhancer-bound TFs to achieve greater regulatory flexibility. This finding advances our understanding of the intricate mechanisms governing cellular gene regulation.

 

 

To facilitate interactive analysis of genome-wide TF footprints identified by FOODIE, the authors developed a user-friendly web server (http://foodie.sunneyxielab.org). As systematic FOODIE databases are established across various human tissues, our understanding of eukaryotic transcriptional regulation and cellular function will continue to expand.

 

In conclusion, FOODIE is a high-precision in situ TF footprinting method capable of genome-wide TF mapping with near-single-nucleotide resolution at the single-cell level. Compared to existing TF footprinting techniques, FOODIE possesses several key advantages. It is easy to implement, scalable for high-throughput workflows, and cost-effective. Moreover, FOODIE can be extended to single-cell resolution, enabling the analysis of distinct cell types in heterogeneous tissues and the assessment of TF binding patterns within each cellular population. By applying FOODIE across diverse tissues, researchers can elucidate TF combinatorial binding patterns and gene regulatory modules. Additionally, FOODIE holds promise for clinical applications, allowing the identification of disease-specific regulatory features through TF footprinting in patient samples. Ultimately, FOODIE represents a transformative technology poised to accelerate gene regulation research across nearly all biological contexts.

 

Professor Xiaoliang Sunney Xie from the Changping Laboratory, Peking University Biomedical Pioneering Innovation Center, and the Peking University-Tsinghua University Joint Center for Life Sciences serves as the corresponding author for this paper. Dr. Runsheng He (Associate Researcher, Peking University Biomedical Frontier Innovation Center/Changping Laboratory), Dr. Wenyang Dong (Associate Researcher, Changping Laboratory), Zhi Wang (Ph.D. student, Peking University), Dr. Chen Xie (Associate Researcher, Peking University Biomedical Frontier Innovation Center/Changping Laboratory), Dr. Long Gao (Associate Researcher, Changping Laboratory), and Wenping Ma (Associate Researcher, Changping Laboratory)are co-first authors of the paper. Ph.D. students Ke Shen (Peking University), Dubai Li (Changping Laboratory), Yuxuan Pang (Changping Laboratory), and others made significant contributions to this study. The research project was supported by funding from Changping Laboratory, Ministry of Science and Technology, Beijing Advanced Innovation Center for Genomics, and Peking University-Tsinghua University Joint Center for Life Sciences.

 

Expert Review

Chuan He (University of Chicago): The Xiaoliang Sunney Xie research group has been dedicated to the development of high-precision, single-cell, whole-genome sequencing methods for studying the fundamental issues of chromatin conformation and transcriptional mechanisms in mammals. The newly published FOODIE technology, online in December 2024, is a high-precision, whole-genome transcription factor binding site sequencing method developed by this group. FOODIE uses an efficient double-strand DNA deaminase to systematically measure transcription factor binding sites across the entire genome. This method is simple to operate, cost-effective, and can be applied to single-cell and single-molecule level analysis. In brain tissue samples, FOODIE demonstrated outstanding capabilities in distinguishing different cell types and studying transcription factor interactions, providing a valuable new tool for exploring the complex transcriptional regulatory mechanisms in mammals.

 

Bing Zhu (Institute of Biophysics, Chinese Academy of Sciences): The Friendship Boat of Transcription Factors. As DNA sequence-specific binding proteins, transcription factors regulate the specific expression of genes. The simplified model presented in textbooks typically describes a transcription factor recognizing a DNA sequence to either activate or repress its expression. However, many transcription factors bind to thousands of target sequences in the genome but only regulate the expression of a small subset of related genes. As Robert Roeder, a pioneer in eukaryotic transcription, famously said: "Binding is NOT functioning." What is the reason behind this? And how do we approach studying it?

Just as we are not isolated in society, transcription factors are often not isolated in the regulatory elements of target genes. Gene expression regulation is rarely determined by a single transcription factor but by the combined action of multiple transcription factors and chromatin environmental factors. Therefore, understanding the "friendship boat" of different transcription factors, or their synergistic or antagonistic interactions within the same regulatory element region, is an important research direction. However, this is a challenging area of research due to the lack of relevant research tools.

Recently, the Xiaoliang Sunney Xie group published a groundbreaking technical study in PNAS that advances this research direction: FOODIE. This technology uses deaminase to convert cytosine (C) in the genome into uracil (U), while cytosine in transcription factor binding sites escapes this reaction, leaving behind its footprint. By analyzing the target sequence characteristics of transcription factors, it is possible to infer the synergistic or antagonistic effects between various transcription factors on the same DNA strand. Importantly, this technology can be applied at the single-cell level, opening up possibilities for studying the heterogeneity exhibited by different cells in response to various signals.

Additionally, FOODIE is a simple, inexpensive, and practical technology that is expected to be widely used by researchers in various biological scenarios.

 

Paper Link: https://www.pnas.org/doi/10.1073/pnas.2423270121

 

References:

  1. Lambert SA, et al. (2018) The Human Transcription Factors. Cell, 172(4):650-665.
  2. Robertson G, et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods, 4(8):651-657.
  3. Rhee Ho S & Pugh BF (2011) Comprehensive Genome-wide Protein-DNA Interactions Detected at Single-Nucleotide Resolution. Cell, 147(6):1408-1419.
  4. Hesselberth JR, et al. (2009) Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nature Methods, 6(4):283-289.
  5. Sönmezer C, et al. (2021) Molecular co-occupancy identifies transcription factor binding cooperativity in vivo. Molecular Cell, 81(2):255-267. e256.
  6. Stergachis AB, Debo BM, Haugen E, Churchman LS, & Stamatoyannopoulos JA (2020) Single-molecule regulatory architectures captured by chromatin fiber sequencing. Science, 368(6498):1449-1454.
  7. Chong, S., Chen, C., Ge, H., & Xie, X. S. (2014). Mechanism of transcriptional bursting in bacteria. Cell, 158(2), 314-326.
  8. Ma, W., & Xie, X. S. (2024). CGMFinder Identifies Correlated Gene Modules from 3H scRNA-seq Data. bioRxiv, 2024-11.

 

Xiaoliang Sunney Xie is an Academician of the Chinese Academy of Sciences, the Li Zhaoji Chair Professor and Dean of the Faculty of Science at Peking University (concurrently), and the Director of Changping Laboratory. He is also the Founding Director of the Biomedical Pioneering Innovation Center (BIOPIC) at Peking University.

Professor Xie’s research group focuses on three major areas: Fundamental Research: Single-molecule enzymology, single-molecule biophysical chemistry, gene expression and regulation, epigenetics, cellular differentiation and reprogramming mechanisms, and genome instability. Technological Development: Single-molecule imaging, single-cell genomics, coherent Raman scattering microscopy, and DNA sequencing. Medical Research: Preimplantation genetic screening and diagnosis for in vitro fertilization (IVF), early cancer diagnostics, and the development of neutralizing antibody drugs for COVID-19.

In 2004, Professor Xie was awarded the NIH Director’s Pioneer Award by the U.S. National Institutes of Health (NIH). In 2015, he became the first Chinese recipient of the Albany Medical Center Prize in Medicine and Biomedical Research, which he received jointly with Professor Karl Deisseroth of Stanford University. That same year, he was also honored with the ACS Peter Debye Award in Physical Chemistry by the American Chemical Society (ACS). On September 16, 2017, he received the Qiu Shi Outstanding Scientist Award. In 2022, he was honored with the Zhongguancun Award for Outstanding Contribution, the highest scientific and technological award in Beijing for the year 2021. In 2024, he received the Tengchong Science Award.