โก Quick Summary
This study introduces SCEG-HiC, a novel machine learning method that predicts enhancer-gene links from single-cell multi-omics data by integrating prior Hi-C information. The method significantly enhances prediction accuracy and reveals biologically relevant links, outperforming existing models in various datasets.
๐ Key Details
- ๐ Datasets: 10 human and mouse single-cell multi-omics datasets
- ๐งฉ Features used: scATAC/RNA-seq data
- โ๏ธ Technology: SCEG-HiC, based on weighted graphical lasso
- ๐ Performance: Outperforms existing single-cell models in accuracy
- ๐ Application: COVID-19 datasets for gene regulatory networks
๐ Key Takeaways
- ๐ฌ Enhancers are crucial for transcriptional regulation from distal genomic locations.
- ๐ SCEG-HiC integrates bulk average Hi-C data to improve predictions.
- ๐ก Supports both paired scATAC/RNA-seq and scATAC-only inputs.
- ๐ Demonstrated effectiveness in reconstructing gene regulatory networks related to disease severity.
- ๐ Freely available as an open-source R package for regulatory genomics research.
- ๐ Enhances understanding of functional associations between noncoding variants and target genes.

๐ Background
Enhancers are vital elements in the regulation of gene expression, often located far from their target genes. Traditional methods for linking enhancers to genes have faced challenges, particularly due to the lack of chromatin conformation data. The integration of single-cell multi-omics data with prior Hi-C information presents a promising avenue for overcoming these challenges and enhancing our understanding of gene regulation.
๐๏ธ Study
The researchers developed SCEG-HiC, a machine learning approach that utilizes weighted graphical lasso to decode enhancer-gene links from single-cell multi-omics data. By incorporating bulk average Hi-C data as prior knowledge, the method aims to improve prediction accuracy while maintaining context-specific correlations. The study evaluated SCEG-HiC across multiple datasets, including those related to COVID-19, to assess its effectiveness in reconstructing gene regulatory networks.
๐ Results
Comprehensive evaluations revealed that SCEG-HiC significantly outperformed existing single-cell models in predicting enhancer-gene links. The method demonstrated a remarkable ability to retain context-specific correlations, leading to the discovery of biologically relevant links. In particular, its application to COVID-19 datasets highlighted its capacity to elucidate the regulatory networks associated with disease severity and the functional associations of noncoding variants.
๐ Impact and Implications
The introduction of SCEG-HiC marks a significant advancement in the field of regulatory genomics. By providing a more reliable method for predicting enhancer-gene links, this approach has the potential to enhance our understanding of gene regulation in various biological contexts, including disease mechanisms. The open-source nature of SCEG-HiC encourages widespread adoption and application in research, paving the way for future discoveries in genomics.
๐ฎ Conclusion
The development of SCEG-HiC showcases the transformative potential of integrating machine learning with multi-omics data in understanding gene regulation. By improving the accuracy of enhancer-gene link predictions, this method opens new avenues for research in regulatory genomics and disease biology. Continued exploration in this area promises to yield valuable insights into the complexities of gene regulation and its implications for health and disease.
๐ฌ Your comments
What are your thoughts on the integration of machine learning in genomics research? We would love to hear your insights! ๐ฌ Leave your comments below or connect with us on social media:
Predicting enhancer-gene links from single-cell multi-omics data by integrating prior Hi-C information.
Abstract
Enhancers play an important role in transcriptional regulation by modulating gene expression from distal genomic locations. Although single-cell ATAC and RNA sequencing (scATAC/RNA-seq) data have been leveraged to infer enhancer-gene links, establishing regulatory links between enhancers and their target genes remains a challenge due to the absence of chromatin conformation information. Here, we present SCEG-HiC, a machine learning method based on weighted graphical lasso, which decodes enhancer-gene links from single-cell multi-omics data by integrating bulk average Hi-C as prior knowledge. SCEG-HiC supports both paired scATAC/RNA-seq and scATAC-only inputs, improving prediction accuracy while retaining context-specific correlations and enabling the discovery of biologically relevant links. Comprehensive evaluation across 10 human and mouse single-cell multi-omics datasets shows that SCEG-HiC outperforms existing single-cell models. Application of SCEG-HiC to COVID-19 datasets illustrates its capacity to more reliably reconstruct gene regulatory networks underlying disease severity, and elucidate functional associations between noncoding variants and their putative target genes. SCEG-HiC is freely available as an open-source and user-friendly R package, facilitating broad applications in regulatory genomics research.
Author: [‘Liang X’, ‘Miao Y’, ‘Han D’, ‘Li Y’, ‘Zhang W’, ‘Wang Z’]
Journal: Nucleic Acids Res
Citation: Liang X, et al. Predicting enhancer-gene links from single-cell multi-omics data by integrating prior Hi-C information. Predicting enhancer-gene links from single-cell multi-omics data by integrating prior Hi-C information. 2026; 54:(unknown pages). doi: 10.1093/nar/gkag437