Researchers from the University of Pittsburgh,ย USA, have made a momentous breakthrough in the realm of single-cell sequencing with the creation of IndepthPathway, an innovative method for pathway enrichment analysis from scRNA-seq data. Overcoming the challenges posed by high noise and low gene coverage in single-cell sequencing, this tool employs the Weighted Concept Signature Enrichment Analysis (WCSEA) approach, which offers a broader perspective to assess the functional relationship of pathway gene sets to differentially expressed genes. This revolutionary technique, featuring the “universal concept signature,” holds immense promise in unveiling individual cells’ and cell populations’ intricate pathways and processes.
Challenges in Pathway Enrichment Analysis
Genomics research faces significant challenges in identifying causal pathways underlying physiological processes, disease initiation, progression, and therapeutic resistance based on proteomic, genomic, and transcriptomic datasets. In single-cell sequencing (SCS)-based studies, pathway enrichment (PE) methods are crucial in identifying key molecular pathways and biological processes governing cell behavior. Over time, PE methods have evolved into three generations: over-representation analysis (ORA), functional class scoring, and network-topology based. These methods can be further classified as unweighted or weighted based on their dependency on gene weights (i.e., levels of differential expressions).
PE methods serve a distinct purpose from pathway activity inference, pathway, and gene set over-dispersion analysis (PAGODA), regulon activity inference, and functional gene set inference. While PE methods focus on testing the enrichment of functional pathways in differentially expressed genes for pathway discovery, pathway activity inference tools assess the activity of specific pathways in each sample or single cell. However, current PE methods face challenges in analyzing SCS data due to their low genome coverage, high amplification bias, and inherent technical variability.
To address these challenges, a concept signature enrichment analysis (CSEA) was developed, leveraging diverse knowledge-based gene sets (molecular concepts) to assess functional relations between pathway gene sets and target gene lists. CSEA employs concept signatures to calculate a cumulative genome-wide Universal Concept Signature (uniConSig) score, representing the functional relevance of genes underlying the target gene list. Building upon CSEA, a weighted CSEA (WCSEA) method was introduced to enable an in-depth functional assessment of weighted gene lists based on differential expressions (DEs) from SCS data.
The Power of Single-Cell Sequencing Data
Traditional bulk sequencing methods provide a snapshot of the average gene expression levels across a population of cells. In contrast, single-cell sequencing allows the profiling of individual cells, enabling the discovery of rare cell types, cell heterogeneity, and dynamic cellular responses. However, this vast amount of granular data demands sophisticated analysis tools to derive meaningful biological insights.
The authors compiled 45,522 molecular concepts from the Molecular Signatures Database (MSigDB) C2CP pathways or hallmark, NCBI EntrezGene interactome database, the Pathway commons database, and the conserved domain database to generate a comprehensive knowledge base.
The authors simulated technical variability and dropouts in gene expression characteristic of scRNA-seq and benchmarked on a real dataset of matched single-cell and bulk RNAseq data to demonstrate the performance of IndepthPathway.
Pathway Enrichment Analysis: Unraveling the Cellular Blueprint
Pathway enrichment analysis is a powerful computational method that helps researchers understand the biological functions and processes associated with differentially expressed genes in a given dataset. It involves mapping genes to predefined biological pathways and identifying which pathways are overrepresented in the dataset compared to what would be expected by chance.
Conventional pathway enrichment tools designed for bulk RNA-seq data have limitations when applied to single-cell sequencing data. The unique challenges stem from the sparse and noisy nature of single-cell data, making it essential to develop specialized tools that address these issues.
IndepthPathway: A Comprehensive Solution
IndepthPathway, a purpose-built PE package, was developed to enhance pathway discovery from bulk and SCS data. The package accommodates both weighted and unweighted target gene lists, allowing users to perform DE analysis between sample or cell groups using preferred DE methods. IndepthPathway incorporates human concept datasets from various knowledge databases and precompiled pathway gene sets for curated canonical and hallmark pathways. Additionally, it offers modules for disambiguating independent pathway modules, visualizing DE and interactome of core enriched genes, and generating pathway networks.
Unraveling Complex Pathway Alterations
Applying IndepthPathway to a published SCS dataset of active versus quiescent populations of hematopoietic stem cells (HSCs) demonstrated its capacity for comprehensive pathway discovery. WCSEA revealed significant alterations in pathways related to cell cycle activities, MYC targets controlling HSC self-renewal and differentiation, spliceosome activity, and DNA repair pathways. In contrast, commonly used PE methods lacked the depth to interpret such complex pathway changes.
Furthermore, IndepthPathway’s reproducibility was tested under simulated technical variability of scRNA-seq data and evaluated using real datasets with matched bulk RNAseq and scRNAseq on the same cell line models. CSEA and WCSEA exhibited outstanding performance, consistently outperforming other PE tools in detecting enriched pathways. These methods offered a more in-depth view of pathway alterations during endothelial differentiation and in neural stem cells versus GBM stem cells.
The practical implications of this research are
The development of the IndepthPathway tool will help biologists to perform pathway enrichment analysis based on single-cell sequencing data with improved scientific rigor. The use of the WCSEA approach in the IndepthPathway tool will enable the analysis of pathways enriched in less abundant cells that are vulnerable to disturbances.
The outstanding stability and depth in pathway enrichment results under stochasticity of the data demonstrated by the IndepthPathway tool will help researchers obtain more reliable and accurate study results. Incorporating WCSEA into the IndepthPathway R package will make it easier for biologists to use this method for pathway analysis based on bulk and single-cell sequencing data.
Conclusion
IndepthPathway represents a cutting-edge solution for pathway enrichment analysis, particularly in the context of single-cell sequencing data. Its ability to tolerate high noise and gene expression dropouts in scRNA-seq data and its deep functional interpretation makes it a powerful tool for pathway discovery in biomedical research. The authors demonstrated that IndepthPathway presents outstanding stability and depth in pathway enrichment results under stochasticity of the data, thus will substantially improve the scientific rigor of the pathway analysis for single-cell sequencing data. As single-cell sequencing continues to drive genomics and biomedical discoveries, IndepthPathway’s comprehensive approach will play a pivotal role in unlocking the full potential of this advanced technology.
Story Source: Reference Paper | The IndepthPathway R package is available through GitHub
Learn More:
Dr. Tamanna Anwar is a Scientist and Co-founder of the Centre of Bioinformatics Research and Technology (CBIRT). She is a passionate bioinformatics scientist and a visionary entrepreneur. Dr. Tamanna has worked as a Young Scientist at Jawaharlal Nehru University, New Delhi. She has also worked as a Postdoctoral Fellow at the University of Saskatchewan, Canada. She has several scientific research publications in high-impact research journals. Her latest endeavor is the development of a platform that acts as a one-stop solution for all bioinformatics related information as well as developing a bioinformatics news portal to report cutting-edge bioinformatics breakthroughs.