Skip to main content
Fig. 1 | Molecular Neurodegeneration

Fig. 1

From: Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer’s disease: review, recommendation, implementation and application

Fig. 1

Overview of the bioinformatics approaches to analyze scRNA-seq, scATAC-seq, and spatial transcriptomics data with a focus on scRNA-seq data. scRNA-seq and scATAC-seq data (A) go through appropriate quality control (QC) to remove outliers and cells with low-quality sequencing data (B), followed by normalization (B). QC-ed and normalized data are then used for dimension reduction, and feature extraction (C) clustering analysis to identify cell clusters (D). Marker genes for each cell cluster will then be identified to infer its association to known or novel cell type (E). Meanwhile, differential gene expression is performed between cell groups of interest (e.g., AD and Control) in each cell cluster to identify gene expression changes associated with the disease (F). Trajectory inference can be performed on all cells, cells in each cluster or the cells from multiple closely related cell clusters to infer cellular dynamics during developmental or disease progression (G). Copy number variations (CNVs) can also be inferred from scRNA-seq data (H). Integration of gene expression and genomic (SNPs & CNVs) data leads to the identification of expression-associated quantitative trait loci (eQTLs) (I). Epigenomic analysis by scATAC-seq can study gene expression regulatory elements in open chromatin regions (J) and will be detailed in Fig. 7. Gene coexpression and causal networks will be constructed for each cell cluster or multiple closely related cell clusters, while priors from eQTLs and epigenomic analyses can be developed for assisting causal network inference (K). Cell clusters can be prioritized based on the number of differentially expressed genes between disease and control across all cell clusters (L). scRNA-seq data can also be integrated with bulk RNA-seq data to robustly identify key molecular changes and network structures (M). Finally, cell cluster-based networks will be analyzed to prioritize key subnetworks (e.g., coexpressed gene modules) and potential network regulators for a disease (e.g., AD) under study (N). Novel cell clusters, key subnetworks and key driver genes can be validated through single-cell spatial transcriptomics analysis which offers more insights into spatially distributed molecular signals in a system or a disease under study (O). Key findings from human AD single cell sequencing data will be validated in AD mouse models and integration of mouse and human single cell data is critical for informing the correspondence between AD mouse models and human AD (P)

Back to article page