Skip to main content

Table 3 Summary of CNV calling methods for scRNA-seq data

From: Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer’s disease: review, recommendation, implementation and application

Method

Brief Explanation

Input

Resolution

Advantages

Disadvantage

InferCNV

Hidden Markov model: i3 and i6 model + Bayesian analysis. The i3 model: deletion, neutral and amplification states. The i6 model: complete loss, loss of one copy, neutral, addition of one copy, addition of two copies, and more than three copies.

Expression profiling

Identification of large-scale chromosome-scale CNVs

1) InferCNV can work both with and without normal-cell reference;

2) it provides two analysis modes including predefined cell types as whole samples, or subclusters based on CNV patterns;

3) InferCNV provides an interactive R Shiny Web App

InferCNV assumes the copy number dosage is constant over the whole predicted region.

HoneyBADGER

Hidden Markov model and Bayesian approach

Allelic imbalance and normalized expression profiling

Robust identification of sub-clonal focal alterations as small as 10 Mb; identification of CNVs at chromosome-arm-level with frequency as low as 30% of target cells, and at the full chromosome-level.

1) Identifcation of CNVs as small as 10 Mb, much higher compared with average expression-based methods;

2) Detection of detect copy-number neutral loss-of-heterozygosity events.

1) Use of WES or common natural SNP information from other public datasets as reference to generate heterozygous SNP positions;

2) Instead of estimating precise copy number, it aims at distinguishing copy number alteration regions from copy number neutral regions.

CONICS

Comparison of control distribution and observed distribution at each CNVR region in each cell.

Expression profiling

CNV regions inferred from other DNA sequencing data or the chromosome-arm level.

CONICS provides routines for further differential-expression, phylogeny, and co-expression network analysis.

1) Predefined CNV locations in orthogonal DNA sequencing data such as WES.

2) Incapable of identifying novel CNV regions.

CONICSmat

Bayesian approach: chi-squared likelihood-ratio test by comparing 2-component Gaussian mixture model and 1-component Gaussian model.

Expression profiling

chromosomal-arm-level

1) No need of an explicit normal control dataset, or DNA-sequencing data;

2) Providing routines for further differential-expression, phylogeny, and co-expression network analyses.

1) Identification of CNVs at the mega base scale. 2) Incapable of identifying gene-level CNVs.

CaSpER

Hidden Markov model and Bayesian approach

Allele frequency shift+ expression profiling

large-scale gene-based, and segment-based CNV calls

1) Variant calling is not needed and this can speed up the whole detection process;

2) CaSpER provides a number of downstream analyses: infer clonal evolution, discover mutual-exclusive and co-occurring CNV events, identify gene expression signature of the identified clones.

1) The true positive rate only reaches 60–80%. 2) The detection accuracy for deletion is much higher than amplification.