CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Most existing dimensionality reduction and clustering packages for single-cell RNA-seq (scRNA-seq) data deal with dropouts by heavy modeling and computational machinery. Here, we introduce CIDR (Clustering through Imputation and Dimensionality Reduction), an ultrafast algorithm that uses a novel yet very simple implicit imputation approach to alleviate the impact of dropouts in scRNA-seq data in a principled manner. Using a range of simulated and real data, we show that CIDR improves the standard principal component analysis and outperforms the state-of-the-art methods, namely t-SNE, ZIFA, and RaceID, in terms of clustering accuracy. CIDR typically completes within seconds when processing a data set of hundreds of cells and minutes for a data set of thousands of cells. CIDR can be downloaded at https://github.com/VCCRI/CIDR .
Keywords for this software
References in zbMATH (referenced in 7 articles )
Showing results 1 to 7 of 7.
- Ma, Xiuyu; Korthauer, Keegan; Kendziorski, Christina; Newton, Michael A.: A compositional model to assess expression changes from single-cell RNA-seq data (2021)
- Wang, Y. X. Rachel; Li, Lexin; Li, Jingyi Jessica; Huang, Haiyan: Network modeling in biology: statistical methods for gene and brain networks (2021)
- Lin, Zhixiang; Zamanighomi, Mahdi; Daley, Timothy; Ma, Shining; Wong, Wing Hung: Model-based approach to the joint analysis of single-cell data on chromatin accessibility and gene expression (2020)
- Liu, Yiyi; Warren, Joshua L.; Zhao, Hongyu: A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data (2019)
- Park, Seyoung; Zhao, Hongyu: Sparse principal component analysis with missing observations (2019)
- Suner, Aslı: Clustering methods for single-cell RNA-sequencing expression data: performance evaluation with varying sample sizes and cell compositions (2019)
- Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn: A unified statistical framework for single cell and bulk RNA sequencing data (2018)