STAR: ultrafast universal RNA-seq aligner. Motivation: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80–90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from

References in zbMATH (referenced in 15 articles )

Showing results 1 to 15 of 15.
Sorted by year (citations)

  1. Bima, Abdulhadi Ibrahim H.; Elsamanoudy, Ayman Zaky; Albaqami, Walaa F.; Khan, Zeenath; Parambath, Snijesh Valiya; Al-Rayes, Nuha; Kaipa, Prabhakar Rao; Elango, Ramu; Banaganapalli, Babajan; Shaik, Noor A.: Integrative system biology and mathematical modeling of genetic networks identifies shared biomarkers for obesity and diabetes (2022)
  2. Akalin, Altuna: Computational genomics with R. With the assistance of Verdan Franke, Bora Uyar and Jonathan Ronen (2021)
  3. Acuña, V.; Grossi, R.; Italiano, G. F.; Lima, L.; Rizzi, R.; Sacomoto, G.; Sagot, M.-F.; Sinaimeri, B.: On bubble generators in directed graphs (2020)
  4. Zhang, Lingxiang: Linearity tests and stochastic trend under the STAR framework (2020)
  5. Kuan-Hao Chao, Yi-Wen Hsiao, Yi-Fang Lee, Chien-Yueh Lee, Liang-Chuan Lai, Mong-Hsun Tsai, Tzu-Pin Lu, Eric Y. Chuang: RNASeqR: an R package for automated two-group RNA-Seq analysis workflow (2019) arXiv
  6. Teixeira, Andreia Sofia; Fernandes, Francisco; Francisco, Alexandre P.: SpliceTAPyR -- an efficient method for transcriptome alignment (2018)
  7. Wolff, Alexander: Analysis of expression profile and gene variation via development of methods for next generation sequencing data (2018)
  8. Faisal, Shahla; Tutz, Gerhard: Missing value imputation for gene expression data by tailored nearest neighbors (2017)
  9. Fu, Rong; Wang, Pei; Ma, Weiping; Taguchi, Ayumu; Wong, Chee-Hong; Zhang, Qing; Gazdar, Adi; Hanash, Samir M.; Zhou, Qinghua; Zhong, Hua; Feng, Ziding: A statistical method for detecting differentially expressed SNVs based on next-generation RNA-seq data (2017)
  10. Mao, Shunfu; Mohajer, Soheil; Ramachandran, Kannan; Tse, David; Kannan, Sreeram: abSNP: RNA-Seq SNP calling in repetitive regions via abundance estimation (2017)
  11. Shen, Carol; Shen, Tony; Lin, Jimmy: Comparative assessment of alignment algorithms for NGS data: features, considerations, implementations, and future (2017)
  12. Shomroni, Orr: Development of algorithms and next-generation sequencing data workflows for the analysis of gene regulatory networks (2017)
  13. Carugo, Oliviero (ed.); Eisenhaber, Frank (ed.): Data mining techniques for the life sciences (2016)
  14. Mathé, Ewy (ed.); Davis, Sean (ed.): Statistical genomics. Methods and protocols (2016)
  15. Picardi, Ernesto (ed.): RNA bioinformatics (2015)