TACOA - Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach. Results: Our novel strategy was extensively evaluated using the leave-one-out cross validation strategy on fragments of variable length (800 bp – 50 Kbp) from 373 completely sequenced genomes. TACOA is able to classify genomic fragments of length 800 bp and 1 Kbp with high accuracy until rank class. For longer fragments ≥ 3 Kbp accurate predictions are made at even deeper taxonomic ranks (order and genus). Remarkably, TACOA also produces reliable results when the taxonomic origin of a fragment is not represented in the reference set, thus classifying such fragments to its known broader taxonomic class or simply as ”unknown”. We compared the classification accuracy of TACOA with the latest intrinsic classifier PhyloPythia using 63 recently published complete genomes. For fragments of length 800 bp and 1 Kbp the overall accuracy of TACOA is higher than that obtained by PhyloPythia at all taxonomic ranks. For all fragment lengths, both methods achieved comparable high specificity results up to rank class and low false negative rates are also obtained. Conclusion: An accurate multi-class taxonomic classifier was developed for environmental genomic fragments. TACOA can predict with high reliability the taxonomic origin of genomic fragments as short as 800 bp. The proposed method is transparent, fast, accurate and the reference set can be easily updated as newly sequenced genomes become available. Moreover, the method demonstrated to be competitive when compared to the most current classifier PhyloPythia and has the advantage that it can be locally installed and the reference set can be kept up-to-date.

References in zbMATH (referenced in 5 articles )

Showing results 1 to 5 of 5.
Sorted by year (citations)

  1. Garbarine, Elaine; DePasquale, Joseph; Gadia, Vinay; Polikar, Robi; Rosen, Gail: Information-theoretic approaches to SVM feature selection for metagenome read classification (2011)
  2. Slabbinck, Bram; Waegeman, Willem; Dawyndt, Peter; De Vos, Paul; De Baets, Bernard: From learning taxonomies to phylogenetic learning: integration of 16S rrna gene data into FAME-based bacterial classification (2010) ioport
  3. Yang, Bin; Peng, Yu; Leung, Henry Chi-Ming; Yiu, Siu-Ming; Chen, Jing-Chi; Chin, Francis Yuk-Lun: Unsupervised binning of environmental genomic fragments based on an error robust selection of (l)-mers (2010) ioport
  4. Diaz, Naryttza N.; Krause, Lutz; Goesmann, Alexander; Niehaus, Karsten; Nattkemper, Tim W.: TACOA - taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach (2009) ioport
  5. Gerlach, Wolfgang; Jünemann, Sebastian; Tille, Felix; Goesmann, Alexander; Stoye, Jens: Webcarma: a web application for the functional and taxonomic classification of unassembled metagenomic reads (2009) ioport