Scoredist: A simple and robust protein sequence distance estimator. Results: We propose a correction-based protein sequence estimator called Scoredist. It uses a logarithmic correction of observed divergence based on the alignment score according to the BLOSUM62 score matrix. We evaluated Scoredist and a number of optimal matrix methods using three evolutionary models for both training and testing Dayhoff, Jones-Taylor-Thornton, and Müller-Vingron, as well as Whelan and Goldman solely for testing. Test alignments with known distances between 0.01 and 2 substitutions per position (1–200 PAM) were simulated using ROSE. Scoredist proved as accurate as the optimal matrix methods, yet substantially more robust. When trained on one model but tested on another one, Scoredist was nearly always more accurate. The Jukes-Cantor and Kimura correction methods were also tested, but were substantially less accurate. Conclusion: The Scoredist distance estimator is fast to implement and run, and combines robustness with accuracy. Scoredist has been incorporated into the Belvu alignment viewer, which is available at ftp://ftp.cgb.ki.se/pub/prog/belvu/.
References in zbMATH (referenced in 3 articles )
Showing results 1 to 3 of 3.
- Brown, Daniel G.; Truszkowski, Jakub: Fast error-tolerant quartet phylogeny algorithms (2011)
- Lemey, Philippe; Lott, Martin; Martin, Darren P.; Moulton, Vincent: Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning (2009) ioport
- Kelil, Abdellali; Wang, Shengrui; Brzezinski, Ryszard; Fleury, Alain: CLUSS: Clustering of protein sequences based on a new similarity measure (2007) ioport