MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction. BACKGROUND: Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. RESULTS: We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. CONCLUSION: MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.
Keywords for this software
References in zbMATH (referenced in 8 articles )
Showing results 1 to 8 of 8.
- Han, Guo-Sheng; Yu, Zu-Guo; Anh, Vo: A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou’s PseAAC (2014)
- Mei, Suyu: \textitSVMensemble based transfer learning for large-scale membrane proteins discrimination (2014)
- Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan: GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou’s pseudo-amino acid composition (2013)
- Mei, Suyu: Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization (2012)
- Mei, Suyu: Predicting plant protein subcellular multi-localization by Chou’s PseAAC formulation based multi-label homolog knowledge transfer learning (2012)
- Mei, S.; Wang, F.; Zhou, S.: Gene ontology based transfer learning for protein subcellular localization (2011) ioport
- Zakeri, Pooya; Moshiri, Behzad; Sadeghi, Mehdi: Prediction of protein submitochondria locations based on data fusion of various features of sequences (2011)
- Blum, Torsten; Briesemeister, Sebastian; Kohlbacher, Oliver: Multiloc2: integrating phylogeny and gene ontology terms improves subcellular protein localization prediction (2009) ioport