Secator : a program for inferring protein subfamilies from phylogenetic trees. With the huge increase of protein data, an important problem is to estimate, within a large protein family, the number of sensible subsets for subsequent in-depth structural, functional, and evolutionary analyses. To tackle this problem, we developed a new program, Secator, which implements the principle of an ascending hierarchical method using a distance matrix based on a multiple alignment of protein sequences. Dissimilarity values assigned to the nodes of a deduced phylogenetic tree are partitioned by a new stopping rule introduced to automatically determine the significant dissimilarity values. The quality of the clusters obtained by Secator is verified by a separate Jackknife study. The method is demonstrated on 24 large protein families covering a wide spectrum of structural and sequence conservation and its usefulness and accuracy with real biological data is illustrated on two well-studied protein families (the Sm proteins and the nuclear receptors).

References in zbMATH (referenced in 2 articles )

Showing results 1 to 2 of 2.
Sorted by year (citations)

  1. Murua, Alejandro; Wicker, Nicolas: Kernel-based mixture models for classification (2015)
  2. Kelil, Abdellali; Wang, Shengrui; Brzezinski, Ryszard; Fleury, Alain: CLUSS: Clustering of protein sequences based on a new similarity measure (2007) ioport