NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features. Nuclear receptors (NRs) are one of the most abundant classes of transcriptional regulators in animals. They regulate diverse functions, such as homeostasis, reproduction, development and metabolism. Therefore, NRs are a very important target for drug development. Nuclear receptors form a superfamily of phylogenetically related proteins and have been subdivided into different subfamilies due to their domain diversity. In this study, a two-level predictor, called NR-2L, was developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone; if it is, the prediction will be automatically continued to further identify it among the following seven subfamilies: (1) thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) estrogen like, (4) nerve growth factor IB-like (NR4), (5) fushi tarazu-F1 like (NR5), (6) germ cell nuclear factor like (NR6), and (7) knirps like (NR0). The identification was made by the Fuzzy K nearest neighbor (FK-NN) classifier based on the pseudo amino acid composition formed by incorporating various physicochemical and statistical features derived from the protein sequences, such as amino acid composition, dipeptide composition, complexity factor, and low-frequency Fourier spectrum components. As a demonstration, it was shown through some benchmark datasets derived from the NucleaRDB and UniProt with low redundancy that the overall success rates achieved by the jackknife test were about 93% and 89% in the first and second level, respectively. The high success rates indicate that the novel two-level predictor can be a useful vehicle for identifying NRs and their subfamilies. As a user-friendly web server, NR-2L is freely accessible at either http://icpr.jci.edu.cn/bioinfo/NR2L or http://www.jci-bioinfo.cn/NR2L. Each job submitted to NR-2L can contain up to 500 query protein sequences and be finished in less than 2 minutes. The less the number of query proteins is, the shorter the time will usually be. All the program codes for NR-2L are available for non-commercial purpose upon request.
Keywords for this software
References in zbMATH (referenced in 9 articles )
Showing results 1 to 9 of 9.
- Srivastava, Abhishikha; Kumar, Ravindra; Kumar, Manish: BlaPred: predicting and classifying (\beta)-lactamase using a 3-tier prediction system via Chou’s general PseAAC (2018)
- Li, Yao-Wang; Li, Bo: Characterization of structure-antioxidant activity relationship of peptides in free radical systems using QSAR models: key sequence positions and their amino acid properties (2013)
- Xiao, Xuan; Min, Jian-Liang; Wang, Pu; Chou, Kuo-Chen: iCDI-PseFpt: identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints (2013)
- Jahandideh, Samad; Mahdavi, Abbas: RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest (2012)
- Li, Tao; Li, Qian-Zhong: Annotating the protein-RNA interaction sites in proteins using evolutionary information and protein backbone structure (2012)
- Mei, Suyu: Multi-kernel transfer learning based on Chou’s PseAAC formulation for protein submitochondria localization (2012)
- Mei, Suyu: Predicting plant protein subcellular multi-localization by Chou’s PseAAC formulation based multi-label homolog knowledge transfer learning (2012)
- Mishra, Pooja; Nath Pandey, Paras: Elman RNN based classification of proteins sequences on account of their mutual information (2012)
- Qiu, Zhijun; Wang, Xicheng: Prediction of protein-protein interaction sites using patch-based residue characterization (2012)