Semi-Supervised Boosting Fold Recognition Algorithm (SB-FR) for Protein Fold Recognition

Wessam H. El-Behaidy, Aliaa A. A. Youssif, and Atef Z. Ghalwash
Helwan University/Faculty of Computers and Information, Cairo, Egypt
Abstract—Protein structure prediction is a very challenging problem in drug discovery and computational biology. To achieve better multi-class classification model for fold recognition problem, a combination between semi-supervised and boosting techniques is proposed into Semi-supervised Boosting Fold Recognition (SB-FR) algorithm. In addition, a testing method “TreeTest” is introduced for improving the overall accuracy of SB-FR algorithm. To benchmark the performance of the proposed SB-FR algorithm, a famous challengeable “Ding and Dubchak” dataset is used for training and testing. In addition, different parameters are applied to the same random sets of labeled and unlabeled sequences. To benchmark the “TreeTest” testing method, All-versus-All (AvA) testing method is used for comparison. Finally, using the proposed SB-FR algorithm along with the proposed “TreeTest” method for fold recognition multi-class classification, a 5.6% improvement is recorded in the overall accuracy for three-class and 8.2% for five-class classifications when compared to the base classifier.

Index Terms—protein structure prediction, multi-class classification, fold recognition, semi-supervised, boosting, Ding and Dubchak dataset, All-vs-All

Cite: Wessam H. El-Behaidy, Aliaa A. A. Youssif, and Atef Z. Ghalwash, "Semi-Supervised Boosting Fold Recognition Algorithm (SB-FR) for Protein Fold Recognition," International Journal of Electrical Energy, Vol. 2, No. 1, pp. 182-187, March 2014. doi: 10.12720/ijoee.2.1.1-6
