Abstract:
During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction.