Home | Repositories | Statistics | About



Subject: C4.5 Classification, Gene ntology, Protein function prediction


Year: 2010


Type: Proceedings



Title: Hierarchical protein classification based on gene ontology and decision trees


Author: Ivanoska, Ilinka
Author: Trivodaliev, Kire
Author: Kalajdziski, Slobodan
Author: Mirceva, Georgina



Abstract: Proteins are the most important cell parts, therefore, knowing their exact function is of a great significance. However, the function of large amount of proteins is still unknown. In addition, today, biologists persist on hierarchical organization the living world, and thus in protein databases also. There are many protein classification algorithms proposed determining the protein function, but, only a few of them take into consideration these hierarchical structures. The Gene Ontology (GO) is a protein and gene database structured as a controlled hierarchical vocabulary of terms to describe protein functions. This paper introduces a new hierarchical multi-label protein classifier that uses the relationships among the GO terms. First, protein descriptors are extracted from the structural coordinates stored in the Protein Data Bank (PDB) files. Then, a modified C4.5 algorithm is applied to select the most appropriate descriptor features for protein classification based on the GO hierarchy. An evaluation of this approach is presented, and the results show that the hierarchical structure of GO is important for improving the accuracy of the classification problem at higher levels.


Publisher:


Relation: ICT Innovations 2010



Identifier: oai:repository.ukim.mk:20.500.12188/22915
Identifier: http://hdl.handle.net/20.500.12188/22915



TitleDateViews
Hierarchical protein classification based on gene ontology and decision trees201023