IJBBB 2018 Vol.8(4): 210-217 ISSN: 2010-3638
doi: 10.17706/ijbbb.2018.8.4.210-217
doi: 10.17706/ijbbb.2018.8.4.210-217
Evaluation of Database Annotation to Determine Human Mitochondrial Proteins
Katsuhiko Murakami, Masaharu Sugita
Abstract—Subcellular localization can be a helpful indication of the function of an unknown protein. Among
the reported human mitochondrial proteins, hundreds of proteins have still not been functionally confirmed.
To date, several databases for such proteins have been developed; however, their annotations overlap
incompletely. A key issue in the completion of a reliable catalog of mitochondrial proteins is the integration
of all this information, and the evaluation of the influence of different forms of evidence is also important.
Here, we integrated various pieces of evidence (features) from both experimental and computational
analyses. Linear and nonlinear prediction models were examined to predict human mitochondrial proteins.
By employing a random forest model, an F-score of 0.929 was achieved by cross validation. The
contributions of individual features toward the accurate prediction of localization were evaluated. We found
only minor differences in importance among different features, with accurate prediction requiring the
combination of many features; however, evidence from mass spectrometry experiments emerged as a
prominent feature. Focusing on human mitochondrial proteins, we have constructed a high-accuracy
prediction model that utilizes many weak features. Evaluation of the importance of individual features
provides insights into what information is most valuable for the confirmation of protein localization.
Index Terms—Database, machine learning, mitochondria, localization, random forest.
Katsuhiko Murakami is with Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan (email: murakami.ktk@gmail.com).
Masaharu Sugita is with School of Bioscience and Biotechnology, Tokyo University of Technology, 1404-1 Katakuramachi, Tokyo 192-0982, Japan.
Index Terms—Database, machine learning, mitochondria, localization, random forest.
Katsuhiko Murakami is with Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan (email: murakami.ktk@gmail.com).
Masaharu Sugita is with School of Bioscience and Biotechnology, Tokyo University of Technology, 1404-1 Katakuramachi, Tokyo 192-0982, Japan.
Cite: Katsuhiko Murakami, Masaharu Sugita, "Evaluation of Database Annotation to Determine Human Mitochondrial Proteins," International Journal of Bioscience, Biochemistry and Bioinformatics vol. 8, no. 4, pp. 210-217, 2018.
General Information
ISSN: 2010-3638 (Online)
Abbreviated Title: Int. J. Biosci. Biochem. Bioinform.
Frequency: Quarterly
DOI: 10.17706/IJBBB
Editor-in-Chief: Prof. Ebtisam Heikal
Abstracting/ Indexing: Electronic Journals Library, Chemical Abstracts Services (CAS), Engineering & Technology Digital Library, Google Scholar, and ProQuest.
E-mail: ijbbb@iap.org
-
Sep 29, 2022 News!
IJBBB Vol 12, No 4 has been published online! [Click]
-
Jun 23, 2022 News!
News | IJBBB Vol 12, No 3 has been published online! [Click]
-
Dec 20, 2021 News!
IJBBB Vol 12, No 1 has been published online! [Click]
-
Sep 23, 2021 News!
IJBBB Vol 11, No 4 has been published online! [Click]
-
Jun 25, 2021 News!
IJBBB Vol 11, No 3 has been published online! [Click]
- Read more>>