Staff View: Efficient feature selection and classification of protein sequence data in bioinformatics

Efficient feature selection and classification of protein sequence data in bioinformatics

Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing project...

Full description

Main Authors:	Iqbal, M.J., Faye, I., Samir, B.B., Md Said, A.
Format:	Article
Institution:	Universiti Teknologi Petronas
Record Id / ISBN-0:	utp-eprints.32341 /
Published:	Hindawi Publishing Corporation 2014
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-84904113027&doi=10.1155%2f2014%2f173869&partnerID=40&md5=f280cf37fafc0a3810f3bf162a4cf8ae http://eprints.utp.edu.my/32341/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	utp-eprints.32341
recordtype	eprints
spelling	utp-eprints.323412022-03-29T05:27:34Z Efficient feature selection and classification of protein sequence data in bioinformatics Iqbal, M.J. Faye, I. Samir, B.B. Md Said, A. Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. Â© 2014 Muhammad Javed Iqbal et al. Hindawi Publishing Corporation 2014 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-84904113027&doi=10.1155%2f2014%2f173869&partnerID=40&md5=f280cf37fafc0a3810f3bf162a4cf8ae Iqbal, M.J. and Faye, I. and Samir, B.B. and Md Said, A. (2014) Efficient feature selection and classification of protein sequence data in bioinformatics. Scientific World Journal, 2014 . http://eprints.utp.edu.my/32341/
institution	Universiti Teknologi Petronas
collection	UTP Institutional Repository
description	Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. Â© 2014 Muhammad Javed Iqbal et al.
format	Article
author	Iqbal, M.J. Faye, I. Samir, B.B. Md Said, A.
spellingShingle	Iqbal, M.J. Faye, I. Samir, B.B. Md Said, A. Efficient feature selection and classification of protein sequence data in bioinformatics
author_sort	Iqbal, M.J.
title	Efficient feature selection and classification of protein sequence data in bioinformatics
title_short	Efficient feature selection and classification of protein sequence data in bioinformatics
title_full	Efficient feature selection and classification of protein sequence data in bioinformatics
title_fullStr	Efficient feature selection and classification of protein sequence data in bioinformatics
title_full_unstemmed	Efficient feature selection and classification of protein sequence data in bioinformatics
title_sort	efficient feature selection and classification of protein sequence data in bioinformatics
publisher	Hindawi Publishing Corporation
publishDate	2014
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-84904113027&doi=10.1155%2f2014%2f173869&partnerID=40&md5=f280cf37fafc0a3810f3bf162a4cf8ae http://eprints.utp.edu.my/32341/
_version_	1741197720170790912
score	11.62408

Efficient feature selection and classification of protein sequence data in bioinformatics

Similar Items