Volume 1 Number 4 (Nov. 2011)
Home > Archive > 2011 > Volume 1 Number 4 (Nov. 2011) >
IJBBB 2011 Vol.1(4): 292-296 ISSN: 2010-3638
DOI: 10.7763/IJBBB.2011.V1.55

A Modified Fastmap K-Means Clustering Algorithm for Large Scale Gene Expression Datasets

Shital A. Raut and S. R. Sathe
Abstract—Clustering is an important tool of data mining to extract the context or meaningful patterns from the datasets. To extract the patterns from gene expression datasets, cluster analysis is used. From last two to three years, there are noticeably increase in public datasets, like geo, arrayexpress, genebank, are noted. All these increase in public datasets are not only in numbers but also in dimensions. To analyze single experiment, 1000’s of genes and 100’s of samples are available. To handle these large dimensions, we have to moderate or modify the traditional clustering algorithms. K-means clustering algorithm is one of the most used and tested clustering algorithm not only for gene expression datasets but also for various different datasets. But, as the dimension goes on increasing, CPU time requirement and memory requirement also increasing. Here, we try to increase the speed of K-means algorithm by adding additional phase (by using moderate FastMap) before implementation of k-means algorithm on the datasets. So, Modified FastMap K-means Clustering Algorithm, is a two phase algorithm, which try to reduced CPU time and memory requirements as compared to tradition K-means requirements. We have shown tabular results for three datasets, which are downloaded from public repository, NCBI, geo. The algorithm can successfully generate good results for large as well as small datasets.

Index Terms—Gene expression analysis, MFKCA, Cluster analysis.

Authors are with Department of CSE, VNIT, Nagpur (phone: 0712-2801259; e-mail: rautsa@gmail.com).



Cite: Shital A. Raut and S. R. Sathe, "A Modified Fastmap K-Means Clustering Algorithm for Large Scale Gene Expression Datasets," International Journal of Bioscience, Biochemistry and Bioinformatics vol. 1, no. 4, pp. 292-296, 2011.

General Information

ISSN: 2010-3638 (Online)
Abbreviated Title: Int. J. Biosci. Biochem. Bioinform.
Frequency: Quarterly 
DOI: 10.17706/IJBBB
Editor-in-Chief: Prof. Ebtisam Heikal 
Abstracting/ Indexing:  Electronic Journals Library, Chemical Abstracts Services (CAS), Engineering & Technology Digital Library, Google Scholar, and ProQuest.
E-mail: ijbbb@iap.org
  • Sep 29, 2022 News!

    IJBBB Vol 12, No 4 has been published online! [Click]

  • Jun 23, 2022 News!

    News | IJBBB Vol 12, No 3 has been published online! [Click]

  • Dec 20, 2021 News!

    IJBBB Vol 12, No 1 has been published online!  [Click]

  • Sep 23, 2021 News!

    IJBBB Vol 11, No 4 has been published online! [Click]

  • Jun 25, 2021 News!

    IJBBB Vol 11, No 3 has been published online! [Click]

  • Read more>>