A hybrid unsupervised and supervised clustering applied to microarray data

Raul Malutan, Pedro Gomez Vilda, Monica Borda


This work shows how one can determine an optimal combination of clustering algorithms by performing a hybrid biclustering of data with unsupervised methods, and how to extract coherent and typically small clusters of genes that vary as much as possible across the samples using an supervised method like Gene Shaving.

Full Text:



S. González, L. Guerra , V. Robles, JM. Peña, F. Famili, “CliDaPa: A new approach to combining clinical data with DNA microarrays” Intelligent Data Analysis Journal, vol 14(2), pp. 207 – 223, 2010

J. Han, M. Kamber, Data mining: Concepts and techniques. Morgan Kaufmann, 2000

R. Malutan, B. Belean, P.G. Vilda, M. Borda, “Two way clustering of microarray data using a hybrid approach” in Proc. of 34th Int. Conf. on Telecommunications and Signal Processing, Budapest, 2011, pp. 417 - 420

G. McLachlan, K-A. Do, C. Ambroise, Analyzing Microarray Gene Expression Data. Wiley-Interscience, 2004

M.B. Zoubi, A. Hudaib, A. Huneiti, B. Hammo, “New efficient strategy to accelerate k-means clustering algorithm” American Journal of Applied Sciences, vol 5(9), pp. 1247 – 1250, 2008

G. McLachlan, T. Krishnan, The EM Algorithm and Extensions. John Willey & Sons 2008

N. Bolshakova, F. Azuaje, “Cluster validation techniques for genome expression data” Signal Processing vol 83, pp. 825 – 833, 2002

K. Wang, B. Wang, L. Peng, “CVAP: Validation for Cluster Analyses” Data Science Journal vol 8, pp. 88 – 93, 2009

T. Hastie et. al., “Gene shaving as a method for identifying distinct sets of genes with similar expression patterns”, Genome Biology, vol. I(2), research 0003, pp. 1-21, 2000

W. L. Martinez, A. R. Martinez, Exploratory Data Analysis with MATLAB, CRC Press LLC, 2005

D. Chowdary et al, “Prognostic gene expression signatures can be measured in tissues collected in RNAlater preservative” J Mol Diagn vol 8(1), pp. 31 – 39, 2006

S. Armstrong et al, “MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia” Nature Genetics vol 30, pp. 41 – 47, 2001

S. C. Madeira, A. L. Oliveira, “Biclustering algorithms for biological data analysis: A survey” IEEE/ACM Transactions on Computational Biology and Bioinformatics vol 1 (1), pp. 24-45, 2004

DOI: http://dx.doi.org/10.11601/ijates.v2i3.21


  • There are currently no refbacks.