MicroArray Data Analysis Candice Quadros & Amol Kothari
Harnessing the power of a neural network for classifying samples. Neural Network for classification
Reduce the no. of genes : We have to reduce the data dimensionality, i.e. reduce the no. of genes to consider. PCA can be used to select most informative genes, but it is computationally expensive to obtain the Eigen vectors for high dimensional data. Use the method suggested by Golub et al. to obtain the informative genes. Neural Network for classification
Steps in classification Obtain the informative genes using Golub’s method. Normalize the genes by shifting them to the mean & dividing by the standard deviation. Train the neural network by using the training data & targets, and get the weights. Classify the test data using the weights obtained above. Neural Network for classification
Results obtained: Inform. Genes No. of Hidden Units NN: Accuracy Golub: Accuracy Neural Network for classification
Hierarchical Merging: When to stop? Question: When to stop the merging? Suggested Solutions: Diameter(C) MaxD Avg(sim(O i,O j )) ≥ (O i,O j C) Difficult to estimate the parameters in high dimensions.
Another solution: When m clusters are present, stop merging. Problem: The m clusters might contain single point clusters. Use the concept of MinPts (from DBScan). A set of points is a significant cluster only if the set has MinPts. When there are m significant clusters, then stop. Hierarchical Merging: When to stop?
No. of iterations No. of Significant Clusters
Visualization of data: Vizstruct
Equation used: How do weigh each dimension, i.e. how do we select λ? Default value = 0.5 Use the Eigen Values of each dimension to obtain the value of λ. Visualization of data: Vizstruct
Steps for visualization: Project the data into Eigen space. The Eigen values of each dimension i = λi Now use the same formulae for calculating the 2D point: Where λ i = Eigen value of the i th dimension Visualization of data: Vizstruct
Results: The visualization obtained by this method is more representative of the data, compared to Vizstruct. Demo Visualization of data: Vizstruct
Thank You