Download presentation
Presentation is loading. Please wait.
Published byAugustine Ethan Robbins Modified over 8 years ago
1
MicroArray Data Analysis Candice Quadros & Amol Kothari
2
Harnessing the power of a neural network for classifying samples. Neural Network for classification
3
Reduce the no. of genes : We have to reduce the data dimensionality, i.e. reduce the no. of genes to consider. PCA can be used to select most informative genes, but it is computationally expensive to obtain the Eigen vectors for high dimensional data. Use the method suggested by Golub et al. to obtain the informative genes. Neural Network for classification
4
Steps in classification Obtain the informative genes using Golub’s method. Normalize the genes by shifting them to the mean & dividing by the standard deviation. Train the neural network by using the training data & targets, and get the weights. Classify the test data using the weights obtained above. Neural Network for classification
5
Results obtained: Inform. Genes No. of Hidden Units NN: Accuracy Golub: Accuracy 100370.5561.76 2001376.7458.82 Neural Network for classification
6
Hierarchical Merging: When to stop? Question: When to stop the merging? Suggested Solutions: Diameter(C) MaxD Avg(sim(O i,O j )) ≥ (O i,O j C) Difficult to estimate the parameters in high dimensions.
7
Another solution: When m clusters are present, stop merging. Problem: The m clusters might contain single point clusters. Use the concept of MinPts (from DBScan). A set of points is a significant cluster only if the set has MinPts. When there are m significant clusters, then stop. Hierarchical Merging: When to stop?
8
No. of iterations No. of Significant Clusters
9
Visualization of data: Vizstruct
10
Equation used: How do weigh each dimension, i.e. how do we select λ? Default value = 0.5 Use the Eigen Values of each dimension to obtain the value of λ. Visualization of data: Vizstruct
11
Steps for visualization: Project the data into Eigen space. The Eigen values of each dimension i = λi Now use the same formulae for calculating the 2D point: Where λ i = Eigen value of the i th dimension Visualization of data: Vizstruct
12
Results: The visualization obtained by this method is more representative of the data, compared to Vizstruct. Demo Visualization of data: Vizstruct
13
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.