Download presentation
Presentation is loading. Please wait.
Published byNoah Norris Modified over 9 years ago
1
Lloyd Algorithm K-Means Clustering
2
Gene Expression Susumu Ohno: whole genome duplications The expression of genes can be measured over time. Identifying which genes are expressed at a given moment can help determine function.
3
Grouping Grouping genes by derivative. Data must be clustered by derivative.
4
Clustering Problems Cluster d data points into k clusters, such that each point is closer to the points in its cluster than those of any other. Data is usually not that clearly organized.
5
Lloyd’s Algorithm Assign points to clusters, minimizing distance between points and centers of clusters. Assign cluster center of gravity as new center, repeat until centers do not change, minimize squared error distortion.
6
The Computational Problem Input: A matrix of points with dimensions m and the desired number of clusters k. Output: Points organized into k clusters, minimizing distance from center, and a visual representation of the data.
7
Pseudo-pseudocode Arbitrarily assign k centers. Assign points to k clusters, minimizing Euclidian distance from center. Assign cluster center of gravity as new center. Repeat until algorithm converges
8
Plotting
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.