COMP 328: Final Review Spring 2010 Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology Can be used as cheat sheet
Page 2 Pre-Midterm l Algorithms for supervised learning n Decision trees n Instance-based learning n Naïve Bayes classifiers n Neural networks n Support vector machines l General issues regarding supervised learning n Classification error and confidence interval n Bias-Variance tradeoff n PAC learning theory
Post-Midterm l Clustering n Distance-Based Clustering n Model-Based Clustering l Dimension Reduction n Principal Component Analysis l Reinforcement Learning l Ensemble Learning
Clustering
Distance/Similarity Measures
Distance-Based Clustering l Partitional and Hierarchical clustering
K-Means: Partitional Clustering
l Different initial points might lead to different partitions l Solution: n Multiple runs, n Use evaluation criteria such as SSE to pick the best one
Hierarchical Clustering l Agglomerative and Divisive
Cluster Similarity
Cluster Validation l External indices n Entropy: Average purity of clusters obtained n Mutual Information between class label and cluster label
Cluster Validation l External Measure n Jaccard Index n Rand Index Measure similarity between two relationships: in-same-class & in-same-cluster # pairs in same cluster# pairs in diff cluster # pairs w/ same labelab # pairs w/ diff labelcd
Cluster Validation l Internal Measure n Dunn’s index
Cluster Validation l Internal Measure
Post-Midterm l Clustering n Distance-Based Clustering n Model-Based Clustering l Dimension Reduction n Principal Component Analysis l Reinforcement Learning l Ensemble Learning
Model-Based Clustering l Assume data generated from a mixture model with K components l Estimate parameters of the model from data l Assign objects to clusters based posterior probability: Soft Assignment
Gaussian Mixtures
Learning Gaussian Mixture Models
EM
l l(t): Log likelihood of model after t-th iteration l l(t): increases monotonically with t l But might go to infinite in case of singularity n Solution: place bound on eigen values of covariance matrix l Local maximum n Multiple restart n Use likelihood to pick best model
EM and K-Means l K-Means is hard-assignment EM
Mixture Variable for Discrete Data
Latent Class Model
Learning Latent Class Models Always converges
Post-Midterm l Clustering n Distance-Based Clustering n Model-Based Clustering l Dimension Reduction n Principal Component Analysis l Reinforcement Learning l Ensemble Learning
Dimension Reduction l Necessary because there are data sets with large numbers of attributes that are difficult to learning algorithms to handle.
Principal Component Analysis
PCA Solution
PCA Illustration
Eigenvalues and Projection Error
Post-Midterm l Clustering n Distance-Based Clustering n Model-Based Clustering l Dimension Reduction n Principal Component Analysis l Reinforcement Learning l Ensemble Learning
Reinforcement Learning
Markov Decision Process l A model of how agent interact with its environment
Markov Decision Process
Value Iteration
Reinforcement Learning
Q-Learning
l From Q-function based value iteration l Ideas n In-place/asynchronous value iteration n Approximate expectation using samples n ε-greedy policy (for exploration/exploitation) tradeoff
Time Difference Learning
Sarsa is also time difference learning
Post-Midterm l Clustering n Distance-Based Clustering n Model-Based Clustering l Dimension Reduction n Principal Component Analysis l Reinforcement Learning l Ensemble Learning
Ensemble Learning
Bagging: Reduce Variance
Boosting: Reduce Classification Error
AdaBoost: Exponential Error