Christoph F. Eick Questions and Topics Review Dec. 10, 2013 1.Compare AGNES /Hierarchical clustering with K-means; what are the main differences? 2. K-means.

Slides:

Advertisements

Similar presentations

CHAPTER 13: Alpaydin: Kernel Machines

Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.

Christoph F. Eick Questions and Topics Review Nov. 30, Give an example of a problem that might benefit from feature creation 2.How does DENCLUE.

ECG Signal processing (2)

Hierarchical Clustering, DBSCAN The EM Algorithm

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

PARTITIONAL CLUSTERING

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.

Classification / Regression Support Vector Machines

Christoph F. Eick Questions and Topics Review Nov. 22, Assume you have to do feature selection for a classification task. What are the characteristics.

Pattern Recognition and Machine Learning

An Introduction of Support Vector Machine

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Machine learning continued Image source:

More on Clustering Hierarchical Clustering to be discussed in Clustering Part2 DBSCAN will be used in programming project.

Supervised Learning Recap

Separating Hyperplanes

Machine Learning and Data Mining Clustering

Introduction to Bioinformatics

Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg

Christoph F. Eick Questions and Topics Review Dec. 1, Give an example of a problem that might benefit from feature creation 2.Compute the Silhouette.

Support Vector Machines

Lecture 10: Support Vector Machines

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.

Métodos de kernel. Resumen SVM - motivación SVM no separable Kernels Otros problemas Ejemplos Muchas slides de Ronald Collopert.

Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.

Data mining and machine learning A brief introduction.

CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

计算机学院计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知计算机学院 Perceptron Revisited: Linear Separators Binary classification.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.

Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Christoph F. Eick Questions and Topics Review Dec. 6, Compare AGNES /Hierarchical clustering with K-means; what are the main differences? 2 Compute.

Christoph F. Eick Questions and Topics Review November 11, Discussion of Midterm Exam 2.Assume an association rule if smoke then cancer has a confidence.

Clustering Algorithms Presented by Michael Smaili CS 157B Spring

CS 478 – Tools for Machine Learning and Data Mining SVM.

Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.

CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.

Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.

1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

Definition Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to)

Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.

Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.

SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.

Introduction to Machine Learning Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)

1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.

Christoph F. Eick Questions Review October 12, How does post decision tree post-pruning work? What is the purpose of applying post-pruning in decision.

High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.

Support Vector Machine

More on Clustering in COSC 4335

Large Margin classifiers

Data Mining K-means Algorithm

Support Vector Machines

K-means and Hierarchical Clustering

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

TOP DM 10 Algorithms C4.5 C 4.5 Research Issue:

Revision (Part II) Ke Chen

Revision (Part II) Ke Chen

Support Vector Machines

Support vector machines

Neuro-Computing Lecture 2 Single-Layer Perceptrons

Review for test #3 Radial basis functions SVM SOM.

Machine Learning and Data Mining Clustering

Presentation transcript:

Christoph F. Eick Questions and Topics Review Dec. 10, Compare AGNES /Hierarchical clustering with K-means; what are the main differences? 2. K-means has a runtime complexity of O(t*k*n*d), where t is the number of iterations, d is the dimensionality of the datasets, k is the number of clusters in the dataset, and n is the number of objects in the dataset. Explain! In general, is K-means an efficient clustering algorithm; give a reason for this answer, by discussing its runtime by referring to its runtime complexity formula! [5] The number of attributes an object has! 3. Assume the Apriori-style sequence mining algorithm described at pages is used and the algorithm generated 3-sequences listed below (see 2007 Final Exam!): Frequent 3-sequences Candidate Generation Candidates that survived pruning

Christoph F. Eick Answers Question 1 a. a.AGNES creates set of clustering/a dendrogram; K-Means creates a single clustering b.K-means forms cluster by using an iteration procedure which minimizes an objective functions, AGNES forms the dendrogram by merging the closest 2 clusters until a single cluster is obtained c.…

Christoph F. Eick Answers Questions 2&3 Answer Question 2: t: #iteration k: number of clusters n: #objects-to-be-clustered d:#attributes In each iteration, all the n points are compared to k centroids to assign them to nearest centroid, which is O(k*n), each distance computations complexity is O(d). Therefore, O(t*k*n*d).

Christoph F. Eick Questions and Topics Review Dec. 10, Gaussian Kernel Density Estimation and DENCLUE a.Assume we have a 2D dataset X containing 4 objects : X={(1,0), (0,1), (1,2) (3,4)}; moreover, we use the Gaussian kernel density function to measure the density of X. Assume we want to compute the density at point (1,1) and you can also assume h=1 (  =1) and that we use Manhattan distance as the distance function!. Give a sketch how the Gaussian Kernel Density Estimation approach determines the density for point (1, 1). Be specific! b.What is a density attractor?. How does DENCLUE form clusters.? 5) PageRank [8] a) What does the PageRank compute? What are the challenges in using the PageRank algorithm in practice? [3] b) Give the equation system that PAGERANK would use for the webpage structure given below. Give a sketch of an approach that determines the page rank of the 4 pages from this equation system! [5] P1P2 P3 P4

Christoph F. Eick Answer Question4 4. Gaussian Kernel Density Estimation and DENCLUE a.Assume we have a 2D dataset X containing 4 objects : X={(1,0), (0,1), (1,2) (3,4)}; moreover, we use the Gaussian kernel density function to measure the density of X. Assume we want to compute the density at point (1,1) and you can also assume h=1 (  =1) and that we use Manhattan distance as the distance function!. Give a sketch how the Gaussian Kernel Density Estimation approach determines the density for point (1, 1). Be specific! b.What is a density attractor?. How does DENCLUE form clusters.? a. The density of (1,1) is computed as follows : f X ((1,1))= e -1/2 + e -1/2 + e -1/2 + e -25/2 b. A density attractor is a local maximum of a density function. DENCLUE iterates over the objects in the dataset and uses hill climbing to associate each point with a density attractor. Next, if forms clusters such that each cluster contains objects in the dataset that are associated with the same clusters; objects who belong to a cluster whose density (of its attractor) is below a user defined threshold are considered as outliers.

Christoph F. Eick Answers Questions 5 and 6 5a) What does the PageRank compute? What are the challenges in using the PageRank algorithm in practice? [3] It computes the probability of a webpage to be assessed. [1] As there are a lot of webpage and links finding an efficient scalable algorithm is a major challenge [2] 5b) Give the equation system that PAGERANK would use for the webpage structure given below. Give a sketch of an approach that determines the page rank of the 4 pages from this equation system! [5] PR(P1)= (1-d) + d * (PR(P3)/2 + PR(P4)/3) PR(P2)= (1-d) + d * (PR(P3)/2 + PR(P4)/3 + PR(P1)) PR(P3)= (1-d) + d*PR(P4)/3 PR(P4)=1-d [One solution: Initial all page ranks with 1 [0.5] and then update the PageRank of each page using the above 4 equations until there is some convergence[1]. 6) A Delaunay triangulation for a set P of points in a plane is a triangulation DT(P) such that no point in P is inside the circumcircle of any triangle in DT(P).triangulationcircumcircletriangle

Christoph F. Eick Questions and Topics Review Dec. 10, What is a Delaunay triangulation? 7.SVM a)The soft margin support vector machine solves the following optimization problem: What does the second term minimize? Depict all non-zero  i in the figure below! What is the advantage of the soft margin approach over the linear SVM approach? [5] b) Referring to the figure above, explain how examples are classified by SVMs! What is the relationship between  i and example i being classified correctly? [4]

Christoph F. Eick Answer Question 7 a. Minimizes the error which is measured as the distance to the class’ hyperplane for points that are on the wrong side of the hyperplane [1.5]Depict [2]; distances to wrong hyperplane at most 1 point]. Can deal with classification problems in which the examples are not linearly separable[1.5]. b. The middle hyperplane is used to classify the examples[1.5]. If  i less equal to half of the width of the hyperplane the example is classified correctly. The length of the arrow for point i is the value of  i ; for points i without arrow  i =0.