1. Randomized Hill Climbing Neighborhood Randomized Hill Climbing: Sample p points randomly in the neighborhood of the currently best solution; determine.

Slides:



Advertisements
Similar presentations
HW 4 Answers.
Advertisements

CHAPTER 9: Decision Trees
Biostatistics-Lecture 4 More about hypothesis testing Ruibin Xi Peking University School of Mathematical Sciences.
More on Clustering Hierarchical Clustering to be discussed in Clustering Part2 DBSCAN will be used in programming project.
CLEVER: CLustEring using representatiVEs and Randomized hill climbing Rachana Parmar and Christoph F. Eick:
Chung Sheng CHEN, Nauful SHAIKH, Panitee CHAROENRATTANARUK, Christoph F. EICK, Nouhad RIZK and Edgar GABRIEL Department of Computer Science, University.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Model-based clustering of gene expression data Ka Yee Yeung 1,Chris Fraley 2, Alejandro Murua 3, Adrian E. Raftery 2, and Walter L. Ruzzo 1 1 Department.
1 Micha Feigin, Danny Feldman, Nir Sochen
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Heuristic alignment algorithms and cost matrices
University at BuffaloThe State University of New York Cluster Validation Cluster validation q Assess the quality and reliability of clustering results.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
PART 7 Constructing Fuzzy Sets 1. Direct/one-expert 2. Direct/multi-expert 3. Indirect/one-expert 4. Indirect/multi-expert 5. Construction from samples.
Clustering.
MAE 552 – Heuristic Optimization Lecture 4 January 30, 2002.
Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant.
Data Clustering (a very short introduction) Intuition: grouping of data into clusters so that elements from the same cluster are more similar to each other.
1 IOE/MFG 543 Chapter 14: General purpose procedures for scheduling in practice Section 14.4: Local search (Simulated annealing and tabu search)
Sketching as a Tool for Numerical Linear Algebra David Woodruff IBM Almaden.
Analysis of Algorithms
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
By Rohit Ray ESE 251.  Most minimization (maximization) strategies work to find the nearest local minimum  Trapped at local minimums (maxima)  Standard.
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
Modeling and simulation of systems Simulation optimization and example of its usage in flexible production system control.
A N A RCHITECTURE AND A LGORITHMS FOR M ULTI -R UN C LUSTERING Rachsuda Jiamthapthaksin, Christoph F. Eick and Vadeerat Rinsurongkawong Computer Science.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Chapter 3 Sec 3.3 With Question/Answer Animations 1.
Ch. Eick: Region Discovery Project Part3 Region Discovery Project Part3: Overview The goal of Project3 is to design a region discovery algorithm and evaluate.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Based on slides by Y. Peng University of Maryland
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Topic9: Density-based Clustering
Efficient computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel.
Information Retrieval and Organisation Chapter 16 Flat Clustering Dell Zhang Birkbeck, University of London.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Zeidat&Eick, MLMTA, Las Vegas K-medoid-style Clustering Algorithms for Supervised Summary Generation Nidal Zeidat & Christoph F. Eick Dept. of Computer.
Fitting image transformations Prof. Noah Snavely CS1114
Ch. Eick et al.: Using Clustering to Learn Distance Functions MLDM 2005 Using Clustering to Learn Distance Functions for Supervised Similarity Assessment.
Ch. Eick: Some Ideas for Task4 Project2 Ideas on Creating Summaries and Evaluations of Clusterings Focus: Primary Focus Summarization (what kind of objects.
Heuristic Methods for the Single- Machine Problem Chapter 4 Elements of Sequencing and Scheduling by Kenneth R. Baker Byung-Hyun Ha R2.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
Ch. Eick: Introduction to Search Classification of Search Problems Search Uninformed Search Heuristic Search State Space SearchConstraint Satisfaction.
Ch. Eick: Some Ideas for Task4 Project2 Ideas on Creating Summaries that Characterize Clustering Results Focus: Primary Focus Cluster Summarization (what.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Corresponding Clustering: An Approach to Cluster Multiple Related Spatial Datasets Vadeerat Rinsurongkawong and Christoph F. Eick Department of Computer.
Ch. Eick: Randomized Hill Climbing Techniques Randomized Hill Climbing Neighborhood Hill Climbing: Sample p points randomly in the neighborhood of the.
Ch. Eick Project 2 COSC Christoph F. Eick.
Numerical Methods for Inverse Kinematics Kris Hauser ECE 383 / ME 442.
4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Clustering (2) Center-based algorithms Fuzzy k-means Density-based algorithms ( DBSCAN as an example ) Evaluation of clustering results Figures and equations.
Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
Correlation Clustering
Gilad Lerman Math Department, UMN
More on Clustering in COSC 4335
Clustering CSC 600: Data Mining Class 21.
CH 5: Multivariate Methods
The Elements of Statistical Learning
By Rohit Ray ESE 251 Simulated Annealing.
Recitation 5 2/4/09 ML in Phylogeny
1. Randomized Hill Climbing
Randomized Hill Climbing
Computing the Entropy (H-Function)
Randomized Hill Climbing
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Introduction Swarm Intelligence
Presentation transcript:

1. Randomized Hill Climbing Neighborhood Randomized Hill Climbing: Sample p points randomly in the neighborhood of the currently best solution; determine the best solution of the n sampled points. If it is better than the current solution, make it the new current solution and continue the search; otherwise, terminate returning the current solution. Advantages: easy to apply, does not need many resources, usually fast. Problems: How do I define my neighborhood; what parameter p should I choose? Eick et al., ParCo11, Ghent

Maximize f(x,y,z)=|x-y-0.2|*|x*z-0.8|*|0.3-z*z*y| with x,y,z in [0,1] Neighborhood Design: Create solutions 50 solutions s, such that: s= (min(1, max(0,x+r1)), min(1, max(0,y+r2)), min(1, max(0, z+r3)) with r1, r2, r3 being random numbers in [-0.05,+0.05]. Example Randomized Hill Climbing Eick et al., ParCo11, Ghent

Clustering Basics 3 We assume we cluster objects 1…7 and obtain clusters: {1,2} {3} {5, 6,7}; object 4 is an outlier: The following matrix summarizes the cluster A ii :=1 if object i is not on outlier; otherwise, 0 A ij (with i<j) := 1 if object i and object j are in the same cluster; otherwise O For the example we obtain: Idea: To assess agreement of two clusterings we compute their respective matrices and then assess in what percentages of entries they agree / disagree.

Purity of a Clustering X 4 Purity(X):=Number of Majority Examples in clusters / Number of Examples that belong to clusters E.g.: consider the following R code: r<-dbscan(iris[3:4], 0.15, 3) d<-data.frame(a=iris[,3],b=iris[,4],c=iris[,5],z=factor(r$cluster)) mytable<-table(d$c,d$z) setosa versicolor virginica Remark1: there are 20 outliers in the clustering r; cluster 0 contains outliers! Purity(d)= ( /(150-20)=114/130=87.6% Remark2: As there are 150 examples, 20 of which our outliers, the number of examples that belong to clusters is =130; only, cluster 2 is not pure (16 viginicas do not belong to the majority class Setosa of cluster 2); all other clusters are pure.