分群分析 (Cluster Analysis)

Slides:



Advertisements
Similar presentations
Social Media Marketing Research 社會媒體行銷研究 SMMR08 TMIXM1A Thu 7,8 (14:10-16:00) L511 Exploratory Factor Analysis Min-Yuh Day 戴敏育 Assistant Professor.
Advertisements

Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Case Study for Information Management 資訊管理個案
Data Mining 資料探勘 文字探勘與網頁探勘 (Text and Web Mining) Min-Yuh Day 戴敏育
Social Media Marketing Analytics 社群網路行銷分析 SMMA02 TLMXJ1A (MIS EMBA) Fri 12,13,14 (19:20-22:10) D326 社群網路行銷分析 (Social Media Marketing Analytics) Min-Yuh.
Data Mining 資料探勘 DM02 MI4 Wed, 7,8 (14:10-16:00) (B130) Association Analysis ( 關連分析 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
商業智慧實務 Practices of Business Intelligence BI04 MI4 Wed, 9,10 (16:10-18:00) (B113) 資料倉儲 (Data Warehousing) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
Case Study for Information Management 資訊管理個案
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Business Intelligence 商業智慧 BI01 IM EMBA Fri 12,13,14 (19:20-22:10) D502 Introduction to Business Intelligence 商業智慧導論 Min-Yuh Day 戴敏育 Assistant Professor.
Social Media Marketing Research 社會媒體行銷研究 SMMR06 TMIXM1A Thu 7,8 (14:10-16:00) L511 Measuring the Construct Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
社會媒體服務專題 Special Topics in Social Media Services 淡江大學 淡江大學 資訊管理學系 資訊管理學系 Dept. of Information ManagementDept. of Information Management, Tamkang UniversityTamkang.
Data Mining 資料探勘 DM02 MI4 Thu. 9,10 (16:10-18:00) B513 關連分析 (Association Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
Web Mining ( 網路探勘 ) WM12 TLMXM1A Wed 8,9 (15:10-17:00) U705 Web Usage Mining ( 網路使用挖掘 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
商業智慧 Business Intelligence
Business Intelligence Trends 商業智慧趨勢 BIT08 MIS MBA Mon 6, 7 (13:10-15:00) Q407 商業智慧導入與趨勢 (Business Intelligence Implementation and Trends) Min-Yuh.
Social Media Management 社會媒體管理 SMM01 TMIXM1A Fri. 7,8 (14:10-16:00) L215 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
Case Study for Information Management 資訊管理個案 CSIM4B07 TLMXB4B (M1824) Tue 2, 3, 4 (9:10-12:00) B502 Telecommunications, the Internet, and Wireless.
分類與預測 (Classification and Prediction)
Social Media Marketing 社群網路行銷 SMM08 TLMXJ1A (MIS EMBA) Mon 12,13,14 (19:20-22:10) D504 社群網路行銷計劃 (Social Media Marketing Plan) Min-Yuh Day 戴敏育 Assistant.
Special Topics in Social Media Services 社會媒體服務專題 1 992SMS07 TMIXJ1A Sat. 6,7,8 (13:10-16:00) D502 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
Data Mining 資料探勘 Introduction to Data Mining 資料探勘導論 Min-Yuh Day 戴敏育
商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.
Social Media Marketing Management 社會媒體行銷管理 SMMM12 TLMXJ1A Tue 12,13,14 (19:20-22:10) D325 確認性因素分析 (Confirmatory Factor Analysis) Min-Yuh Day 戴敏育.
Social Media Marketing Analytics 社群網路行銷分析 SMMA04 TLMXJ1A (MIS EMBA) Fri 12,13,14 (19:20-22:10) D326 測量構念 (Measuring the Construct) Min-Yuh Day 戴敏育.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
商業智慧實務 Practices of Business Intelligence BI06 MI4 Wed, 9,10 (16:10-18:00) (B130) 資料科學與巨量資料分析 (Data Science and Big Data Analytics) Min-Yuh Day 戴敏育.
Social Media Marketing Research 社會媒體行銷研究 SMMR09 TMIXM1A Thu 7,8 (14:10-16:00) L511 Confirmatory Factor Analysis Min-Yuh Day 戴敏育 Assistant Professor.
Data Mining 資料探勘 DM04 MI4 Thu 9, 10 (16:10-18:00) B216 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
Web Mining ( 網路探勘 ) WM04 TLMXM1A Wed 8,9 (15:10-17:00) U705 Unsupervised Learning ( 非監督式學習 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of.
Case Study for Information Management 資訊管理個案 CSIM4B08 TLMXB4B Thu 8, 9, 10 (15:10-18:00) B508 Securing Information System: 1. Facebook, 2. European.
Case Study for Information Management 資訊管理個案 CSIM4B03 TLMXB4B (M1824) Tue 3,4 (10:10-12:00) L212 Thu 9 (16:10-17:00) B601 Global E-Business and Collaboration:
Case Study for Information Management 資訊管理個案
商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B130) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.
商業智慧實務 Practices of Business Intelligence
Business Intelligence Trends 商業智慧趨勢 BIT01 MIS MBA Mon 6, 7 (13:10-15:00) Q407 Course Orientation for Business Intelligence Trends 商業智慧趨勢課程介紹 Min-Yuh.
Data Mining 資料探勘 分類與預測 (Classification and Prediction) Min-Yuh Day 戴敏育
Data Mining 資料探勘 DM04 MI4 Wed, 6,7 (13:10-15:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
商業智慧實務 Practices of Business Intelligence BI05 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧的資料探勘 (Data Mining for Business Intelligence) Min-Yuh Day 戴敏育.
Data Mining 資料探勘 DM02 MI4 Thu 9, 10 (16:10-18:00) B216 關連分析 (Association Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
Social Media Marketing Research 社會媒體行銷研究 SMMR04 TMIXM1A Thu 7,8 (14:10-16:00) L511 Marketing Research Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
Data Mining 資料探勘 DM01 MI4 Wed, 6,7 (13:10-15:00) (B216) Introduction to Data Mining ( 資料探勘導論 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Data Mining 資料探勘 Introduction to Data Mining (資料探勘導論) Min-Yuh Day 戴敏育
Case Study for Information Management 資訊管理個案
商業智慧 Business Intelligence BI04 IM EMBA Fri 12,13,14 (19:20-22:10) D502 資料倉儲 (Data Warehousing) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept.
Big Data Mining 巨量資料探勘 DM01 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) Course Orientation for Big Data Mining ( 巨量資料探勘課程介紹 ) Min-Yuh Day 戴敏育.
Social Computing and Big Data Analytics 社群運算與大數據分析
巨量資料基礎: MapReduce典範、Hadoop與Spark生態系統
Social Computing and Big Data Analytics 社群運算與大數據分析
Social Computing and Big Data Analytics 社群運算與大數據分析 SCBDA05 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (B309) Big Data Analytics with Numpy in.
Big Data Mining 巨量資料探勘 DM04 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) 分類與預測 (Classification and Prediction) Min-Yuh Day 戴敏育 Assistant Professor.
社群網路行銷管理 Social Media Marketing Management SMMM03 MIS EMBA (M2200) (8615) Thu, 12,13,14 (19:20-22:10) (D309) 顧客價值與品牌 (Customer Value and Branding)
Social Computing and Big Data Analytics 社群運算與大數據分析
Social Computing and Big Data Analytics 社群運算與大數據分析
Big Data Mining 巨量資料探勘 DM05 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
Big Data Mining 巨量資料探勘 DM03 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) 關連分析 (Association Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
巨量資料基礎: MapReduce典範、Hadoop與Spark生態系統
資訊管理專題 Hot Issues of Information Management
Social Computing and Big Data Analytics 社群運算與大數據分析
Course Orientation for Big Data Mining (巨量資料探勘課程介紹)
Case Study for Information Management 資訊管理個案
Business Intelligence Trends 商業智慧趨勢
Data Mining 資料探勘 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育
商業智慧實務 Practices of Business Intelligence
Case Study for Information Management 資訊管理個案
Case Study for Information Management 資訊管理個案
Topic 5: Cluster Analysis
Case Study for Information Management 資訊管理個案
Case Study for Information Management 資訊管理個案
Presentation transcript:

分群分析 (Cluster Analysis) Tamkang University Tamkang University Big Data Mining 巨量資料探勘 分群分析 (Cluster Analysis) 1052DM05 MI4 (M2244) (3069) Thu, 8, 9 (15:10-17:00) (B130) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University 淡江大學 資訊管理學系 http://mail. tku.edu.tw/myday/ 2017-03-16

課程大綱 (Syllabus) 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 1 2017/02/16 巨量資料探勘課程介紹 (Course Orientation for Big Data Mining) 2 2017/02/23 巨量資料基礎:MapReduce典範、Hadoop與Spark生態系統 (Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem) 3 2017/03/02 關連分析 (Association Analysis) 4 2017/03/09 分類與預測 (Classification and Prediction) 5 2017/03/16 分群分析 (Cluster Analysis) 6 2017/03/23 個案分析與實作一 (SAS EM 分群分析): Case Study 1 (Cluster Analysis – K-Means using SAS EM) 7 2017/03/30 個案分析與實作二 (SAS EM 關連分析): Case Study 2 (Association Analysis using SAS EM)

課程大綱 (Syllabus) 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 8 2017/04/06 教學行政觀摩日 (Off-campus study) 9 2017/04/13 期中報告 (Midterm Project Presentation) 10 2017/04/20 期中考試週 (Midterm Exam) 11 2017/04/27 個案分析與實作三 (SAS EM 決策樹、模型評估): Case Study 3 (Decision Tree, Model Evaluation using SAS EM) 12 2017/05/04 個案分析與實作四 (SAS EM 迴歸分析、類神經網路): Case Study 4 (Regression Analysis, Artificial Neural Network using SAS EM) 13 2017/05/11 Google TensorFlow 深度學習 (Deep Learning with Google TensorFlow) 14 2017/05/18 期末報告 (Final Project Presentation) 15 2017/05/25 畢業班考試 (Final Exam)

Outline Cluster Analysis K-Means Clustering

A Taxonomy for Data Mining Tasks Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Example of Cluster Analysis Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c (3, 8) p04 d (4, 5) p05 e (4, 7) p06 f (5, 1) p07 g (5, 5) p08 h (7, 3) p09 i (7, 5) p10 j (8, 5)

K-Means Clustering Point P P(x,y) m1 distance m2 distance Cluster p01 (3, 4) 1.95 3.78 Cluster1 p02 b (3, 6) 0.69 4.51 p03 c (3, 8) 2.27 5.86 p04 d (4, 5) 0.89 3.13 p05 e (4, 7) 1.22 4.45 p06 f (5, 1) 5.01 3.05 Cluster2 p07 g (5, 5) 1.57 2.30 p08 h (7, 3) 4.37 0.56 p09 i (7, 5) 3.43 1.52 p10 j (8, 5) 4.41 m1 (3.67, 5.83) m2 (6.75, 3.50)

Cluster Analysis

Cluster Analysis Used for automatic identification of natural groupings of things Part of the machine-learning family Employ unsupervised learning Learns the clusters of things from past data, then assigns new instances There is not an output variable Also known as segmentation Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Cluster Analysis Clustering of a set of objects based on the k-means method. (The mean of each cluster is marked by a “+”.) Source: Han & Kamber (2006)

Cluster Analysis Clustering results may be used to Identify natural groupings of customers Identify rules for assigning new cases to classes for targeting/diagnostic purposes Provide characterization, definition, labeling of populations Decrease the size and complexity of problems for other data mining methods Identify outliers in a specific domain (e.g., rare-event detection) Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Example of Cluster Analysis Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c (3, 8) p04 d (4, 5) p05 e (4, 7) p06 f (5, 1) p07 g (5, 5) p08 h (7, 3) p09 i (7, 5) p10 j (8, 5)

Cluster Analysis for Data Mining Analysis methods Statistical methods (including both hierarchical and nonhierarchical), such as k-means, k-modes, and so on Neural networks (adaptive resonance theory [ART], self-organizing map [SOM]) Fuzzy logic (e.g., fuzzy c-means algorithm) Genetic algorithms Divisive versus Agglomerative methods Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Cluster Analysis for Data Mining How many clusters? There is not a “truly optimal” way to calculate it Heuristics are often used Look at the sparseness of clusters Number of clusters = (n/2)1/2 (n: no of data points) Use Akaike information criterion (AIC) Use Bayesian information criterion (BIC) Most cluster analysis methods involve the use of a distance measure to calculate the closeness between pairs of items Euclidian versus Manhattan (rectilinear) distance Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

k-Means Clustering Algorithm k : pre-determined number of clusters Algorithm (Step 0: determine value of k) Step 1: Randomly generate k random points as initial cluster centers Step 2: Assign each point to the nearest cluster center Step 3: Re-compute the new cluster centers Repetition step: Repeat steps 2 and 3 until some convergence criterion is met (usually that the assignment of points to clusters becomes stable) Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Cluster Analysis for Data Mining - k-Means Clustering Algorithm Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Similarity Distance

Similarity and Dissimilarity Between Objects Distances are normally used to measure the similarity or dissimilarity between two data objects Some popular ones include: Minkowski distance: where i = (xi1, xi2, …, xip) and j = (xj1, xj2, …, xjp) are two p-dimensional data objects, and q is a positive integer If q = 1, d is Manhattan distance Source: Han & Kamber (2006)

Similarity and Dissimilarity Between Objects (Cont.) If q = 2, d is Euclidean distance: Properties d(i,j)  0 d(i,i) = 0 d(i,j) = d(j,i) d(i,j)  d(i,k) + d(k,j) Also, one can use weighted distance, parametric Pearson product moment correlation, or other disimilarity measures Source: Han & Kamber (2006)

Euclidean distance vs Manhattan distance Distance of two point x1 = (1, 2) and x2 (3, 5) Euclidean distance: = ((3-1)2 + (5-2)2 )1/2 = (22 + 32)1/2 = (4 + 9)1/2 = (13)1/2 = 3.61 x2 (3, 5) 5 4 3.61 3 3 2 2 x1 = (1, 2) Manhattan distance: = (3-1) + (5-2) = 2 + 3 = 5 1 1 2 3

The K-Means Clustering Method Example 1 2 3 4 5 6 7 8 9 10 10 9 8 7 6 5 Update the cluster means 4 Assign each objects to most similar center 3 2 1 1 2 3 4 5 6 7 8 9 10 reassign reassign K=2 Arbitrarily choose K object as initial cluster center Update the cluster means Source: Han & Kamber (2006)

K-Means Clustering

Example of Cluster Analysis Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c (3, 8) p04 d (4, 5) p05 e (4, 7) p06 f (5, 1) p07 g (5, 5) p08 h (7, 3) p09 i (7, 5) p10 j (8, 5)

K-Means Clustering Step by Step Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c (3, 8) p04 d (4, 5) p05 e (4, 7) p06 f (5, 1) p07 g (5, 5) p08 h (7, 3) p09 i (7, 5) p10 j (8, 5)

K-Means Clustering Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c Step 1: K=2, Arbitrarily choose K object as initial cluster center Point P P(x,y) p01 a (3, 4) p02 b (3, 6) p03 c (3, 8) p04 d (4, 5) p05 e (4, 7) p06 f (5, 1) p07 g (5, 5) p08 h (7, 3) p09 i (7, 5) p10 j (8, 5) Initial m1 m2 M2 = (8, 5) m1 = (3, 4)

K-Means Clustering p01 a (3, 4) 0.00 5.10 Cluster1 p02 b (3, 6) 2.00 Step 2: Compute seed points as the centroids of the clusters of the current partition Step 3: Assign each objects to most similar center Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 0.00 5.10 Cluster1 p02 b (3, 6) 2.00 p03 c (3, 8) 4.00 5.83 p04 d (4, 5) 1.41 p05 e (4, 7) 3.16 4.47 p06 f (5, 1) 3.61 5.00 p07 g (5, 5) 2.24 3.00 p08 h (7, 3) 4.12 Cluster2 p09 i (7, 5) 1.00 p10 j (8, 5) Initial m1 m2 M2 = (8, 5) m1 = (3, 4) K-Means Clustering

K-Means Clustering Euclidean distance b(3,6) m2(8,5) Step 2: Compute seed points as the centroids of the clusters of the current partition Step 3: Assign each objects to most similar center Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 0.00 5.10 Cluster1 p02 b (3, 6) 2.00 p03 c (3, 8) 4.00 5.83 p04 d (4, 5) 1.41 p05 e (4, 7) 3.16 4.47 p06 f (5, 1) 3.61 5.00 p07 g (5, 5) 2.24 3.00 p08 h (7, 3) 4.12 Cluster2 p09 i (7, 5) 1.00 p10 j (8, 5) Initial m1 m2 M2 = (8, 5) Euclidean distance b(3,6) m2(8,5) = ((8-3)2 + (5-6)2 )1/2 = (52 + (-1)2)1/2 = (25 + 1)1/2 = (26)1/2 = 5.10 m1 = (3, 4) Euclidean distance b(3,6) m1(3,4) = ((3-3)2 + (4-6)2 )1/2 = (02 + (-2)2)1/2 = (0 + 4)1/2 = (4)1/2 = 2.00 K-Means Clustering

Step 4: Update the cluster means, Repeat Step 2, 3, stop when no more new assignment Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 1.43 4.34 Cluster1 p02 b (3, 6) 1.22 4.64 p03 c (3, 8) 2.99 5.68 p04 d (4, 5) 0.20 3.40 p05 e (4, 7) 1.87 4.27 p06 f (5, 1) 4.29 4.06 Cluster2 p07 g (5, 5) 1.15 2.42 p08 h (7, 3) 3.80 1.37 p09 i (7, 5) 3.14 0.75 p10 j (8, 5) 4.14 0.95 m1 (3.86, 5.14) m2 (7.33, 4.33) m1 = (3.86, 5.14) M2 = (7.33, 4.33) K-Means Clustering

Step 4: Update the cluster means, Repeat Step 2, 3, stop when no more new assignment Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 1.95 3.78 Cluster1 p02 b (3, 6) 0.69 4.51 p03 c (3, 8) 2.27 5.86 p04 d (4, 5) 0.89 3.13 p05 e (4, 7) 1.22 4.45 p06 f (5, 1) 5.01 3.05 Cluster2 p07 g (5, 5) 1.57 2.30 p08 h (7, 3) 4.37 0.56 p09 i (7, 5) 3.43 1.52 p10 j (8, 5) 4.41 m1 (3.67, 5.83) m2 (6.75, 3.50) m1 = (3.67, 5.83) M2 = (6.75., 3.50) K-Means Clustering

K-Means Clustering stop when no more new assignment p01 a (3, 4) 1.95 Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 1.95 3.78 Cluster1 p02 b (3, 6) 0.69 4.51 p03 c (3, 8) 2.27 5.86 p04 d (4, 5) 0.89 3.13 p05 e (4, 7) 1.22 4.45 p06 f (5, 1) 5.01 3.05 Cluster2 p07 g (5, 5) 1.57 2.30 p08 h (7, 3) 4.37 0.56 p09 i (7, 5) 3.43 1.52 p10 j (8, 5) 4.41 m1 (3.67, 5.83) m2 (6.75, 3.50) K-Means Clustering

K-Means Clustering (K=2, two clusters) stop when no more new assignment Point P P(x,y) m1 distance m2 distance Cluster p01 a (3, 4) 1.95 3.78 Cluster1 p02 b (3, 6) 0.69 4.51 p03 c (3, 8) 2.27 5.86 p04 d (4, 5) 0.89 3.13 p05 e (4, 7) 1.22 4.45 p06 f (5, 1) 5.01 3.05 Cluster2 p07 g (5, 5) 1.57 2.30 p08 h (7, 3) 4.37 0.56 p09 i (7, 5) 3.43 1.52 p10 j (8, 5) 4.41 m1 (3.67, 5.83) m2 (6.75, 3.50) K-Means Clustering

K-Means Clustering Point P P(x,y) m1 distance m2 distance Cluster p01 (3, 4) 1.95 3.78 Cluster1 p02 b (3, 6) 0.69 4.51 p03 c (3, 8) 2.27 5.86 p04 d (4, 5) 0.89 3.13 p05 e (4, 7) 1.22 4.45 p06 f (5, 1) 5.01 3.05 Cluster2 p07 g (5, 5) 1.57 2.30 p08 h (7, 3) 4.37 0.56 p09 i (7, 5) 3.43 1.52 p10 j (8, 5) 4.41 m1 (3.67, 5.83) m2 (6.75, 3.50)

Summary Cluster Analysis K-Means Clustering

References Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Second Edition, Elsevier, 2006. Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Third Edition, Morgan Kaufmann 2011. Efraim Turban, Ramesh Sharda, Dursun Delen, Decision Support and Business Intelligence Systems, Ninth Edition, Pearson, 2011.