商業智慧實務 Practices of Business Intelligence 1 1022BI05 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧的資料探勘 (Data Mining for Business Intelligence) Min-Yuh Day 戴敏育.

Slides:

Advertisements

Similar presentations

高大網頁競賽評鑑量表改良 Yu-Hui Tao 2011/4/27. 目地是甚麼 ? 1. 世界網路大學排名 2. 校務評鑑.

Advertisements

序列分析工具:MDDLogo 謝勝任林宗慶指導教授:李宗夷教授.

Data Mining 資料探勘 DM02 MI4 Wed, 7,8 (14:10-16:00) (B130) Association Analysis ( 關連分析 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.

Chapter 0 Computer Science (CS) 計算機概論教學目標瞭解現代電腦系統之發展歷程瞭解電腦之元件、功能及組織架構瞭解電腦如何表示資料及其處理方式學習運用電腦來解決問題認知成為一位電子資訊人才所需之基本條件認知進階電子資訊之相關領域.

全球化環境下的組織管理本章內容全球化的趨勢國際化的階段國際企業母公司對分支機構的管理取向國際企業組織的結構設計 Chapter 6

1 真理大學運輸管理學系實務實習說明目錄  實務實習類別  實務實習條例  校外實習單位  實務實習成績計算方式  校外實習甄選 / 自洽申請流程  附錄：相關表格.

如何寫好一篇報告釐清問題選擇資料庫制定檢索策略實機操作. 報告內容跨國公司 – 公司簡介（如公司成立時間、目前在幾個國家有據點等） – 公司計畫 – 公司組織 – 公司領導 – 公司控制 – 總結（主要為結論，但是如果可以對該公司提出建議，會額外加分） – 參考文獻.

Last modified 2004/02 An Introduction to SQL (Structured Query Language )

真理大學航空運輸管理學系實務實習說明. 實務實習部份實務實習校內實習校外實習實習時數必須在 300 小時 ( 含 ) 以上才承認校內實習時數及實習成績。二個寒假各一個月暑假兩個月.

以 Business Dimensional Lifecycle 方法開發 Data Warehouse 系統之初探指導老師吳美宜老師組別 :12 組陳敬憲鍾政宏王國璿.

1 單元三查詢結果的引用分析 Web of Science 利用指引查看出版及被引用情況在查詢結果的清單中，可以瀏覽近 20 年來查詢主題出版和被引用的情況。

物流通關專業教室 (052) 國貿實務專業教室 (054) 企業資源整合專業教室 (055) 整合各專業教室資訊進行即時動態及異常管理 (051) 貿易運籌研訓中心專業實習、、師生研究討論、、、、海關模擬系統、貨況追蹤、貨物進出倉管理、海空運通關承攬、通關自動化作業等相關模組全球運籌決策中心.

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 貳研究設計.

FGU LDT. FGU EIS 96 ‧ 8 ‧ 25 FGU LDT 佛光大學學習與數位科技學系.

論文研討 ( 一 ) B 組課程簡介劉美纓 / 尚榮安 / 胡凱傑 2009/09/17. 一、課程基本資料科目名稱： ( 中文 ) 論文研討（一）Ｂ組 ( 英文 ) SEMINARS (I) 開課學期： 98 學年度第 1 學期開課班級：企碩一學分數： 2 學分星期節次：四 34.

第三章自動再裝載運用篇使用時機：裝載計劃完成時，尚有剩餘空間的情形，維持已固定計劃而繼續做裝載最佳化。以支持次日裝載計劃而提前調整作業模式。裝載物品設定和裝載容器設定如前兩章介紹，於此不再重複此動作，直接從裝載計劃設定開始，直接從系統內定的物品和容器選取所需.

真理大學航空服務管理學系實務實習說明. 實務實習部份實務實習校內實習校外實習實習時數必須在 300 小時 ( 含 ) 以上才承認校內實習時數及實習成績。二個寒假各一個月暑假兩個月.

第二十一章研究流程、論文結構　　　　　　　與研究範例 21-1 　研究流程 21-2 　論文結構 21-3 　研究範例.

生產系統導論生產系統簡介績效衡量現代工廠之特徵管理機能.

1 高等演算法授課老師 : 陳建源研究室 : 法 401 網站

2010 MCML introduction 製作日期： 2010/9/10 製作人 : 胡名霞.

校園網頁整合平台介紹電算中心綜合業務組. 大綱設計理念功能介紹實做 FAQ 特殊案例 Q&A.

大華技術學院九十五學年度資工系計算機概論教學大綱吳弘翔. Wu Hung-Hsiang2 科目名稱：計算機概論與實習授課老師：吳弘翔學分數： 4 修別：必修老師信箱：

寬頻通訊系統基礎教育計畫分項計畫二寬頻網路通訊主要參與人員黎碧煌教授鍾順平副教授

廣電新聞播報品質電腦化評估系統之研發國立政治大學資訊科學系指導教授：廖文宏學生：蘇以暄.

統計學 ( 二 ) 朝陽科技大學工業工程與管理系副教授洪弘祈 Statistics II2 企業與統計之關係 n 品質管制 n 預測統計與市場調查 n 績效與人事管理 n 例行報告之方案評估與決策參考 n 製程改善 n 研發能力之提昇 n 產品可靠度 n 生產管制.

Chapter 7 Sampling Distribution

Cluster Analysis 目的 – 將資料分成幾個相異性最大的群組基本問題 – 如何衡量事務之間的相似性 – 如何將相似的資料歸入同一群組 – 如何解釋群組的特性.

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 壹企業研究導論.

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 壹企業研究導論.

1 高等管理資訊系統. 2 授課教師 : 王耀德研究室 : 主顧 686 電話 : (04) # 課輔時間 Wednesday 09:00~13:00 介紹.

全國奈米科技人才培育推動計畫辦公室中北區奈米科技Ｋ -12 教育發展中心計畫簡報報告人：楊鏡堂教授計畫執行單位：國立清華大學動力機械工程學系計畫種子學校：教育部顧問室 94 年度奈米科技人才培育先導型計畫年度成果報告中華民國九十四年十月十四日.

管理變革 Prentice Hall Chapter 18

商業智慧實務 Practices of Business Intelligence BI04 MI4 Wed, 9,10 (16:10-18:00) (B113) 資料倉儲 (Data Warehousing) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.

Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.

Business Intelligence 商業智慧 BI01 IM EMBA Fri 12,13,14 (19:20-22:10) D502 Introduction to Business Intelligence 商業智慧導論 Min-Yuh Day 戴敏育 Assistant Professor.

社會媒體服務專題 Special Topics in Social Media Services 淡江大學淡江大學資訊管理學系資訊管理學系 Dept. of Information ManagementDept. of Information Management, Tamkang UniversityTamkang.

Data Mining 資料探勘 DM02 MI4 Thu. 9,10 (16:10-18:00) B513 關連分析 (Association Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.

Web Mining ( 網路探勘 ) WM12 TLMXM1A Wed 8,9 (15:10-17:00) U705 Web Usage Mining ( 網路使用挖掘 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.

商業智慧實務 Practices of Business Intelligence BI03 MI4 Wed, 9,10 (16:10-18:00) (B113) 企業績效管理 (Business Performance Management) Min-Yuh Day 戴敏育 Assistant.

Business Intelligence Trends 商業智慧趨勢 BIT08 MIS MBA Mon 6, 7 (13:10-15:00) Q407 商業智慧導入與趨勢 (Business Intelligence Implementation and Trends) Min-Yuh.

Social Media Management 社會媒體管理 SMM01 TMIXM1A Fri. 7,8 (14:10-16:00) L215 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,

Social Media Marketing 社群網路行銷 SMM08 TLMXJ1A (MIS EMBA) Mon 12,13,14 (19:20-22:10) D504 社群網路行銷計劃 (Social Media Marketing Plan) Min-Yuh Day 戴敏育 Assistant.

Data Mining 資料探勘 Introduction to Data Mining 資料探勘導論 Min-Yuh Day 戴敏育

商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.

Social Media Marketing Management 社會媒體行銷管理 SMMM12 TLMXJ1A Tue 12,13,14 (19:20-22:10) D325 確認性因素分析 (Confirmatory Factor Analysis) Min-Yuh Day 戴敏育.

Social Media Marketing Analytics 社群網路行銷分析 SMMA04 TLMXJ1A (MIS EMBA) Fri 12,13,14 (19:20-22:10) D326 測量構念 (Measuring the Construct) Min-Yuh Day 戴敏育.

商業智慧實務 Practices of Business Intelligence BI06 MI4 Wed, 9,10 (16:10-18:00) (B130) 資料科學與巨量資料分析 (Data Science and Big Data Analytics) Min-Yuh Day 戴敏育.

Data Mining 資料探勘 DM04 MI4 Thu 9, 10 (16:10-18:00) B216 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,

Web Mining ( 網路探勘 ) WM04 TLMXM1A Wed 8,9 (15:10-17:00) U705 Unsupervised Learning ( 非監督式學習 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of.

Case Study for Information Management 資訊管理個案 CSIM4B03 TLMXB4B (M1824) Tue 3,4 (10:10-12:00) L212 Thu 9 (16:10-17:00) B601 Global E-Business and Collaboration:

商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B130) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.

商業智慧實務 Practices of Business Intelligence

Business Intelligence Trends 商業智慧趨勢 BIT01 MIS MBA Mon 6, 7 (13:10-15:00) Q407 Course Orientation for Business Intelligence Trends 商業智慧趨勢課程介紹 Min-Yuh.

Data Mining 資料探勘 DM04 MI4 Wed, 6,7 (13:10-15:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,

Data Mining 資料探勘 DM01 MI4 Wed, 6,7 (13:10-15:00) (B216) Introduction to Data Mining ( 資料探勘導論 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of.

Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.

Data Mining 資料探勘 Introduction to Data Mining (資料探勘導論) Min-Yuh Day 戴敏育

Case Study for Information Management 資訊管理個案

商業智慧 Business Intelligence BI04 IM EMBA Fri 12,13,14 (19:20-22:10) D502 資料倉儲 (Data Warehousing) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept.

Case Study for Information Management 資訊管理個案 CSIM4C01 TLMXB4C Mon 8, 9, 10 (15:10-18:00) B602 Introduction to Case Study for Information Management.

Case Study for Information Management 資訊管理個案 CSIM4C01 TLMXB4C (M1824) Tue 2 (9:10-10:00) L212 Thu 7,8 (14:10-16:00) B601 Introduction to Case Study.

Case Study for Information Management 資訊管理個案 CSIM4B01 TLMXB4B (M1824) Tue 2, 3, 4 (9:10-12:00) B502 Introduction to Case Study for Information Management.

Big Data Mining 巨量資料探勘 DM05 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.

Case Study for Information Management 資訊管理個案

分群分析 (Cluster Analysis)

Data Mining 資料探勘分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育

商業智慧實務 Practices of Business Intelligence

Presentation transcript:

商業智慧實務 Practices of Business Intelligence BI05 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧的資料探勘 (Data Mining for Business Intelligence) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang University 淡江大學淡江大學資訊管理學系資訊管理學系 tku.edu.tw/myday/ Tamkang University

週次 (Week) 日期 (Date) 內容 (Subject/Topics) 1 103/02/19 商業智慧導論 (Introduction to Business Intelligence) 2 103/02/26 管理決策支援系統與商業智慧 (Management Decision Support System and Business Intelligence) 3 103/03/05 企業績效管理 (Business Performance Management) 4 103/03/12 資料倉儲 (Data Warehousing) 5 103/03/19 商業智慧的資料探勘 (Data Mining for Business Intelligence) 6 103/03/26 商業智慧的資料探勘 (Data Mining for Business Intelligence) 7 103/04/02 教學行政觀摩日 (Off-campus study) 8 103/04/09 資料科學與巨量資料分析 (Data Science and Big Data Analytics) 課程大綱 ( Syllabus) 2

週次日期內容（ Subject/Topics ） 9 103/04/16 期中報告 (Midterm Project Presentation) /04/23 期中考試週 (Midterm Exam) /04/30 文字探勘與網路探勘 (Text and Web Mining) /05/07 意見探勘與情感分析 (Opinion Mining and Sentiment Analysis) /05/14 社會網路分析 (Social Network Analysis) /05/21 期末報告 (Final Project Presentation) /05/28 畢業考試週 (Final Exam) 課程大綱 ( Syllabus) 3

Data Mining at the Intersection of Many Disciplines Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 4

A Taxonomy for Data Mining Tasks Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 5

Data Mining Software Commercial – SPSS - PASW (formerly Clementine) – SAS - Enterprise Miner – IBM - Intelligent Miner – StatSoft – Statistical Data Miner – … many more Free and/or Open Source – Weka – RapidMiner… Source: KDNuggets.com, May 2009 Source: Turban et al. (2011), Decision Support and Business Intelligence Systems

Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard processes: – CRISP-DM (Cross-Industry Standard Process for Data Mining) – SEMMA (Sample, Explore, Modify, Model, and Assess) – KDD (Knowledge Discovery in Databases) Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 7

Data Mining Process: CRISP-DM Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 8

Data Mining Process: CRISP-DM Step 1: Business Understanding Step 2: Data Understanding Step 3: Data Preparation (!) Step 4: Model Building Step 5: Testing and Evaluation Step 6: Deployment The process is highly repetitive and experimental (DM: art versus science?) Accounts for ~85% of total project time Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 9

Data Preparation – A Critical DM Task Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 10

Data Mining Process: SEMMA Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 11

Cluster Analysis Used for automatic identification of natural groupings of things Part of the machine-learning family Employ unsupervised learning Learns the clusters of things from past data, then assigns new instances There is not an output variable Also known as segmentation Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 12

Cluster Analysis 13 Clustering of a set of objects based on the k-means method. (The mean of each cluster is marked by a “+”.) Source: Han & Kamber (2006)

Cluster Analysis Clustering results may be used to – Identify natural groupings of customers – Identify rules for assigning new cases to classes for targeting/diagnostic purposes – Provide characterization, definition, labeling of populations – Decrease the size and complexity of problems for other data mining methods – Identify outliers in a specific domain (e.g., rare-event detection) Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 14

15 PointPP(x,y) p01a(3, 4) p02b(3, 6) p03c(3, 8) p04d(4, 5) p05e(4, 7) p06f(5, 1) p07g(5, 5) p08h(7, 3) p09i(7, 5) p10j(8, 5) Example of Cluster Analysis

Cluster Analysis for Data Mining How many clusters? – There is not a “truly optimal” way to calculate it – Heuristics are often used 1.Look at the sparseness of clusters 2.Number of clusters = (n/2) 1/2 (n: no of data points) 3.Use Akaike information criterion (AIC) 4.Use Bayesian information criterion (BIC) Most cluster analysis methods involve the use of a distance measure to calculate the closeness between pairs of items – Euclidian versus Manhattan (rectilinear) distance Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 16

k-Means Clustering Algorithm k : pre-determined number of clusters Algorithm (Step 0: determine value of k) Step 1: Randomly generate k random points as initial cluster centers Step 2: Assign each point to the nearest cluster center Step 3: Re-compute the new cluster centers Repetition step: Repeat steps 2 and 3 until some convergence criterion is met (usually that the assignment of points to clusters becomes stable) Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 17

Cluster Analysis for Data Mining - k-Means Clustering Algorithm Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 18

Similarity and Dissimilarity Between Objects Distances are normally used to measure the similarity or dissimilarity between two data objects Some popular ones include: Minkowski distance: where i = (x i1, x i2, …, x ip ) and j = (x j1, x j2, …, x jp ) are two p- dimensional data objects, and q is a positive integer If q = 1, d is Manhattan distance 19 Source: Han & Kamber (2006)

Similarity and Dissimilarity Between Objects (Cont.) If q = 2, d is Euclidean distance: – Properties d(i,j)  0 d(i,i) = 0 d(i,j) = d(j,i) d(i,j)  d(i,k) + d(k,j) Also, one can use weighted distance, parametric Pearson product moment correlation, or other disimilarity measures 20 Source: Han & Kamber (2006)

Euclidean distance vs Manhattan distance Distance of two point x 1 = (1, 2) and x 2 (3, 5) x 2 (3, 5) 2 x 1 = (1, 2) Euclidean distance: = ((3-1) 2 + (5-2) 2 ) 1/2 = ( ) 1/2 = (4 + 9) 1/2 = (13) 1/2 = 3.61 Manhattan distance: = (3-1) + (5-2) = = 5

The K-Means Clustering Method Example K=2 Arbitrarily choose K object as initial cluster center Assign each objects to most similar center Update the cluster means reassign 22 Source: Han & Kamber (2006)

23 PointPP(x,y) p01a(3, 4) p02b(3, 6) p03c(3, 8) p04d(4, 5) p05e(4, 7) p06f(5, 1) p07g(5, 5) p08h(7, 3) p09i(7, 5) p10j(8, 5) K-Means Clustering Step by Step

24 PointPP(x,y) p01a(3, 4) p02b(3, 6) p03c(3, 8) p04d(4, 5) p05e(4, 7) p06f(5, 1) p07g(5, 5) p08h(7, 3) p09i(7, 5) p10j(8, 5) Initialm1(3, 4) Initialm2(8, 5) m 1 = (3, 4) M 2 = (8, 5) K-Means Clustering Step 1: K=2, Arbitrarily choose K object as initial cluster center

25 M 2 = (8, 5) Step 2: Compute seed points as the centroids of the clusters of the current partition Step 3: Assign each objects to most similar center m 1 = (3, 4) K-Means Clustering PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster1 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 Initialm1(3, 4) Initialm2(8, 5)

26 PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster1 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 Initialm1(3, 4) Initialm2(8, 5) M 2 = (8, 5) Step 2: Compute seed points as the centroids of the clusters of the current partition Step 3: Assign each objects to most similar center m 1 = (3, 4) K-Means Clustering Euclidean distance b(3,6)  m2(8,5) = ((8-3) 2 + (5-6) 2 ) 1/2 = (5 2 + (-1) 2 ) 1/2 = (25 + 1) 1/2 = (26) 1/2 = 5.10 Euclidean distance b(3,6)  m1(3,4) = ((3-3) 2 + (4-6) 2 ) 1/2 = (0 2 + (-2) 2 ) 1/2 = (0 + 4) 1/2 = (4) 1/2 = 2.00

27 PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster2 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 m1(3.86, 5.14) m2(7.33, 4.33) m 1 = (3.86, 5.14) M 2 = (7.33, 4.33) Step 4: Update the cluster means, Repeat Step 2, 3, stop when no more new assignment K-Means Clustering

28 PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster2 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 m1(3.67, 5.83) m2(6.75, 3.50) M 2 = (6.75., 3.50) m 1 = (3.67, 5.83) Step 4: Update the cluster means, Repeat Step 2, 3, stop when no more new assignment K-Means Clustering

29 PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster2 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 m1(3.67, 5.83) m2(6.75, 3.50) stop when no more new assignment K-Means Clustering

K-Means Clustering (K=2, two clusters) 30 PointPP(x,y) m1 distance m2 distance Cluster p01a(3, 4) Cluster1 p02b(3, 6) Cluster1 p03c(3, 8) Cluster1 p04d(4, 5) Cluster1 p05e(4, 7) Cluster1 p06f(5, 1) Cluster2 p07g(5, 5) Cluster1 p08h(7, 3) Cluster2 p09i(7, 5) Cluster2 p10j(8, 5) Cluster2 m1(3.67, 5.83) m2(6.75, 3.50) stop when no more new assignment K-Means Clustering

個案分析與實作一 (SAS EM 分群分析 ) ： Case Study 1 (Cluster Analysis – K-Means using SAS EM) Banking Segmentation 31

行銷客戶分群 32

33 案例情境 ABC 銀行的行銷部門想要針對該銀行客戶的使用行為，進行分群分析，以了解現行客戶對本行的往來方式，並進一步提供適宜的行銷接觸模式。該銀行從有效戶 ( 近三個月有交易者 ) ，取出 10 萬筆樣本資料。依下列四種交易管道計算交易次數： – 傳統臨櫃交易 (TBM) – 自動櫃員機交易 (ATM) – 銀行專員服務 (POS) – 電話客服 (CSC) Source: SAS Enterprise Miner Course Notes, 2014, SAS

34 資料欄位說明資料集名稱： profile.sas7bdat Source: SAS Enterprise Miner Course Notes, 2014, SAS

35 分析目的依據各往來交易管道 TBM 、 ATM 、 POS 、 CSC 進行客戶分群分析。演練重點 : 極端值資料處理分群變數選擇衍生變數產出分群參數調整與分群結果解釋行銷客戶分群實機演練 Source: SAS Enterprise Miner Course Notes, 2014, SAS

SAS Enterprise Miner (SAS EM) Case Study SAS EM 資料匯入 4 步驟 – Step 1. 新增專案 (New Project) – Step 2. 新增資料館 (New / Library) – Step 3. 建立資料來源 (Create Data Source) – Step 4. 建立流程圖 (Create Diagram) SAS EM SEMMA 建模流程 36

37 Download EM_Data.zip (SAS EM Datasets)

Upzip EM_Data.zip to C:\DATA\EM_Data 38

Upzip EM_Data.zip to C:\DATA\EM_Data 39

VMware Horizon View Client softcloud.tku.edu.tw SAS Enterprise Miner 40

SAS Locale Setup Manager  English UI 41

SAS Enterprise Guide 5.1 (SAS EG) 42

SAS Enterprise Miner 12.1 (SAS EM) 43

SAS Locale Setup Manager

45 Source:

46 C:\Program Files\SASHome\SASEnterpriseMinerWorkstationConfiguration\12.1\em.exe

47

48

49

SAS Enterprise Miner 12.1 (SAS EM) 50

SAS Enterprise Miner (SAS EM) 51 SAS Enterprise Miner (SAS EM) English UI

SAS Enterprise Guide 5.1 (SAS EG) Open SAS.sas7bdat File Export to Excel.xlsx File Import Excel.xlsx File Export to SAS.sas7bdat File 52

SAS Enterprise Guide 5.1 (SAS EG) 53

New Project 54

Open SAS Data File: Profile.sas7bdat 55

56

57

58

59

60

61

62

63

64

65

66 Export SAS.sas7bdat to to Excel.xlsx File

Import Excel File to SAS EG 67

68 Import Excel File to SAS EG

69

70

71

72

73

74

75

Export Excel.xlsx File to SAS.sas7bdat File 76

77

78

79

Profile_Excel.xlsx 80

Profile_SAS.sas7bdat 81

SAS Enterprise Miner 12.1 (SAS EM) 82

SAS EM 資料匯入 4 步驟 Step 1. 新增專案 (New Project) Step 2. 新增資料館 (New / Library) Step 3. 建立資料來源 (Create Data Source) Step 4. 建立流程圖 (Create Diagram) 83

Step 1. 新增專案 (New Project) 84

85 Step 1. 新增專案 (New Project)

86 Step 1. 新增專案 (New Project)

SAS Enterprise Miner (EM_Project1) 87

Step 2. 新增資料館 (New / Library) 88

89 Step 2. 新增資料館 (New / Library)

90 Step 2. 新增資料館 (New / Library)

91 Step 2. 新增資料館 (New / Library)

92 Step 2. 新增資料館 (New / Library)

Step 3. 建立資料來源 (Create Data Source) 93

94 Step 3. 建立資料來源 (Create Data Source)

95 Step 3. 建立資料來源 (Create Data Source)

96 Step 3. 建立資料來源 (Create Data Source)

97 Step 3. 建立資料來源 (Create Data Source)

98 EM_LIB.PROFILE Step 3. 建立資料來源 (Create Data Source)

99 Step 3. 建立資料來源 (Create Data Source)

100 Step 3. 建立資料來源 (Create Data Source)

101 Step 3. 建立資料來源 (Create Data Source)

102 Step 3. 建立資料來源 (Create Data Source)

103 Step 3. 建立資料來源 (Create Data Source)

104 Step 3. 建立資料來源 (Create Data Source)

105 Step 3. 建立資料來源 (Create Data Source)

106 Step 3. 建立資料來源 (Create Data Source)

107 Step 3. 建立資料來源 (Create Data Source)

108 Step 4. 建立流程圖 (Create Diagram)

109 Step 4. 建立流程圖 (Create Diagram)

110 Step 4. 建立流程圖 (Create Diagram)

SAS Enterprise Miner (SAS EM) Case Study SAS EM 資料匯入 4 步驟 – Step 1. 新增專案 (New Project) – Step 2. 新增資料館 (New / Library) – Step 3. 建立資料來源 (Create Data Source) – Step 4. 建立流程圖 (Create Diagram) SAS EM SEMMA 建模流程 111

112 案例情境模型流程

113

EM_Lib.Profile 114

115

116

117

118

119

篩選－間隔變數 … 120

121

篩選－匯入的資料 … [ 勘查 ] 122

123 篩選－匯入的資料

124 篩選－匯出的資料 … [ 勘查 ]

125 篩選－匯出的資料

勘查 (Explore) －群集 (Cluster) 126

127

128

129

130

群集 (Cluster) 結果 131

評估 (Assess) －區段特徵描繪 (Segment Profile) 132 區段特徵描繪 (Segment Profile)

133 評估 (Assess) －區段特徵描繪 (Segment Profile)

134 評估 (Assess) －區段特徵描繪 (Segment Profile)

135 評估 (Assess) －區段特徵描繪 (Segment Profile)

136 評估 (Assess) －區段特徵描繪 (Segment Profile)

區段特徵描繪 (Segment Profile) 137

138 區段特徵描繪 (Segment Profile)

References Efraim Turban, Ramesh Sharda, Dursun Delen, Decision Support and Business Intelligence Systems, Ninth Edition, 2011, Pearson. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Second Edition, 2006, Elsevier Jim Georges, Jeff Thompson and Chip Wells, Applied Analytics Using SAS Enterprise Miner, SAS, 2010 SAS Enterprise Miner Course Notes, 2014, SAS SAS Enterprise Miner Training Course, 2014, SAS SAS Enterprise Guide Training Course, 2014, SAS 139