Big Data Mining 巨量資料探勘 1 1042DM01 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) Course Orientation for Big Data Mining ( 巨量資料探勘課程介紹 ) Min-Yuh Day 戴敏育.

Slides:



Advertisements
Similar presentations
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Advertisements

Case Study for Information Management 資訊管理個案
HSR 課程介紹. 指定用書 Health Services Research Method Leiyu Shi 2008.
Chapter 0 Computer Science (CS) 計算機概論 教學目標 瞭解現代電腦系統之發展歷程 瞭解電腦之元件、功能及組織架構 瞭解電腦如何表示資料及其處理方式 學習運用電腦來解決問題 認知成為一位電子資訊人才所需之基本條 件 認知進階電子資訊之相關領域.
1 真理大學運輸管理學系 實務實習說明 目錄  實務實習類別  實務實習條例  校外實習單位  實務實習成績計算方式  校外實習甄選 / 自洽申請流程  附錄:相關表格.
元智大學應用外語系碩士班 Department of Foreign Languages and Applied Linguistics Master’s Program.
真理大學航空運輸管理學系 實務實習說明. 實務實習部份 實務實習 校內實習 校外實習 實習時數必須在 300 小時 ( 含 ) 以上才承認 校內實習時數及實習成績。 二個寒假 各一個月 暑假兩個月.
統計資訊軟體應用 授課者:蔡桂宏 系別:應用統計資訊系 職務:專任副教授 連絡: 轉 3485 系辦
物流通關專業教室 (052) 國貿實務專業教室 (054) 企業資源整合專業教室 (055) 整合各專業教室資訊進行 即時動態及異常管理 (051) 貿易運籌研訓中心 專業實習、 、 師生研究討論 、、 、、 海關模擬系統、貨況追蹤、貨物 進出倉管理、海空運通關承攬、 通關自動化作業等相關模組 全球運籌決策中心.
1 數位控制(一) 2 數位控制 課程計畫 課程目標 介紹數位控制理論 與工業界常用之數位控制器比較 實習數位控制器之模擬與設計 課程綱要 Introduction to Digital Control System The z Transform z-Plane Analysis of Discrete-Time.
國立中央大學電機工程學系 99 學年度第 2 學期 助教會議 中央大學電機工程學系 工程認證 1.
CS1103 電機資訊工程實習 Department of Computer Science National Tsing Hua University.
電子計算機概論電子計算機概論 教科書 計算機概論 Introduction to Computers 原著: Peter Norton 審閱: 陳正雄‧趙立本‧簡文山‧林碧蘭 編譯:普羅數位科技 總審閱:林志敏 NT 590 洽助教.
論文研討 ( 一 ) B 組 課程簡介 劉美纓 / 尚榮安 / 胡凱傑 2009/09/17. 一、課程基本資料 科目名稱: ( 中文 ) 論文研討(一)B組 ( 英文 ) SEMINARS (I) 開課學期: 98 學年度第 1 學期 開課班級:企碩一 學 分 數: 2 學分 星期節次: 四 34.
1 Syllabus Computer Network 計算機網路 賴秉樑 Dept. of Electronic Engineering National Chin-Yi University of Technology Spring 2008.
大華技術學院九十三學年度 資工系計算機概論教學大綱 吳弘翔. Wu Hung-Hsiang2 科目名稱:計算機概論與實習 適用班別:夜資工技一A 授課老師:吳弘翔 學分數: 4 修別:必修 老師信箱:
鄭瑞興的個人簡介 中山資工所 鄭瑞興.
教材名稱:網際網路安全之技術及其應用 (編號: 41 ) 計畫主持人:胡毓忠 副教授 聯絡電話: 教材網址: 執行單位: 政治大學資訊科學系.
1 高等演算法 授課老師 : 陳建源 研究室 : 法 401 網站
大華技術學院九十五學年度 資工系計算機概論教學大綱 吳弘翔. Wu Hung-Hsiang2 科目名稱:計算機概論與實習 授課老師:吳弘翔 學分數: 4 修別:必修 老師信箱:
寬頻通訊系統基礎教育計畫 分項計畫二 寬頻網路通訊 主要參與人員 黎碧煌 教 授 鍾順平 副教授
統計學 ( 二 ) 朝陽科技大學工業工程與管理系副教授洪弘祈 Statistics II2 企業與統計之關係 n 品質管制 n 預測統計與市場調查 n 績效與人事管理 n 例行報告之方案評估與決策參考 n 製程改善 n 研發能力之提昇 n 產品可靠度 n 生產管制.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 壹 企業研究導論.
1 高等管理資訊系統. 2 授課教師 : 王耀德 研究室 : 主顧 686 電話 : (04) # 課輔時間 Wednesday 09:00~13:00 介紹.
Case Study for Information Management 資訊管理個案
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Social Media Marketing Research 社會媒體行銷研究 SMMR01 TMIXM1A Thu 7,8 (14:10-16:00) U505 Course Orientation for Social Media Marketing Research Min-Yuh.
Business Intelligence 商業智慧 BI01 IM EMBA Fri 12,13,14 (19:20-22:10) D502 Introduction to Business Intelligence 商業智慧導論 Min-Yuh Day 戴敏育 Assistant Professor.
Case Study for Information Management 資訊管理個案 CSIM4B06 TLMXB4B (M1824) Tue 2, 3, 4 (9:10-12:00) B502 Foundations of Business Intelligence - Database.
企業網路系統研究 (Enterprise Networking Systems Research) (VMware vSphere) VM01 MI4 Thu 9,10 (16:10-18:00) (B310) Course Introduction ( 課程介紹 ) Min-Yuh Day.
社會媒體服務專題 Special Topics in Social Media Services 淡江大學 淡江大學 資訊管理學系 資訊管理學系 Dept. of Information ManagementDept. of Information Management, Tamkang UniversityTamkang.
Business Intelligence Trends 商業智慧趨勢 BIT08 MIS MBA Mon 6, 7 (13:10-15:00) Q407 商業智慧導入與趨勢 (Business Intelligence Implementation and Trends) Min-Yuh.
Social Media Management 社會媒體管理 SMM01 TMIXM1A Fri. 7,8 (14:10-16:00) L215 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
Case Study for Information Management 資訊管理個案 CSIM4B07 TLMXB4B (M1824) Tue 2, 3, 4 (9:10-12:00) B502 Telecommunications, the Internet, and Wireless.
Data Mining 資料探勘 Introduction to Data Mining 資料探勘導論 Min-Yuh Day 戴敏育
商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B113) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.
Social Media Marketing Management 社會媒體行銷管理 SMMM12 TLMXJ1A Tue 12,13,14 (19:20-22:10) D325 確認性因素分析 (Confirmatory Factor Analysis) Min-Yuh Day 戴敏育.
商業智慧實務 Practices of Business Intelligence BI06 MI4 Wed, 9,10 (16:10-18:00) (B130) 資料科學與巨量資料分析 (Data Science and Big Data Analytics) Min-Yuh Day 戴敏育.
Case Study for Information Management 資訊管理個案
Data Mining 資料探勘 DM04 MI4 Thu 9, 10 (16:10-18:00) B216 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
Case Study for Information Management 資訊管理個案 CSIM4B08 TLMXB4B Thu 8, 9, 10 (15:10-18:00) B508 Securing Information System: 1. Facebook, 2. European.
Case Study for Information Management 資訊管理個案 CSIM4B03 TLMXB4B (M1824) Tue 3,4 (10:10-12:00) L212 Thu 9 (16:10-17:00) B601 Global E-Business and Collaboration:
商業智慧實務 Practices of Business Intelligence BI01 MI4 Wed, 9,10 (16:10-18:00) (B130) 商業智慧導論 (Introduction to Business Intelligence) Min-Yuh Day 戴敏育.
Business Intelligence Trends 商業智慧趨勢 BIT01 MIS MBA Mon 6, 7 (13:10-15:00) Q407 Course Orientation for Business Intelligence Trends 商業智慧趨勢課程介紹 Min-Yuh.
Data Mining 資料探勘 DM04 MI4 Wed, 6,7 (13:10-15:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management,
Social Media Marketing Management 社會媒體行銷管理 SMMM01 TLMXJ1A Tue 12,13,14 (19:20-22:10) D325 Course Orientation of Social Media Marketing Management.
Management 周 宇 超 Office: SL412 Tel:
Data Mining 資料探勘 DM01 MI4 Wed, 6,7 (13:10-15:00) (B216) Introduction to Data Mining ( 資料探勘導論 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of.
Data Mining 資料探勘 Introduction to Data Mining (資料探勘導論) Min-Yuh Day 戴敏育
Case Study for Information Management 資訊管理個案
Social Computing and Big Data Analytics 社群運算與大數據分析
Case Study for Information Management 資訊管理個案 CSIM4C01 TLMXB4C Mon 8, 9, 10 (15:10-18:00) B602 Introduction to Case Study for Information Management.
巨量資料基礎: MapReduce典範、Hadoop與Spark生態系統
Case Study for Information Management 資訊管理個案 CSIM4C01 TLMXB4C (M1824) Tue 2 (9:10-10:00) L212 Thu 7,8 (14:10-16:00) B601 Introduction to Case Study.
社群網路行銷管理 Social Media Marketing Management SMMM01 MIS EMBA (M2200) (8615) Thu, 12,13,14 (19:20-22:10) (D309) 社群網路行銷管理課程介紹 (Course Orientation for.
Social Computing and Big Data Analytics 社群運算與大數據分析
Social Computing and Big Data Analytics 社群運算與大數據分析 SCBDA02 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (Q201) Data Science and Big Data Analytics:
Case Study for Information Management 資訊管理個案 CSIM4B01 TLMXB4B (M1824) Tue 2, 3, 4 (9:10-12:00) B502 Introduction to Case Study for Information Management.
Big Data Mining 巨量資料探勘 DM05 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授.
巨量資料基礎: MapReduce典範、Hadoop與Spark生態系統
資訊管理專題 Hot Issues of Information Management
Social Computing and Big Data Analytics 社群運算與大數據分析
Course Orientation for Big Data Mining (巨量資料探勘課程介紹)
大數據行銷研究 Big Data Marketing Research
Case Study for Information Management 資訊管理個案
分群分析 (Cluster Analysis)
Case Study for Information Management 資訊管理個案
Case Study for Information Management 資訊管理個案
Case Study for Information Management 資訊管理個案
Presentation transcript:

Big Data Mining 巨量資料探勘 DM01 MI4 (M2244) (3094) Tue, 3, 4 (10:10-12:00) (B216) Course Orientation for Big Data Mining ( 巨量資料探勘課程介紹 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang University 淡江大學 淡江大學 資訊管理學系 資訊管理學系 tku.edu.tw/myday/ Tamkang University

淡江大學 104 學年度第 2 學期 課程教學計畫表 Spring 2016 ( ) 課程名稱:巨量資料探勘 (Big Data Mining) 授課教師:戴敏育 (Min-Yuh Day) 開課系級:資管四 P (TLMXB4P) (M2244) (3094) 開課資料:選修 單學期 2 學分 (2 Credits, Elective) 上課時間:週二 3,4 (Tue 10:10-12:00) 上課教室: B216 2

課程簡介 本課程介紹巨量資料探勘 (Big Data Mining) 的 基礎概念及應用技術。 課程內容包括 – 巨量資料探勘 (Big Data Mining) – 巨量資料基礎: MapReduce 典範、 Hadoop 與 Spark 生態系統 (Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem ) – 關連分析 (Association Analysis) – 分類與預測 (Classification and Prediction) – 分群分析 (Cluster Analysis) – SAS 企業資料採礦實務 (SAS EM) – 巨量資料探勘個案分析與實作 – Google TensorFlow 深度學習 (Deep Learning with Google TensorFlow) 3

Course Introduction This course introduces the fundamental concepts and applications technology of big data mining. Topics include – Big Data Mining – Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem – Association Analysis – Classification and Prediction – Cluster Analysis – Data Mining Using SAS Enterprise Miner (SAS EM) – Case Study and Implementation of Big Data Mining – Deep Learning with Google TensorFlow 4

課程目標 (Objective) 瞭解及應用巨量資料探勘基本概念與技術。 Understand and apply the fundamental concepts and technology of big data mining 5

週次 (Week) 日期 (Date) 內容 (Subject/Topics) /02/16 巨量資料探勘課程介紹 (Course Orientation for Big Data Mining) /02/23 巨量資料基礎: MapReduce 典範、 Hadoop 與 Spark 生態系統 (Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem) /03/01 關連分析 (Association Analysis) /03/08 分類與預測 (Classification and Prediction) /03/15 分群分析 (Cluster Analysis) /03/22 個案分析與實作一 (SAS EM 分群分析 ) : Case Study 1 (Cluster Analysis – K-Means using SAS EM) /03/29 個案分析與實作二 (SAS EM 關連分析 ) : Case Study 2 (Association Analysis using SAS EM) 課程大綱 (Syllabus) 6

週次 (Week) 日期 (Date) 內容 (Subject/Topics) /04/05 教學行政觀摩日 (Off-campus study) /04/12 期中報告 (Midterm Project Presentation) /04/19 期中考試週 (Midterm Exam) /04/26 個案分析與實作三 (SAS EM 決策樹、模型評估 ) : Case Study 3 (Decision Tree, Model Evaluation using SAS EM) /05/03 個案分析與實作四 (SAS EM 迴歸分析、類神經網路 ) : Case Study 4 (Regression Analysis, Artificial Neural Network using SAS EM) /05/10 Google TensorFlow 深度學習 (Deep Learning with Google TensorFlow) /05/17 期末報告 (Final Project Presentation) /05/24 畢業班考試 (Final Exam) 課程大綱 (Syllabus) 7

教學方法與評量方法 教學方法 – 講述、討論、賞析、模擬、實作、問題解決 評量方法 – 紙筆測驗、實作、報告、上課表現 8

教材課本 – 講義 (Slides) – 資料採礦運用 : 以 SAS Enterprise Miner 為工具, 李淑娟, 2015 , SAS 賽仕電腦軟體 參考書籍 – Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners, Jared Dean, Wiley, 2014 – Data Science for Business: What you need to know about data mining and data-analytic thinking, Foster Provost and Tom Fawcett, O'Reilly, 2013 – Applied Analytics Using SAS Enterprise Mining, Jim Georges, Jeff Thompson and Chip Wells, SAS, 2010 – Data Mining: Concepts and Techniques, Third Edition, Jiawei Han, Micheline Kamber and Jian Pei, Morgan Kaufmann,

作業與學期成績計算方式 作業篇數 – 3 篇 學期成績計算方式 –  期中評量: 30 % –  期末評量: 30 % –  其他(課堂參與及報告討論表現): 40 % 10

Team Term Project Term Project Topics – Big Data mining – Web and Text mining – Business Intelligence – Big Data Analytics – Social Computing 3-4 人為一組 – 分組名單於 2016/02/23 ( 二 ) 課程下課時繳交 – 由班代統一收集協調分組名單 11

2016/02/23 巨量資料基礎: MapReduce 典範、 Hadoop 與 Spark 生態系統 (Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem) 12

2016/05/10 Google TensorFlow 深度學習 (Deep Learning with Google TensorFlow) 13

Big Data Analytics and Data Mining 14

Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications 15 Source:

Architecture of Big Data Analytics 16 Source: Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications Data Mining OLAP Reports Queries Hadoop MapReduce Pig Hive Jaql Zookeeper Hbase Cassandra Oozie Avro Mahout Others Middleware Extract Transform Load Data Warehouse Traditional Format CSV, Tables * Internal * External * Multiple formats * Multiple locations * Multiple applications Big Data Sources Big Data Transformation Big Data Platforms & Tools Big Data Analytics Applications Big Data Analytics Transformed Data Raw Data

Architecture of Big Data Analytics 17 Source: Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications Data Mining OLAP Reports Queries Hadoop MapReduce Pig Hive Jaql Zookeeper Hbase Cassandra Oozie Avro Mahout Others Middleware Extract Transform Load Data Warehouse Traditional Format CSV, Tables * Internal * External * Multiple formats * Multiple locations * Multiple applications Big Data Sources Big Data Transformation Big Data Platforms & Tools Big Data Analytics Applications Big Data Analytics Transformed Data Raw Data Data Mining Big Data Analytics Applications

Social Big Data Mining (Hiroshi Ishikawa, 2015) 18 Source:

Architecture for Social Big Data Mining (Hiroshi Ishikawa, 2015) 19 Hardware Software Social Data Physical Layer Logical Layer Integrated analysis Multivariate analysis Application specific task Data Mining Conceptual Layer Enabling TechnologiesAnalysts Model Construction Explanation by Model Construction and confirmation of individual hypothesis Description and execution of application-specific task Integrated analysis model Natural Language Processing Information Extraction Anomaly Detection Discovery of relationships among heterogeneous data Large-scale visualization Parallel distrusted processing Source: Hiroshi Ishikawa (2015), Social Big Data Mining, CRC Press

Business Intelligence (BI) Infrastructure 20 Source: Kenneth C. Laudon & Jane P. Laudon (2014), Management Information Systems: Managing the Digital Firm, Thirteenth Edition, Pearson.

Data Warehouse Data Mining and Business Intelligence Increasing potential to support business decisions End User Business Analyst Data Analyst DBA Decision Making Data Presentation Visualization Techniques Data Mining Information Discovery Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems 21 Source: Jiawei Han and Micheline Kamber (2006), Data Mining: Concepts and Techniques, Second Edition, Elsevier

The Evolution of BI Capabilities Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 22

Data Mining 23 Source:

24 Source: 郝沛毅, 李御璽, 黃嘉彥 編譯, 資料探勘 (Jiawei Han, Micheline Kamber, Jian Pei, Data Mining - Concepts and Techniques 3/e), 高立圖書, 2014

Data Mining at the Intersection of Many Disciplines Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 25

Knowledge Discovery (KDD) Process Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation 26 Source: Han & Kamber (2006) Data mining: core of knowledge discovery process

A Taxonomy for Data Mining Tasks Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 27

28 Source:

Deep Learning Intelligence from Big Data 29 Source:

30 Source:

31 Source:

32 Source:

Big Data with Hadoop Architecture 33 Source:

34 Big Data with Hadoop Architecture Logical Architecture Processing: MapReduce Source:

35 Big Data with Hadoop Architecture Logical Architecture Storage: HDFS Source:

36 Big Data with Hadoop Architecture Process Flow Source:

37 Big Data with Hadoop Architecture Hadoop Cluster Source:

Traditional ETL Architecture 38 Source:

39 Source: Offload ETL with Hadoop (Big Data Architecture)

Big Data Solution 40 Source:

HDP A Complete Enterprise Hadoop Data Platform 41 Source:

Spark and Hadoop 42 Source:

Spark Ecosystem 43 Source:

Python for Big Data Analytics 44 (The column on the left is the 2015 ranking; the column on the right is the 2014 ranking for comparison Source:

45 Source:

Yves Hilpisch, Python for Finance: Analyze Big Financial Data, O'Reilly, Source:

Business Insights with Social Analytics 47

Analyzing the Social Web: Social Network Analysis 48

49 Source: Jennifer Golbeck (2013), Analyzing the Social Web, Morgan Kaufmann

Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites 50 Source:

Web Mining Success Stories Amazon.com, Ask.com, Scholastic.com, … Website Optimization Ecosystem Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 51

Business Intelligence Trends 1.Agile Information Management (IM) 2.Cloud Business Intelligence (BI) 3.Mobile Business Intelligence (BI) 4.Analytics 5.Big Data 52 Source:

Business Intelligence Trends: Computing and Service Cloud Computing and Service Mobile Computing and Service Social Computing and Service 53

Business Intelligence and Analytics Business Intelligence 2.0 (BI 2.0) – Web Intelligence – Web Analytics – Web 2.0 – Social Networking and Microblogging sites Data Trends – Big Data Platform Technology Trends – Cloud computing platform 54 Source: Lim, E. P., Chen, H., & Chen, G. (2013). Business Intelligence and Analytics: Research Directions. ACM Transactions on Management Information Systems (TMIS), 3(4), 17

Business Intelligence and Analytics: Research Directions 1. Big Data Analytics – Data analytics using Hadoop / MapReduce framework 2. Text Analytics – From Information Extraction to Question Answering – From Sentiment Analysis to Opinion Mining 3. Network Analysis – Link mining – Community Detection – Social Recommendation 55 Source: Lim, E. P., Chen, H., & Chen, G. (2013). Business Intelligence and Analytics: Research Directions. ACM Transactions on Management Information Systems (TMIS), 3(4), 17

56 Source: Davenport, T. H., & Patil, D. J. (2012). Data Scientist. Harvard business review

SAS 第五屆大數據資料科學家競賽 文字分析與數位行銷大賽 57

58 SAS 第五屆大數據資料科學家競賽 文字分析與數位行銷大賽

Summary This course introduces the fundamental concepts and applications technology of big data mining. Topics include – Big Data Mining – Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem – Association Analysis – Classification and Prediction – Cluster Analysis – Data Mining Using SAS Enterprise Miner (SAS EM) – Case Study and Implementation of Big Data Mining – Deep Learning with Google TensorFlow 59

Contact Information 戴敏育 博士 (Min-Yuh Day, Ph.D.) 專任助理教授 淡江大學 淡江大學 資訊管理學系 資訊管理學系 電話: #2846 傳真: 研究室: B929 地址: 新北市淡水區英專路 151 號 : 網址: