CSC 4740 / 6740 Fall 2016 Data Mining Instructor: Yubao Wu Fall 2016.

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Advertisements

CS583 – Data Mining and Text Mining
Web Search and Mining Course Overview 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 0: Course Overview.
These slides are additional material for TIES4451 Data Mining Lecture 1 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
CS583 – Data Mining and Text Mining
CS/CMPE 536 –Data Mining Outline. CS Data Mining (Au 2004/2005) - Asim LUMS2 Description A comprehensive introduction to the concepts and.
CS/CMPE 535 – Machine Learning Outline. CS Machine Learning (Wi ) - Asim LUMS2 Description A course on the fundamentals of machine.
CS 536 –Data Mining Outline.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
General information CSE 230 : Introduction to Software Engineering
Data Mining: Concepts and Techniques
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
An Overview of Our Course:
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Ch. Eick: Course Information COSC Introduction --- Part2 1. Another Introduction to Data Mining 2. Course Information.
Ch. Eick: Introduction Data Mining and Course Information 1 Introduction --- Part2 1. Another Introduction to Data Mining 2. Course Information.
CS583 – Data Mining and Text Mining
Instructor: Jinze Liu Spring 2009 CS 685 Special Topics in Data mining.
1 1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 1 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
CpSc 881: Machine Learning Introduction. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources.
Course Title Database Technologies Instructor: Dr ALI DAUD Course Credits: 3 with Lab Total Hours: 45 approximately.
Overview of CS Class Jiawei Han Department of Computer Science
Instructor: Jinze Liu Spring 2014 CS 685 Special Topics in Data mining.
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert.
Instructor: Jinze Liu Fall 2010 CS 685 Special Topics in Data mining.
1 IMM472 資料探勘 陳春賢. 2 Lecture I Class Introduction.
General Information 439 – Data Mining Assist.Prof.Dr. Derya BİRANT.
1 1 MSCIT 5210: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by Dr. Lei Chen based on the slides provided by Jiawei Han, Micheline.
ITIS 4510/5510 Web Mining Spring Overview Class hour 5:00 – 6:15pm, Tuesday & Thursday, Woodward Hall 135 Office hour 3:00 – 5:00pm, Tuesday, Woodward.
Han: Introduction to KDD 1 Introduction to Knowledge Discovery and Data Mining ©Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab.
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
1 Advanced Database System Design Instructor: Ruoming Jin Fall 2010.
1 IMM472 資料探勘 陳春賢. 2 Lecture I Class Introduction.
CSC4320/6320 Operating Systems.  Instructor: Xiaolin Hu   Phone:  Office: 25 Park Place Building,
Sotarat Thammaboosadee, Ph.D. EGIT563- Data Mining Course Outline.
DATA MINING: LECTURE 1 By Dr. Hammad A. Qureshi Introduction to the Course and the Field There is an inherent meaning in everything. “Signs for people.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
CS583 – Data Mining and Text Mining
Term Project Proposal By J. H. Wang Apr. 7, 2017.
Computer Network Fundamentals CNT4007C
Course Overview - Database Systems
ECE 533 Digital Image Processing
Computer Networks CNT5106C
CS583 – Data Mining and Text Mining
Eick: Introduction Machine Learning
COMP1942 Exploring and Visualizing Data Overview
CS583 – Data Mining and Text Mining
中国计算机学会学科前沿讲习班:信息检索 Course Overview
Jiawei Han Computer Science University of Illinois at Urbana-Champaign
Special Topics in Data Mining Applications Focus on: Text Mining
Data Mining: Concepts and Techniques Course Outline
CS583 – Data Mining and Text Mining
Computer Networks CNT5106C
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Introduction --- Part2 Another Introduction to Data Mining
CS583 – Data Mining and Text Mining
CS 425 / CS 625 Software Engineering
CS583 – Data Mining and Text Mining
Dept. of Computer Science University of Liverpool
Christoph F. Eick: A Gentle Introduction to Machine Learning
Welcome! Knowledge Discovery and Data Mining
CSCE 4143 Section 001: Data Mining Spring 2019.
CS583 – Data Mining and Text Mining
Promising “Newer” Technologies to Cope with the
Information Retrieval and Data Mining (AT71. 07) Comp. Sc. and Inf
CS 474/674 – Image Processing Fall Prof. Bebis.
First 2-3 Lectures (Intro to DS/DM)
Presentation transcript:

CSC 4740 / 6740 Fall 2016 Data Mining Instructor: Yubao Wu Fall 2016

Welcome! Instructor: Yubao Wu Office: 25 Park Place Suite 737 Phone: (office) Website: Office Hours: 4:00 pm - 5:30 pm, Wednesday; 3:30 pm - 5:00 pm, Friday; or by appointment

Classroom and Date Classroom: Petit Science Center 230 Date/Time: Monday/Wednesday, 10:00 am - 11:45 am

Textbook Data Mining: Concepts and Techniques, Third Edition, by Jiawei Han, Micheline Kamber, and Jian Pei, Morgan Kaufmann Publishers, ISBN:

References  Introduction to Data Mining, by Tan, Steinbach, and Kumar, Addison Wesley, (ISBN: )  Principles of Data Mining, by Hand, Mannila, and Smyth, MIT Press, (ISBN: X)  The Elements of Statistical Learning --- Data Mining, Inference, and Prediction, by Hastie, Tibshirani, and Friedman, Springer, (ISBN: )

Course Content Basic data mining techniques  association rules mining  Sequential Patterns  Classification and Prediction  Clustering and Outlier Detection  Regression  Pattern Interestingness  Dimensionality Reduction  …… Big data mining applications  Web data mining  Bioinformatics  Social networks  Text mining  Visualization  Financial data analysis  Software Engineering  ……

Course Requirements Course Requirements:  Basic theoretical principles  Practical hands-on experience Prerequisite: CSC 3410 Data Structures  Assignments  Mid-term Exam  Final Exam  Research Project The department will strictly enforce all prerequisites. Students without proper prerequisites will be dropped from the class, without any prior notice, at any time during the semester.

Assignments and Exams Mid-Term Exam: Open Textbook Final Exam: Open Textbook The problems for CSC 4740 and CSC 6740 may be different.

Research Projects CSC 4740: CSC 6740: One or Two undergraduate students form a group. Each group does a project and submits one project report. Each graduate student does a project and submits one project report.

Research Projects discovers interesting relationships within a significant amount of data. Some project ideas (only examples, best to propose your own)  Statistical Computing (Speed up traditional statistical methods, such as correlation computation).  Data Mining in Business Applications (Customer Segmentation, Accounting, Marketing)  Literature Survey  Mining Biological Datasets  Social Network Analysis  Your own ideas

Research Projects  Project proposal (2 - 4 pages, ACM SIGKDD or IEEE ICDM template)  Title, project idea, survey of related work, data source, key algorithms/technology, and what you expect to submit at the end of the semester.  Final report ( pages, ACM SIGKDD or IEEE ICDM template)  A comprehensive description of your project.  project idea, extended survey of related work, detailed algorithm/technology, specific implementation, key results  what worked, what did not work, what surprised you, and why

Research Projects CSC 4740: CSC 6740:  Project Proposal  Final Report  Software, user manual, and sample dataset  Project Proposal  Final Report  Software, user manual, and sample dataset  Slides

Research Projects Final presentation  In the last a few classes, each graduate student presents his/her project to the rest of the class.  About 15 minute presentation + 2 minute questions Checkpoints  Proposal (due Sep 21): ~ 1 month  Final Report (due Dec 5): ~ 2 months

Class Policy:  Attendance: Students are required to attend all classes.  Academic honesty: Plagiarism will result in a score of zero on the test or project. The instructor has the right to make a decision.  Assignments and Projects: They must be handed in on time and will not be accepted when past due.  Withdrawals: Oct 11 Tuesday is the last day to withdraw and possibly receive a W.  Make-ups: need the instructor's special permission.

Grading Policy: CSC 4740CSC 6740 Mid-term Exam 25%20% Final Exam 25%20% Assignments 30%25% Project 15%30% Attendance 5% A+ [97, 100]A [93, 97)A- [90, 93) B+ [87, 90)B [83, 87)B- [80, 83) C+ [77, 80)C [73, 77)C- [70, 73) D [60, 70)F [0, 60) If one student’s score is no less than 97, an A+ will be given. The scores may be adjusted if the average is low.

Tentative Course Outline and Schedule: Chapter 1 IntroductionAug. 22 Chapter 2 Getting to Know Your Data Chapter 3 Data Preprocessing Aug. 24 Chapter 6 Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods Aug. 29, 31, Sep. 7 Chapter 8 Classification: Basic ConceptsSep. 12 Chapter 9 Classification: Advanced MethodsSep. 14, 19, 21 Project Proposal Due6 pm eastern time, Sep. 21

Tentative Course Outline and Schedule: Chapter 10 Cluster Analysis: Basic Concepts and Methods Sep. 26, 28, Oct. 5, 10 Mid-term ExamOct. 3 Chapter 11 Advanced Cluster AnalysisOct. 12, 17, 19, 24 Chapter 13 Data Mining Trends and Research Frontiers Oct. 26, 31, Nov. 2, 7, 9, 14 Project PresentationsNov. 16, 28, 30 Final ExamDec. 5 Research Project Due6 pm eastern time, Dec. 8

KDD References  Data mining and KDD  Conferences: ACM-SIGKDD, IEEE-ICDM, SIAM-DM, PKDD, PAKDD, etc.  Journal: ACM-KDD, Data Mining and Knowledge Discovery, KDD Explorations  Database systems  Conferences: ACM-SIGMOD, ACM-PODS, VLDB, IEEE-ICDE, EDBT, ICDT, DASFAA  Journals: ACM-TODS, IEEE-TKDE, JIIS, J. ACM, etc.  AI & Machine Learning  Conferences: Machine learning (ICML), AAAI, IJCAI, COLT (Learning Theory), etc.  Journals: Machine Learning, Artificial Intelligence, etc.

KDD References  Statistics  Conferences: Joint Stat. Meeting, etc.  Journals: Annals of statistics, etc.  Bioinformatics  Conferences: ISMB, RECOMB, PSB, CSB, BIBE, etc.  Journals: J. of Computational Biology, Bioinformatics, PLoS Computational Biology, etc.

Questions?