Download presentation
Presentation is loading. Please wait.
2
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology qyang@cs.ust.hk http://www.cs.ust.hk
3
2015/6/1Course Introduction2 2 Data Mining: An Example — KDDCUP from past years — 2007: — Predict if a user is going to rate a movie? — Predict how many users are going to rate a movie? — 2006: — Predict if a patient has cancer from medical images — 2005: — Given a web query ( “ Apple ” ), predict the categories (IT, Food) — 1998: — Given a person, predict if this person is going to donate money — In general, we wish to — Input: Data — Output: — Build model — Apply model to future data
4
2015/6/1Course Introduction3 3 Data Mining: Convergence of Three Technologies
5
2015/6/1Course Introduction4 4 Definition: Predictive Model — A “ black box ” that makes predictions about the future based on information from the past and present — Large number of inputs usually available
6
2015/6/1Course Introduction5 5 How are Models Built and Used? — High Level View :
7
2015/6/1Course Introduction6 6 The Data Mining Process
8
2015/6/1Course Introduction7 7 What does the Real World Look Like
9
2015/6/1Course Introduction8 8 Predictive Models are … Decision Trees Nearest Neighbor Classification Neural Networks Rule Induction Clustering
10
2015/6/1Course Introduction9 Course Description Data Mining and Knowledge Discovery Focus: Focus 1: Theoretical foundations in Pattern Recognition and Machine Learning Algorithms: Differences? where they apply? Focus 2: Broad survey of recent research Focus 3: Hands-on, apply algorithms to KDD data sets
11
2015/6/1Course Introduction10 Topic 1: Foundations Classification algorithms Clustering algorithms Association algorithms Sequential Data Mining Novel Applications Web Customer Relationship Management Biological Data
12
2015/6/1Course Introduction11 Topic 2: Hands On Apply learned algorithms to selected data sets Homework assignments Get familiar with existing system packages and libraries In-class workshops Programming Assignments
13
2015/6/1Course Introduction12 Important Sites Instructor Web Site http://www.cse.ust.hk/~qyang/521 http://www.cse.ust.hk/~qyang/521 TA: Kaixiang Mo Assignment Hand-in: online csit5210@ust.hk csit5210@ust.hk Course Discussion Site: Check out the web cite
14
2015/6/1Course Introduction13 Prerequisites Statistics and Probability would help, but not necessary Pattern Recognition would help, but not necessary Databases Knowledge of SQL and relational algebra But not necessary One programming language One of Java, C++, Perl, Matlab, etc. Will need to read Java Library
15
2015/6/1Course Introduction14 Grading Grade Distribution: Assignments 20% Course Project 20% Exams 60% Midterm 20% Final 40%
16
2015/6/1Course Introduction15 More info Textbooks: For reference only Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Pearson International Edition, 2005. Data Mining. by Ian Witten and Ebe Frank. (Google books)Google books Data Mining -- Concepts and Techniques by Jiawei Han and Micheline Kamber. Morgan Kaufmann Publishers. Available in our bookstore
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.