COMP 4332 / RMBI 4330 Big Data Mining (Spring 2015) Lei Chen Hong Kong University of Science and Technology
Topics Review of Basics Practical Data Mining –Imbalanced Data –Text and Web Mining –Big Data –Social Recommendation –Social Media and Social Networks Hands on: 2 Major Projects Student Presentations 2015/9/11Course Introduction2
Outcome and Objective Student will know the current state of the art in Data Mining Student will be able to implement a practical data mining project Student will be able to present their ideas well Prepared for PG study, Internship, etc. 2015/9/11Course Introduction3
Projects: based on KDDCUPs Project 1: –KDDCUPs on credit rating and customer retention (KDDCUP 2009) Project 2: –Micro-blog (Weibo) User Recommendation (KDDCUP 2012) Project 3 (Optional): KDDCUP /9/11Course Introduction4
2015/9/11Course Introduction5 5 KDDCUP Examples —KDDCUP from past years —2007: —Predict if a user is going to rate a movie? —Predict how many users are going to rate a movie? —2006: —Predict if a patient has cancer from medical images —2005: —Given a web query (“Apple”), predict the categories (IT, Food) —1998: —Given a person, predict if this person is going to donate money —In general, we wish to —Input: Data —Output: —Build model —Apply model to future data
2015/9/11Course Introduction6 Important Sites Course Web Site TA: Yue Wang Assignment Hand-in: CASS
2015/9/11Course Introduction7 Prerequisites Statistics and Probability would help, But will be reviewed in class Machine Learning/Pattern Recognition would help, We will review some most important algorithms One programming language We will teach new languages in the tutorial
2015/9/11Course Introduction8 Grading Assignments: 20% Course Projects: 60% Presentations: 10% Term Paper: 10%
2015/9/11Course Introduction9 More info Textbooks: –Listed on Course Website –Buy them online if you wish