Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCE 5073 Section 001: Data Mining Spring 2016. Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,

Similar presentations


Presentation on theme: "CSCE 5073 Section 001: Data Mining Spring 2016. Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,"— Presentation transcript:

1 CSCE 5073 Section 001: Data Mining Spring 2016

2 Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur, JBHT 516 Instructor - Dr. Xintao Wu email - xintaowu@uark.eduxintaowu@uark.edu Office – JBHT 516 Webpage http://csce.uark.edu/~xintaowu/5073/5073.htm Textbook Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques, 3 rd edition, Morgan Kaufmann, 2011. ISBN: 978-0-12-381479-1Data Mining: Concepts and Techniques, 3 rd editionMorgan Kaufmann

3 Topic Description Introduction to data mining Know your data Data preprocessing Data warehousing and OLAP Frequent pattern mining, association and correlation Classification Cluster analysis Outlier Detection Advanced topics Deep learning Big data analysis including MapReduce, Spark Social aware data mining

4 Course Prerequisite Data Structure and algorithm Familiarity with programming with Java or C++ is assumed Matlab/R/Python/Scala is preferred. Probability and statistics basic concept Knowledge of linear algebra is a big plus

5 Grading Composition Homework and quiz 10% Project 30% Midterm 20% Final 40%

6 Homework and Project Reports Late policy: No acceptable. Hard copy is preferred Electronic submission (word or pdf) accepted

7 Project Data Analysis Project Each group consists 2-3 students Develop/implement/apply data mining techniques on real challenging data mining problems Individual Research Project More information http://csce.uark.edu/~xintaowu/5073/proj.htm

8 Midterm & Final Open books/notes/internet No discussion No help from any entity, e.g., by posting/uploading your questions on Web Cumulative No makeup Class attendance is not required Bonus is expected

9 9 9 Textbook & Recommended Reference Books Textbook Jiawei Han, Micheline Kamber, Jian Pei, Data Mining: Concepts and Techniques, 3 rd ed., Morgan Kaufmann, 2011 Recommended reference books C. M. Bishop, Pattern Recognition and Machine Learning, Springer 2007. S. Chakrabarti, Mining the Web: Statistical Analysis of Hypertext and Semi-Structured Data, Morgan Kaufmann, 2002 T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction,2 nd ed., Springer-Verlag, 2009. B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer, 2006 D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World, Cambridge Univ. Press, 2010. M. Newman, Networks: An Introduction, Oxford Univ. Press, 2010.

10 10 Reference Papers Course research papers: Check Reading_List Major conference proceedings that will be used DM conferences: ACM SIGKDD (KDD), ICDM (IEEE, Int. Conf. Data Mining), SDM (SIAM Data Mining), PKDD (Principles KDD)/ECML, PAKDD (Pacific- Asia) DB conferences: ACM SIGMOD, VLDB, ICDE ML conferences: NIPS, ICML IR conferences: SIGIR, CIKM Web conferences: WWW, WSDM Other related conferences and journals IEEE TKDE, ACM TKDD, DMKD, ML, Use course Web page, DBLP, Google Scholar, Citeseer CS591Han: Advanced Seminar on Data Mining

11 11 Research Frontiers in Data Mining Mining social and information networks Mining spatiotemporal data, moving object data & cyber-physical systems Mining multimedia, social media, text and Web Data software engineering and computer system data Multidimensional online analytical analysis Pattern mining, pattern usage, and pattern understanding Biological data mining Stream data mining


Download ppt "CSCE 5073 Section 001: Data Mining Spring 2016. Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,"

Similar presentations


Ads by Google