A Data Mining Course for Computer Science and non Computer Science Students Jamil Saquer Computer Science Department Missouri State University Springfield,

Slides:



Advertisements
Similar presentations
Albert Gatt Corpora and Statistical Methods Lecture 13.
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CS690L: Clustering References:
Teaching Courses in Scientific Computing 30 September 2010 Roger Bielefeld Director, Advanced Research Computing.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
CS/CMPE 535 – Machine Learning Outline. CS Machine Learning (Wi ) - Asim LUMS2 Description A course on the fundamentals of machine.
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.
Overview of the MS Program Jan Prins. The Computer Science MS Objective – prepare students for advanced technical careers in computing or a related field.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Chapter 5 Data mining : A Closer Look.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Introduction Lecture 1 Intro to ALS  These lecture notes accompany the book on ALS  They can be used with the book and the software for courses on.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
New experiences with teaching Java as a second programming language Ioan Jurca “Politehnica” University of Timisoara/Romania
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Teaching Teaching Discrete Mathematics and Algorithms & Data Structures Online G.MirkowskaPJIIT.
CSCI 347 – Data Mining Lecture 01 – Course Overview.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Sheila Roberts Department of Geology Bowling Green State University.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Competence Analysis in the Two-subject Study Program Computer Science Jože Rugelj, Irena Nančovska Šerbec Faculty of Education Univesity of Ljubljana 1Beaver.
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
Project MLExAI Machine Learning Experiences in AI Ingrid Russell, University.
Knowledge Discovery and Data Mining Evgueni Smirnov.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Knowledge Discovery and Data Mining Evgueni Smirnov.
ScWk 242 Course Overview and Review of ScWk 240 Concepts ScWk 242 Session 1 Slides.
Data Mining Teaching experience at the FIB. What is Data Mining? A broad set of techniques and algorithms brought from machine learning and statistics.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Summary „Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Clustering, performance evaluation, and Term Project 1.Term Project 2.Resource for review.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Computer Vision COURSE OBJECTIVES: To introduce the student to computer vision algorithms, methods and concepts. EXPECTED OUTCOME: Get introduced to computer.
Brief Intro to Machine Learning CS539
Who am I? Work in Probabilistic Machine Learning Like to teach 
Machine Learning Models
DATA MINING Spatial Clustering
Analysis of Computing Options at ISU
DATA MINING © Prentice Hall.
2009: Topics Covered in COSC 6368
New Machine Learning in Medical Imaging Journal Club
Data Mining 101 with Scikit-Learn
SEEM5770/ECLT5840 Course Review
Waikato Environment for Knowledge Analysis
Data Mining: Concepts and Techniques Course Outline
What is Pattern Recognition?
Nearest-Neighbor Classifiers
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
A CASE STUDY INTRODUCING DYNAMIC PROGRAMMING IN CS2
Data Science introduction.
DATA MINING Introductory and Advanced Topics Part II - Clustering
Ying Dai Faculty of software and information science,
Grad V.S. Undergrad Clickers V.S. Non-clickers
Dept. of Computer Science University of Liverpool
Welcome! Knowledge Discovery and Data Mining
Python4ML An open-source course for everyone
Presentation transcript:

A Data Mining Course for Computer Science and non Computer Science Students Jamil Saquer Computer Science Department Missouri State University Springfield, MO

Outline Introduction Introduction Motivation Motivation Challenges Challenges Design of the Course Design of the Course Topics Covered Topics Covered Assignments Assignments Examination Format Examination Format Conclusion Conclusion

Introduction What is data mining (DM)? What is data mining (DM)? non-trivial process of identifying valid, novel, useful, and ultimately understandable patterns in large volumes of data. non-trivial process of identifying valid, novel, useful, and ultimately understandable patterns in large volumes of data. DM is an interdisciplinary topic DM is an interdisciplinary topic Has many things in common with machine learning and pattern recognition Has many things in common with machine learning and pattern recognition

Motivation for the Course Introducing more electives Introducing more electives Introducing graduate level CS courses Introducing graduate level CS courses Informatics Program Informatics Program Interest to faculty members and students from other departments Interest to faculty members and students from other departments Authors main area of research Authors main area of research

Challenges in Designing the Course Diverse student population Diverse student population CS vs. non-CS CS vs. non-CS undergrad vs. grad undergrad vs. grad Solution Solution Informatics program in design stages Informatics program in design stages MNAS CS option is new MNAS CS option is new Therefore, emphasis on undergrad CS studentsTherefore, emphasis on undergrad CS students

Accommodating other students Minimize prerequisites Minimize prerequisites CS 2 (or even CS 1) CS 2 (or even CS 1) Capable of using a DM software Capable of using a DM software Scientific background/mentality Scientific background/mentality One from business, another from GGPOne from business, another from GGP For grad CS students: For grad CS students: project requires more researchproject requires more research Tests could be a little differentTests could be a little different Emphasize understanding basic DM concepts and using software for mining data Emphasize understanding basic DM concepts and using software for mining data

Design of the Course Used book by Dunham Used book by Dunham Book divided into 3 parts Book divided into 3 parts About 1 week spent on definitions, applications, motivations, challenges, … About 1 week spent on definitions, applications, motivations, challenges, … Core of the course spent on core DM subjects: classification, clustering, mining association rules Core of the course spent on core DM subjects: classification, clustering, mining association rules Last week for project presentations Last week for project presentations

Classification Assigning objects to classes Assigning objects to classes supervised learning supervised learning Example: classify a military vehicle as a friendly or an enemy vehicle Example: classify a military vehicle as a friendly or an enemy vehicle Methods covered include: decision trees, Naïve Bayesian, k-nearest neighbor, backpropogation Methods covered include: decision trees, Naïve Bayesian, k-nearest neighbor, backpropogation

Clustering Grouping objects into different classes Grouping objects into different classes unsupervised learning unsupervised learning Example: cluster Weblog data to discover groups of similar access patterns Example: cluster Weblog data to discover groups of similar access patterns Techniques covered include: link algorithms, nearest neighbor, k-means, PAM, BIRCH, DBSCAN, CURE, ROCK Techniques covered include: link algorithms, nearest neighbor, k-means, PAM, BIRCH, DBSCAN, CURE, ROCK

Association Rules Finding patterns that occur together Finding patterns that occur together Example: diapers and beer are usually bought together Example: diapers and beer are usually bought together Techniques covered: Apriori, sampling, partitioning, FP-growth Techniques covered: Apriori, sampling, partitioning, FP-growth

Assignments Students need to learn how to mine data Students need to learn how to mine data One assignment on each core DM topic One assignment on each core DM topic apply two different algorithms on at least two data sets, one has to be relatively large apply two different algorithms on at least two data sets, one has to be relatively large can use any DM package (Weka) can use any DM package (Weka) Students write a report Students write a report Students learn how to run an experiment Students learn how to run an experiment

Term Project Group projects Group projects Either provide a non-trivial implementation of a DM algorithm Either provide a non-trivial implementation of a DM algorithm Or, learn about a DM topic not discussed in class Or, learn about a DM topic not discussed in class Graduate students required to read at least three research papers and to write a report Graduate students required to read at least three research papers and to write a report All students present their project in class All students present their project in class

Examination Format Open book Open book Two types of questions Two types of questions First type, require basic knowledge of the material First type, require basic knowledge of the material definitions, T/F, short answers definitions, T/F, short answers Second type, apply certain algorithms on small data sets Second type, apply certain algorithms on small data sets

Conclusion DM is an interesting course for CS and non-CS students DM is an interesting course for CS and non-CS students DM can be taught for non-CS students DM can be taught for non-CS students A DM course can be taught for students with minimal CS background A DM course can be taught for students with minimal CS background

Questions