CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Slides:



Advertisements
Similar presentations
Clustering Basic Concepts and Algorithms
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
An Overview of Machine Learning
Scikit-learn: Machine learning in Python
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
A Technique for Advanced Dynamic Integration of Multiple Classifiers Alexey Tsymbal*, Seppo Puuronen**, Vagan Terziyan* *Department of Artificial Intelligence.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
Data Mining – Intro.
Graph Classification.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Introduction to machine learning
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Anomaly detection with Bayesian networks Website: John Sandiford.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Data Mining Teaching experience at the FIB. What is Data Mining? A broad set of techniques and algorithms brought from machine learning and statistics.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Machine Learning Documentation Initiative Workshop on the Modernisation of Statistical Production Topic iii) Innovation in technology and methods driving.
Data Visualization Michel Bruley Teradata Aster EMEA Marketing Director April 2013 Michel Bruley Teradata Aster EMEA Marketing Director.
Syllabus. We covered Regression in Applied Stats. We will review Regression and cover Time Series and Principle Components Analysis. Reference Book.
Feature (Gene) Selection MethodsSample Classification Methods Gene filtering: Variance (SD/Mean) Principal Component Analysis Regression using variable.
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
CSC 594 Topics in AI – Text Mining and Analytics
Instructor: Pedro Domingos
Summary „Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Summary „Rough sets and Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Data Mining and Decision Support
Data Analytics CMIS Short Course part II Day 1 Part 1: Introduction Sam Buttrey December 2015.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Knowledge Discovery in a DBMS Data Mining Computing models and finding patterns in large databases current major challenge in database systems & large.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
Data Mining – Intro.
Who am I? Work in Probabilistic Machine Learning Like to teach 
Sentiment analysis algorithms and applications: A survey
Machine Learning overview Chapter 18, 21
DATA MINING © Prentice Hall.
Introduction to Data Mining
Data Mining 101 with Scikit-Learn
Machine Learning Training Bootcamp
Special Topics in Data Mining Applications Focus on: Text Mining
Data Mining: Concepts and Techniques Course Outline
Mining and Analyzing Data from Open Source Software Repository
כריית מידע -- מבוא ד"ר אבי רוזנפלד.
Data Warehousing and Data Mining
Classification and Prediction
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Presentation transcript:

CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University

What we did  Data Mining Overview  The KDD Process  Data Preprocessing and Understanding  Using Python, Numpy, Pandas  Using Scikit-learn modules  Some emphasis on visualizing and understanding characteristics of the data  Supervised Knowledge Discovery  Classification  Regression Analysis  Techniques such as KNN, Ridge Regression, Decision Tree and Bayesian classification  Lots of emphasis on model evaluation  Evaluation metrics  Train-Test methodologies such as cross-validation  Systematic parameter selection (e.g., grid search) 2

What we did  Unsupervised Knowledge Discovery  Cluster analysis  Using PCA and SVD for dimensionality reduction, data characterization, and noise reduction.  Association rule discovery  Emphasis on using unsupervised approaches as components of larger knowledge discovery efforts  E.g., using PCA before clustering; using clustering as the basis for classification  Real application domains  Text Mining and document analysis/filtering  Recommender systems  Predictive modeling for marketing/business applications  Image analysis 3

What we did not do (and you should learn later)  Approaches for mining sequential/temporal data  Markov models; time series analysis, sequential pattern mining  More Ensemble and Hybrid Classifiers/Predictors  Combining multiple classifiers  Random Forest classifiers  Other Meta-learners such as Ada Boost  Support Vector Machines and Kernel-Based Classifiers  Topic modeling with Latent factor models  LDA  Latent Dirichlet Allocation 4