Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.

Slides:



Advertisements
Similar presentations
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Advertisements

1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
An Overview of Machine Learning
Introduction to Data Mining with XLMiner
Machine Learning Neural Networks
Decision Support Systems
OLS REGRESSION VS. NEURAL NETWORKS VS. MARS A COMPARISON R. J. Lievano E. Kyper University of Minnesota Duluth.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Learning From Data Chichang Jou Tamkang University.
Three kinds of learning
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Part I: Classification and Bayesian Learning
Neural Networks in Data Mining “An Overview”
Chapter 5 Data mining : A Closer Look.
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
Data Mining Techniques
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Next Generation Techniques: Trees, Network and Rules
Data Mining Chun-Hung Chou
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Decision Support Systems Chapter 10.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Applying Neural Networks Michael J. Watts
CLASSIFICATION: Ensemble Methods
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Nurissaidah Ulinnuha. Introduction Student academic performance ( ) Logistic RegressionNaïve Bayessian Artificial Neural Network Student Academic.
Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Data Mining and Decision Support
From OLS to Generalized Regression Chong Ho Yu (I am regressing)
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
Introduction to Machine Learning
Classification of models
DATA MINING © Prentice Hall.
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Alan D. Mead, Talent Algorithms Inc. Jialin Huang, Amazon
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Data Mining (DM) and Machine Learning
Introduction to Predictive Modeling
3.1.1 Introduction to Machine Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning for Space Systems: Are We Ready?
Machine Learning.
Is Statistics=Data Science
Presentation transcript:

Chong Ho Yu

 Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed in the field Business Intelligence (BI) for years.  DM inherits the spirit of exploratory data analysis (EDA) but there is a crucial difference: no learning in EDA.

 Big data are everywhere now.  Everyday we create 2.5 quintillion bytes of data.  From sensors, social media, e-commerce, cell phones, GPS…etc.

 These are real data that reflect your actual psychological state and behavior.  Self-report data are not highly reliable.  If a survey item asks abut what my favorite movies are, I may not tell you the truth.  But my Netflix records will not lie!

 Use large quantities of data: Big data analytics  Exploration and pattern recognition. Like EDA, it does not start with a strong hypothesis. The logic is P(H|D), not P(D|H).  Resampling (e.g. cross-validation, bootstraping)  Automated algorithms; machine learning

Data analysis can be more efficient and effective if a machine can learn (think).

NameOriginApproach Symbolists Logic, philosophy Some form of deduction ConnectionistsNeuroscienceNetworking, pathway Evolutionists Evolutionary biology Genetic programming BayesiansStatisticsProbabilistic inference AnalogistsPsychologyLearn by examples

 Can a machine think like us if we can mimic the neuropathway?

 Supervised: Train the algorithm by giving labelled training data (examples).  Unsupervised: try to find the hidden structure in unlabeled data (without examples).

 In resampling we can do cross-validation (CV).  CV is a form of supervised machine learning.  You can hold back a portion of your data (e.g. 30%).  The first subset is for training and the remaining is for validation.

 Data mining can handle large data sets without the problem of excessive statistical power.  Non-parametric. Say “Hasta la vista, baby” to parametric assumptions.

 Can handle different data types (nominal, ordinal, continuous). If you use categorical data as IV in regression, you need dummy coding.  Immune to outliers.  Some can do data transformation for you.  Machine learning: avoid overfitting.  Replication (bootstrap forest)

 Decision tree (classification tree, recursive partition tree)  Bootstrap forest (random forest)  Multivariate adaptive regression splines (MARS)  Support vector machine  Clustering  Artificial Neural Network (ANN)

 ANN is a good example of data mining: machine learning  In some cases ANN is better than conventional OLS regression.  OLS regression is linear; it imposes a simple structure on the data.  When you have collinear predictors, you need to “orthogonalize” the problematic variables.  Non-linear regression may overfit the data.

 Artificial neural network: Stopping rule to prevent overfitting  It can work with different data types: nominal, ordinal, and continuous

 Neural networks, as the name implies, try to mimic interconnected neurons in the brain in order to make the algorithm capable of complex learning for extracting patterns and detecting trends.

 It is built upon the premise that real world data structures are complex, and thus it necessitates complex learning systems.  Usually regression is “one-shot”; you cannot “train” a regression model. In other words, regression cannot “learn”.

 A trained neural network can be viewed as an “expert” in the category of information it has been given to analyze. This expert system can provide projections given new solutions to a problem and answer "what if" questions.  Flexible models for regression and classification  Higher predictive power than regression and classification trees

 Artificial Neural Network in Education (ANNIE).  For CV you can hold back a certain portion of the data or choose K-fold.

 A typical neural network is composed of three types of layers input layer: data hidden layer: data transformation and manipulation output layer  Data transformation? We were there before!

 You can explore the inter-relationships among many variables in a single panel.

 You can partition your data for machine learning.

 Difficult to interpret

 There are three types of layers, not three layers, in the network. There may be more than one hidden layer and it depends on how complex the researcher wants the model to be.  Because the input and the output are mediated by the hidden layer, neural networks are commonly seen as a “black box.”  Harder to interpret and understand

 Use it when predictive accuracy is the most important objective  When you need a non-linear fit but do not want over-fitting and want to avoid the tedious work of orthogonalization  When you have mixed data type, such as nominal, ordinal, and continuous, but want to avoid the laborious data transformation

 Download the data set ‘PISA_ANN.jmp’ from the Unit 9 folder.  Run a neural network.  Use ability as Y, use all science interest, science value, and science enjoyment as Xs.  Use Surface profiler to explore the relationships among ability, science interest, science value, and science enjoyment (It may be hard to see the back of the graph. Rotation is necessary).