Alan D. Mead, Talent Algorithms Inc. Jialin Huang, Amazon

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Model assessment and cross-validation - overview
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
Ensemble Learning (2), Tree and Forest
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Issues with Data Mining
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
COMP24111: Machine Learning Ensemble Models Gavin Brown
Case Selection and Resampling Lucila Ohno-Machado HST951.
Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.
Validation methods.
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Data Science Credibility: Evaluating What’s Been Learned
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Big data classification using neural network
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
Introduction to Machine Learning
Applying Deep Neural Network to Enhance EMPI Searching
Bagging and Random Forests
Machine Learning overview Chapter 18, 21
Chapter 13 – Ensembles and Uplift
Lecture 17. Boosting¶ CS 109A/AC 209A/STAT 121A Data Science: Harvard University Fall 2016 Instructors: P. Protopapas, K. Rader, W. Pan.
CH 5: Multivariate Methods
The Elements of Statistical Learning
COMP61011 : Machine Learning Ensemble Models
Dipartimento di Ingegneria «Enzo Ferrari»,
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Basic machine learning background with Python scikit-learn
Neuro-Computing Lecture 5 Committee Machine
ECE 5424: Introduction to Machine Learning
Data Mining Lecture 11.
Vincent Granville, Ph.D. Co-Founder, DSC
Unsupervised Learning and Autoencoders
Ungraded quiz Unit 6.
A “Holy Grail” of Machine Learing
What is Pattern Recognition?
Statistical Learning Dong Liu Dept. EEIS, USTC.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Ensemble learning Reminder - Bagging of Trees Random Forest
Chapter 7: Transformations
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
A task of induction to find patterns
Word representations David Kauchak CS158 – Fall 2016.
Machine Learning in Business John C. Hull
About Data Analysis.
What is Artificial Intelligence?
Presentation transcript:

Alan D. Mead, Talent Algorithms Inc. Jialin Huang, Amazon A machine learning "Rosetta Stone" for psychologists and psychometricians Alan D. Mead, Talent Algorithms Inc. Jialin Huang, Amazon

Why a “Rosette Stone”? We’ve been doing “big data” analyses for a century Johnny-come-lately “data scientists” are re-inventing our vocabulary… … and possibly eating our lunch This glossary of terms is a first step towards understanding these guys, and incorporating the new aspects of data science

▶ Machine learning Artificial intelligence (AI) Statistical analysis Text mining ▶ Primarily means “statistics” and “statistical analysis” Also refers generically to AI approaches using statistical modeling (which may be extremely varied) Abbreviated ML

▶ Big Data Nothing (a meaningless buzzword) Statistical analysis Text mining Means different things to different people Infrastructure to hold huge amounts of data Analysis of truly large samples (often using simple analyses like crosstabs) Refers generically to statistical analysis

▶ Supervised Learning Criterion-related analysis Statistical analyses Text mining “Learning” means statistics/statistical analysis “Supervised” refers to analyses with a “Y-variable” Cluster analysis is an example of unsupervised learning; so are most “mining” analyses, which search for hidden patterns in the dataset

K-fold Crossvalidation Wherry’s correction for shinkage Criterion-related validation Resampling crossvalidation ▶ Divide the sample randomly into K (equal) subsets Perform K crossvalidation analyses; in each analysis the Kth subset is the holdout sample, the rest form the calibration sample Estimates variability in crossvalidation Double-crossvalidation is 2-fold crossvalidation If K=N => leave one out crossvalidation (LOOCV)

▶ Features Dependent variable(s) Independent variable(s) Factors, like in factor analysis ▶ Encompasses nominal, ordinal, interval and ratio measures Often (e.g., optical character recognition), the features need to be “extracted” from the data (they are not “measured” directly and discretely)

▶ Co-occurrence Matrix Correlation matrix for nominal features A text mining methodology A statistic showing similarity Assumes features (e.g., words) in groups that are the unit of analysis (e.g., items) Counts the occurrence of a feature in a group A factor-analysis like process produces a latent semantic space

Artificial Neural Network (ANN) Artificial intelligence (AI) Unsupervised learning algorithm Supervised learning algorithm ▶ Non-parametric, nonlinear regression for nominal DV’s; complex fitting Has it’s own terminology (nodes, layers, activation, etc.) Can have enormous numbers of parameters; thus the model is typically incomprehensible

▶ Naïve Bayes Artificial intelligence (AI) Unsupervised learning algorithm Supervised learning algorithm ▶ Only classifies (nominal DV) May have very large number of dichotomous IV’s Simple estimation Naïve refers to assuming IV’s are independent; i.e., that P(IV1=1 & IV2=1) = P(IV1=1) P(IV2=1)

▶ One Hot Encoding Data imputation Dummy coding Data cleaning May include schemes where K values get K dummy codes (not K-1) Co-occurrence matrix is table of OH encodings Comes from digital circuits (“one hot”=“a single 1 in a group of binary values”)

▶ Bagging Statistical analysis Resampling crossvalidation Resampling model-fitting ▶ Bootstrap aggregating, an “ensemble meta-algorithm” to improve stability Re-sample (with replacement) your N observations to create D bootstrap samples of size N; average the output (for regression) or vote (for classification).

Bagging example Source: https://en.wikipedia.org/wiki/Bootstrap_aggregating

What new in data science? Little balkanization; predicting anything from anything Includes methods that are supervised and unsupervised (“mining” for patterns) Resampling to improve efficiency and reduce instability New terminology Wider variation in problems and methodologies

Thanks! Questions? Alan Mead <amead@alanmead.org> Jialin Huang <huangpsych@gmail.com>

Further reading… Wikipedia?!? Yes, Wikipedia provides excellent definitions Rosetta Stone Machine Learning Big Data Supervised Learning K-fold crossvalidation Feature Co-occurrence matrix Artificial Neural Network Naïve Bayes One hot encoding Bagging Boosting