Machine learning in Action: Unpacking the Biographical Questionnaire

Slides:



Advertisements
Similar presentations
On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif.
Advertisements

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
Chapter 5 Multiple Linear Regression
Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Departments of Medicine and Biostatistics
Cost Model Development using Costmod Dave Stockton Taqui Shaik.
4-1 Management Information Systems for the Information Age Copyright 2002 The McGraw-Hill Companies, Inc. All rights reserved Chapter 4 Decision Support.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Multivariate Probability Distributions. Multivariate Random Variables In many settings, we are interested in 2 or more characteristics observed in experiments.
Exploration of Ground Truth from Raw GPS Data National University of Defense Technology & Hong Kong University of Science and Technology Exploration of.
A Case Study on Traffic Violations in the City of Colombo Udara Perera Sandun Silva Oshada Senaweera Yogeswaran Akhilan Amani Subawickrama.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Introduction of Mobility laboratory & Collaboration with CALTECH Noriko Shimomura Nissan Mobility Laboratory.
CSC4444: Artificial Intelligence Fall 2011 Dr. Jianhua Chen Slides adapted from those on the textbook website.
Week 6: Model selection Overview Questions from last week Model selection in multivariable analysis -bivariate significance -interaction and confounding.
NEURAL NETWORKS FOR DATA MINING
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Confidential | RBEI / NBD | 27/08/2015 | © Robert Bosch Engineering and Business Solutions Private Limited All rights reserved, also regarding any.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.
FOUNDATIONS OF ARTIFICIAL INTELLIGENCE
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Machine Learning in CSC 196K
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Intelligent Database Systems Lab Presenter : Fen-Rou Ciou Authors : Hamdy K. Elminir, Yosry A. Azzam, Farag I. Younes 2007,ENERGY Prediction of hourly.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Artificial Intelligence and Machine Learning in Big Data and IoT: The Market for Data Capture, Analytics, and Decision Making 2016 – 2021 Phone No.: +1.
What is Legal Analytics?
CSE 4705 Artificial Intelligence
Artificial Intelligence, P.II
Carolinas HealthCare System: Consumer Analytics
Contextual Intelligence as a Driver of Services Innovation
School of Computer Science & Engineering
Generalized Linear Models
Envisioning the Future: Effects of Crash Countermeasures
Temperature as a predictor of fouling and diarrhea in slaughter pigs
RESEARCH APPROACH.
Machine Learning for dotNET Developer Bahrudin Hrnjica, MVP
School of Information Management Nanjing University China
NATURALISTIC DRIVING STUDIES: THE EFFECTIVENESS OF THE METHODOLOGY IN MONITORING DRIVER BEHAVIOUR K. Muronga, N Ruxwana* 36th Annual Southern African.
Artificial Intelligence
MACHINE LEARNING.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Kocaeli University Introduction to Engineering Applications
Week 11 Knowledge Discovery Systems & Data Mining :
Team 2: Graham Leech, Austin Woods, Cory Smith, Brent Niemerski
Overview of Machine Learning
Generalization in deep learning
AI and Agents CS 171/271 (Chapters 1 and 2)
Risk Adjustment Network Meeting. The Hague. October 11-14, 2017
Somi Jacob and Christian Bach
Multivariate Methods Berlin Chen
D A L I Deep Artificial Learning Intelligence
Kotler on Marketing Marketing is becoming a battle based more on information than on sales power.
Regression and Clinical prediction models
Recent trends of machine learning in healthcare
. DAVID K. NJERU DCM,HND(ORTH),Bsc(DMID),Msc(OSH) Ph.D. (Ergonomics)ongoing Lecturer of Clinical Medicine Egerton University Kenya .
Patterson: Chap 1 A Review of Machine Learning
Presentation transcript:

Machine learning in Action: Unpacking the Biographical Questionnaire “ Rethinking University Engagement in Africa ” Innocent Mamvura Wits Business Intelligence Services All Rights Reserved 2017 24th SAAIR Annual Conference 9/14/2018

Machine Learning Introduced 9/14/2018

Examples of Machine Learning and AI Applications Autonomous vehicles have sensors and cameras that continuously scan the surrounding area and capture information. Machine learning algorithms employ deep learning to learn from the behavior of human drivers. Many hours on the road are needed to train the self-driving brain how to act near pedestrians, when to slow down, to drive in the middle of a lane, and to stop at a red light. 9/14/2018

Netflix Machine learning is integral to Netflix’s video recommendation engine. 9/14/2018

UBER The tech giant uses machine learning to determine arrival times, pick ups and locations and UberEats food deliverables 9/14/2018

Healthcare Disease Identification/Diagnosis eg IBM Watson Genomics Personalized Treatment/Behavioral Modification Clinical Trial Research Radiology and Radiotherapy  Smart Electronic Health Records 9/14/2018

Top Universities using Predictive analytics https://vimeo.com/119487844 9/14/2018

Problem Statement Wits is collecting large amounts of data pertaining to student background, school information, financial background, level of competence in computers skills, library usage skills etc. The existing Student Early Warning System is based on only quantitative variables. Identifying at risk students using the existing system is resulting large numbers of first year students with less resources to assist these students. No significant improvement in first year success rates 9/14/2018

Definitions Binary Logistic Regression is a technique used when the outcome variable is a dichotomous variable (has two values). Logistic Regression uses Binomial Probability Theory in which there are only two outcome categories. The technique forms a function using the Maximum Likelihood Method, which maximizes the chances of grouping the observed data into the suitable category given the regression coefficients. Artificial neural networks (ANNs) are computing systems inspired by the biological neural networks that constitute animal brains. Such systems learn (progressively improve performance) to do tasks by considering examples, generally without task-specific programming. 9/14/2018

Assumptions of the Model No Linear Relationship between the outcome and predictor variables is required; The outcome variable must have two categories; The predictor variables does not follow a normal distribution, or linear relationship; Maximum Likelihood coefficients are large sample estimates 9/14/2018

CRISP-DM Cross-industry standard process for data mining 9/14/2018 2017 24th Annual Conference

Creating a Machine Learning Model 9/14/2018 2017 24th Annual Conference

Predictors 9/14/2018

What are the Findings? Distance to campus, Funding, Residence, First Generation, Computer Skills, Library Usage and Science Laboratory usage is a predictor of academic success. Working part time, Exposure to Textbooks, Science lab usage are not significant. The predictive accuracy of the Binary Logistic model and Neural Network model is at least 80%. These findings will help with allocation alignment of resources where they are needed the most. 2017 24th Annual Conference 9/14/2018

Neural Network Binary Logistic Regression 9/14/2018

Future Work Include more variables from the BQ data Use other machine learning techniques Explore deep learning algorithms Put the power of data into the hands of students Collaborate with other Universities on BQ 9/14/2018

Conclusion & Recommendations Students travelling long distances to campus, not on financial aid, first generation, not in residence, no computer skills, no exposure to library usage skills are most likely to fail their first year. Working Part time is not significant. The university could provide accommodation to students staying >20km from campus and also provide access to computers and workshops to the students who had not been exposed to computers at high school. The university could also perhaps provide studying skills workshops and general use of library resources to these students. Academic mentors could also be arranged for first generation students as they require motivation and 9/14/2018