Neural Networks in RStudio with the H2O module

Slides:



Advertisements
Similar presentations
Control Case Common Always active
Advertisements

Random Forest Predrag Radenković 3237/10
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Boosting Ashok Veeraraghavan. Boosting Methods Combine many weak classifiers to produce a committee. Resembles Bagging and other committee based methods.
Indian Statistical Institute Kolkata
Movie Recommendation System
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.
Searching for Single Top Using Decision Trees G. Watts (UW) For the DØ Collaboration 5/13/2005 – APSNW Particles I.
Three kinds of learning
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
FLANN Fast Library for Approximate Nearest Neighbors
Intelligible Models for Classification and Regression
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Machine Learning CS 165B Spring 2012
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Data mining and machine learning A brief introduction.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Towards Improving Classification of Real World Biomedical Articles Kostas Fragos TEI of Athens Christos Skourlas TEI of Athens
Leslie Luyt Supervisor: Dr. Karen Bradshaw 2 November 2009.
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.
Department of Computer Science, University of Waikato, New Zealand Geoffrey Holmes, Bernhard Pfahringer and Richard Kirkby Traditional machine learning.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
Agent-Based Hybrid Intelligent Systems and Their Dynamic Reconfiguration Zili Zhang Faculty of Computer and Information Science Southwest University
Frontiers in the Convergence of Bioscience and Information Technologies 2007 Seyed Koosha Golmohammadi, Lukasz Kurgan, Brendan Crowley, and Marek Reformat.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
15 August, 2005IEEE IRI Web Based Expert System for Class Schedule Planning using JESS Ken Ho Hewlett Packard Company Meiliu Lu Department of Computer.
CISC Machine Learning for Solving Systems Problems Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware Detection.
Adaptive Sampling Methods for Scaling up Knowledge Discovery Algorithms From Ch 8 of Instace selection and Costruction for Data Mining (2001) From Ch 8.
Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
COMP24111: Machine Learning Ensemble Models Gavin Brown
CS 732: Advance Machine Learning
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

An Effective Hybridized Classifier for Breast Cancer Diagnosis DISHANT MITTAL, DEV GAURAV & SANJIBAN SEKHAR ROY VIT University, India.
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Presented by Tyler Bjornestad and Rodney Weakly.  One app, all your favorite news feeds  Customizable  Client-server  Uses Bayesian algorithm to make.
Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Chapter 13 Artificial Intelligence. Artificial Intelligence – Figure 13.1 The Turing Test.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Debesh Jha and Kwon Goo-Rak
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Boosting and Additive Trees (2)
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
Websoft Research Group
COMP61011 : Machine Learning Ensemble Models
Basic machine learning background with Python scikit-learn
CSc 219 Project Proposal Raymond Fraizer.
Identifying Confusion from Eye-Tracking Data
Overview of Machine Learning
Approaching an ML Problem
Analytics: Its More than Just Modeling
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning in Practice Lecture 27
Earthen Mounds Recognition Using LiDAR Images
H2O is used by more than 14,000 companies
The Student’s Guide to Apache Spark
COSC 4368 Intro Supervised Learning Organization
CRCV REU 2019 Kara Schatz.
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Neural Networks in RStudio with the H2O module Raymond Fraizer

Goal To learn how to implement a neural network using RStudio and H2O. To learn how to use the H2O web interface. Get experience with sensor data.

Dataset Background Found PUC-rio dataset at UCI repository. The paper “Wearable Computing: Accelerometers’ Data Classification of Body Postures and Movements” by Ugulino et al.[1] They used C4.5 with AdaBoost using 10 boosting iterations and 10-fold cross validation. Their overall accuracy was 99.4%.

What is H2O A machine learning platform with interfaces for many languages such as R, Python, Scala, Java, etc. It also has a web interface for interactive control. It has many common models: GLM, GBM, DRF, PCA, K-Means, “Deep Learning”, and “Naive Bayes”.[2]

What is H2O The platform is setup for easy model building by automating common pre- and post-processing tasks and providing good defaults for most settings. You don’t need to worry about missing values, categorical features, or scaling your data. H2O deals with all of that for you.[3]

Method Using RStudio and the H2O module I trained a few individual neural networks of different configurations to compare results. With so many parameters I tried using random grid search to choose the activation unit, the number of nodes, and in some runs l1 and l2.

Results

Conclusion Small networks did poorly, if you can call 92% or better accuracy poor. The 45-60 node networks all did better than 98% accuracy, but with some over fitting. The large 400+ node networks that did not use dropout exceeded 99% accuracy, but with a lot of over fitting. Even without using an ensemble the neural network came close to the accuracy of the original work’s 99.4% for a 10 times AdaBoosted 10-fold validated decision tree.

References Ugulino, W.; Cardador, D.; Vega, K.; Velloso, E.; Milidiu, R.; Fuks, H. "Wearable Computing: Accelerometers' Data Classification of Body Postures and Movements.", Proceedings of 21st Brazilian Symposium on Artificial Intelligence. Advances in Artificial Intelligence - SBIA 2012. In: Lecture Notes in Computer Science. pp.52-61. Curitiba, PR: Springer Berlin / Heidelberg, 2012. Aboyoun, P., Aiello, S, Eckstrand, E., Fu, A., Landry, M., and Lanford, J. (Sep. 2016). "Machine Learning with R and H2O.", http://h2o2016.wpengine.com/wp-content/themes/h2o2016/images/resources/RBooklet.pdf (accessed Apr 11,2017). Arora, A., Candel, A., Lanford, J., LeDell, E., and Parmar, V. (Sep. 2016). "Deep Learning with H2O.", http://h2o2016.wpengine.com/wp-content/themes/h2o2016/images/resources/DeepLearningBooklet.pdf (accessed Apr 11,2017).