Presented by Namir Shammas

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Lecture (14,15) More than one Variable, Curve Fitting, and Method of Least Squares.
Statistics: Data Analysis and Presentation Fr Clinic II.
Part 4 Chapter 13 Linear Regression
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Statistics: Data Presentation & Analysis Fr Clinic I.
Log-linear and logistic models
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Chapter 11 Multiple Regression.
Business Statistics BU305 Chapter 3 Descriptive Stats: Numerical Methods.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals.
Least-Squares Regression
Inference for regression - Simple linear regression
Measurement Tools for Science Observation Hypothesis generation Hypothesis testing.
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
MATH 3359 Introduction to Mathematical Modeling Project Multiple Linear Regression Multiple Logistic Regression.
Quantifying the dynamics of Binary Search Trees under combined insertions and deletions BACKGROUND The complexity of many operations on Binary Search Trees.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
ES 240: Scientific and Engineering Computation. Chapter 13: Linear Regression 13. 1: Statistical Review Uchechukwu Ofoegbu Temple University.
Part IV Significantly Different: Using Inferential Statistics
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/6/12 Simple Linear Regression SECTIONS 9.1, 9.3 Inference for slope (9.1)
GoldSim Technology Group LLC, 2006 Slide 1 Sensitivity and Uncertainty Analysis and Optimization in GoldSim.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
COTOR Training Session II GL Data: Long Tails, Volatility, Data Transforms September 11, 2006.
Essential Statistics Chapter 51 Least Squares Regression Line u Regression line equation: y = a + bx ^ –x is the value of the explanatory variable –“y-hat”
Multiple Logistic Regression STAT E-150 Statistical Methods.
Statistics 1: Introduction to Probability and Statistics Section 3-2.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
RNGs Using Integer Array Shuffling with the HP Prime
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Data Analysis, Presentation, and Statistics
Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00.
Dynamics of Binary Search Trees under batch insertions and deletions with duplicates ╛ BACKGROUND The complexity of many operations on Binary Search Trees.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
Data Transformation: Normalization
Regression Analysis AGEC 784.
Regression Analysis Module 3.
26134 Business Statistics Week 5 Tutorial
Data Mining: Concepts and Techniques
Regression Chapter 6 I Introduction to Regression
Machine Learning – Regression David Fenyő
Chapter 5 STATISTICS (PART 4).
Microsoft Office Illustrated
S519: Evaluation of Information Systems
Intro to Machine Learning
The Least Squares Line Lesson 1.3.
Collaborative Filtering Nearest Neighbor Approach
REGRESSION.
Predict Failures with Developer Networks and Social Network Analysis
Simple Linear Regression
Statistics 1: Introduction to Probability and Statistics
Intro to Machine Learning
y = mx + b Linear Regression line of best fit REMEMBER:
Parametric Methods Berlin Chen, 2005 References:
Ch 4.1 & 4.2 Two dimensions concept
Presentation transcript:

Presented by Namir Shammas Machine Learning for Best Linearized Regression Model Using the HP Prime Presented by Namir Shammas

Dedication To the late Jon Johnston curator of HP Computer Museum (hpmusum.net) who lost his life in April 2016, while on a mountain-climbing expedition in Tibet. His contribution to posting documentation for HP desktop and handheld computers is valuable.

History of Best Linearized Regression Models HP-65 and HP-67 Stat Pacs offered programs for various linearized regression models. PPC-ROM has best linearized regression. The HP-41 Advantage ROM also offers best linearized regression. Above apps chose from a set of four regression models--linear, exponential, logarithmic, and power.

History of Best Linearized Regression Models (cont.) Several PPC members explored obtaining the best linearized regression model using a wider set of models. Example: William Kolb. I wrote programs to find best linearized regression models in: The Corvallis library. HHC presentations. HP Solver (HP 39GII programs that also work on the HP Prime).

History of Best Linearized Regression Models (cont.) Covering a large number of linearized and multiple linearized regression models may use: A set of indices each specifying all of the transformations for the observations. A set of enumerated transformations for each variable entering the regression models. A range of powers applied to each variable ( 0  log transformation): Integer values, like from -4 to 4 in steps of 1. Floating-point values, like -4, -3.75, -3.5, …, 3.5 ,3.75, 4.

Machine Learning Basics for Finding the Best Fits Use two data sets. First data set used for training. Second data set used for testing. Each data set should have different noise. Thus avoid, if possible, splitting one big data set into two subsets.

ML Basics (cont.) For each model do the following: Calculate regression slope and intercept Calculate MSSE1 from training set. Calculate MSSE2 from test set using regression slope and intercept obtained from the training data set. Calculate a weighted MSSE from MSSE1 and MSSE2. Calculate best models ranked using MSSE.

Options Options for model selections: Options for handling data: Use an enumerated set of models, each with a specific transformations for X and Y values. Use a range of powers to apply to X and Y values. The power value of 0 translates into the logarithmic transformation. Options for handling data: Use X and Y values without transformations. Normalize X and Y data using minimum and maximum values to map the data in the recommended range of [1, 2]. Subtract mean value and then divide by the standard deviation.

Bootstrap Method Alternative Use a single data set. Repeat N time selecting a different subset (using the same number of observations) to calculate regression model statistics. Calculate average slope, intercept, and coefficient of determination. Rank models by average coefficient of determination values.

Creating Test/Training Data Use function in HP Prime file PopData.txt. Parameters of function PopData: n - number of points x - starting value for x a, b, pwr - coefficients for y = a*b*x^pwr PEF1 - % Error factor for training data PEF2 - % Error factor for test data Matrix M1 stores training dataset. Matrix M2 stores test dataset.

Sample M1 matrix of 100 (X, Y) points.

Sample M1 matrix (X, Y) points.

Sample M2 matrix (X, Y) points.

Getting Min Max Range For normalized data use GetMinMax function to get the minimum and maximum values for X and Y data.

ML Program Version 1 Use function ML1 in HP Prime file Best_YX_LR_Machine_Learning_1.txt. Parameters of function ML1: pDataMat – matrix containing training data. pTestDataMat - matrix containing test data. MLwt – weight used to calculate stats. Returns matrix of Rsqr, MSSE, Y transformation, X transformation, slope, and intercept values.

Example of ML Program Version 1

ML Program Version 2 Use function ML2 in HP Prime file Best_YX_LR_Machine_Learning_2.txt. Parameters of function ML2: pDataMat – matrix containing training data. pTestDataMat - matrix containing test data. MLwt – weight used to calculate stats. Uses normalized values for variables X and Y. Returns matrix of Rsqr, MSSE, Y transformation, X transformation, slope, and intercept values.

Example of ML Program Version 2

Bonus Regression Programs! Proceedings include source code for multiple linearized regression version of the programs presented earlier. File Best_ZYX_MLR_Machine_Learning_1.txt has machine learning regression for Z=f(X,Y). File Best_ZYX_MLR_Machine_Learning_2.txt has machine learning regression for Z=f(X,Y) using normalized data.

Bootstrap Regression Use function BSR in HP Prime file Best_YX_LR_Bootstrap.txt. Parameters of function BSR: pDataMat – matrix containing data. FractionDataUsed – fraction of data used in each simulation. NumSimulations – number of simulations. Uses normalized values for variables X and Y. Returns matrix of Rsqr, Y transformation, X transformation, slope, and intercept values.

Example of Bootstrap Program

Thank You!