A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.28.

Slides:



Advertisements
Similar presentations
A core course on Modeling kees van Overveld Week-by-week summary.
Advertisements

AP Statistics Section 3.2 C Coefficient of Determination
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.24.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 P.13.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Greg C Elvers.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.27.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Chapter 10 Simple Regression.
BA 555 Practical Business Analysis
Business Statistics - QBM117 Least squares regression.
CHAPTER 3 Describing Relationships
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Simple Linear Regression and Correlation
Least Squares Regression Line (LSRL)
Measures of Regression and Prediction Intervals
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.26.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Linear Regression Least Squares Method: the Meaning of r 2.
Chapter 21 Basic Statistics.
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.32.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.30.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.9.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.2.
A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.29.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Unsupervised Learning
The simple linear regression model and parameter estimation
CHAPTER 3 Describing Relationships
Topic 10 - Linear Regression
Statistics 101 Chapter 3 Section 3.
12 Inferential Analysis.
CHAPTER 26: Inference for Regression
EQ: How well does the line fit the data?
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A core Course on Modeling
Least Squares Method: the Meaning of r2
Least-Squares Regression
12 Inferential Analysis.
Simple Linear Regression and Correlation
Least-Squares Regression
Product moment correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Unsupervised Learning
CHAPTER 3 Describing Relationships
Presentation transcript:

A core Course on Modeling Introduction to Modeling 0LAB0 0LBB0 0LCB0 0LDB0 S.28

Reflection: interpret result Right problem? (problem validation) Right concepts? (concepts validation) Right model? (model verification) Right outcome? (outcome verification) Right answer? (answer verification) Conclude phase: interpret result

Black box models have empirical data as input. Quantities try to capture essential behavior of this data. Quantities typically involve aggregation. Most common aggregations: average, average, standard deviation, standard deviation, correlation, correlation, fit. fit. univariate: every item = single quantity bivariate: every item = pair of quantities Confidence for Black Box models

Average (  ): the central tendency in a set (for mathematical details, see courses in statistics) Confidence for Black Box models

Standard deviation (  ; variance is  2 ): how much variation does a set contain Confidence for Black Box models

Correlation (  ): what is the agreement between two sets (= a measure for similarity) Confidence for Black Box models

fit: example of a extracting meaningful pattern from data: Example: data set: (x i,y i ), assume linear dependency y=f(x). Intuition: find a line y=ax+b such that the sum of squares of the vertical differences is minimal. …very bad …still not good …try again …good (best?) Confidence for Black Box models Use a penalty to find the best fit line. Penalty q =  (y i -f(x i )) 2 where f(x i ) = ax i +b, a and b are unknown.

fit: example of a extracting meaningful pattern from data: Example: data set: (x i,y i ), assume linear dependency y=f(x). Intuition: find a line y=ax+b such that the sum of squares of the vertical differences is minimal. Confidence for Black Box models Use a penalty to find the best fit line. Penalty q =  (y i -f(x i )) 2 where f(x i ) = ax i +b, a and b are unknown.

fit: example of a extracting meaningful pattern from data: Example: data set: (x i,y i ), assume linear dependency y=f(x). Intuition: find a line y=ax+b such that the sum of squares of the vertical differences is minimal. Confidence for Black Box models

A black box model should explain the essence of a body of data. Subtract the explained part of the data  leave little of the initial data. For data (x i,y i ), ‘explained’ by a model y=f(x), the part left over is  (y i -f(x i )) 2 (residue). Confidence for Black Box models

A black box model should explain the essence of a body of data. Subtract the explained part of the data  leave little of the initial data. For data (x i,y i ), ‘explained’ by a model y=f(x), the part left over is  (y i -f(x i )) 2 (residue). Confidence for Black Box models

A black box model should explain the essence of a body of data. Subtract the explained part of the data  leave little of the initial data. For data (x i,y i ), ‘explained’ by a model y=f(x), the part left over is  (y i -f(x i )) 2 (residue). This should be small compared to  (y i -  y ) 2 (=what you would get assuming no functional dependency). Therefore: confidence is high iff  (y i -f(x i )) 2 /  (y i -  y ) 2 is <<1. Confidence for Black Box models

A black box model should be distinctive, that is: it should allow to distinguish input sets that intuitively are distinct. Confidence for Black Box models

Average, variance and least squares may not be very distinctive. Anscombe (1973) constructed 4 very distinct data sets with equal average, variance and least square fits. Hasty conclusion: ‘these sets are similar’... not! Confidence for Black Box models

Raw data is reasonably well explained by lin. least squares fit (low residue). So what? Challenge hypothesis that raw data stems from one set. Cluster analysis reveals two sets. Confidence for Black Box models

Raw data is reasonably well explained by lin. least squares fit (low residue). So what? Challenge hypothesis that raw data stems from one set. Cluster analysis reveals two sets. Conclusion 1: women will overtake men in 2050 ? Conclusion 2: men will break 0 second record around 2200 ? Confidence for Black Box models

Summary: Black box models involve aggregation: often a (linear least squares) fit, minimizing residues. Values of ( , , residue) don’t give a full characterisation of the data; residual error: how much of the behavior of the data is captured in the model? Distinctiveness ( more on distinctiveness in chapter 7 ): how well can the model distinguish between different modeled systems? Common sense: how plausible are conclusions, drawn from a black box model?