Vectors geometry: Playing with arrows

Slides:



Advertisements
Similar presentations
Geometric Representation of Regression. ‘Multipurpose’ Dataset from class website Attitude towards job –Higher scores indicate more unfavorable attitude.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Structural Equation Modeling
Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.
Surface normals and principal component analysis (PCA)
Multiple Regression. Outline Purpose and logic : page 3 Purpose and logic : page 3 Parameters estimation : page 9 Parameters estimation : page 9 R-square.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Confidence intervals. Population mean Assumption: sample from normal distribution.
Analysis of Variance: ANOVA. Group 1: control group/ no ind. Var. Group 2: low level of the ind. Var. Group 3: high level of the ind var.
Regionalized Variables take on values according to spatial location. Given: Where: A “structural” coarse scale forcing or trend A random” Local spatial.
Ignore parts with eye-ball estimation & computational formula
Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)
Warm Up 1.) Draw a triangle. The length of the hypotenuse is 1. Find the length of the two legs. Leave your answers exact.
Lecture 2: Geometry vs Linear Algebra Points-Vectors and Distance-Norm Shang-Hua Teng.
Find hypotenuse length in a triangle EXAMPLE 1
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
EXAMPLE 1 Find hypotenuse length in a triangle o o o Find the length of the hypotenuse. a. SOLUTION hypotenuse = leg 2 = 8 2 Substitute
Relationships Among Variables
Separate multivariate observations
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
4.4: THE PYTHAGOREAN THEOREM AND DISTANCE FORMULA
Analysis of Variance: Some Review and Some New Ideas
Simple Linear Regression Models
Section 11.6 Pythagorean Theorem. Pythagorean Theorem: In any right triangle, the square of the length of the hypotenuse equals the sum of the squares.
Pythagorean Theorum Adham Jad. What is a triangle? How many sides does a triangle have? What is the sum of angles in a triangle? Background & Concept.
Unit 1 – Physics Math Algebra, Geometry and Trig..
Forces in 2D Chapter Vectors Both magnitude (size) and direction Magnitude always positive Can’t have a negative speed But can have a negative.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
–The shortest distance is the one that crosses at 90° the vector u Statistical Inference on correlation and regression.
Descriptive Statistics
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
X, Y X axis Y axis Let’s just start with a point on a plane surface like this sheet of paper. Now coordinate “x” describes how far to the right, and “y”
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Use Similar Right Triangles
Vector geometry: A visual tool for statistics Sylvain Chartier Laboratory for Computational Neurodynamics and Cognition Centre for Neural Dynamics.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Geometry 7-6 Circles, Arcs, Circumference and Arc Length.
Introduction to Vectors and Matrices
The Distance and Midpoint Formulas
Midpoint and Distance in the Coordinate Plane
Unit Vectors AP Physics.
Kin 304 Regression Linear Regression Least Sum of Squares
Pythagoras’ Theorem – Outcomes
Section 9.1: Three Dimensional Coordinate Systems
[non right-angled triangles]
Sample Mean Distributions
7.4 Special Right Triangles
BPK 304W Correlation.
Pythagorean Theorem and Distance
Review of Probability Concepts
Analysis of Variance: Some Review and Some New Ideas
5.7: THE PYTHAGOREAN THEOREM (REVIEW) AND DISTANCE FORMULA
Linear regression Fitting a straight line to observations.
Review of Chapter 2 Some Basic Concepts: Sample center
Right Triangles Unit 4 Vocabulary.
HW# : Complete the last slide
Chapter 9: Differences among Groups
Lecture 2: Geometry vs Linear Algebra Points-Vectors and Distance-Norm
Standard Deviation How many Pets?.
Introduction to Vectors and Matrices
Y. Davis Geometry Notes Chapter 8.
Pythagoras’ Theorem.
Presentation transcript:

Vectors geometry: Playing with arrows How using a vector (arrow) we can represent concepts of Mean, variance (standard deviation), normalization and standardization. How using two vectors we can represent concepts of Correlation and regression.

A datum (0) (16)

Two data (8) (0) (16) Principal of independence of observation : perfectly opposed direction

Two data (16,8) (8) (0, 0) (0) (16)

Two data (16,8) (0, 0)

Starting point: Zero Ending point (16,8) Starting point (0,0)

Starting point: Mean Ending point x = (x1, x2) Starting point

Starting point: Mean Starting point (12, 12) Ending point x = (16, 8)

One group

Many groups

Degrees of freedom

We removed the effect of the mean We centralized the data Starting point (mean) (12, 12) Ending point x = (16, 8) (0, 0) = (4, -4)

We removed the effect of the mean (many groups)

We removed the effect of the mean (many groups)

We removed the effect of the mean (many groups) What is the real dimensionality?

We removed the effect of the man If we have two data, we will get one dimension. If we have three data, we will get two dimensions . If we have n data, we will get n-1 dimensions. In other words, degrees of freedom represent the true dimensionality of the data.

Variance

What is the difference between these three vectors (composed of two data each) ? Length (distance) The higher the variability, the longer the length will be. (-0.5, 0,5) (1.5, -1.5) (2.5, -2.5)

What is the difference between these three arrows? How do we measure the length (distance)? Pythagoras Hypotenuse of a triangle ? = (4^2+3^2) = 25 = 5 (4,3) 5 ? 3 4

What is the difference between these three arrows? Therefore, the point (4,3) is at a distance of 5 from its starting point. = sum of squares = variance×(n-1) (4,3) 5

What is the difference between these three arrows? What is the length of these three lines? 1 ? A) 1 1 1 C) 3 ? 1 2 ? 1 B) The dimensionality inflates the variability. In order to a have a measure that can take into account the dimensionality, what do we need to do? 1

What is the difference between these three arrows? We divide the length of the data set by its true dimensionality = (quadratic) distance (from the mean) corrected by the (true) dimensionality of the data.

Normalization et standardization

Normalization vs Standardization To normalize is equivalent as to bring a given vector x (arrow) centered (mean = 0) to a length of 1.. Normalization: z = x  by its length Sz2 = 1 Standardization: zx = x  SD Szx2 = n-1 => zx = z*(n-1)

Two groups or two variables

One group of three participants

Two groups of three participants

Two groups of three participants They can be represented by a plane

Two groups of three participants They can be represented by a plane

Two groups of three participants They can be represented by a plane

Two groups of three participants They can be represented by a plane This is true whatever the number of participants

Correlation and regression

Relation between two vectors If two groups (u and v) have the same data, then the two vectors are superposed on each other. As the angle between them increases, the direction changes.

Relation between two vectors If the angle reaches 90 degrees, then they share nothing in common.

Relation between two vectors The cosine of the angle is the coefficient of correlation