Linear regression
How to analyse data?
How to analyse data? Plot!
Human brain is one the most powerfull computationall tools How to analyse data? Plot! Human brain is one the most powerfull computationall tools Works differently than a computer…
Simple example – finding maximum y(xmax) Computer 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Computer Set y(xmax) = y(x1). 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Computer Set y(xmax) = y(x1). Go to the next point x2: 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Computer Set y(xmax) = y(x1). Go to the next point x2: If y(x2) > y(xmax) then xmax = x2 2. Else do nothing. 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Computer Set y(xmax) = y(x1). Go to the next point x2: If y(x2) > y(xmax) then xmax = x2 2. Else do nothing. 3. Repeat this procedure until you reach the end. 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Human brain 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Here! Human brain 3 2 1 x1 x2 x3
Simple example – finding maximum y(xmax) Here! Human brain 3 With increasing number of points quicker answer 2 1 x1 x2 x3
How to analyse data? Plot x against y Observe trend - correlation
How to „measure” linearity? Geometry 𝐚 𝒃
How to measure angle between two vectors? Scalar product 𝐚 𝜶 𝒃
How to measure angle between two vectors? Scalar product 𝒂 =( 𝒂 𝟏 , 𝒂 𝟐 ), 𝒃 =( 𝒃 𝟏 , 𝒃 𝟐 ) 𝐚 𝜶 𝒃
How to measure angle between two vectors? Scalar product 𝒂 =( 𝒂 𝟏 , 𝒂 𝟐 ), 𝒃 =( 𝒃 𝟏 , 𝒃 𝟐 ) 𝐚 𝒂 𝒐 𝒃 = 𝒂 𝟏 𝒃 𝟏 + 𝒂 𝟐 𝒃 𝟐 = 𝒊=𝟏 𝟐 𝒂 𝒊 𝒃 𝒊 𝜶 𝒃
How to measure angle between two vectors? Scalar product 𝒂 =( 𝒂 𝟏 , 𝒂 𝟐 ), 𝒃 =( 𝒃 𝟏 , 𝒃 𝟐 ) 𝐚 𝒂 𝒐 𝒃 = 𝒂 𝟏 𝒃 𝟏 + 𝒂 𝟐 𝒃 𝟐 𝜶 𝒂 𝒐 𝒂 = 𝒂 𝟏 𝟐 + 𝒂 𝟐 𝟐 𝒃
How to measure angle between two vectors? Scalar product 𝒂 =( 𝒂 𝟏 , 𝒂 𝟐 ), 𝒃 =( 𝒃 𝟏 , 𝒃 𝟐 ) 𝐚 𝒂 𝒐 𝒃 = 𝒂 𝟏 𝒃 𝟏 + 𝒂 𝟐 𝒃 𝟐 𝜶 𝒂 𝒐 𝒂 = 𝒂 𝟏 𝟐 + 𝒂 𝟐 𝟐 𝒃 𝒄𝒐𝒔 𝜶 = 𝒂 𝒐 𝒃 𝒂 𝒃
Example 𝒛 𝒙 𝒚
Example How to do it? 𝒛 𝒙 𝒚
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚 𝒂 =(𝟏,𝟎,𝟏), 𝒃 =(𝟎,𝟏,𝟏) 𝐚 𝒃 𝒙 𝒚
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚 𝒂 =(𝟏,𝟎,𝟏), 𝒃 =(𝟎,𝟏,𝟏) 𝐚 𝒄𝒐𝒔 𝜶 = 𝒂 𝒐 𝒃 𝒂 𝒃 𝒃 𝒙 𝒚 𝜶=𝟔 𝟎 𝒐
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚 𝒂 =(𝟏,𝟎,𝟏), 𝒃 =(𝟎,𝟏,𝟏) 𝐚 𝒄𝒐𝒔 𝜶 = 𝒂 𝒐 𝒃 𝒂 𝒃 𝒃 𝒙 𝒄𝒐𝒔 𝜶 = 𝟏,𝟎,𝟏 𝒐 𝟎,𝟏,𝟏 (𝟏,𝟎,𝟏) 𝟎,𝟏,𝟏 𝒚
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚 𝒂 =(𝟏,𝟎,𝟏), 𝒃 =(𝟎,𝟏,𝟏) 𝐚 𝒄𝒐𝒔 𝜶 = 𝒂 𝒐 𝒃 𝒂 𝒃 𝒃 𝒙 𝒄𝒐𝒔 𝜶 = 𝟏,𝟎,𝟏 𝒐 𝟎,𝟏,𝟏 (𝟏,𝟎,𝟏) 𝟎,𝟏,𝟏 = 𝟏∗𝟎+𝟎∗𝟏+𝟏∗𝟏 𝟐 𝟐 = 𝟏 𝟐 𝒚
Example How to do it? We choose two vectors 𝒛 𝐚 𝒃 𝒙 𝒚 𝒂 =(𝟏,𝟎,𝟏), 𝒃 =(𝟎,𝟏,𝟏) 𝐚 𝒄𝒐𝒔 𝜶 = 𝒂 𝒐 𝒃 𝒂 𝒃 𝒃 𝒙 𝒄𝒐𝒔 𝜶 = 𝟏,𝟎,𝟏 𝒐 𝟎,𝟏,𝟏 (𝟏,𝟎,𝟏) 𝟎,𝟏,𝟏 = 𝟏∗𝟎+𝟎∗𝟏+𝟏∗𝟏 𝟐 𝟐 = 𝟏 𝟐 𝒚 𝜶=𝟔 𝟎 𝒐
What’s the relevance? Two sets of data Data are vectors! X 1 2 3 4 Y 2 4.1 5.4 8.3 y4 y3 y2 𝒙 =(𝟏, 𝟐, 𝟑, 𝟒) 𝒚 =(𝟐, 𝟒.𝟏, 𝟓.𝟒, 𝟖.𝟑) y1 Data are vectors! x1 x2 x3 x4
What’s the relevance? Two sets of data Linear relationship parallel 𝒚 X 1 2 3 4 Y 2 4.1 5.4 8.3 y4 𝒚 =𝒂∗ 𝒙 y3 Linear relationship y2 y1 𝒚 parallel x1 x2 x3 x4 𝒙
How to measure parallelism between two vectors? Linear relationship 𝒚 y4 𝒙 parallel = zero angle y3 y2 y1 𝜶≈𝟎→𝒄𝒐𝒔 𝜶 ≈𝟏 x1 x2 x3 x4
How to calculute the angle? Scalar product! Two sets of data X 1 2 3 4 Y 2 4.1 5.4 8.3 y4 y3 cos 𝜶 = 𝒙 𝒐 𝒚 𝒙| 𝒚 | = 𝒊=𝟏 𝒏 𝒙 𝒊 𝒚 𝒊 𝒊=𝟏 𝒏 𝒙 𝒊 𝟐 𝒊=𝟏 𝒏 𝒚 𝒊 𝟐 y2 y1 x1 x2 x3 x4
How to calculute the angle? Scalar product! Changing origin (0,0) 𝒙 , 𝒚 𝑹 𝟐 =cos 𝜶 = 𝒊=𝟏 𝒏 (𝒙 𝒊 − 𝒙 ) (𝒚 𝒊 − 𝒚 ) 𝒊=𝟏 𝒏 (𝒙 𝒊 − 𝒙 ) 𝟐 𝒊=𝟏 𝒏 𝒚 𝒊 − 𝒚 𝟐 𝑦 𝒙 = 𝟏 𝒏 𝒊=𝟏 𝒏 𝒙 𝒊 , 𝒚 = 𝟏 𝒏 𝒊=𝟏 𝒏 𝒚 𝒊 𝑥
Our case Two sets of data 𝑹 𝟐 = 𝟏𝟎.𝟏 𝟓 𝟐𝟎.𝟖𝟓 =𝟎.𝟗𝟖 X 1 2 3 4 Y 2 4.1 5.4 8.3 y4 y3 𝑥 = 1+2+3+4 4 =2.5 𝑦 = 2+4.1+5.4+8.3 4 =4.95 𝑥− 𝑥 -1.5 -0.5 0.5 1.5 y2 𝑦− 𝑦 -2.95-0.85 0.45 3.35 𝒊=𝟏 𝒏 (𝒙 𝒊 − 𝒙 ) (𝒚 𝒊 − 𝒚 )=(−𝟏.𝟓∗−𝟐.𝟗𝟓)+(−𝟎.𝟓∗−𝟎.𝟖𝟓)+(𝟎.𝟓∗𝟎.𝟒𝟓)+(𝟏.𝟓∗𝟑.𝟑𝟓)=𝟏𝟎.𝟏 y1 x1 x2 x3 x4 ( 𝒊=𝟏 𝒏 (𝒙 𝒊 − 𝒙 ) 𝟐 =𝟓 𝒊=𝟏 𝒏 𝒚 𝒊 − 𝒚 𝟐 =𝟐𝟎.𝟖𝟓 𝑹 𝟐 = 𝟏𝟎.𝟏 𝟓 𝟐𝟎.𝟖𝟓 =𝟎.𝟗𝟖
What is the best position of the line? The best = smallest error X 1 2 3 4 Y 2 4.1 5.4 8.3 Error = data value – estimated value
What is the best position of the line? The best = smallest error X 1 2 3 4 Y 2 4.1 5.4 8.3 𝑆𝑆𝐸= 𝑖=1 𝑛 𝐸 𝑖 2 = 𝑖=1 𝑛 𝑦 𝑖 −𝑓 𝑥 𝑖 2 𝑦 2 𝐸 1 = 𝑦 1 −𝑓( 𝑥 1 ) 𝐸 2 𝑓 𝑥 2 𝑓 𝑥 1 𝐸 2 = 𝑦 2 −𝑓 𝑥 2 𝐸 1 𝑓 𝑥 =𝑎𝑥+𝑏 𝑦 1 𝑆𝑆𝐸= 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 𝐸 𝑖 = 𝑦 𝑖 −𝑓 𝑥 𝑖
How to adjust a and b so SSE is the smallest? 𝑆𝑆𝐸(𝑎,𝑏)= 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 How to calculate minimum of the SSE(a,b) function? 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑎 =0 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑏 =0
How to adjust a and b so SSE is the smallest? 𝑆𝑆𝐸(𝑎,𝑏)= 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑎 = 𝜕 𝜕𝑎 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 = 𝑖=1 𝑛 𝜕 𝜕𝑎 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 = 𝑖=1 𝑛 − 𝑥 𝑖 ∗2 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =−2 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑏 = 𝜕 𝜕𝑏 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 = 𝑖=1 𝑛 𝜕 𝜕𝑏 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 = 𝑖=1 𝑛 −2 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =−2 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏
How to adjust a and b so SSE is the smallest? 𝑆𝑆𝐸(𝑎,𝑏)= 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 2 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑎 =0 →−2 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =0 𝜕𝑆𝑆𝐸 𝑎,𝑏 𝜕𝑏 =0→−2 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =0
We obtain a set of linear equations of two variables a and b 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =0 𝑖=1 𝑛 (𝑥 𝑖 𝑦 𝑖 −𝑎 𝑥 𝑖 2 −𝑏 𝑥 𝑖 )=0 𝑎 𝑖=1 𝑛 𝑥 𝑖 2 +𝑏 𝑖=1 𝑛 𝑥 𝑖 − 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 =0 𝑖=1 𝑛 𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏 =0 𝑖=1 𝑛 (𝑦 𝑖 −𝑎 𝑥 𝑖 −𝑏)=0 𝑎 𝑖=1 𝑛 𝑥 𝑖 +𝑏 𝑖=1 𝑛 1− 𝑖=1 𝑛 𝑦 𝑖 =0
Finally… Set of linear equations 𝑎 𝑖=1 𝑛 𝑥 𝑖 2 +𝑏 𝑖=1 𝑛 𝑥 𝑖 − 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 =0 𝑎 𝑖=1 𝑛 𝑥 𝑖 2 +𝑏 𝑖=1 𝑛 𝑥 𝑖 = 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 𝑎 𝑖=1 𝑛 𝑥 𝑖 +𝑏 𝑖=1 𝑛 1− 𝑖=1 𝑛 𝑦 𝑖 =0 𝑎 𝑖=1 𝑛 𝑥 𝑖 +𝑏𝑛= 𝑖=1 𝑛 𝑦 𝑖 𝑖=1 𝑛 𝑥 𝑖 2 𝑖=1 𝑛 𝑥 𝑖 𝑖=1 𝑛 𝑥 𝑖 𝑛 𝑎 𝑏 = 𝑖=1 𝑛 𝑥 𝑖 𝑦 𝑖 𝑖=1 𝑛 𝑦 𝑖
How to solve it? Set of linear equations. 𝒊=𝟏 𝒏 𝒙 𝒊 𝟐 𝒊=𝟏 𝒏 𝒙 𝒊 𝒊=𝟏 𝒏 𝒙 𝒊 𝒏 𝒂 𝒃 = 𝒊=𝟏 𝒏 𝒙 𝒊 𝒚 𝒊 𝒊=𝟏 𝒏 𝒚 𝒊 𝑨𝒙=𝒃
Has solution if 𝒅𝒆𝒕 𝑨 ≠𝟎 𝒙 𝟐 − 𝒙 𝟐 ≠𝟎 𝒙 𝟐 𝒙 𝑪𝒐𝒗 𝑿,𝑿 ≠𝟎 𝒊=𝟏 𝒏 𝒙 𝒊 𝟐 𝒊=𝟏 𝒏 𝒙 𝒊 𝒊=𝟏 𝒏 𝒙 𝒊 𝒏 =𝒏 𝒊=𝟏 𝒏 𝒙 𝒊 𝟐 − 𝒊=𝟏 𝒏 𝒙 𝒊 𝒊=𝟏 𝒏 𝒙 𝒊 ≠𝟎 𝟏 𝒏 𝒊=𝟏 𝒏 𝒙 𝒊 𝟐 − 𝟏 𝒏 𝒊=𝟏 𝒏 𝒙 𝒊 𝟏 𝒏 𝒊=𝟏 𝒏 𝒙 𝒊 ≠𝟎 𝒙 𝟐 𝒙 𝒙 𝟐 − 𝒙 𝟐 ≠𝟎 𝑪𝒐𝒗 𝑿,𝑿 ≠𝟎
Linear regression procedure Plot data – make observation, decide which model fits best. If you decide to use linear regression – compute 𝑹 𝟐 . Solve linear regression problem.