Download presentation
1
Projection and Some of Its Applications
Mohammed Nasser Professor, Dept. of Statistics, RU,Bangladesh The use of matrix theory is now widespread are essential in modern treatment of univeriate and multivariate statistical methods. C.R.Rao 1 DRRSDM 1 1
2
Contents Oblique and Orthogonal Projection in R2
Orthogonal Projection into a Line Rn Inner product Space Projection into a Subspace Gram-Schmidt Orthogonalization Projection and Matrices Projection in Infinite-dimensional Space Projection in Multivariate Methods 2 2
3
Mathematical Concepts
3
4
Mathematical Concepts
Covariance Projection Variance 4
5
1.Oblique and Orthogonal Projection in R2
are two independent vectors in R2 are two one-dimensional subspaces in R2 R2 =V1 +V2 V1 ∩V2 ={0} R2 =V1 V2
6
1.Oblique and Orthogonal Projection in R2
V2 V1 (1,1) (0,1)
7
1.Oblique and Orthogonal Projection in R2
are two independent vectors in R2
8
1.Oblique and Orthogonal Projection in R2
We define:L: R as We can easily show that it is a linear map This linear map is called projection (oblique??) The vector is projected on the space generated by along the space generated by
9
1.Oblique and Orthogonal Projection in R2
V2 V1 (1,1) (-1,1)
10
1.Oblique and Orthogonal Projection in R2
are two orthogonal (→independent) vectors in R2 Let us consider In this case we can find values of “a” without inverse
11
1.Oblique and Orthogonal Projection in R2
We define:L: R as This linear map is called orthogonal projection. The vector is projected on the space generated by along the space generated by
12
Projections (2,2,2) b = (2,2) (0,0,1) (0,1,0) (1,0,0) a = (1,0)
13
1.Orthogonal Projection Into a Line
Definition 1.1: Orthogonal Projection The orthogonal projection of v into the line spanned by a nonzero s is the vector. If S has unit length, then =(v.s)s and its length is (v.s) Example 1.1: Orthogonal projection of the vector ( 2 3 )T into the line y = 2x. 13 13
14
Example 1.3: Project = Discard orthogonal components
Example 1.2: Orthogonal projection of a general vector in R3 into the y-axis Example 1.3: Project = Discard orthogonal components A railroad car left on an east-west track without its brake is pushed by a wind blowing toward the northeast at fifteen miles per hour; what speed will the car reach? 14 14
15
Example 1.4: Nearest Point
(a,b) (c,d) Let A=(a,b) and B =(c,d) be two vectors. We have to find the nearest vector to A on B C B K(c,d) That means we have to find the value of k for which F(k)=((a,b)- k(c,d))T((a,b)- k(c,d)) is minimum , i.e. , length of AC is minimum. Easy application of derivative shows that ={(A.B)/B.B } B 15 15
16
Exercises 1 projection into the line y = x.
1. Consider the function mapping a plane to itself that takes a vector to its projection into the line y = x. (a)Produce a matrix that describes the function’s action. Show also that this map can be obtained by first rotating everything in the plane π/4 radians clockwise, then projecting into the x-axis, and then rotating π/4 radians counterclockwise. 2. Show that 16 16
17
2.Inner product Space 2.1Definition
An inner product on a real spaces V is a function that associates a number, denoted 〈u, v〉, with each pair of vectors u and v of V. This function has to satisfy the following conditions for vectors u, v, and w, and scalar c. 1.〈u, v〉=〈v, u〉 (symmetry axiom) 2.〈u + v, w〉=〈u, w〉+〈v, w〉 (additive axiom) 3.〈cu, v〉= c〈u, v〉 (homogeneity axiom) 4.〈u, u〉 0, and 〈u, u〉= 0 if and only if u = 0 (position definite axiom)
18
2.Inner product Space A vector space V on which an inner product is defined is called an inner product space.Any function on a vector space that satisfies the axioms of an inner product defines an inner product on the space. . There can be many inner products on a given vector space
19
Example 2.1 Let u = (x1, x2), v = (y1, y2), and w = (z1, z2) be arbitrary vectors in R2. Prove that〈u, v〉, defined as follows, is an inner product on R2. 〈u, v〉= x1y1 + 4x2y2 Determine the inner product of the vectors (-2, 5), (3, 1) under this inner product. Solution Axiom 1:〈u, v〉= x1y1 + 4x2y2 = y1x1 + 4y2x2 =〈v, u〉 Axiom 2:〈u + v, w〉=〈 (x1, x2) + (y1, y2) , (z1, z2) 〉 =〈 (x1 + y1, x2 + y2), (z1, z2) 〉 = (x1 + y1) z1 + 4(x2 + y2)z2 = x1z1 + 4x2z2 + y1 z1 + 4 y2z2 =〈(x1, x2), (z1, z2)〉+〈(y1, y2), (z1, z2) 〉 =〈u, w〉+〈v, w〉
20
Axiom 3:〈cu, v〉= 〈c(x1, x2), (y1, y2)〉 =〈 (cx1, cx2), (y1, y2) 〉
= cx1y1 + 4cx2y2 = c(x1y1 + 4x2y2) = c〈u, v〉 Axiom 4: 〈u, u〉= 〈(x1, x2), (x1, x2)〉= Further, if and only if x1 = 0 and x2 = 0. That is u = 0. Thus〈u, u〉 0, and〈u, u〉= 0 if and only if u = 0. The four inner product axioms are satisfied, 〈u, v〉= x1y1 + 4x2y2 is an inner product on R2. The inner product of the vectors (-2, 5), (3, 1) is 〈(-2, 5), (3, 1)〉= (-2 3) + 4(5 1) = 14
21
Example 2.2 Consider the vector space M22 of 2 2 matrices. Let u and v defined as follows be arbitrary 2 2 matrices. Prove that the following function is an inner product on M22. 〈u, v〉= ae + bf + cg + dh Determine the inner product of the matrices Solution Axiom 1:〈u, v〉= ae + bf + cg + dh = ea + fb + gc + hd =〈v, u〉 Axiom 3: Let k be a scalar. Then 〈ku, v〉= kae + kbf + kcg + kdh = k(ae + bf + cg + dh) = k〈u, v〉
22
Example 2.3 Consider the vector space Pn of polynomials of degree n. Let f and g be elements of Pn. Prove that the following function defines an inner product of Pn. Determine the inner product of polynomials f(x) = x2 + 2x – 1 and g(x) = 4x + 1 Solution Axiom 1: Axiom 2:
23
We now find the inner product of the functions f(x) = x2 + 2x – 1 and g(x) = 4x + 1
24
Norm of a Vector Definition 2.2
The norm of a vector in Rn can be expressed in terms of the dot product as follows Generalize this definition: The norms in general vector space do not necessary have geometric interpretations, but are often important in numerical work. Definition 2.2 Let V be an inner product space. The norm of a vector v is denoted ||v|| and it defined by
25
Example 2.4 Consider the vector space Pn of polynomials with inner product The norm of the function f generated by this inner product is Determine the norm of the function f(x) = 5x2 + 1. Solution Using the above definition of norm, we get The norm of the function f(x) = 5x2 + 1 is
26
Example 2.5 Consider the vector space M22 of 2 2 matrices. Let u and v defined as follows be arbitrary 2 2 matrices. It is known that the function 〈u, v〉= ae + bf + cg + dh is an inner product on M22 by Example 2. The norm of the matrix is
27
Angle between two vectors
The dot product in Rn was used to define angle between vectors. The angle between vectors u and v in Rn is defined by Definition 2.3 Let V be an inner product space. The angle between two nonzero vectors u and v in V is given by
28
Angle between two vectors
In R2 we first define cosθ, then prove C-S inequality In Rn we first prove C-S inequality , then define cosθ
29
Example 2.6 Consider the inner product space Pn of polynomials with inner product The angle between two nonzero functions f and g is given by Determine the cosine of the angle between the functions f(x) = 5x2 and g(x) = 3x Solution We first compute ||f || and ||g||. Thus
30
Example 2.7 Consider the vector space M22 of 2 2 matrices. Let u and v defined as follows be arbitrary 2 2 matrices. It is known that the function 〈u, v〉= ae + bf + cg + dh is an inner product on M22 by Example 2. The norm of the matrix is The angle between u and v is
31
Orthogonal Vectors Def 2.4. Let V be an inner product space. Two nonzero vectors u and v in V are said to be orthogonal if Example 2.8 Show that the functions f(x) = 3x – 2 and g(x) = x are orthogonal in Pn with inner product Solution Thus the functions f and g are orthogonal in this inner product Space.
32
Distance As for norm, the concept of distance will not have direct geometrical interpretation. It is however, useful in numerical mathematics to be able to discuss how far apart various functions are. Definition 2.5 Let V be an inner product space with vector norm defined by The distance between two vectors (points) u and v is defined d(u,v) and is defined by
33
Example 2.8 Consider the inner product space Pn of polynomials discussed earlier. Determine which of the functions g(x) = x2 – 3x + 5 or h(x) = x2 + 4 is closed to f(x) = x2. Solution Thus The distance between f and h is 4, as we might suspect, g is closer than h to f.
34
3.Gram-Schmidt Orthogonalization
Given a vector s, any vector v in an inner product space V can be decomposed as where Definition 3.1: Mutually Orthogonal Vectors Vectors v1, …, vk V are mutually orthogonal if vi · vj = 0 i j Theorem 3.1: A set of mutually orthogonal non-zero vectors is linearly independent. Proof: → → cj = 0 j 34 34
35
3.Gram-Schmidt Orthogonalization
Corollary .3.1: A set of k mutually orthogonal nonzero vectors in V k is a basis for the space. Definition 3.2: Orthogonal Basis An orthogonal basis for a vector space is a basis of mutually orthogonal vectors. Definition 3.3: Orthonormal Basis An orthonormal basis for a vector space is a basis of mutually orthogonal vectors of unit length. Definition 3.4: Orthogonal Complement The orthogonal complement of a subspace M of R3 is M = { v R3 | v is perpendicular to all vectors in M } ( read “M perp” ). The orthogonal projection projM (v ) of a vector is its projection into M along M .
36
Hence, v Rn., v projM (v) is perpendicular to all vectors in M.
Lemma 3.1: Let M be a subspace of Rn. Then M is also a subspace and Rn. = M M . Hence, v Rn., v projM (v) is perpendicular to all vectors in M. Proof: Construct bases using G-S orthogonalization. Theorem 3.2: Let v be a vector in Rn and let M be a subspace of Rn. with basis β1 , …, βk . If A is the matrix whose columns are the β’s then projM (v ) = c1β1 + …+ ck βk where the coefficients ci are the entries of the vector (AT A)-1 AT v. That is, projM (v ) = A (AT A)1 AT v. Proof: where c is a column vector → By lemma 3.7, → → 36 36
37
Interpretation of Theorem 3.2:
If B = β1 , …, βk is an orthonormal basis, then ATA = I. In which case, projM (v ) = A (AT A)1 AT v = A AT v. with In particular, if B = Ek , then A = AT = I. In case A is not orthonormal, the task is to find C s.t. B = AC and BTB = I. → → Hence 37 37
38
To orthogonally project into subspace
Example 3.1: To orthogonally project into subspace From we get → 38 38
39
Exercises 3. 1. Perform the Gram-Schmidt process on this basis for R3 , 2. Show that the columns of an nn matrix form an orthonormal set if and only if the inverse of the matrix is its transpose. Produce such a matrix. 39 39
40
4 Projection Into a Subspace
Definition 3.1:For any direct sum V = M N and any v V such that v = m + n with m M and n N .The projection of v into M along N is defined as E(v)= projM, N (v) = m Reminder: M & N need not be orthogonal.There need not even be an inner product defined. Theorem3.1: Show that (i) E is linear and (ii) E2=E. Theorem3.2: Let E:V→V is linear and E2=E then, (i) E(u)=u for any u є ImE. (ii) V is the direct sum of the image and kernel of E. i.e., V=ImE KerE. (iii) E is the projection of V into ImE, its image along KerE. Theorem3.2: Let E:V→V is linear and E2=E then, (i) E(u)=u for any u є ImE. (ii) V is the direct sum of the image and kernel of E. i.e., V=ImE KerE. (iii) E is the projection of V into ImE, its image along KerE. 40 40
41
Projection and Matrices
Let L(V)={L|L is a linear map between V and itself}. L(V) is a vector space under function addition and scalar multiplication. dim(L)=n2 if d(V)=n. If we fix a basis in V, there arises a one to one correspondence between L and set of all matrices of order n (the last set is also a vector space with dimension n2). The result implies that matrices identify linear operators.
42
Orthogonal Projection
Y (1,2) (2,1) O X two vector spaces V1, V2 by multiplying the vectors and by k; where k R. V1=k and V2=k 42 42
43
Orthogonal Projection
Now, we can find the vector space R2 as V1 V2 Let be any vector of R2 and k1, k2 R then we can write =k1 +k2 Therefore, = = 43 43
44
Orthogonal Projection
Now, let P be a projection matrix and x be a vector in R2, then a projection from R2 to V1 is given by, Px=k1 =[-1/3 x1 +2/3 x2 ] = Ex. 1) Check that P is idempotent but not symmetric. Why? 2) Prove that if the second vector is , P will be then idempotent as well as symmetric 44 44
45
Meaning of Pn×n xn×1 Case 1: Pn×n is singular but not idempotent
The whole space, Rn is mapped to the column space of Pn×n , an improper subspace of Rn . An vector of the subspace may mapped to another vector of the Subspace.,
46
Meaning of Pn×n xn×1 Case 2: Pn×n is singular and idempotent( asymmetric) The whole space, Rn is mapped to the column space of Pn×n , an improper subspace of Rn . An vector of the subspace is mapped to the same vector of the Subspace., Px is not orthogonal to x-Px
47
Meaning of Pn×n xn×1 Case 3: Pn×n is singular and idempotent( symmetric) Meaning: The whole space, Rn is mapped to the column space of Pn×n , an improper subspace of Rn . An vector of the subspace is mapped to the same vector of the Subspace. It is orthogonal projection, That is, the subspace is to its complement. For example, Px=(x1+x2) Px is orthogonal to x-Px
48
Meaning of Pn×n xn×1 Case 4: Pn×n is non-singular and non-orthogonal
Meaning: The whole space, Rn is mapped to the column space of Pn×n , same as Rn . The mapping is one-to-one and onto.We have now columns of Pn×n as a new (oblique) basis in place of standard basis. Angles between vectors and length of vectors are not preserved.For example,
49
Meaning of Pn×n xn×1 Case : Pn×n is non-singular and orthogonal Meaning: The whole space, Rn is mapped to the column space of Pn×n , same as Rn . The mapping is one-to-one and onto.We have now columns of Pn×n as a new (orthogonal) basis in place of standard basis. Angles between vectors and length of vectors are preserved. We have only a rotation of axes. For example, From a symmetric matrix we have always such a P of its n independent eigen vectors From a symmetric matrix we have alway a symmetric idempotent P of its r(<n )independent eigen vectors
50
Projection Theorem In a Hilbert Space
Let M be a closed subspace of a Hilbert space, H. There exists a unique pair of mappings P:H→M and Q:H→MT such that x=Px+Qx for all xє H. P and Q have the following properties: i) x є M Px=x, Qx=0 ii) x є M T Px=0, Qx=x ii) Px is closest vector in M to x. iv) Qx is closest vector in MT to x v) ||Px||2+ ||Qx||2 =||x||2 vi) P and Q are linear maps and P2=P, Q2=Q Finite dimensional spaces are always closed
51
51 Applications The common characteristic (structure) among the following statistical methods? 1. Principal Components Analysis 2. (Ridge ) regression 3. Fisher discriminant analysis 4. Canonical correlation analysis 5.Singular value decomposition 6. Independent component analysis We consider linear combinations of input vector: We make use concepts of length and dot product available in Euclidean space. 51
52
What is feature reduction?
Original data reduced data Linear transformation
53
Dimensionality Reduction
One approach to deal with high dimensional data is by reducing their dimensionality. Project high dimensional data onto a lower dimensional sub-space using linear or non-linear transformations.
54
Principal Component Analysis (PCA)
Find a basis in a low dimensional sub-space: Approximate vectors by projecting them in a low dimensional sub-space: (1) Original space representation: (2) Lower-dimensional sub-space representation: Note: if K=N, then
55
Principal Component Analysis (PCA)
Information loss Dimensionality reduction implies information loss !! PCA preserves as much information as possible: What is the “best” lower dimensional sub-space? The “best” low-dimensional space is centered at the sample mean and has directions determined by the “best” eigenvectors of the covariance matrix of the data x. By “best” eigenvectors we mean those corresponding to the largest eigenvalues ( i.e., “principal components”). Since the covariance matrix is real and symmetric, these eigenvectors are orthogonal and form a set of basis vectors. –Singular Value
56
Principal Component Analysis (PCA)
Methodology Suppose x1, x2, ..., xM are N x 1 vectors
57
Principal Component Analysis (PCA)
Methodology – cont.
58
Principal Component Analysis (PCA)
Linear transformation implied by PCA The linear transformation RN RK that performs the dimensionality reduction is:
59
Principal Component Analysis (PCA)
Eigenvalue spectrum λN K λi
60
Principal Comzponent Analysis (PCA)
What is the error due to dimensionality reduction? It can be shown that error due to dimensionality reduction is equal to:
61
Principal Component Analysis (PCA)
Standardization The principal components are dependent on the units used to measure the original variables as well as on the range of values they assume. We should always standardize the data prior to using PCA. A common standardization method is to transform all the data to have zero mean and unit standard deviation:
62
Principal Component Analysis (PCA)
Case Study: Eigenfaces for Face Detection/Recognition M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, vol. 3, no. 1, pp , 1991. Face Recognition The simplest approach is to think of it as a template matching problem Problems arise when performing recognition in a high-dimensional space. Significant improvements can be achieved by first mapping the data into a lower dimensionality space. How to find this lower-dimensional space?
63
Principal Component Analysis (PCA)
Main idea behind eigenfaces average face
64
Principal Component Analysis (PCA)
Computation of the eigenfaces – cont. Mind that this is normalized.. ui
65
Principal Component Analysis (PCA)
Computation of the eigenfaces – cont.
66
Principal Component Analysis (PCA)
Representing faces onto this basis
67
Principal Component Analysis (PCA)
Representing faces onto this basis – cont.
68
Principal Component Analysis (PCA)
Face Recognition Using Eigenfaces
69
Principal Component Analysis (PCA)
Face Recognition Using Eigenfaces – cont. The distance er is called distance within the face space (difs) Comment: we can use the common Euclidean distance to compute er, however, it has been reported that the Mahalanobis distance performs better:
70
Principal Component Analysis (PCA)
Face Detection Using Eigenfaces
71
Principal Component Analysis (PCA)
Face Detection Using Eigenfaces – cont.
72
Principal Component Analysis (PCA)
Reconstruction of faces and non-faces
73
Principal Component Analysis (PCA)
Applications dffs Face detection, tracking, and recognition
74
Principal Components Analysis
So, principal components are given by: b1 = u11x1 + u12x u1NxN b2 = u21x1 + u22x u2NxN ... bN= aN1x1 + aN2x aNNxN xj’s are standardized if correlation matrix is used (mean 0.0, SD 1.0)
75
Principal Components Analysis
Score of ith unit on jth principal component bi,j = uj1xi1 + uj2xi ujNxiN
76
PCA Scores xi2 bi,1 bi,2 xi1
77
Principal Components Analysis
Amount of variance accounted for by: 1st principal component, λ1, 1st eigenvalue 2nd principal component, λ2, 2ndeigenvalue ... λ1 > λ2 > λ3 > λ4 > ... Average λj = 1 (correlation matrix)
78
Principal Components Analysis: Eigenvalues
λ2 λ1 U1
79
Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.