Download presentation
Presentation is loading. Please wait.
Published byJasmin Jenkins Modified over 6 years ago
1
Linear Algebra Primer Juan Carlos Niebles and Ranjay Krishna
Stanford Vision and Learning Lab Another, very in-depth linear algebra review from CS229 is available here: And a video discussion of linear algebra from EE263 is here (lectures 3 and 4): 10/2/17
2
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculus 10/2/17
3
Outline Vectors and matrices Transformation Matrices Matrix inverse
Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightness, etc. We’ll define some common uses and standard operations on them. Vectors and matrices Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculus 10/2/17
4
Vector A column vector where A row vector where
denotes the transpose operation 10/2/17
5
Vector We’ll default to column vectors in this class
You’ll want to keep track of the orientation of your vectors when programming in python You can transpose a vector V in python by writing V.t. (But in class materials, we will always use VT to indicate transpose, and we will use V’ to mean “V prime”) 10/2/17
6
Vectors have two main uses
Data (pixels, gradients at an image keypoint, etc) can also be treated as a vector Such vectors don’t have a geometric interpretation, but calculations like “distance” can still have value Vectors can represent an offset in 2D or 3D space Points are just vectors from the origin 10/2/17
7
Matrix A matrix is an array of numbers with size by , i.e. m rows and n columns. If , we say that is square. 10/2/17
8
= Images Python represents an image as a matrix of pixel brightnesses
Note that the upper left corner is [y,x] = (0,0) Coordinates: [row, column] 10/2/17
9
Color Images Grayscale images have one number per pixel, and are stored as an m × n matrix. Color images have 3 numbers per pixel – red, green, and blue brightnesses (RGB) Stored as an m × n × 3 matrix = 10/2/17
10
Basic Matrix Operations
We will discuss: Addition Scaling Dot product Multiplication Transpose Inverse / pseudoinverse Determinant / trace 10/2/17
11
Matrix Operations Addition Scaling
Can only add a matrix with matching dimensions, or a scalar. Scaling 10/2/17
12
Vectors Norm More formally, a norm is any function that satisfies 4 properties: Non-negativity: For all Definiteness: f(x) = 0 if and only if x = 0. Homogeneity: For all Triangle inequality: For all 10/2/17
13
Matrix Operations Example Norms General norms: 10/2/17
14
Matrix Operations Inner product (dot product) of vectors
Multiply corresponding entries of two vectors and add up the result x·y is also |x||y|Cos( the angle between x and y ) 10/2/17
15
Matrix Operations Inner product (dot product) of vectors
If B is a unit vector, then A·B gives the length of A which lies in the direction of B 10/2/17
16
Matrix Operations The product of two matrices 10/2/17
17
Matrix Operations Multiplication The product AB is:
Each entry in the result is (that row of A) dot product with (that column of B) Many uses, which will be covered later 10/2/17
18
Matrix Operations Multiplication example:
Each entry of the matrix product is made by taking the dot product of the corresponding row in the left matrix, with the corresponding column in the right one. 10/2/17
19
Matrix Operations The product of two matrices 10/2/17
20
Matrix Operations Powers
By convention, we can refer to the matrix product AA as A2, and AAA as A3, etc. Obviously only square matrices can be multiplied that way 10/2/17
21
Matrix Operations Transpose – flip matrix, so row 1 becomes column 1
A useful identity: 10/2/17
22
Matrix Operations Determinant returns a scalar
Represents area (or volume) of the parallelogram described by the vectors in the rows of the matrix For , Properties: 10/2/17
23
Matrix Operations Trace
Invariant to a lot of transformations, so it’s used sometimes in proofs. (Rarely in this class though.) Properties: 10/2/17
24
Matrix Operations Vector Norms
Matrix norms: Norms can also be defined for matrices, such as the Frobenius norm: 10/2/17
25
Special Matrices Identity matrix I Diagonal matrix
Square matrix, 1’s along diagonal, 0’s elsewhere I ∙ [another matrix] = [that matrix] Diagonal matrix Square matrix with numbers along diagonal, 0’s elsewhere A diagonal ∙ [another matrix] scales the rows of that matrix 10/2/17
26
Special Matrices Symmetric matrix Skew-symmetric matrix 10/2/17
27
Linear Algebra Primer Juan Carlos Niebles and Ranjay Krishna
Stanford Vision and Learning Lab Another, very in-depth linear algebra review from CS229 is available here: And a video discussion of linear algebra from EE263 is here (lectures 3 and 4): 10/2/17
28
Announcements – part 1 HW0 submitted last night HW1 is due next Monday
HW2 will be released tonight Class notes from last Thursday due before class in exactly 48 hours 10/2/17
29
Announcements – part 2 Future homework assignments will be released via github Will allow you to keep track of changes IF they happen. Submissions for HW1 onwards will be done all through gradescope. NO MORE CORN SUBMISSIONS You will have separate submissions for the ipython pdf and the python code. 10/2/17
30
Recap - Vector A column vector where A row vector where
denotes the transpose operation 10/2/17
31
Recap - Matrix A matrix is an array of numbers with size by , i.e. m rows and n columns. If , we say that is square. 10/2/17
32
Recap - Color Images Grayscale images have one number per pixel, and are stored as an m × n matrix. Color images have 3 numbers per pixel – red, green, and blue brightnesses (RGB) Stored as an m × n × 3 matrix = 10/2/17
33
Recap - Vectors Norm More formally, a norm is any function that satisfies 4 properties: Non-negativity: For all Definiteness: f(x) = 0 if and only if x = 0. Homogeneity: For all Triangle inequality: For all 10/2/17
34
Recap – projection Inner product (dot product) of vectors
If B is a unit vector, then A·B gives the length of A which lies in the direction of B 10/2/17
35
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculus Matrix multiplication can be used to transform vectors. A matrix used in this way is called a transformation matrix. 10/2/17
36
Transformation Matrices can be used to transform vectors in useful ways, through multiplication: x’= Ax Simplest is scaling: (Verify to yourself that the matrix multiplication works out this way) 10/2/17
37
Rotation How can you convert a vector represented in frame “0” to a new, rotated coordinate frame “1”? 10/2/17
38
Rotation How can you convert a vector represented in frame “0” to a new, rotated coordinate frame “1”? Remember what a vector is: [component in direction of the frame’s x axis, component in direction of y axis] 10/2/17
39
Rotation So to rotate it we must produce this vector: [component in direction of new x axis, component in direction of new y axis] We can do this easily with dot products! New x coordinate is [original vector] dot [the new x axis] New y coordinate is [original vector] dot [the new y axis] 10/2/17
40
Rotation Insight: this is what happens in a matrix*vector multiplication Result x coordinate is: [original vector] dot [matrix row 1] So matrix multiplication can rotate a vector p: RED is rotation matrix 10/2/17
41
Rotation Suppose we express a point in the new coordinate system which is rotated left If we plot the result in the original coordinate system, we have rotated the point right Thus, rotation matrices can be used to rotate vectors. We’ll usually think of them in that sense-- as operators to rotate vectors Left rotation of axis == right rotation of point 10/2/17
42
2D Rotation Matrix Formula
Counter-clockwise rotation by an angle P’ y’ y P x’ x 10/2/17
43
Transformation Matrices
Multiple transformation matrices can be used to transform a point: p’=R2 R1 S p 10/2/17
44
Transformation Matrices
Multiple transformation matrices can be used to transform a point: p’=R2 R1 S p The effect of this is to apply their transformations one after the other, from right to left. In the example above, the result is (R2 (R1 (S p))) 10/2/17
45
Transformation Matrices
Multiple transformation matrices can be used to transform a point: p’=R2 R1 S p The effect of this is to apply their transformations one after the other, from right to left. In the example above, the result is (R2 (R1 (S p))) The result is exactly the same if we multiply the matrices first, to form a single transformation matrix: p’=(R2 R1 S) p 10/2/17
46
Homogeneous system In general, a matrix multiplication lets us linearly combine components of a vector This is sufficient for scale, rotate, skew transformations. But notice, we can’t add a constant! 10/2/17
47
Homogeneous system The (somewhat hacky) solution? Stick a “1” at the end of every vector: Now we can rotate, scale, and skew like before, AND translate (note how the multiplication works out, above) This is called “homogeneous coordinates” 10/2/17
48
Homogeneous system In homogeneous coordinates, the multiplication works out so the rightmost column of the matrix is a vector that gets added. Generally, a homogeneous transformation matrix will have a bottom row of [0 0 1], so that the result has a “1” at the bottom too. 10/2/17
49
2D Translation P’ P t 10/2/17
50
2D Translation using Homogeneous Coordinates
P x y tx ty P’ t P 10/2/17
51
2D Translation using Homogeneous Coordinates
P x y tx ty P’ t P 10/2/17
52
2D Translation using Homogeneous Coordinates
P x y tx ty P’ t P 10/2/17
53
2D Translation using Homogeneous Coordinates
P x y tx ty P’ t P 10/2/17
54
2D Translation using Homogeneous Coordinates
P x y tx ty P’ t t P 10/2/17
55
Scaling P’ P 10/2/17
56
Scaling Equation P’ sy y P y x sx x 10/2/17
57
Scaling Equation P’ sy y P y x sx x 10/2/17
58
Scaling Equation P’ sy y P y x sx x 10/2/17
59
Scaling & Translating P’=S∙P P’’=T∙P’ P’’=T ∙ P’=T ∙(S ∙ P)= T ∙ S ∙P
10/2/17
60
Scaling & Translating A 10/2/17
61
Scaling & Translating 10/2/17
62
Translating & Scaling versus Scaling & Translating
10/2/17
63
Translating & Scaling != Scaling & Translating
10/2/17
64
Translating & Scaling != Scaling & Translating
10/2/17
65
Rotation P’ P 10/2/17
66
Rotation Equations Counter-clockwise rotation by an angle P’ y’ x’
10/2/17
67
Rotation Matrix Properties
A 2D rotation matrix is 2x2 Note: R belongs to the category of normal matrices and satisfies many interesting properties: 10/2/17
68
Rotation Matrix Properties
Transpose of a rotation matrix produces a rotation in the opposite direction The rows of a rotation matrix are always mutually perpendicular (a.k.a. orthogonal) unit vectors (and so are its columns) 10/2/17
69
Scaling + Rotation + Translation
P’= (T R S) P This is the form of the general-purpose transformation matrix 10/2/17
70
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculate The inverse of a transformation matrix reverses its effect 10/2/17
71
Inverse Given a matrix A, its inverse A-1 is a matrix such that AA-1 = A-1A = I E.g. Inverse does not always exist. If A-1 exists, A is invertible or non-singular. Otherwise, it’s singular. Useful identities, for matrices that are invertible: 10/2/17
72
Matrix Operations Pseudoinverse
Say you have the matrix equation AX=B, where A and B are known, and you want to solve for X 10/2/17
73
Matrix Operations Pseudoinverse
Say you have the matrix equation AX=B, where A and B are known, and you want to solve for X You could calculate the inverse and pre-multiply by it: A-1AX=A-1B → X=A-1B 10/2/17
74
Matrix Operations Pseudoinverse
Say you have the matrix equation AX=B, where A and B are known, and you want to solve for X You could calculate the inverse and pre-multiply by it: A-1AX=A-1B → X=A-1B Python command would be np.linalg.inv(A)*B But calculating the inverse for large matrices often brings problems with computer floating-point resolution (because it involves working with very small and very large numbers together). Or, your matrix might not even have an inverse. 10/2/17
75
Matrix Operations Pseudoinverse
Fortunately, there are workarounds to solve AX=B in these situations. And python can do them! Instead of taking an inverse, directly ask python to solve for X in AX=B, by typing np.linalg.solve(A, B) Python will try several appropriate numerical methods (including the pseudoinverse if the inverse doesn’t exist) Python will return the value of X which solves the equation If there is no exact solution, it will return the closest one If there are many solutions, it will return the smallest one 10/2/17
76
Matrix Operations Python example: >> import numpy as np
>> x = np.linalg.solve(A,B) x = 1.0000 10/2/17
77
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculate The rank of a transformation matrix tells you how many dimensions it transforms a vector to. 10/2/17
78
Linear independence Suppose we have a set of vectors v1, …, vn
If we can express v1 as a linear combination of the other vectors v2…vn, then v1 is linearly dependent on the other vectors. The direction v1 can be expressed as a combination of the directions v2…vn. (E.g. v1 = .7 v2 -.7 v4) Second bullet point: define linear combination in blackboard 10/2/17
79
Linear independence Suppose we have a set of vectors v1, …, vn
If we can express v1 as a linear combination of the other vectors v2…vn, then v1 is linearly dependent on the other vectors. The direction v1 can be expressed as a combination of the directions v2…vn. (E.g. v1 = .7 v2 -.7 v4) If no vector is linearly dependent on the rest of the set, the set is linearly independent. Common case: a set of vectors v1, …, vn is always linearly independent if each vector is perpendicular to every other vector (and non-zero) Second bullet point: define linear combination in blackboard 10/2/17
80
Linear independence Linearly independent set Not linearly independent
Geometric interpretation 10/2/17
81
Matrix rank Column/row rank Matrix rank
Column rank always equals row rank Matrix rank 10/2/17
82
Matrix rank For transformation matrices, the rank tells you the dimensions of the output E.g. if rank of A is 1, then the transformation p’=Ap maps points onto a line. Here’s a matrix with rank 1: Example of how a rank deficient matrix can collapse all the space (any point) into a line All points get mapped to the line y=2x 10/2/17
83
Matrix rank If an m x m matrix is rank m, we say it’s “full rank”
Maps an m x 1 vector uniquely to another m x 1 vector An inverse matrix can be found If rank < m, we say it’s “singular” At least one dimension is getting collapsed. No way to look at the result and tell what the input was Inverse does not exist Inverse also doesn’t exist for non-square matrices Geometrically: transformation is like a rotation, scale, translation… Singular -> cannot go back to original data 10/2/17
84
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors(SVD) Matrix Calculus 10/2/17
85
Eigenvector and Eigenvalue
An eigenvector x of a linear transformation A is a non-zero vector that, when A is applied to it, does not change direction. 10/2/17
86
Eigenvector and Eigenvalue
An eigenvector x of a linear transformation A is a non-zero vector that, when A is applied to it, does not change direction. Applying A to the eigenvector only scales the eigenvector by the scalar value λ, called an eigenvalue. 10/2/17
87
Eigenvector and Eigenvalue
We want to find all the eigenvalues of A: Which can we written as: Therefore: 10/2/17
88
Eigenvector and Eigenvalue
We can solve for eigenvalues by solving: Since we are looking for non-zero x, we can instead solve the above equation as: 10/2/17
89
Properties The trace of a A is equal to the sum of its eigenvalues:
10/2/17
90
Properties The trace of a A is equal to the sum of its eigenvalues:
The determinant of A is equal to the product of its eigenvalues 10/2/17
91
Properties The trace of a A is equal to the sum of its eigenvalues:
The determinant of A is equal to the product of its eigenvalues The rank of A is equal to the number of non-zero eigenvalues of A. 10/2/17
92
Properties The trace of a A is equal to the sum of its eigenvalues:
The determinant of A is equal to the product of its eigenvalues The rank of A is equal to the number of non-zero eigenvalues of A. The eigenvalues of a diagonal matrix D = diag(d1, dn) are just the diagonal entries d1, dn 10/2/17
93
Spectral theory We call an eigenvalue λ and an associated eigenvector an eigenpair. The space of vectors where (A − λI) = 0 is often called the eigenspace of A associated with the eigenvalue λ. The set of all eigenvalues of A is called its spectrum: 10/2/17
94
Spectral theory The magnitude of the largest eigenvalue (in magnitude) is called the spectral radius Where C is the space of all eigenvalues of A 10/2/17
95
Spectral theory The spectral radius is bounded by infinity norm of a matrix: Proof: Turn to a partner and prove this! 10/2/17
96
Spectral theory The spectral radius is bounded by infinity norm of a matrix: Proof: Let λ and v be an eigenpair of A: 10/2/17
97
Diagonalization An n × n matrix A is diagonalizable if it has n linearly independent eigenvectors. Most square matrices (in a sense that can be made mathematically rigorous) are diagonalizable: Normal matrices are diagonalizable Matrices with n distinct eigenvalues are diagonalizable Lemma: Eigenvectors associated with distinct eigenvalues are linearly independent. 10/2/17
98
Diagonalization An n × n matrix A is diagonalizable if it has n linearly independent eigenvectors. Most square matrices are diagonalizable: Normal matrices are diagonalizable Matrices with n distinct eigenvalues are diagonalizable Lemma: Eigenvectors associated with distinct eigenvalues are linearly independent. 10/2/17
99
Diagonalization Eigenvalue equation:
Where D is a diagonal matrix of the eigenvalues 10/2/17
100
Diagonalization Eigenvalue equation: Assuming all λi’s are unique:
Remember that the inverse of an orthogonal matrix is just its transpose and the eigenvectors are orthogonal 10/2/17
101
Symmetric matrices Properties:
For a symmetric matrix A, all the eigenvalues are real. The eigenvectors of A are orthonormal. 10/2/17
102
Symmetric matrices Therefore:
where So, if we wanted to find the vector x that: 10/2/17
103
Symmetric matrices Therefore:
where So, if we wanted to find the vector x that: Is the same as finding the eigenvector that corresponds to the largest eigenvalue. 10/2/17
104
Some applications of Eigenvalues
PageRank Schrodinger’s equation PCA 10/2/17
105
Outline Vectors and matrices Transformation Matrices Matrix inverse
Basic Matrix Operations Determinants, norms, trace Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors(SVD) Matrix Calculus 10/2/17
106
Matrix Calculus – The Gradient
Let a function take as input a matrix A of size m × n and returns a real value. Then the gradient of f: 10/2/17
107
Matrix Calculus – The Gradient
Every entry in the matrix is: the size of ∇Af(A) is always the same as the size of A. So if A is just a vector x: 10/2/17
108
Exercise Example: Find: 10/2/17
109
Exercise Example: From this we can conclude that: 10/2/17
110
Matrix Calculus – The Gradient
Properties 10/2/17
111
Matrix Calculus – The Hessian
The Hessian matrix with respect to x, written or simply as H is the n × n matrix of partial derivatives 10/2/17
112
Matrix Calculus – The Hessian
Each entry can be written as: Exercise: Why is the Hessian always symmetric? 10/2/17
113
Matrix Calculus – The Hessian
Each entry can be written as: The Hessian is always symmetric, because This is known as Schwarz's theorem: The order of partial derivatives don’t matter as long as the second derivative exists and is continuous. 10/2/17
114
Matrix Calculus – The Hessian
Note that the hessian is not the gradient of whole gradient of a vector (this is not defined). It is actually the gradient of every entry of the gradient of the vector. 10/2/17
115
Matrix Calculus – The Hessian
Eg, the first column is the gradient of 10/2/17
116
Exercise Example: 10/2/17
117
Exercise 10/2/17
118
Exercise Divide the summation into 3 parts depending on whether:
i == k or j == k 10/2/17
119
Exercise 10/2/17
120
Exercise 10/2/17
121
Exercise 10/2/17
122
Exercise 10/2/17
123
Exercise 10/2/17
124
Exercise 10/2/17
125
Exercise 10/2/17
126
Exercise 10/2/17
127
Exercise 10/2/17
128
What we have learned Vectors and matrices Transformation Matrices
Basic Matrix Operations Special Matrices Transformation Matrices Homogeneous coordinates Translation Matrix inverse Matrix rank Eigenvalues and Eigenvectors Matrix Calculate 10/2/17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.