Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hebrew University1 Threading Local Kernel Functions: Local versus Holistic Representations Amnon Shashua Tamir Hazan School of Computer Science & Eng.

Similar presentations


Presentation on theme: "Hebrew University1 Threading Local Kernel Functions: Local versus Holistic Representations Amnon Shashua Tamir Hazan School of Computer Science & Eng."— Presentation transcript:

1 Hebrew University1 Threading Local Kernel Functions: Local versus Holistic Representations Amnon Shashua Tamir Hazan School of Computer Science & Eng. The Hebrew University

2 Hebrew University2 Conventional Signal Representation for Purposes of Machine Learning: Given a training set of examples: Find a classification functionwhich is approximately consistent with the training examples and which “generalizes” well. The instance space, typically is a 1D signal Goal in this work: Generalize this concept where the instance space is a set – for example, a set of vectors with varying cardinality

3 Hebrew University3 Examples for Learning over Sets Conventional: identify a face from a single image Model images Novel image Each picture is represented as a 1D vector.

4 Hebrew University4 Examples for Learning over Sets A better way: identify a face from a collection of images Model images input set Example I

5 Hebrew University5 Examples for Learning over Sets Example II Find an irregular motion trajectory (surveillance application) An instance: a video clip of people in motion along a certain period of time. A possible representation: each trajectory is a column vector of (x,y) positions over time (rows represent time and columns represent people. Represented by a matrix with 3 columns and 2n rows

6 Hebrew University6 Examples for Learning over Sets Example II Find an irregular motion trajectory (surveillance application) An instance: a video clip of people in motion along a certain period of time. A possible representation: each trajectory is a column vector of (x,y) positions over time (rows represent time and columns represent people. For instance matrix associate a label whereifcontains an “irregular” trajectory ifdoes not

7 Hebrew University7 Examples for Learning over Sets Example II Find an irregular motion trajectory (surveillance application) Given a training set of examples: Find a classification function whereifcontains an “irregular” trajectory

8 Hebrew University8 Examples for Learning over Sets Example II Find an irregular motion trajectory (surveillance application) Say, we want to use an “optimal margin” classification algorithm like the SVM. subject to where is a positive definite kernel Not enough to find “some” measure of similarity on a pair of matrices It must also form an inner-product space

9 Hebrew University9 Another Example: will be used as running example Holistic Representation: measurements form a single vector Typical: straightforward representation, advanced inference engine (SVM..) When vector is in high dimension - unreasonable demands on the inference engine Down-sampling blurs the distinction near the boundary - loss of accuracy Sensitivity to occlusions, local and global transformations, non-rigidity of local parts,.. Local Parts Representation: Measurements form a collection (ordered or unordered) of vectors Typical: advanced (well thought off) data modeling but straightforward inference engine (nearest neighbor) Robust to local changes of appearance and occlusions Number of local parts may vary from one instance to the next.

10 Hebrew University10 Problem Statement Local Parts Representation: Measurements form a collection (ordered or unordered) of vectors Typical: advanced (well thought off) data modeling but straightforward inference engine (nearest neighbor) Robust to local changes of appearance and occlusions Number of local parts may vary from one instance to the next. Goal: Endow advanced algorithms (SVM..) with ability to work with local representations. Key: similarity function over instances must be positive definite (one can form a metric from it).

11 Hebrew University11 Problem Statement: similarity over sets of varying cardinality is an inner-product, i.e., Given two sets of vectors find a similarity measure is built over local kernel functions column cardinality (rank) may vary, n is fixed but k,q are arbitrary parameters of should induce invariance to order, robustness under occlusions, degree of interaction between local parts, and local transformations

12 Hebrew University12 Existing Approaches Fit a distribution to the set of local parts (column vectors) followed by a match over distributions like KL-divergence. Shakhnarovich, Fisher, Darrell ECCV02 Kondor, Jebara ICML 03 Vasconcelos, Ho, Moreno ECCV04 Pros: number of parts can vary Cons: variation could be complex and mot likely to fit a known distribution, sample size (number of parts) could be not sufficiently large for a fit. Direct (algebraic) Wolf, Shashua JMLR03 - set cardinality must be equal. Wallraven, Caputo, Graf ICCV03 - heuristic, not positive definite

13 Hebrew University13 Inner-Products over Matrices: Linear Family Generally, the inner-product where vector representations via column-wise concatenation is the upper-left sub-matrix of some positive definite We will begin by representing in a more convenient form allowing explicit access to the columns of A,B

14 Hebrew University14 Inner-Products over Matrices: Linear Family We will begin by representing in a more convenient form allowing explicit access to the columns of A,B Assume for now linear local kernels: Any can be represented as a sum over tensor products: Where are matrices.

15 Hebrew University15 is a basis of is a basis (reverse lexicographical order) of vector space is the dimensional dual space of the space of all multilinear functions of pairs of vectors from, Tensor Product Space If G,F are linear maps from V onto itself, then is a linear map from onto itself.

16 Hebrew University16 Inner-Products over Matrices: Linear Family We will begin by representing in a more convenient form allowing explicit access to the columns of A,B Any can be represented as a sum over tensor products: Where are matrices.

17 Hebrew University17 Where is of size with the (i,j) entry equal to Inner-Products over Matrices: Linear Family is the upper-left sub-matrix of

18 Hebrew University18 Inner-Products over Matrices: Linear Family Where is of size with the (i,j) entry equal to is the upper-left sub-matrix of What are the necessary conditions on such that is pos. def.? This is difficult (NP-hard, Gurvits 2003). Sufficient conditions: If both are p.d. then so is Another condition: must “distribute with the local kernel”

19 Hebrew University19 Inner-Products over Matrices: Linear Family Where is of size with the (i,j) entry equal to is the upper-left sub-matrix of The role of is to perform local coordinate change (mix the entries of the columns) We will assume that and as a result: Where are part of the upper-left block with appropriate dimensions of a fixed p.d. matrix

20 Hebrew University20 Inner-Products over Matrices: Linear Family Where are part of the upper-left block with appropriate dimensions of a fixed p.d. matrix The choices of F will determined the type of invariances one could obtain: parts are assumed aligned, take a weighted sum of local matches. permutation invariance, take a weighted sum of all matches decaying interaction among local parts Examples:

21 Hebrew University21 Lifting matrices to higher dimensions: The non-linear family sim(A,B) forms a linear super-position of local kernels Where maps matrices onto higher dimensional matrices (possibly in a non-linear manner) will form non-linear super-positions of the local kernels

22 Hebrew University22 Lifting matrices to higher dimensions: The non-linear family sim(A,B) There are two algebraic generic ways to “lift” the matrices: by applying symmetric operators on the tensor product space: the l-fold alternating tensor the l-fold symmetric tensor These liftings introduce the determinant and permanent operations on blocks of (whose entries are )

23 Hebrew University23 The Alternating Tensor The l-fold alternating tensor is a multilinear map of where Example: If forms a basis of, then the elements Where form a basis of the l-fold alternating tensor product of, denoted by

24 Hebrew University24 In we have 3 such elements: Let be the standard basis of The Alternating Tensor: example Example: n=3, l=2

25 Hebrew University25 The Alternating Tensor: example

26 Hebrew University26 The Alternating Tensor: example

27 Hebrew University27 The Alternating Tensor: example Note that the basic units are the 2x2 minors of A

28 Hebrew University28 The l-fold Alternating Tensor is a linear map The matrix representation of is called the “l’th compound matrix” The entry has the value has dimensions whose entries are equal to the minors of A

29 Hebrew University29 The l-fold Alternating Tensor has dimensions whose entries are equal to the minors of A Finally, from the identity we conclude which is known as the Binet-Cauchy theorem.

30 Hebrew University30 with dimension The l-fold Alternating Kernel Recall, Where is of size with the (i,j) entry equal to is the upper-left sub-matrix of In particular, when Define: wherewith dimension upper-left block of

31 Hebrew University31 The l-fold Alternating Kernel In particular, when is a linear weighted combination of the minors of Since the entries of arewe get that: is a non-linear combination of the local

32 Hebrew University32 The l-fold Alternating Kernel: Geometric Interpretation Let The “QR” factorization of A (and B) can be computed using only the elements (Wolf & Shashua, JMLR’03) Therefore, without loss of generality we can assume that the input matrices fed into are unitary.

33 Hebrew University33 The l-fold Alternating Kernel: Geometric Interpretation Therefore, without loss of generality we can assume that the input matrices fed into are unitary. The determinant of an block of is equal to the product of the cosine principal angles between the corresponding pair of l-dimensional subspaces spanned by: (recall that ) Reminder:

34 Hebrew University34 The l-fold Alternating Kernel: The role of F The choices of F will determined the type of invariances one could obtain: Examples: sum of product cos. prin. ang. between matching subspaces and only a subset of matching subspaces are considered (say, a sliding window arrangement) sum of product cos. prin. ang. between all pairs of subspaces and

35 Hebrew University35 The Symmetric Tensor and Kernel The points in the l-fold symmetric tensor space are defined: The points where form a basis for The analogue of is the “l’th power matrix” whose entry has the value is whose entries are equal to the permanents of A

36 Hebrew University36 Compound Kernels: Example Define: and obtain the “square l-fold alternating kernel”: (with an appropriate sparse F).

37 Hebrew University37 Pedestrian Detection using Local Parts Nine local regions, each providing 22 measurements

38 Hebrew University38 Holistic Down-sampled Images Raw linear Poly d=2Poly d=6RBF 20x2078%83%84%86% 32x3278%84%85%82% occlusion73.5%72%77%76.5% * The % is the ratio between the sum of true positives and true negatives and the total number of test examples * SVM with 4000 training and 4000 test sets were used

39 Hebrew University39 Sim(A,B) Kernels Local kernel linear90.8%85%90.6%88% RBF91.2%85%90.4%90% * The % is the ratio between the sum of true positives and true negatives and the total number of test examples * SVM with 4000 training and 4000 test sets were used

40 Hebrew University40 Sim(A,B) Kernels with Occlusions * The % is the ratio between the sum of true positives and true negatives and the total number of test examples * SVM with 4000 training and 4000 test sets were used Local kernel linear62%87% RBF83%88%

41 Hebrew University41 END


Download ppt "Hebrew University1 Threading Local Kernel Functions: Local versus Holistic Representations Amnon Shashua Tamir Hazan School of Computer Science & Eng."

Similar presentations


Ads by Google