Download presentation
Presentation is loading. Please wait.
Published byDylan Sutton Modified over 10 years ago
2
Outline Facial Attributes Analysis Animated Pose Templates(APT) for Modeling and Detecting Human Actions Unsupervised Structure Learning of Stochastic And-Or Grammars
3
Outline Facial Attributes Analysis Animated Pose Templates for Modeling and Detecting Human Actions Unsupervised Structure Learning of Stochastic And-Or Grammars
4
A Deep Sum-Product Architecture for Robust Facial Attributes Analysis Motivation: An attribute can be estimated from small region Occluded region can be inferred with respect to others Attributes may indicate the absence or presence of others
5
Algorithm Use discriminative binary decision tree(DDT) for each attribute. Each node of tree contains a detector(locate the region) and a classifier(determine the presence or absence of an attribute) DDT
6
Sum-product Tree(SPT) Model joint probability The value of the root equals the joint probability of the variables. All the children of a product node are sums, all the children of a sum node are products or terminals. Sum node with its children has weights.
7
Sum-product Tree(SPT) With SPT, we can efficiently infer the value of an unobserved variable using MPE inference.
8
Algorithm Transform DDT to a sum-product tree(SPT) to explorer interdependencies of regions. Be able to handle occlusions even train data has no occlusions separator cluster Sum node Product node
9
Algorithm Organize all the SPTs into a sum-product network(SPN) to learn correlations of different attributes.(Learned by EM) means 3 different type of sum weights
10
Inference Run region detector with sliding window Locate a region Apply a region classifier
11
Learning 1) Train DDT for each attribute 2) transform DDT to SPT 3) build SPN E-step: infer unobserved data M-step: renormalize parameters Prune edges with zero weights
12
Outline Facial Attributes Analysis Animated Pose Templates for Modeling and Detecting Human Actions Unsupervised Structure Learning of Stochastic And-Or Grammars
13
Formulation Short-term action snippets( 2~5 frames ) Moving pose templates Long-term transitions between the pose templates APTs Contextual objects
14
Short-term action snippets Moving pose templates for each pose = Shape template(HOG) + Motion template(HOF) Human geometry, appearance, motion jointly
16
Moving Pose Template(MPT) MPT appearance(HOG), deformation and motion(variation of HOF).
17
Long-term actions Animated pose template A sequence of moving pose templates
18
Animated Pose Templates HMM model Transition Probability for the MPT labels Tracking probability for the movement of parts between frames
19
Animated Pose Templates(APT)
20
Animated Pose Templates with Contextual Object Contextual Objects Weak objects( e.g. cigarette and ground ) ○ Too small or too diverse ○ Using body parts Strong objects( e.g. cup ) ○ Distinguishable ○ Using HOG Treat these objects in the same way as the body parts.
22
Inference
23
Learning Semi-supervised Structure SVM Annotated key frames Cluster them into pose templates by EM For unannotated frames and model parameters ○ Learn model using labeled data by LSVM ○ Accept high score frames as labeled frames
24
Outline Facial Attributes Analysis Animated Pose Templates for Modeling and Detecting Human Actions Unsupervised Structure Learning of Stochastic And-Or Grammars
25
Unsupervised Structure Learning Problem Definition G is grammar X is the training data
26
Algorithm Framework Introduce new intermediate nonterminal nodes to increase its posterior probability.
27
And-Or Fragments And-fragments Failed when training data is scarce. Or-fragments Decrease posterior probability. And-Or fragments And-rules and Or-rules are learned in a more unified manner.
28
Likelihood Gain = likelihood changes * context matrix changes Prior Gain = size of grammar increase + reductions of configurations Posterior Gain = Likelihood Gain * Prior Gain
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.