Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline  Facial Attributes Analysis  Animated Pose Templates(APT) for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic.

Similar presentations


Presentation on theme: "Outline  Facial Attributes Analysis  Animated Pose Templates(APT) for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic."— Presentation transcript:

1

2 Outline  Facial Attributes Analysis  Animated Pose Templates(APT) for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic And-Or Grammars

3 Outline  Facial Attributes Analysis  Animated Pose Templates for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic And-Or Grammars

4 A Deep Sum-Product Architecture for Robust Facial Attributes Analysis  Motivation: An attribute can be estimated from small region Occluded region can be inferred with respect to others Attributes may indicate the absence or presence of others

5 Algorithm  Use discriminative binary decision tree(DDT) for each attribute. Each node of tree contains a detector(locate the region) and a classifier(determine the presence or absence of an attribute) DDT

6 Sum-product Tree(SPT)  Model joint probability The value of the root equals the joint probability of the variables.  All the children of a product node are sums, all the children of a sum node are products or terminals. Sum node with its children has weights.

7 Sum-product Tree(SPT)  With SPT, we can efficiently infer the value of an unobserved variable using MPE inference.

8 Algorithm  Transform DDT to a sum-product tree(SPT) to explorer interdependencies of regions. Be able to handle occlusions even train data has no occlusions separator cluster Sum node Product node

9 Algorithm  Organize all the SPTs into a sum-product network(SPN) to learn correlations of different attributes.(Learned by EM) means 3 different type of sum weights

10 Inference  Run region detector with sliding window  Locate a region  Apply a region classifier

11 Learning  1) Train DDT for each attribute  2) transform DDT to SPT  3) build SPN E-step: infer unobserved data M-step: renormalize parameters Prune edges with zero weights

12 Outline  Facial Attributes Analysis  Animated Pose Templates for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic And-Or Grammars

13 Formulation  Short-term action snippets( 2~5 frames ) Moving pose templates  Long-term transitions between the pose templates APTs  Contextual objects

14 Short-term action snippets  Moving pose templates for each pose = Shape template(HOG) + Motion template(HOF)  Human geometry, appearance, motion jointly

15

16 Moving Pose Template(MPT)  MPT  appearance(HOG), deformation and motion(variation of HOF).

17 Long-term actions  Animated pose template A sequence of moving pose templates

18 Animated Pose Templates  HMM model Transition Probability for the MPT labels Tracking probability for the movement of parts between frames

19 Animated Pose Templates(APT)

20 Animated Pose Templates with Contextual Object  Contextual Objects Weak objects( e.g. cigarette and ground ) ○ Too small or too diverse ○ Using body parts Strong objects( e.g. cup ) ○ Distinguishable ○ Using HOG Treat these objects in the same way as the body parts.

21

22 Inference

23 Learning  Semi-supervised Structure SVM Annotated key frames Cluster them into pose templates by EM For unannotated frames and model parameters ○ Learn model using labeled data by LSVM ○ Accept high score frames as labeled frames

24 Outline  Facial Attributes Analysis  Animated Pose Templates for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic And-Or Grammars

25 Unsupervised Structure Learning  Problem Definition  G is grammar  X is the training data

26 Algorithm Framework  Introduce new intermediate nonterminal nodes to increase its posterior probability.

27 And-Or Fragments  And-fragments Failed when training data is scarce.  Or-fragments Decrease posterior probability.  And-Or fragments And-rules and Or-rules are learned in a more unified manner.

28 Likelihood Gain = likelihood changes * context matrix changes Prior Gain = size of grammar increase + reductions of configurations Posterior Gain = Likelihood Gain * Prior Gain

29

30


Download ppt "Outline  Facial Attributes Analysis  Animated Pose Templates(APT) for Modeling and Detecting Human Actions  Unsupervised Structure Learning of Stochastic."

Similar presentations


Ads by Google