Presentation is loading. Please wait.

Presentation is loading. Please wait.

Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe.

Similar presentations


Presentation on theme: "Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe."— Presentation transcript:

1 Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe Systems Inc.

2 Goal: Extract attributes from images of people

3 Who has long hair?Who has long hair?

4 Who has short pants?Who has short pants?

5 Male or female?Male or female?

6 Prior work on poselets and on attributes

7 Prior work on PoseletsPrior work on Poselets Introduced by [Bourdev and Malik, ICCV09] Detection with poselets [Bourdev et al, ECCV10] Applications Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] Human parsing [Wang et al, CVPR11] Semantic contours [Hariharan et al, ICCV11] Subordinate level categorization [Farrell et al, ICCV11]

8 Prior work on PoseletsPrior work on Poselets Introduced by [Bourdev and Malik, ICCV09] Detection with poselets [Bourdev et al, ECCV10] Applications Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] Human parsing [Wang et al, CVPR11] Semantic contours [Hariharan et al, ICCV11] Subordinate level categorization [Farrell et al, ICCV11]

9 Prior work on PoseletsPrior work on Poselets Introduced by [Bourdev and Malik, ICCV09] Detection with poselets [Bourdev et al, ECCV10] Applications Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] Human parsing [Wang et al, CVPR11] Semantic contours [Hariharan et al, ICCV11] Subordinate level categorization [Farrell et al, ICCV11]

10 Prior work on PoseletsPrior work on Poselets Introduced by [Bourdev and Malik, ICCV09] Detection with poselets [Bourdev et al, ECCV10] Applications Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] Human parsing [Wang et al, CVPR11] Semantic contours [Hariharan et al, ICCV11] Subordinate level categorization [Farrell et al, ICCV11]

11 Prior work on PoseletsPrior work on Poselets Introduced by [Bourdev and Malik, ICCV09] Detection with poselets [Bourdev et al, ECCV10] Applications Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] Human parsing [Wang et al, CVPR11] Semantic contours [Hariharan et al, ICCV11] Subordinate level categorization [Farrell et al, ICCV11]

12 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

13 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

14 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

15 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

16 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

17 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

18 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

19 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

20 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

21 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

22 Prior work on AttributesPrior work on Attributes Attributes as intermediate parts Discovering attributes from text Discovering attributes from images Attributes from motion capture Joint learning of classes & attributes Image retrieval with attributes Attributes and actions Active learning with attributes Attributes of people Gender attribute [Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam & Yang, PAMI02] [Ferrari & Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08] [Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al, BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10] [Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]

23 Poselets for Attribute Classification

24 Male or female?Male or female?

25 Gender recognition is easier if we factor out the pose

26 Poselets [Bourdev & Malik ICCV09][Bourdev & Malik ICCV09]

27 Poselets Examples may differ visually but have common semantics

28 How do we train a poselet?How do we train a poselet?

29 Finding correspondences at training timeFinding correspondences at training time Given part of a human pose How do we find a similar pose configuration in the training set?

30 We use keypoints to annotate the joints, eyes, nose, etc. of people Left Hip Left Shoulder Finding correspondences at training timeFinding correspondences at training time

31 Residual Error Finding correspondences at training timeFinding correspondences at training time

32 Training poselet classifiersTraining poselet classifiers Residual Error: 0.150.200.100.350.150.85 1. Given a seed patch 2. Find the closest patch for every other person 3. Sort them by residual error 4. Threshold them

33 Training poselet classifiersTraining poselet classifiers 1. Given a seed patch 2. Find the closest patch for every other person 3. Sort them by residual error 4. Threshold them 5. Use them as positive training examples to train a linear SVM with HOG features

34 Attribute Classification Algorithm at Test Time

35 Goal: Extract attributes of this person

36 Target person bounds Bounds of other nearby people Input:

37 Step 1: Detect poselet activations [Bourdev et al, ECCV10]

38 Step 2: Cluster the activations [Bourdev et al, ECCV10]

39 Step 3: Predict person bounds [Bourdev et al, ECCV10]

40 Step 4: Identify the correct cluster Max-flow in bipartite graph

41 Poselet Activations Start with its poselet activationsStart with its poselet activations

42 Features Features Poselet Activations Pyramid HOG LAB histogram Skin features Hands-skin Legs-skin Poselet patch B.* C Skin mask Arms mask

43 Poselet Activations Features Poselet-level Attribute Classifiers Attribute Classification OverviewAttribute Classification Overview

44 Poselet Activations Features Poselet-level Attribute Classifiers Person-level Attribute Classifiers Attribute Classification OverviewAttribute Classification Overview

45 Poselet Activations Features Poselet-level Attribute Classifiers Person-level Attribute Classifiers Context-level Attribute Classifiers Attribute Classification OverviewAttribute Classification Overview

46 Results

47 Our datasetOur dataset Source: VOC 2010 trainval for Person + H3D ~8000 annotations (4000 train + 4000 test) 9 binary attributes specified by 5 independent annotators via AMT Ground truth label: If 4 of the 5 agree Dataset will be made publicly available

48 Visual search on our test setVisual search on our test set “Female” “Wears hat”

49 “Has long hair” “Wears glasses”

50 “Wears shorts” “Has long sleeves”

51 “Doesn’t have long sleeves”

52 Our baselineOur baseline Canny-modulated HOG with SPM kernel [Lazebnik et al CVPR06] To help the baseline trained separate SPM for four viewpoints: For each attribute we pick the best SPM as our baseline Full viewHead zoomUpper bodyLegs

53 Precision/recall on our test setPrecision/recall on our test set Label frequency - - ___ SPM ___ No context ___ Full Model

54 State-of-the-art Gender RecognitionState-of-the-art Gender Recognition We outperform Cognitec (top-notch face recognizer) We outperform any gender recognizer based on frontal faces (are there others?) 61% of our test have frontal faces. Even with perfect classification of frontal faces, max AP=80.5% vs. our AP of 82.4%

55 Men most confused as womenConfusions Women most confused as men baseball hatlong hairhair hidden

56 Short pants most confused to be long pants Non-T-shirt most confused to be T-shirt annotation errors Are these pants short?wrong person occlusion

57 Best poselets per attributeBest poselets per attribute Gender: Long Hair: Wears glasses:

58 “A woman with long hair, glasses and long pants”(??) We can describe a picture of a personWe can describe a picture of a person

59 Conclusion

60 How poselets help in high-level visionHow poselets help in high-level vision The image is a complex function of the viewpoint, pose, appearance, etc. Poselets decouple pose and camera view from appearance

61 Google “poselets” to get:Google “poselets” to get: The set of published poselet papers H3D data set + Matlab tools Java3D annotation tool + video tutorial Matlab code to detect people using poselets Our latest trained poselets

62 “A man with short hair, glasses, short sleeves and shorts” “A man with short hair and long sleeves” “A person with short hair, no hat and long sleeves” “A woman with long hair, glasses, short sleeves and long pants” “A person with long pants” Describing peopleDescribing people “A computer vision professor who likes machine learning” Failure mode Poselets websitePoselets website http://eecs.berkeley.edu/~lbourdev/poselets The set of published poselet papers H3D data set + Matlab tools Java3D annotation tool + video tutorial Matlab code to detect people using poselets Our latest trained poselets


Download ppt "Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe."

Similar presentations


Ads by Google