Presentation is loading. Please wait.

Presentation is loading. Please wait.

Actor-Object Relation in Videos

Similar presentations


Presentation on theme: "Actor-Object Relation in Videos"— Presentation transcript:

1 Actor-Object Relation in Videos
Volodymyr Bobyr and Aayushjungbahadur Rana

2 Task Input: Dataset: VidOR – 10,000 Video-Clips Output: A video with:
Actors: Adult, Child, Dog Objects: toys, furniture, etc. Actions: “holding”, “in front”, “talking to”, etc. Output: Spatial & Temporal Pixel-Perfect Localization of actors, objects, and actions Dataset: VidOR – 10,000 Video-Clips

3 Approach Convolutional encoder/decoder network: 4 Stages:
Encoder backbone: I3D pretrained on kinetics Decoder: Feature pyramid network with diluted convolutions and side-connections 4 Stages: Actor & Object spatial segmentation Centroid Detection Action spatial segmentation Temporal connection – postprocessing

4 Details Input: (n_frames, 224, 224, 3) Output: Class Imbalance:
Actor/Object Segmentation: (n_frames, 56, 56, 80) Centroid Detection: (n_frames, 56, 56, 1) Action Segmentation: (n_frames, 56, 56, 52) Class Imbalance: People: 56% of all objects Background: in every videoclip Solution: class weights

5 Mean Intersection over Union among pixels in each frame
IoU Metrics Mean Intersection over Union among pixels in each frame

6 Data Preparation & Output Example
Original image Augmented Image Experimental Segmentation Output Original centroids Augmented Centroids

7 Experimental Results In the past: Loss: Binary Cross-Entropy

8 Experimental Results Before: Loss: Categorical Cross-Entropy

9 Experimental Results Now:
Categorical Cross-Entropy + Augmentation Tweaks


Download ppt "Actor-Object Relation in Videos"

Similar presentations


Ads by Google