Visual Attributes in Video

Slides:



Advertisements
Similar presentations
Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.
Advertisements

Attributes for Classifier Feedback Amar Parkash and Devi Parikh.
Learning Specific-Class Segmentation from Diverse Data M. Pawan Kumar, Haitherm Turki, Dan Preston and Daphne Koller at ICCV 2011 VGG reading group, 29.
DDDAS: Stochastic Multicue Tracking of Objects with Many Degrees of Freedom PIs: D. Metaxas, A. Elgammal and V. Pavlovic Dept of CS, Rutgers University.
PETS’05, Beijing, October 16 th 2005 ETISEO Project Ground Truth & Video annotation.
Face Alignment with Part-Based Modeling
Large-scale, real-world facial recognition in movie trailers Alan Wright Presentation 8.
INRETS, Villeneuve d’Ascq, December 15 th -16 th 2005 ETISEO Annotation rules Data structure Annotation tool and format Ground truth creation rules Reference.
Parsing Clothing in Fashion Photographs
Performance Evaluation Measures for Face Detection Algorithms Prag Sharma, Richard B. Reilly DSP Research Group, Department of Electronic and Electrical.
International Conference on Automatic Face and Gesture Recognition, 2006 A Layered Deformable Model for Gait Analysis Haiping Lu, K.N. Plataniotis and.
Introduction to Data-driven Animation Jinxiang Chai Computer Science and Engineering Texas A&M University.
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
Instructor : Dr. K. R. Rao Presented by: Rajesh Radhakrishnan.
Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.
WEEK VI Malcolm Collins-Sibley Mentor: Shervin Ardeshir.
BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.
“Hello! My name is... Buffy” Automatic Naming of Characters in TV Video Mark Everingham, Josef Sivic and Andrew Zisserman Arun Shyam.
Hands segmentation Pat Jangyodsuk. Motivation Alternative approach of finding hands Instead of finding bounding box, classify each pixel whether they’re.
Human pose recognition from depth image MS Research Cambridge.
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Head Tracking Using Video Analytics Xuan Wang 1, Yuhen Hu 1, Robert G. Radwin 2, John D. Lee 2 University of Wisconsin – Madison 1 Dept. Electrical and.
Describing People: A Poselet-Based Approach to Attribute Classification.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
Computer Vision: 3D Shape Reconstruction Use images to build 3D model of object or site 3D site model built from laser range scans collected by CMU autonomous.
Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.
A Hierarchical Deep Temporal Model for Group Activity Recognition
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
Naifan Zhuang, Jun Ye, Kien A. Hua
Unsupervised Learning of Video Representations using LSTMs
What is a Hidden Markov Model?
What Convnets Make for Image Captioning?
Week 3 (June 6 – June10 , 2016) Summary :
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Textual Video Prediction Week 2
Query-Focused Video Summarization – Week 1
Stop-Motion Animation
Tracking Objects with Dynamics
cs540 - Fall 2015 (Shavlik©), Lecture 25, Week 14
Qualitative Research Techniques: Questionnaires
Tim Sheerman-Chase, Eng-Jon Ong and Richard Bowden
Recognizing Humans: Action Recognition
Visual Attributes in Video
RES 745 Enthusiastic Study/snaptutorial.com
Towards Understanding End-of-trip Instructions in a Taxi Ride Scenario
Textual Video Prediction
Week 8 Nicholas Baker.
Accounting for the relative importance of objects in image retrieval
Attributes and Simile Classifiers for Face Verification
The Open World of Micro-Videos
How to make a character Workshop 1.
מיומנויות תקשורת בין אישית סוגי שיחות שונים בין שנים, בבית הספר.
Visual Attributes in Video
Animation Workshop Week 3
Visual Attributes in Video
Morphological Segmentation of Natural Gesture
Introduction to Object Tracking
Spatially Supervised Recurrent Neural Networks for Visual Object Tracking Authors: Guanghan Ning, Zhi Zhang, Chen Huang, Xiaobo Ren, Haohong Wang, Canhui.
Visual Attributes in Video
Angel A. Cantu, Nami Akazawa Department of Computer Science
Multi-UAV Detection and Tracking
Multi-UAV to UAV Tracking
Volodymyr Bobyr Supervised by Aayushjungbahadur Rana
Learning complex visual concepts
Report 4 Brandon Silva.
Report 2 Brandon Silva.
Multi-Target Detection and Tracking of UAVs from a UAV
Introduction Few-Shot object Segmentation.
REU Program 2019 Week 6 Alex Ruiz Jyoti Kini.
Presentation transcript:

Visual Attributes in Video Marielle Morris May 26, 2017

Project Goals Attributes: descriptive labels Ex. a trotting horse, a man with a pointy nose Identify and track attributes in videos Focus on time-dependent traits Different than action localization

Project Outline Create large-scale video attribute dataset Ideally build on existing, pre-annotated set Amazon Mechanical Turk to tag attributes Localize primary object in each frame Maintain in “spatio-temporal tube” across clip Train attribute prediction model Predict attribute dynamic in relation to object tube

Dataset Survey Name Clips Annotation Attributes Notes DAVIS 150 Obj segmentation — Humans, animals, objects, vehicles USAA 100 “Weak” Per-video Attributes are not per-frame SegTrack v2 14 Small dataset FBMS-59 59 Background sub Not very diverse, small dataset BVSD Fragmented ground-truth annotations CMU 30 Outlines Few frames annotated, limited motion UMASS 38 Bad annotations VSB100 Every 20th frame annotated McGillFaces 60 Pose All Faces Gesture 16 Gestures All Hand Gestures Youtube-Faces 3425 Bounding box CMU Panoptic 65 Skeleton/Pose All action sequences WWW Crowd 10000 Action/scene Poor attribute selections Youtube-Objects 126 Object is not in every frame, every 10th CRP 7 Action sequences, high-res, every 10th Youtube-BB 380000 Single object, 23 object types

Current Choice: Youtube-BB 380,000 video segments of 15-20s Single object bounding boxes at 1 fps 23 object types

Attribute Labeling: Single Frame Choose three of the following attributes: White Sniffing Preparing to jump Small Crouching

Attribute Annotation: Reel Choose three of the following attributes: White Sniffing Preparing to jump Small Crouching

Timeline: Week Proof of concept with Amazon Mechanical Turk Crowd-source results of single frame vs. reel Method: COCO Attributes (Patterson & Hays) Economic Labeling Algorithm for annotation Formulate plan for attribute generation

Timeline: Month Prepare YT-BB dataset Create Human Intelligence Task for labeling Building off COCO Attributes Github Minor modification: adding image reel to GUI