Video Analysis Mei-Chen Yeh May 29, 2012. Outline Video representation Motion Actions in Video.

Slides:

Advertisements

Similar presentations

Optical Flow Estimation

Advertisements

Actions in video Monday, April 25 Kristen Grauman UT-Austin.

Computer Vision Optical Flow

A New Block Based Motion Estimation with True Region Motion Field Jozef Huska & Peter Kulla EUROCON 2007 The International Conference on “Computer as a.

3D Computer Vision and Video Computing 3D Vision Topic 4 of Part II Visual Motion CSc I6716 Fall 2011 Cover Image/video credits: Rick Szeliski, MSR Zhigang.

Announcements Quiz Thursday Quiz Review Tomorrow: AV Williams 4424, 4pm. Practice Quiz handout.

Optical Flow Methods 2007/8/9.

Announcements Project1 artifact reminder counts towards your grade Demos this Thursday, 12-2:30 sign up! Extra office hours this week David (T 12-1, W/F.

Optical Flow Brightness Constancy The Aperture problem Regularization Lucas-Kanade Coarse-to-fine Parametric motion models Direct depth SSD tracking.

Caltech, Oct Lihi Zelnik-Manor

Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

Motion Computing in Image Analysis

Optical Flow Estimation

Lecture 19: Optical flow CS6670: Computer Vision Noah Snavely

Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Numerical Recipes (Newton-Raphson), 9.4 (first.

1 Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or.

3D Rigid/Nonrigid RegistrationRegistration 1)Known features, correspondences, transformation model – feature basedfeature based 2)Specific motion type,

Matching Compare region of image to region of image. –We talked about this for stereo. –Important for motion. Epipolar constraint unknown. But motion small.

1 Video Surveillance systems for Traffic Monitoring Simeon Indupalli.

Optical flow (motion vector) computation Course: Computer Graphics and Image Processing Semester:Fall 2002 Presenter:Nilesh Ghubade

Motion and optical flow Thursday, Nov 20 Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, S. Lazebnik.

Optical flow Combination of slides from Rick Szeliski, Steve Seitz, Alyosha Efros and Bill Freeman.

ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.

Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.

Flow Based Action Recognition Papers to discuss: The Representation and Recognition of Action Using Temporal Templates (Bobbick & Davis 2001) Recognizing.

CSSE463: Image Recognition Day 30 This week This week Today: motion vectors and tracking Today: motion vectors and tracking Friday: Project workday. First.

Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.

Recognizing Human Figures and Actions Greg Mori Simon Fraser University.

Computer Vision, Robert Pless Lecture 11 our goal is to understand the process of multi-camera vision. Last time, we studies the “Essential” and “Fundamental”

Videos Mei-Chen Yeh. Outline Video representation Basic video compression concepts – Motion estimation and compensation Some slides are modified from.

CSE 185 Introduction to Computer Vision Feature Tracking and Optical Flow.

Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Uses of Motion 3D shape reconstruction Segment objects based on motion cues Recognize events and activities Improve video quality Track objects Correct.

Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #16.

Motion Analysis using Optical flow CIS750 Presentation Student: Wan Wang Prof: Longin Jan Latecki Spring 2003 CIS Dept of Temple.

3D Imaging Motion.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

1 Motion Analysis using Optical flow CIS601 Longin Jan Latecki Fall 2003 CIS Dept of Temple University.

Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.

Miguel Tavares Coimbra

Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page

Feature Tracking and Optical Flow Computer Vision ECE 5554, ECE 4984 Virginia Tech Devi Parikh 11/10/15 Slides borrowed from Derek Hoiem, who adapted many.

Suspicious Behavior in Outdoor Video Analysis - Challenges & Complexities Air Force Institute of Technology/ROME Air Force Research Lab Unclassified IED.

Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page

Optical flow and keypoint tracking Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

Image Motion. The Information from Image Motion 3D motion between observer and scene + structure of the scene –Wallach O’Connell (1953): Kinetic depth.

Digital Video Representation Subject : Audio And Video Systems Name : Makwana Gaurav Er no.: : Class : Electronics & Communication.

Motion and optical flow

Feature Tracking and Optical Flow

Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

CSE 577 Image and Video Analysis

Vehicle Segmentation and Tracking in the Presence of Occlusions

Motion Estimation Today’s Readings

Announcements more panorama slots available now

CSSE463: Image Recognition Day 30

CSSE463: Image Recognition Day 29

Announcements Questions on the project? New turn-in info online

Coupled Horn-Schunck and Lukas-Kanade for image processing

CSSE463: Image Recognition Day 30

Announcements more panorama slots available now

Optical flow Computer Vision Spring 2019, Lecture 21

CSSE463: Image Recognition Day 30

CSSE463: Image Recognition Day 29

Optical flow and keypoint tracking

Faculty of Science Information Technology Safeen Hasan Assist Lecturer

Presentation transcript:

Video Analysis Mei-Chen Yeh May 29, 2012

Outline Video representation Motion Actions in Video

Videos A natural video stream is continuous in both spatial and temporal domains. A digital video stream sample pixels in both domains.

Video processing YC b C r

Video signal representation (1) Composite color signal – R, G, B – Y, C b, C r Why Y, C b, C r ? – Backward compatibility (back- and-white to color TV) – The eye is less sensitive to changes of C b and C r components Luminance (Y) Chrominance (C b + C r )

Video signal representation (2) Y is the luma component and C b and C r are the blue and red chroma components. YCbCb CrCr

Sampling formats (1) 4:4:44:2:2 (DVB)4:1:1 (DV) Slide from Dr. Ding

Sampling formats (2) 4:2:0 (VCD, DVD)

TV encoding system (1) PAL – Phase Alternating Line, is a color encoding system used in broadcast television systems in large parts of the world. SECAM – (French: Séquential Couleur Avec Mémoire), is an analog color television system first used in France. NTSC – National Television System Committee, is the analog television system used in most of North America, South America, Burma, South Korea, Taiwan, Japan, Philippines, and some Pacific island nations and territories.

TV encoding system (2)

Uncompressed bitrate of videos Slide from Dr. Chang

Outline Video representation Motion Actions in Video

Motion and perceptual organization Sometimes, motion is foremost cue

Motion and perceptual organization Even poor motion data can evoke a strong percept

Motion and perceptual organization Even poor motion data can evoke a strong percept

Uses of motion Estimating 3D structure Segmenting objects based on motion cues Learning dynamical models Recognizing events and activities Improving video quality (motion stabilization) Compressing videos ……

Motion field The motion field is the projection of the 3D scene motion into the image

Motion field P(t) is a moving 3D point Velocity of scene point: V = dP/dt p(t) = (x(t),y(t)) is the projection of P in the image Apparent velocity v in the image: v x = dx/dt v y = dy/dt These components are known as the motion field of the image p(t)p(t) p(t+dt) P(t)P(t) P(t+dt) V v

Motion estimation techniques Based on temporal changes in image intensities Direct methods – Directly recover image motion at each pixel from spatio- temporal image brightness variations – Dense motion fields, but sensitive to appearance variations – Suitable when image motion is small Feature-based methods – Extract visual features (corners, textured areas) and track them over multiple frames – Sparse motion fields, but more robust tracking – Suitable when image motion is large

Optical flow The velocity of observed 2-D motion vectors Can be caused by – object motions – camera movements – illumination condition changes

Optical flow the true motion field Motion field exists but no optical flow No motion field but shading changes

Problem definition: optical flow How to estimate pixel motion from image I(x,y,t) to image I(x,y,t+d t )? Solve pixel correspondence problem –given a pixel in I t, look for nearby pixels of the same color in I t+dt Key assumptions color constancy: a point in I t looks the same in I t+dt –For grayscale images, this is brightness constancy small motion: points do not move very far This is called the optical flow problem.

Optical flow constraints (grayscale images) Let’s look at these constraints more closely: brightness constancy: small motion: (u and v are small) –using Taylor’s expansion = 0

Optical flow equation Combining these two equations Dividing both sides by d t u, v: displacement vectors Known as the optical flow equation velocity vector spatial gradient vector

Q: how many unknowns and equations per pixel? – 2 unknowns, one equation What does this constraint mean? The component of the flow perpendicular to the gradient (i.e., parallel to the edge) is unknown edge (v x, v y ) (u’,v’) gradient (v x +u’, v y +v’) If (v x, v y ) satisfies the equation, so does (v x +u’, v y +v’) if

Q: how many unknowns and equations per pixel? – 2 unknowns, one equation What does this constraint mean? The component of the flow perpendicular to the gradient (i.e., parallel to the edge) is unknown This explains the Barber Pole illusion 1 2

The aperture problem Perceived motion

The aperture problem Actual motion

The barber pole illusion

The barber pole illusion

To solve the aperture problem… We need more equations for a pixel. Example – Spatial coherence constraint: pretends the pixel’s neighbors have the same (v x, v y ) – Lucas & Kanade (1981)

Outline Video representation Motion Actions in Video – Background subtraction – Recognition of actions based on motion patterns

Using optical flow: recognizing facial expressions Recognizing Human Facial Expression (1994) by Yaser Yacoob, Larry S. Davis

Example use of optical flow: visual effects in films

Slide credit: Birgi Tamersoy

Background subtraction Simple techniques can do ok with static camera …But hard to do perfectly Widely used: – Traffic monitoring (counting vehicles, detecting & tracking vehicles, pedestrians), – Human action recognition (run, walk, jump, squat), – Human-computer interaction – Object tracking

Slide credit: Birgi Tamersoy

Frame differences vs. background subtraction Toyama et al. 1999

Slide credit: Birgi Tamersoy

Pros and cons Advantages: Extremely easy to implement and use Fast Background models need not be constant, they change over time Disadvantages: Accuracy of frame differencing depends on object speed and frame rate Median background model: relatively high memory requirements Setting global threshold Th… Slide credit: Birgi Tamersoy

Background subtraction with depth How can we select foreground pixels based on depth information? Leap:

Outline Video representation Motion Actions in video – Background subtraction – Recognition of action based on motion patterns

Motion analysis in video “Actions”: atomic motion patterns -- often gesture- like, single clear-cut trajectory, single nameable behavior (e.g., sit, wave arms) “Activity”: series or composition of actions (e.g., interactions between people) “Event”: combination of activities or actions (e.g., a football game, a traffic accident) Modified from Venu Govindaraju

Surveillance

Interfaces

Model-based action/activity recognition: – Use human body tracking and pose estimation techniques, relate to action descriptions – Major challenge: accurate tracks in spite of occlusion, ambiguity, low resolution Activity as motion, space-time appearance patterns – Describe overall patterns, but no explicit body tracking – Typically learn a classifier – We’ll look at a specific instance… Human activity in video: basic approaches

Recognize actions at a distance [ICCV 2003] – Low resolution, noisy data, not going to be able to track each limb. – Moving camera, occlusions – Wide range of actions (including non-periodic) [Efros, Berg, Mori, & Malik 2003] The 30-Pixel Man

Approach Motion-based approach – Non-parametric; use large amount of data – Classify a novel motion by finding the most similar motion from the training set More specifically, – A motion description based on optical flow – an associated similarity measure used in a nearest neighbor framework [Efros, Berg, Mori, & Malik 2003]

Motion description matching results More matching results: video 1, video 2video 1video 2

Action classification result demo video

Gathering action data Tracking – Simple correlation-based tracker – User-initialized

Figure-centric representation Stabilized spatio-temporal volume – No translation information – All motion caused by person’s limbs, indifferent to camera motion!

Extract optical flow to describe the region’s motion. Using optical flow: action recognition at a distance [Efros, Berg, Mori, & Malik 2003]

Input Sequence Matched Frames Use nearest neighbor classifier to name the actions occurring in new video frames. Using optical flow: action recognition at a distance [Efros, Berg, Mori, & Malik 2003]

Football Actions: classification [ ] (8 actions, 4500 frames, taken from 72 tracked sequences)

Application: motion retargeting [Efros, Berg, Mori, & Malik 2003] SHOW VIDEO

Summary Background subtraction: – Essential low-level processing tool to segment moving objects from static camera’s video Action recognition: – Increasing attention to actions as motion and appearance patterns – For constrained environments, relatively simple techniques allow effective gesture or action recognition

Closing remarks Thank you all for your attention and participation to the class! Please be well prepared for the final project (06/12 and 06/19). Come to class on time. Start early!