Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSP03-04 - Visual input processing 1 Visual input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7.

Similar presentations


Presentation on theme: "CSP03-04 - Visual input processing 1 Visual input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7."— Presentation transcript:

1 CSP03-04 - Visual input processing 1 Visual input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7

2 CSP03-04 - Visual input processing 2 Introduction The immobot base exercise Work on the visual input Goal – object localization in 3D Setup: –PC –Two Logitech QC Zoom webcams

3 CSP03-04 - Visual input processing 3 Setup Setup for a PC: 1.Logitech QuickCam (QC) drivers 2.QuickTime 3.WinVDig (that corresponds to the installed version of QuickTime) 4.Max/MSP/Jitter

4 CSP03-04 - Visual input processing 4 Setup Camera parameters Image Sensor: 1/4” Color 640 x 480 Pixel CMOS Lens type: 3P F/# : F/2.4 Effective focal length : 5.0mm

5 CSP03-04 - Visual input processing 5 Setup Low tech configuration – stereo imaging not guaranteed (frame delays) Other options –Bumblebee camera –True stereo camera –Firewire (power issues, drivers) Axis 206 camera IP camera (drivers)

6 CSP03-04 - Visual input processing 6 Goal of the vision processing algorithm Object detection: –the application needs to detect the presence of a new object whenever it enters the monitored environment. Object recognition: –Once a new object is detected, it needs to be classified to determine its type (e.g., a car versus a truck, a tiger versus a deer). Object tracking: –Assuming the new object is of interest to the application, it can be tracked as it moves through the environment. Tracking involves computing current location of the object and its trajectory, Color tracking Estimation of 3D location through two view geometry - stereopsis

7 CSP03-04 - Visual input processing 7 Goal of the vision processing algorithm

8 CSP03-04 - Visual input processing 8 Color tracking Using a Max/MSP/Jitter provided algorithm – jit.findbounds Input – min and max color range to react to, and video Output – min and max (x,y) coordinates of the rectangle where the color has been found

9 CSP03-04 - Visual input processing 9 Color tracking Jit.findbounds output – rectangle Center coordinate

10 CSP03-04 - Visual input processing 10 Color tracking – example code Can be performed in Max/MSP javascript using jsui – slow !

11 CSP03-04 - Visual input processing 11 Color tracking - background Video tracking - the process of locating a moving object (or several ones) in time using a camera. An algorithm analyses the video frames and outputs the location of moving targets within the video frame. –video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object to track. Video tracking approaches: –Blob tracking: Segmentation of object interior (for example blob detection, block-based correlation or optical flow) –Contour tracking: Detection of object boundary (e.g. active contours or Condensation algorithm) –Visual feature matching: Registration Color tracking is a type of blob tracking

12 CSP03-04 - Visual input processing 12 Color tracking - background Blob detection refers to visual modules that are aimed at detecting points and/or regions in the image that are either brighter or darker than the surrounding. There are two main classes of blob detectors (i) differential methods based on derivative expressions and (ii) methods based on local extrema in the intensity landscape. A blob (binary large object) is an area of touching pixels with the same logical state. A group of pixels organized into a structure is commonly called a blob. Problems related to blobs: 1. Where are the edges? 2. Where is the center? 3. How many pixels does it contain? 4. What is the average pixel intensity? 5. What is the blob's orientation (angle)?

13 CSP03-04 - Visual input processing 13 Color tracking - background Blob center calculation – simple method

14 CSP03-04 - Visual input processing 14 Color tracking - background A blob (binary large object) is an area of touching pixels with the same logical state. –All pixels in an image that belong to a blob are in a foreground state. –All other pixels are in a background state. –In a binary image, pixels in the background have values equal to zero while every nonzero pixel is part of a binary object. For jit.findbounds - this logical test of belonging to the blob is whether the color of the currently tested pixel falls within the range set to be detected

15 CSP03-04 - Visual input processing 15 Color tracking - background What is easily identifiable by the human eye as several distinct but touching blobs - may be interpreted by software as a single blob. A reliable software package will tell you how touching blobs are defined. For example, you can define touching pixels as adjacent pixels along the vertical or horizontal axis as touching or include diagonally adjacent pixels. Segmentation of the image - separating the good blobs from the background and each other as well as eliminating everything else that is not of interest. Segmentation usually involves a binarization operation – a black and white image result

16 CSP03-04 - Visual input processing 16 Color tracking - background blob analysis – logical – (generally) performed on black and white image Brightness - rectangle algorithm –The rectangle algorithm keeps track of four points in each frame, the top most, left most, right most and bottom most points where the brightness exceeds a certain threshold value.

17 CSP03-04 - Visual input processing 17 Color tracking - background Tracking types: (I) objects of a given nature, e.g., cars, people, faces (II) objects of a given nature with a specific attribute, e.g., moving cars, walking people, talking heads, face of a given person (III) objects of a priori unknown nature but of a specific interest, e.g., moving objects, objects of semantic interest manually picked in the first frame (I) and (II) - part of the input video frame is searched against a reference model (image patches – or overall shape[geometry]) describing the appearance of the object. (III) - the reference can be extracted from the first frame and kept frozen – color tracking Recent color tracking algorithms: –MeanShift –Continuously Adaptive Mean Shift (CamShift)

18 CSP03-04 - Visual input processing 18 Color tracking - background Advanced application of tracking in stereo – matching Starting from a collection of images or a video sequence the first step consists in relating the different images to each other. two images are shown with the extracted corners. Note that it is not possible to find the corresponding corner for each corner, but that for many of them it is. In our example, we are having only one 3D point to deal with – we assume the data obtained from the two cameras are matched

19 CSP03-04 - Visual input processing 19 Camera parameters Extrinsic and intrinsic parameters Extrinsic parameters –the orientation of the camera Euclidean co-ordinates with respect to the world Euclidean co-ordinate system.This relation is given by matrices R and t. –Thus there are six extrinsic camera parameters; three rotations and three translations.

20 CSP03-04 - Visual input processing 20 Camera parameters Extrinsic and intrinsic parameters Intrinsic parameters – coefficients of calibration matrix K px and py are the width and the height of the pixels, c=[cx cy 1] T is the principal point (defined as intersection of the optical axis and the retinal [image] plane - center of image plane) and a the skew angle as indicated

21 CSP03-04 - Visual input processing 21 Stereo 3D localization algorithm

22 CSP03-04 - Visual input processing 22 Stereo 3D localization algorithm Problem:

23 CSP03-04 - Visual input processing 23 Stereo 3D localization algorithm Writing the system for the two cameras

24 CSP03-04 - Visual input processing 24 Stereo 3D localization algorithm Special case – canonical configuration – binocular –The model has two identical cameras separated only in the X direction by a baseline distance b. The image planes are coplanar in this model. –The baseline is aligned to the horizontal co-ordinate axis, the optical axes of the cameras are parallel, the epipoles move to infinity, and the epipolar lines in the image planes are parallel. Rotation matrices are identity. b – distance, f – focal length Extrinsic parameters

25 CSP03-04 - Visual input processing 25 Stereo 3D localization algorithm Intrinsic parameters are ignored here – no calibration ! We will try to scale the coordinates manually until we get something meaningful.

26 CSP03-04 - Visual input processing 26 Stereo 3D localization algorithm Intersection of the lines in 3D is not guaranteed Derivation using principle behind CPA (closest points of approach) –Looking for the closest points on the lines –Solution using parametric equations

27 CSP03-04 - Visual input processing 27 Stereo 3D localization algorithm Finally, we obtain the estimate point C MID which we declare to be our object location O(X,Y,Z) We will use this in code to calculate the vector location from the obtained coordinates from color tracking Will be programmed in JavaScript, and called from Max/MSP/Jitter

28 CSP03-04 - Visual input processing 28 Problems with the approach No calibration – no intrinsic parameters taken into account Low end cameras – aberrations Low end cameras – radial distortions No guarantee for time sync between left and right images In general – approximative/illustrative

29 CSP03-04 - Visual input processing 29 Implementation in Max/MSP/Jitter


Download ppt "CSP03-04 - Visual input processing 1 Visual input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7."

Similar presentations


Ads by Google