Processing visual information for Computer Vision Computer vision introduction - construct scene descriptions from images - make useful decisions about real physical objects and scenes based on sensed image Fundamental issues - Sensing - Encoded information - Representation - Algorithms
Fundamental issues Sensing – how do sensors obtain images of the world? how do the images encode properties of the world, such as material, shape, illumination and spatial relationships? Encoded information – how do images yield information for understanding the 3D world, including the geometry, texture, motion and identity of objects in it? Representation – what representations should be used for stored descriptions of objects, their parts, properties and relationship. Algorithms – what methods are there to process image information and construct descriptions of the world and its objects?
Some topics Motion analysis from 2D image sequences Image analysis and segmentation 3D shape from 2D (shape from “x”, e.g., shading, texture, motion,..) Depth perception from stereo Stereo model (camera calibration) 3D/2D models and matching
Camera Model Camera calibration - obtain the camera parameters by computational procedure based on the set of image points whose coordinates are known (Note: it is not necessary to measure the focal length, offsets, angles of titles/pans) - 3D affine transformation -- translation a certain distance -- scaling -- rotation about x-axis (y-axis and z-axis) or arbitrary rotation (1) Monocular image – single camera - The shape information is contained in shading, texture, contours, shadow, motion, etc.
Camera Model (1) Monocular image – single camera (cont’d) Y P(x,y,z) I(x’,y’) Z f X
Camera Model (2) Binocular image – two cameras (stereo) -- missing depth information can be obtained by using stereoscopic imaging techniques Disparity: D = x’’ – x’ For eL: (f-z)x’ = (x-d)f eR: (f-z)x’’ = (x+d)f So: z = f – (2df/(x’’-x’) = f – (2df/D) = f(1-(2d)/D) P(x, f-z) (x’, f) eL x’=0 d x=0 (x”, f) d eR f z x” =0
Camera Model (3) Active camera (movable camera) – Pin-hole model
Camera Model (3) Active camera (movable camera)
Camera Model (3) Active camera (movable camera)
3D information reconstruction 3D object information and representation Shape from “X” -- shape from shading -- shape from texture -- shape from motion
Image Understanding High-level vision Graph isomophism -- find the similarity of the structure of the graph through matching Dynamic programming and linear programming -- Solving combinational optimization problems which involves sequential decision making process -- Solving optimization problem where cost functions and constraints are strictly linear.
Knowledge based computer vision Prior knowledge of certain object/scene will help computer to understand what it captured Similar to human vision system -- E.g., Template matching design Relational model design Image training for computing the basis image
Knowledge based computer vision Image acquisition & processing Representation & description 3D understanding segmentation recognition
Conclusion Low-level: spatial-domain methods, transform-domain methods, …. Intermediate level: region-based, edge-based,…… (segmentation…) High-level: primitive structure, 3D understanding, object matching, description/modeling, …..