CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen

Slides:

Advertisements

Similar presentations

Digital Image Fundamentals Selim Aksoy Department of Computer Engineering Bilkent University

Advertisements

Binary Image Analysis Selim Aksoy Department of Computer Engineering Bilkent University

Digital Image Fundamentals Selim Aksoy Department of Computer Engineering Bilkent University

Digital Image Fundamentals Selim Aksoy Department of Computer Engineering Bilkent University

Each pixel is 0 or 1, background or foreground Image processing to

Computer Vision: CSE 803 A brief intro MSU/CSE Fall 2014.

Common Low-level Operations for Processing & Enhancement

MSU/CSE Fall Computer Vision: Imaging Devices Summary of several common imaging devices covered ( Chapter 2 of Shapiro and Stockman)

1 Binary Image Analysis Binary image analysis consists of a set of image analysis operations that are used to produce or process binary images, usually.

Binary Image Analysis: Part 2 Readings: Chapter 3: mathematical morphology region properties region adjacency 1.

1 Imaging and Image Representation  Sensing Process  Typical Sensing Devices  Problems with Digital Images  Image Formats  Relationship of 3D Scenes.

Stockman MSU/CSE Fall Computer Vision: Imaging Devices Summary of several common imaging devices covered in Chapter 2 of Shapiro and Stockman.

Computer Vision Basics Image Terminology Binary Operations Filtering Edge Operators.

Visual Object Recognition Rob Fergus Courant Institute, New York University

1. Binary Image B(r,c) 2 0 represents the background 1 represents the foreground

Stockman MSU/CSE Fall 2009 Computer Vision: CSE 803 A brief intro.

1 Introduction What IS computer vision? Where do images come from? the analysis of digital images by a computer.

SCCS 4761 Introduction What is Image Processing? Fundamental of Image Processing.

1. Binary Image B(r,c) 2 0 represents the background 1 represents the foreground

1 Binary Image Analysis Binary image analysis consists of a set of image analysis operations that are used to produce or process binary images, usually.

Introduction to Computer Vision Sebastian van Delden USC Upstate

Chapter 2 : Imaging and Image Representation Computer Vision Lab. Chonbuk National University.

Intelligent Vision Systems Image Geometry and Acquisition ENT 496 Ms. HEMA C.R. Lecture 2.

1 CSE 455: Computer Vision Winter 2007 Instructor: Professor Linda Shapiro Additional Instructor: Dr. Matthew Brown

1 Regions and Binary Images Hao Jiang Computer Science Department Sept. 25, 2014.

CS 376b Introduction to Computer Vision 02 / 11 / 2008 Instructor: Michael Eckmann.

1 Introduction What IS computer vision? Where do images come from? the analysis of digital images by a computer.

1 Machine Vision. 2 VISION the most powerful sense.

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.

Low level Computer Vision 1. Thresholding 2. Convolution 3. Morphological Operations 4. Connected Component Extraction 5. Feature Extraction 1.

Image Perception ‘Let there be light! ‘. “Let there be light”

Intelligent Vision Systems Image Geometry and Acquisition ENT 496 Ms. HEMA C.R. Lecture 2.

1 Mathematic Morphology used to extract image components that are useful in the representation and description of region shape, such as boundaries extraction.

1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.

An Introduction to Digital Image Processing Dr.Amnach Khawne Department of Computer Engineering, KMITL.

Image Perception ‘Let there be light! ‘. “Let there be light”

Content Based Coding of Face Images

Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.

1. Images as Functions 2 A Binary image is a function B(x,y) Є [0,1] Gray-tone image is a function: g(x,y) Є [0.,1,….L-1] A color image is represented.

COMP 9517 Computer Vision Digital Images 1/28/2018 COMP 9517 S2, 2009.

COMP 9517 Computer Vision Binary Image Analysis 4/15/2018

Common Low-level Operations for Processing & Enhancement

- photometric aspects of image formation gray level images

Lecture 25: Introduction to Recognition

Binary Image Analysis Gokberk Cinbis

Mean Shift Segmentation

Introduction Computer vision is the analysis of digital images

CSE 455 Computer Vision Autumn 2014

EE 596 Machine Vision Autumn 2015

Outline Image formats and basic operations Image representation

Binary Image Analysis: Part 1 Readings: Chapter 3: 3.1, 3.4, 3.8

Lecture 25: Introduction to Recognition

Digital Image Fundamentals

Introduction Computer vision is the analysis of digital images

CSE 455 Computer Vision Autumn 2010

Binary Image Analysis used in a variety of applications:

Binary Image Analysis: Part 1 Readings: Chapter 3: 3.1, 3.4, 3.8

Introduction What IS computer vision?

Department of Computer Engineering

Binary Image Analysis: Part 2 Readings: Chapter 3:

Midterm Exam Closed book, notes, computer Similar to test 1 in format:

© 2010 Cengage Learning Engineering. All Rights Reserved.

Midterm Exam Closed book, notes, computer Similar to test 1 in format:

Introduction Computer vision is the analysis of digital images

Computer and Robot Vision I

Computer and Robot Vision I

7th Annual STEMtech conference

DIGITAL IMAGE PROCESSING Elective 3 (5th Sem.)

Binary Image Analysis used in a variety of applications:

Presentation transcript:

CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen Güzelyurt, Ankara Spring 2009

Textbook: 1. L. Shapiro and Stockman, Computer Vision Reccomended Books: R. Szeliski, Computer Vision: Algorithms and Applications, Dec 23, 2008 D. Forsyth, J. Ponce, Computer Vision: Modern Approach B. Jahne, H. Haubacker, Computer Vision and Applications

Grading: Midterm: 30% Final: 40% Homework: 30% with your partner

What is Computer Vision? Make the computer SEE SEE: Extracting Visual information from any sensed data Goal : Make useful decisions about objects and scenes based on sensed data

OBJECT perceptible material thing vision

Object According to Plato Things consisting of forms and matter Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Matter is the ordinary substace

OBJECTS ANIMALS PLANTS INANIMATE ….. TAPIR BOAR GROUSE CAMERA NATURAL MAN-MADE ….. VERTEBRATE MAMMALS BIRDS TAPIR BOAR GROUSE CAMERA

How many object categories are there? ~10,000 to 30,000 Biederman 1987

SCENE Consists of multiple objects Goal : Make useful decisions about objects and scenes based on sensed data

Bruegel, 1564

Sensed Data: Images All sorts of sensor data carying visual info Optic Thermal IR MR SAR …. Goal : Make useful decisions about objects and scenes based on sensed data

IMAGES: Sattelite,CT, SAR, Thermal, scientific

Useful Decisions Recognize, classify, detect, locallize, retrieve, annotate, varify Goal : Make useful decisions about objects and scenes based on sensed data

So what does recognition involve?

Verification: is that a lamp?

Detection: are there people?

Identification: is that Potala Palace?

Object categorization mountain tree building banner street lamp vendor people

Scene and context categorization outdoor city …

Application Domains of computer vISION

Traffics Assisted driving Pedestrian and car detection Lane detection meters Ped Car Pedestrian and car detection Assisted driving Lane detection Collision warning systems with adaptive cruise control, Lane departure warning systems, Rear object detection systems,

Retrieval: Improving online search Query: STREET Digital Album

Similarity Retrieval of Brain Data

Image Databases: Content-Based Retrieval Images from my Ground-Truth collection. What categories of image databases exist today?

Abstract Regions for Object Recognition Original Images Color Regions Texture Regions Line Clusters Caption!

Insect Identification for Ecology Studies Doroneuria (Dor) Calineuria (Cal) Yoraperla (Yor)

Document Analysis

Surveillance: Object and Event Recognition in Aerial Videos Original Video Frame Color Regions Structure Regions

Video Analysis What are the objects? What are the events?

3D Reconstruction of the Blood Vessel Tree

Recognition of 3D Object Classes from Range Data

3D Scanning Scanning Michelangelo’s “The David” The Digital Michelangelo Project - http://graphics.stanford.edu/projects/mich/ UW Prof. Brian Curless, collaborator 2 BILLION polygons, accuracy to .29mm

The Digital Michelangelo Project, Levoy et al.

Tasks in Computer Vision Segment an image into useful regions Perform measurements on certain areas Determine what object(s) are in the scene Calculate the precise location(s) of objects Visually inspect a manufactured object Construct a 3D model of the imaged object Find “interesting” events in a video liver kidney spleen

HISTORY OF COMPUTER VISION

1970

1980s

1990

2000s

Why is it Difficult? What are the Challenges

Challenges 1: view point variation Michelangelo 1475-1564

Challenges 2: illumination slide credit: S. Ullman

Challenges 3: occlusion Magritte, 1957

Challenges 4: scale

Challenges 5: deformation Xu, Beihong 1943

Challenges 6: background clutter Klimt, 1913

Challenges 7: intra-class variation

image image The Three Stages of Computer Vision low-level mid-level high-level image image image features features analysis

Low-Level sharpening blurring

Low-Level Mid-Level Canny original image edge image ORT data structure circular arcs and line segments

Mid-level K-means clustering (followed by connected component analysis) regions of homogeneous color original color image data structure

Low- to High-Level edge image consistent line clusters low-level edge image mid-level consistent line clusters high-level Building Recognition

Recognition Scale / orientation range to search over Speed Context

Course content Image representatiın Matrices, functions Image file formats Binary Image Analysis Pixel and neighborhood Masks and convolution Counting and labeling Morphological operations

Thresholding Object Recognition conceps Representation Classification Measures Gray-level Image Analysis Gray level mapping Noise removal, Smoothing

Color and shading Color spaces Shades Texture Texels, texture description Texture measure Segmentation Clustering Region Growing Content Based Image retrieval

Imaging and Image Representation Ch:2 Shapiro et al.

Classical Imaging Process Light reaches surfaces in 3D Surfaces reflect Sensor element receives light energy Intensity counts Angles count Material counts What are radiance and irradiance?

Radiometry and Computer Vision* Radiometry is a branch of physics that deals with the measurement of the flow and transfer of radiant energy. Radiance is the power of light that is emitted from a unit surface area into some spatial angle; the corresponding photometric term is brightness. Irradiance is the amount of energy that an image- capturing device gets per unit of an efficient sensitive area of the camera. Quantizing it gives image gray tones. From Sonka, Hlavac, and Boyle, Image Processing, Analysis, and Machine Vision, ITP, 1999.

Sensors: Image acquisition Devices CCD (Charged Couple Device ) X-Ray Devices Microwave Devices UV Devices Thermal Cameras IR Devices 3-D scanners

CCD type camera: Commonly used in industrial applications Array of small fixed elements Each element converts the light energy to electric charge 1x1 cm Can add refracting elements to get color in 2x2 neighborhoods 8-bit intensity common

Computer Vision Algorithms Main concern of CV is to develop Algorithms

LIDAR also senses surfaces Single sensing element scans scene Laser light reflected off surface and returned Phase shift codes distance Brightness change codes albedo (surface reflectance) Stockman MSU/CSE Fall 2008

2.5D face image from Minolta Vivid 910 scanner A rotating mirror scans a laser stripe across the object. 320x240 rangels obtained in about 2 seconds. [x,y,z,R,G,B] image. Stockman MSU/CSE Fall 2008

3D scanning technology 3D image of voxels obtained Usually computationally expensive reconstruction of 3D from many 2D scans (CAT computer-aided-tomography) Stockman MSU/CSE Fall 2008

Magnetic Resonance Imaging Sense density of certain chemistry S slices x R rows x C columns Volume element (voxel) about 2mm per side At left is shaded 2D image created by “volume rendering” a 3D volume: darkness codes depth Stockman MSU/CSE Fall 2008

Single slice through human head MRIs are computed structures, computed from many views. At left is MRA (angiograph), which shows blood flow. CAT scans are computed in much the same manner from X-ray transmission data. Stockman MSU/CSE Fall 2008

Problems in Image Acquisition

Human eye as a spherical camera 75-150 millionRods sense intensity 6-7 million Cones sense color Fovea has tightly packed area, more cones Periphery has more rods Focal length is about 20mm Pupil/iris controls light entry Eye scans, or saccades to image details on fovea 100M sensing cells funnel to 1M optic nerve connections to the brain Stockman MSU/CSE Fall 2008

RODES AND CONES

Cones

Image Formation

Problems in HVS Mach Band Effect

Contrast

Illusions

Images: 2D projections of 3D The 3D world has color, texture, surfaces, volumes, light sources, temperature, reflectance, … A 2D image is a projection of a scene from a specific viewpoint.

Digital Images form arrays

Digitizing- SAmpling

Quantization

Digital Image: Sampled and quantized

Sampling at different resolution

Sampling

Quantization

What is the appropriate sampling and quantization rates?

Resolution resolution: precision of the sensor nominal resolution: size of a single pixel in scene coordinates (ie. meters, mm) common use of resolution: num_rows X num_cols (ie. 515 x 480) field of view (FOV): size of the scene a sensor can sense

Images as Functions g(x,y) = val or f(row, col) = val A gray-tone image is a function: g(x,y) = val or f(row, col) = val A color image is just three functions or a vector-valued function: f(row,col) =(r(row,col), g(row,col), b(row,col)) Multi-spectral Image: f(row,col) =(f1(row,col), f2(row,col),…, fn(row,col))

Gray-tone Image as Function

Image vs Matrix There are many different file formats.

Digital Image Terminology: 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 1 95 96 94 93 92 0 0 92 93 93 92 92 0 0 93 93 94 92 93 0 1 92 93 93 93 93 0 0 94 95 95 96 95 pixel (with value 94) its 3x3 neighborhood region of medium intensity resolution (7x7) binary image gray-scale (or gray-tone) image color image multi-spectral image range image labeled image

Image File Formats Portable Gray Map (PGM) older form GIF was early commercial version JPEG (JPG) is modern version MPEG for motion Many others exist: header plus data Do they handle color? Do they provide for compression? Are there good packages that use them or at least convert between them?

Commpression: Reduce the redundancy Lossy Lossless

Run Coding Row1 0001001000000 Row2 0001111000000 Row3 0001001000000 Code 1: 3(0)1(1)2(0)1(1)6(0) Or Code2: (4,4)(7,7)

PGM image with ASCII info. P2 means ASCII gray Comments W=16; H=8 192 is max intensity Can be made with editor Large images are usually not stored as ASCII

PBM/PGM/PPM Codes P1: ascii binary (PBM) P2: ascii grayscale (PGM) P3: ascii color (PPM) P4: byte binary (PBM) P5: byte grayscale (PGM) P6: byte color (PPM)

JPG current popular form Public standard Allows for image compression; often 10:1 or 30:1 are easily possible 8x8 intensity regions are fit with basis of cosines Error in cosine fit coded as well Parameters then compressed with Huffman coding Common for most digital cameras

From 3D Scenes to 2D Images Object World Camera Real Image Pixel Image

Binary Image Analysis

Binary image analysis consists of a set of image analysis operations that are used to produce or process binary images, usually images of 0’s and 1’s. 0 represents the background 1 represents the foreground 00010010001000 00011110001000

Binary Image Analysis is used in a number of practical applications, e.g. part inspection riveting fish counting document processing

What kinds of operations? Separate objects from background and from one another Aggregate pixels for each object Compute features for each object

Example: red blood cell image Many blood cells are separate objects Many touch – bad! Salt and pepper noise from thresholding How useable is this data?

Results of analysis 63 separate objects detected Single cells have area about 50 Noise spots Gobs of cells

Useful Operations 1. Thresholding a gray-tone image 2. Determining good thresholds 3. Connected components analysis 4. Binary mathematical morphology 5. All sorts of feature extractors (area, centroid, circularity, …)

1. Thresholding Convert gray level or color image into binary image Use histogram

Histogram Background is black Healthy cherry is bright Bruise is medium dark Histogram shows two cherry regions (black background has been removed) pixel counts 256 gray-tone values

Histogram-Directed Thresholding How can we use a histogram to separate an image into 2 (or several) different regions? Is there a single clear threshold? 2? 3?

Automatic Thresholding: Otsu’s Method Grp 1 Grp 2 Assumption: the histogram is bimodal t Method: find the threshold t that minimizes the weighted sum of within-group variances for the two groups that result from separating the gray tones at value t.

Thresholding Example original gray tone image binary thresholded image

2. Connected Components Labeling Once you have a binary image, you can identify and then analyze each connected set of pixels. The connected components operation takes in a binary image and produces a labeled image in which each pixel has the integer label of either the background (0) or a component. connected components binary image after morphology

Methods for CC Analysis Recursive Tracking (almost never used) Parallel Growing (needs parallel hardware) Row-by-Row (most common) Classical Algorithm (see text) Efficient Run-Length Algorithm (developed for speed in real industrial applications)

Equivalent Labels Original Binary Image 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 0 0 0 1 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1

Equivalent Labels The Labeling Process 0 0 0 1 1 1 0 0 0 0 2 2 2 2 0 0 0 0 3 0 0 0 1 1 1 1 0 0 0 2 2 2 2 0 0 0 3 3 0 0 0 1 1 1 1 1 0 0 2 2 2 2 0 0 3 3 3 0 0 0 1 1 1 1 1 1 0 2 2 2 2 0 0 3 3 3 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 3 3 3 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1  2 1  3

Run-Length Data Structure 0 1 2 3 4 1 2 3 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 row scol ecol label Binary Image 1 2 3 4 5 6 7 U N U S E D 0 0 0 1 0 0 3 4 0 1 0 1 0 1 4 4 0 2 0 2 0 2 4 4 0 4 1 4 0 Rstart Rend 1 2 3 4 2 4 6 0 0 7 7 Row Index Runs

Run-Length Algorithm /* Pass 1 (by rows) */ Procedure run_length_classical { initialize Run-Length and Union-Find data structures count <- 0 /* Pass 1 (by rows) */ for each current row and its previous row move pointer P along the runs of current row move pointer Q along the runs of previous row

Case 1: No Overlap Q Q |/////| |/////| |////| |///| |///| |/////| P P |/////| |////| |///| |///| |/////| P P /* new label */ count <- count + 1 label(P) <- count P <- P + 1 /* check Q’s next run */ Q <- Q + 1

Case 2: Overlap Q Q |///////| |/////| |/////////////| Subcase 2: P’s run has a label that is different from Q’s run Subcase 1: P’s run has no label yet Q Q |///////| |/////| |/////////////| |///////| |/////| |/////////////| P P label(P) <- label(Q) move pointer(s) union(label(P),label(Q)) move pointer(s) }

Pass 2 (by runs) /* Relabel each run with the name of the equivalence class of its label */ For each run M { label(M) <- find(label(M)) } where union and find refer to the operations of the Union-Find data structure, which keeps track of sets of equivalent labels.

Labeling shown as Pseudo-Color connected components of 1’s from thresholded image connected components of cluster labels

Mathematical Morphology Binary mathematical morphology consists of two basic operations dilation and erosion and several composite relations closing and opening conditional dilation . . .

Dilation Dilation expands the connected sets of 1s of a binary image. It can be used for 1. growing features 2. filling holes and gaps

Erosion Erosion shrinks the connected sets of 1s of a binary image. It can be used for 1. shrinking features 2. Removing bridges, branches and small protrusions

Structuring Elements A structuring element is a shape mask used in the basic morphological operations. They can be any shape and size that is digitally representable, and each has an origin. box disk hexagon something box(length,width) disk(diameter)

Dilation with Structuring Elements The arguments to dilation and erosion are a binary image B a structuring element S dilate(B,S) takes binary image B, places the origin of structuring element S over each 1-pixel, and ORs the structuring element S into the output image at the corresponding position. 0 0 0 0 0 1 1 0 dilate 0 1 1 0 0 1 1 1 0 0 0 0 1 1 1 S B B  S origin

Erosion with Structuring Elements erode(B,S) takes a binary image B, places the origin of structuring element S over every pixel position, and ORs a binary 1 into that position of the output image only if every position of S (with a 1) covers a 1 in B. origin 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 1 1 erode B S B S

Example to Try 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 0 S B 1 1 1 erode dilate with same structuring element

Opening and Closing Closing is the compound operation of dilation followed by erosion (with the same structuring element) Opening is the compound operation of erosion followed by dilation (with the same structuring element)

Use of Opening Original Opening Corners What kind of structuring element was used in the opening? How did we get the corners?

Gear Tooth Inspection original binary image How did they do it? detected defects

Some Details

Region Properties Properties of the regions can be used to recognize objects. geometric properties (Ch 3) gray-tone properties color properties texture properties shape properties (a few in Ch 3) motion properties relationship properties (1 in Ch 3)

Geometric and Shape Properties area centroid perimeter perimeter length circularity elongation mean and standard deviation of radial distance bounding box extremal axis length from bounding box second order moments (row, column, mixed) lengths and orientations of axes of best-fit ellipse Which are statistical? Which are structural?

Region Adjacency Graph A region adjacency graph (RAG) is a graph in which each node represents a region of the image and an edge connects two nodes if the regions are adjacent. 1 1 2 2 4 3 4 3