Computer Vision. Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room.

Computer Vision

Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room occupants  Camera on assembly line => find defective parts  Xray Assistant => flag areas as potentially cancerous  Phone camera => Find faces and adaptively focus  Post office camera => Automatically scan and decode address data ……  Challenges:  Sloppy data (compare to sorting a list of numbers)  Limitations of capture device: e.g. webcam auto-adjust lighting.  Algorithms often required a lot of tuning – hard to optimize for general case.

Outline of this lab  A very simple “pipeline” that illustrates some simple CV techniques  Blurring (Gaussian and Box)  Automatic thresholding (2-mean Otsu method)  Hu moments of central region  Creating and querying a database of Hu moments – best fit  At each stage, there are many improvements we could make  I’ll try to mention a few along the way  I’d give you bonus points for attempting / completing some  Implementation  I don’t want you to use any CV libraries  The only library functionality I want you to use is for loading / saving color images and displaying images on-screen.  Speed is not an issue (i.e. it’s OK if it’s python-slow)

Blurring

Thresholding  Basic idea:  Group pixels into n groups based on (grayscale) intensity  N = 2 in our case (background and foreground)  Simple method:  Select an intensity value T (between 0 and 255)  Any pixels below T are set to black (0,0,0); Any greater than or equal are set to (255,255,255)  You may want to save these into a “region” list (for the next step)  Disadvantage: Can’t come up with a “universal” threshold value that’ll work for all images

Otsu Thresholding  (We’re looking at the “vanilla” algorithm)  See https://en.wikipedia.org/wiki/Otsu%27s_method for a starting referencehttps://en.wikipedia.org/wiki/Otsu%27s_method  Assumptions:  The Histogram of our pixel intensities has a bi-modal distribution  (There are more advanced algorithms that will use a k-modal distribution)  Algorithm idea:  Find the T that either (they’re related quantities): Minimize the variance within each group Maximize the variance between the two group’s variance  Find this value T by testing each possible threshold (0 – 255)  Do the normal thresholding  Advantage: works for any image with a bi-modal histogram!

Otsu Thresholding Scaled to 200 x 200 Grouped bins into 4’s (0 – 3, 4 – 7, …, 252 - 255) T = 169 is what my Otsu chose

Otsu Thresholding, cont.

Segmenting  In a real CV application, we would do more sophisticated segmenting  Segmenting = breaking image into regions of interest  Suppose the region takes this form R = [(x1, y1), (x2, y2), …, (xn, yn)]

Hu Moments

Hu Moments, cont.

Complete Pipeline for apple.jpg  Warning: my numbers could very well be wrong…(not on purpose )

Computer Vision. Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room.

Similar presentations

Presentation on theme: "Computer Vision. Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Vision. Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room.

Similar presentations

Presentation on theme: "Computer Vision. Overview of the field  Image / Video => Data  Compare to graphics (the reverse)  Sample applications  Video Camera feed => ID room."— Presentation transcript:

Similar presentations

About project

Feedback