Computer Vision Chapter 1 Introduction.  The goal of computer vision is to make useful decisions about real physical objects and scenes based on sensed.

Slides:



Advertisements
Similar presentations
Chapter 11 Introduction to Programming in C
Advertisements

Digital Images in Java Java’s imaging classes. Java imaging library  Java has good support for image processing  Must master a handful of classes and.
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
Composition CMSC 202. Code Reuse Effective software development relies on reusing existing code. Code reuse must be more than just copying code and changing.
1 CSC 551: Web Programming Spring 2004 client-side programming with JavaScript  scripts vs. programs  JavaScript vs. JScript vs. VBScript  common tasks.
Grey Level Enhancement Contrast stretching Linear mapping Non-linear mapping Efficient implementation of mapping algorithms Design of classes to support.
C++ Programming: From Problem Analysis to Program Design, Third Edition Chapter 7: User-Defined Functions II.
Chapter 7: User-Defined Functions II
Chapter 7: User-Defined Functions II Instructor: Mohammad Mojaddam.
Multimedia for the Web: Creating Digital Excitement Multimedia Element -- Graphics.
Programming Assignment 2 CS308 Fall Goals Improve your skills with using templates. Learn how to compile your code when using templates. Learn more.
Highlights Lecture on the image part (10) Automatic Perception 16
Programming Introduction November 9 Unit 7. What is Programming? Besides being a huge industry? Programming is the process used to write computer programs.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Guide To UNIX Using Linux Third Edition
Computer Vision Lecture 3: Digital Images
Chapter 8: Introduction to High-level Language Programming Invitation to Computer Science, C++ Version, Third Edition.
Java: Chapter 1 Computer Systems Computer Programming II Aug
Topics Introduction Hardware and Software How Computers Store Data
XP New Perspectives on Microsoft Access 2002 Tutorial 51 Microsoft Access 2002 Tutorial 5 – Enhancing a Table’s Design, and Creating Advanced Queries and.
Fortran 1- Basics Chapters 1-2 in your Fortran book.
Multimedia Databases (MMDB)
Lab #5-6 Follow-Up: More Python; Images Images ● A signal (e.g. sound, temperature infrared sensor reading) is a single (one- dimensional) quantity that.
Java: Chapter 1 Computer Systems Computer Programming II.
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
Chapter 1: A First Program Using C#. Programming Computer program – A set of instructions that tells a computer what to do – Also called software Software.
CS1Q Computer Systems Lecture 8
Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly.
Introduction to Arrays. definitions and things to consider… This presentation is designed to give a simple demonstration of array and object visualizations.
Addison Wesley is an imprint of © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 5 Working with Images Starting Out with Games & Graphics in.
Unit 2, cont. September 12 More HTML. Attributes Some tags are modifiable with attributes This changes the way a tag behaves Modifying a tag requires.
A brief introduction to javadoc and doxygen Cont’d.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
C++ Programming: From Problem Analysis to Program Design, Fifth Edition, Fifth Edition Chapter 7: User-Defined Functions II.
Using the JImageViewer classes. JImageViewer classes JImageViewer class JImageViewer class ImagePanel class ImagePanel class Image class Image class.
Question of the Day  On a game show you’re given the choice of three doors: Behind one door is a car; behind the others, goats. After you pick a door,
September 5, 2013Computer Vision Lecture 2: Digital Images 1 Computer Vision A simple two-stage model of computer vision: Image processing Scene analysis.
Section 8.1 Create a custom theme Design a color scheme Use shared borders Section 8.2 Identify types of graphics Identify and compare graphic formats.
Week 11 Creating Framed Layouts Objectives Understand the benefits and drawbacks of frames Understand and use frame syntax Customize frame characteristics.
Forms and Server Side Includes. What are Forms? Forms are used to get user input We’ve all used them before. For example, ever had to sign up for courses.
HTML Concepts and Techniques Fifth Edition Chapter 6 Using Frames in a Web Site.
LEARNING HTML PowerPoint #1 Cyrus Saadat, Webmaster.
Introduction to programming in the Java programming language.
WHAT IS A DATABASE? A DATABASE IS A COLLECTION OF DATA RELATED TO A PARTICULAR TOPIC OR PURPOSE OR TO PUT IT SIMPLY A GENERAL PURPOSE CONTAINER FOR STORING.
Computer Vision Chapter 1 Introduction.  The goal of computer vision is to make useful decisions about real physical objects and scenes based on sensed.
CS 376b Introduction to Computer Vision 02 / 11 / 2008 Instructor: Michael Eckmann.
A brief introduction to doxygen. What does a compiler do?  A compiler ignores comments and processes the code.  What does doxygen do? –It ignores the.
A brief introduction to javadoc and doxygen. What’s in a program file? 1. Comments 2. Code.
Intermediate 2 Computing Unit 2 - Software Development.
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
June 14, ‘99 COLORS IN MATLAB.
1 Machine Vision. 2 VISION the most powerful sense.
1 Class 1 Lecture Topic Concepts, Definitions and Examples.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
INTRODUCTION TO COMPUTER PROGRAMMING(IT-303) Basics.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Section 8.1 Section 8.2 Create a custom theme Design a color scheme
Chapter 7: User-Defined Functions II
A brief introduction to doxygen
User-Defined Functions
Chapter 5 Working with Images
Introduction to javadoc
Chapter 11 Introduction to Programming in C
Topics Introduction to File Input and Output
Topics Introduction Hardware and Software How Computers Store Data
Chapter 1 Introduction(1.1)
Chapter 7: User-Defined Functions II
Introduction to javadoc
Topics Introduction to File Input and Output
Presentation transcript:

Computer Vision Chapter 1 Introduction

 The goal of computer vision is to make useful decisions about real physical objects and scenes based on sensed images.

Applications areas  Industrial inspection  Medical imaging  Image database and query  Satellite and surveillance imagery  Entertainment  Handwriting and printed character recognition

Image dimensionality  1D –audio (sound)  2D –digital camera picture, chest x-ray, ultrasound  3D –video sequence of 2D images –multispectral 2D images –volumetric medical imagery (CT, MRI)  4D –PET-CT –MRI

Image types  Binary  Grayscale  Color  Multispectral

Operations on images  Neighborhood (local) operations  Enhancing the entire image  Combining multiple images –Ex. differences, noise reduction, blending  Feature extraction –Ex. area, centroid (center of mass), orientation, lines –invariants

Extracting features

Example features

General hardware discussion  General purpose vs. special purpose (DSP, GPU)  Uniprocessors vs. parallel processors (COWs, multiprocessors)  Sensors (discussed later)

General software discussion  Android SDK –Java-based –freely available from –Albie start app  doxygen for source code documentation –freely available from doxygen.org  code format – conv html

General software discussion  C# –use Visual C# (Express Edition is freely available from Microsoft) –CSImageViewer starter app –main course web page has links  doxygen for source code documentation  code format

INTRODUCTION TO DOXYGEN

What’s in a program file? 1. Comments 2. Code

What’s a compiler?  A program –Input –Processing –Output

What’s a compiler?  A program –Input:  Text file (your program) –Processing:  Convert HLL statements into machine code (or similar)  Ignore comments –Output:  A binary file of machine code (or similar)

Traditional documentation  Code files are separate from design documents.  Wouldn’t it be great if we could bring code and documentation together into the same file(s)?

Tools like doxygen and javadoc  A program –Input:  Text file (your program) –Processing:  Convert (specially formatted) comments into documentation  Ignore HLL statements –Output:  Documentation (typically in HTML)

Getting started with doxygen  Download from doxy  Download from doxygen.org.   Do this only once in directory (folder) containing your source code: (already done for you) doxygen –g   This creates a doxygen configuration file called Doxyfile which you may edit to change default options.   Edit Doxyfile and make sure all EXTRACTs are YES   Then whenever you change your code and wish to update the documentation: doxygen   which updates all documentation in html subdirectory   Demonstrate.

Using doxygen: document every (source code) file /** * \fileImageData.java * \briefcontains ImageData class definition (note that this *class is abstract) * * \authorGeorge J. Grevera, Ph.D. */.

Using doxygen: document every class // /** \brief CSImageViewer class. * * Longer description goes here. */ public class CSImageViewer : Form {.

Using doxygen: document every function // /** \brief Given a pixel's row and column location, this * function returns the gray pixel value. * \param row image row * \param col image column * \returns the gray pixel value at that position */ public int getGray ( int row, int col ) { int offset = row * mW + col; return mOriginalData[ offset ]; }

Using doxygen: document every function (parameters) // /** \brief Given a pixel's row and column location, this * function returns the gray pixel value. * \param row image row * \param col image column * \returns the gray pixel value at that position */ public int getGray ( int row, int col ) { int offset = row * mW + col; return mOriginalData[ offset ]; }

Using doxygen: document every function (return value) // /** \brief Given a pixel's row and column location, this * function returns the gray pixel value. * \param row image row * \param col image column * \returns the gray pixel value at that position */ public int getGray ( int row, int col ) { int offset = row * mW + col; return mOriginalData[ offset ]; }

Using doxygen: document all class members (and global and static variables in C/C++) protected boolmIsColor;///< true if color (rgb); false if gray protected boolmImageModified;///< true if image has been modified protected intmW;///< image width protected intmH;///< image height protected intmMin;///< overall min image pixel value protected intmMax;///< overall max image pixel value protected StringmFname;///< (optional) file name

doxygen (lengthier example including html) /** \brief Actual original (unmodified) unpacked (1 component per * array entry) image data. * * If the image data are gray, each entry in this array represents a * gray pixel value. So mImageData[0] is the first pixel's gray * value, mImageData[1] is the second pixel's gray value, and so * on. Each value may be 8 bits or 16 bits. 16 bits allows for * values in the range [ ]. * * If the image data are color, triples of entries (i.e., 3) represent * each color rgb value. So each value is in [0..255] for 24-bit * color where each component is 8 bits. So mImageData[0] is the * first pixel's red value, mImageData[1] is the first pixel's green * value, mImageData[2] is the first pixel's blue value, mImageData[3] * is the second pixel's red value, and so on. */ protected int[] mOriginalData;

Required documentation rules  Each file, class, method, and member variable must be documented w/ doxygen. –Exception is when we follow the one-class-per- file rule. In that case only the class or file needs to be documented.  The contents of the body of each method should contain comments, but none of these comments should be in the doxygen format. (Not every comment is a doxygen comment.)

Not every comment should be a doxygen comment. Required: 1.every file/class 2.every function/method 3.every class member (data) 4.(in C/C++, every static and/or global variable) Use regular, plain comments in the body of a function/method. (One exception is the \todo.)

int mColorImageData[][][]; ///< should be mColorImageData[mH][mW][3] int mColorImageData[][][]; ///< should be mColorImageData[mH][mW][3] // // /** \brief Given a buffered image, this ctor reads the image data, stores /** \brief Given a buffered image, this ctor reads the image data, stores * the raw pixel data in an array, and creates a displayable version of * the raw pixel data in an array, and creates a displayable version of * the image. Note that this ctor is protected. The user should only * the image. Note that this ctor is protected. The user should only * use ImageData.load( fileName ) to instantiate an object of this type. * use ImageData.load( fileName ) to instantiate an object of this type. * \param bi buffered image used to construct this class instance * \param bi buffered image used to construct this class instance * \param w width of image * \param w width of image * \param h height of image * \param h height of image * \returns nothing (constructor) * \returns nothing (constructor) */ */ protected ColorImageData ( final BufferedImage bi, final int w, final int h ) { protected ColorImageData ( final BufferedImage bi, final int w, final int h ) { mW = w; mW = w; mH = h; mH = h; mOriginalImage = bi; mOriginalImage = bi; mIsColor = true; mIsColor = true; //format TYPE_INT_ARGB will be saved to mDisplayData //format TYPE_INT_ARGB will be saved to mDisplayData mDisplayData = mOriginalImage.getRGB(0, 0, mW, mH, null, 0, mW); mDisplayData = mOriginalImage.getRGB(0, 0, mW, mH, null, 0, mW); mImageData = new int[ mW * mH * 3 ]; mImageData = new int[ mW * mH * 3 ]; mMin = mMax = mDisplayData[0] & 0xff; mMin = mMax = mDisplayData[0] & 0xff; for (int i=0,j=0; i<mDisplayData.length; i++) { for (int i=0,j=0; i<mDisplayData.length; i++) { mDisplayData[i] &= 0xffffff; //just to insure that we only have 24-bit rgb mDisplayData[i] &= 0xffffff; //just to insure that we only have 24-bit rgb final int r = (mDisplayData[i] & 0xff0000) >> 16; final int r = (mDisplayData[i] & 0xff0000) >> 16; final int g = (mDisplayData[i] & 0xff00) >> 8; final int g = (mDisplayData[i] & 0xff00) >> 8; final int b = mDisplayData[i] & 0xff; final int b = mDisplayData[i] & 0xff; if (r<mMin) mMin = r; if (r<mMin) mMin = r; if (g<mMin) mMin = g; if (g<mMin) mMin = g;…

Summary of most useful tags \file\author\brief\param\returns \todo (not used in assignments) And many, many others.

THE GOOD, THE BAD, AND THE UGLY Back to images and imaging…

The good, the bad, and the ugly.  Success is usually hard won!  Problems: 1.Matching models to reality 2.Lighting variation 3.Sensor noise 4.Occlusion & rotation/translation/scale 5.Limited resolution  An image is a discrete model of an underlying continuous function  Spatial discretization  Sensed values quantization 6.Levels Of Detail (LOD)

So let’s try to recognize chairs.  Task that is trivial for us.

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality

The good, the bad, and the ugly.  Problem: Matching models to reality –Maybe CAD/CAM models can help! –

The good, the bad, and the ugly.  Problems: Lighting variation

The good, the bad, and the ugly.  Problem: S ensor noise

The good, the bad, and the ugly.  Problem: Occlusion

The good, the bad, and the ugly.  Problem: Rotation, reflection, translation, & scale

 Problem: –Limited resolution  An image is a discrete model of an underlying continuous function  Spatial discretization (above and below)  Sensed values quantization (next slide)  Too much of a good thing can be a problem too! The good, the bad, and the ugly.

 Problem: –Limited resolution  An image is a discrete model of an underlying continuous function  Spatial discretization (previous slide)  Sensed values quantization (below: left 16M colors, right 16 colors very carefully chosen)

The good, the bad, and the ugly. Problem: Limited resolution of gray values (sensed value quantization)

The good, the bad, and the ugly.  Problem: L evels Of Detail (LOD)

EXAMPLE APPLICATION: COUNTING BOLT HOLES

Example application: counting bolt holes  A missing bolt hole is a very costly defect during assembly.

Counting bolt holes  Pixel = (is a contraction for what?)  Dark = 1 = no light= no hole  Light = 0 = light= part of hole We could have used other conventions as well. –Dark = 0 = no light = no hole; light = 1 = light = part of hole. –Dark = 255; light = 0. –Dark > 127; light 127; light <= 127.

Counting bolt holes  External corner (ext) = 2x2 neighborhood of exactly three 1s (and one 0)  Internal corner (int) = exactly three 0s (and one 1)  Holes = (ext – int) / 4

Counting bolt holes  External corner = 2x2 neighborhood of exactly 3-1’s (and 1-0)  Internal corner = exactly 3-0’s  How many (total) possible combinations (of 4 bits) are there?

Counting bolt holes  Ex. 3 objects row 0 (y’s) column 5 (x’s) (x,y)=(c,r)=(0,0) [r][c]=[0][0] (x,y)=(c,r)=(Cols-1,Rows-1) [r][c]=[Rows-1][Cols-1] adjacent in memory (in C++)

Algorithm for counting holes in a binary image Input: a binary image, M, of R rows and C cols (indexed as M[r][c]) Output: number of holes M contains int external=0, internal=0; for (int r=0; r<R-1; r++) { for (int c=0; c<C-1; c++) { if ( isExternal(M,r,c) ) external++; else if ( isInternal(M,r,c) ) internal++; }} return (external-internal) / 4;

Hole counting assumptions  All image border pixels must be 1s.  Each region of 0s (holes) must be 4- connected.  Holes must also be simply connected (not contain any objects).

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal …

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects –External –Internal

Counting bolt holes  Ex. 3 objects = (21-9) / 4 = 12 / 4 –External –Internal

Algorithm for counting holes in a binary image Input: a binary image, M, of R rows and C cols (indexed as M[r][c]) Output: number of holes M contains int external=0, internal=0; for (int r=0; r<R-1; r++) { for (int c=0; c<C-1; c++) { if ( isExternal(M,r,c) ) external++; else if ( isInternal(M,r,c) ) internal++; }} return (external-internal) / 4; Why r<R-1 and c<C-1?

PROBLEMS FOR DISCUSSION

Problems: Ex. 1.5 –Problems can be solved in different ways and a problem solver should not get trapped early in a specific approach. Consider the problem of identifying cars in various situations:  entering a parking lot or a secured area,  passing through a tollgate,  exceeding the speed limit. –Several groups are developing or have developed machine vision methods to read a car’s license plate. Suggest an alternative to machine vision. How do the economic and social costs compare to the machine vision approach?

Problems: Ex. 1.6 –Identify some defects remaining in the right image of Figure 1.8 and describe simple neighborhood operations that will improve the image.

Problems: Ex. 1.9 –Examine the image of bacteria in Figure 1.8 and the sample of automatically computed features in Figure Is there potential for obtaining a count of bacteria, say within 5% accuracy (in this example)? Explain.

Problems: Ex. 1.9 cont’d.

Problems: Ex. 1.12: On face interpretation. –Is it easy for you to decide the gender and approximate age of persons pictured in magazine ads? –Psychologists might tell us that humans have the ability to see a face and immediately decide on the age, sex, and degree of hostility of the person. Assume that this ability exists for humans. If you think it would be based on image features, then what are they? If you think that image features are not used, then explain how humans might make such decisions.

Problems: Ex. 1.14: Toward the correctness of holecounting. (a) How many possible 2x2 neighborhood patterns are there in a binary image? List them all. (b) Which of the patterns of part (a) cannot occur in a binary image that is 4-connected? Define border point to be the center grid point of a 2x2 neighborhood that contains both 0 and 1 pixels. (c) Argue that a single hole cannot be accounted for by just counting the number of e and i patterns along its border and that the formula n=(e-i)/4 is correct when one hole is present. (d) Argue that no two holes can have a common border point. (e) Argue that the formula is correct when an arbitrary number of holes is present.