Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004

Roadmap Motivation –Computer vision applications Is a Picture worth a thousand words? –Low level features Feature extraction: intensity, color –High level features Top-down constraint: shape from stereo, motion,.. Case Study: Vision as Modern AI –Fast, robust face detection (Viola & Jones 2002)

Perception From observation to facts about world –Analogous to speech recognition –Stimulus (Percept) S, World W S = g(W) –Recognition: Derive world from percept W=g’(S) Is this possible?

Key Perception Problem Massive ambiguity –Optical illusions Occlusion Depth perception “Objects are closer than they appear” Is it full-sized or a miniature model?

Handling Uncertainty Identify single perfect correct solution –Impossible! Noise, ambiguity, complexity Solution: –Probabilistic model –P(W|S) = αP(S|W) P(W) Maximize image probability and model probability

Handling Complexity Don’t solve the whole problem –Don’t recover every object/position/color… Solve restricted problem –Find all the faces –Recognize a person –Align two images

Modern Computer Vision Applications Face / Object detection Medical image registration Face recognition Object tracking

Computer Vision Case Study “Rapid Object Detection using a Boosted Cascade of Simple Features”, Viola/Jones ’01 Challenge: –Object detection: Find all faces in an arbitrary images –Real-time execution 15 frames per second –Need simple features, classifiers

Rapid Object Detection Overview Fast detection with simple local features –Simple fast feature extraction Small number of computations per pixel Rectangular features –Feature selection with Adaboost Sequential feature refinement –Cascade of classifiers Increasingly complex classifiers Repeatedly rule out non-object areas

Picking Features What cues do we use for object detection? –Not direct pixel intensities –Features Can encode task specific domain knowledge (bias) –Difficult to learn directly from data –Reduce training set size Feature system can speed processing

Rectangle Features Treat rectangles as units –Derive statistics Two-rectangle features –Two similar rectangular regions Vertically or horizontally adjacent –Sum pixels in each region Compute difference between regions

Rectangle Features II Three-rectangle features –3 similar rectangles: horizontally/vertically Sum outside rectangles Subtract from center region Four-rectangle features –Compute difference between diagonal pairs HUGE feature set: ~180,000

Rectangle Features

Computing Features Efficiently Fast detection requires fast feature calculation Rapidly compute intermediate representation –“Integral image” –Value for point (x,y) is sum of pixels above, left –ii(x,y) = Σx’<=x,y’<=y i(x,y) –Computed by recurrence s(x,y) = s(x,y-1) + i(x,y), where s(x,y) cumulative row ii(x,y) = ii(x-1,y) + s(x,y) Compute rectangle sum with 4 array references

Rectangle Feature Summary Rectangle features –Relatively simple –Sensitive to bars, edges, simple structure Coarse –Rich enough for effective learning –Efficiently computable

Learning an Image Classifier Supervised training: +/- examples Many learning approaches possible Adaboost: –Selects features AND trains classifier –Improves performance of simple classifiers Guaranteed to converge exponentially rapidly –Basic idea: Simple classifier Boosts performance by focusing on previous errors

Feature Selection and Training Goal: Pick only useful features from 180000 –Idea: Small number of features effective Learner selects single feature that best separates +/- ve examples –Learner selects optimal threshold for each feature –Classifier h(x) = 1 if pf(x)<pθ, 0 otherwise

Basic Learning Results Initial classification: Frontal faces –200 features –Finds 95%, 1/14000 false positive –Very fast Adding features adds to computation time Features interpretable –Darker region around eyes that nose/cheeks –Eyes are darker than bridge of nose

“Attentional Cascade” Goal: Improved classification, reduced time –Insight: Small – fast – classifiers can reject But have very few false negatives –Reject majority of uninteresting regions quickly –Focus computation on interesting regions Approach: “Degenerate” decision tree Aka “cascade” Positive results passed to high detection classifiers –Negative results rejected immediately

Cascade Schematic All Sub-window Features CL 1CL 2CL 3 F FF T TT More Classifiers Reject Sub-Window

Cascade Construction Each stage is a trained classifier –Tune threshold to minimize false negatives –Good first stage classifier Two feature strong classifier – eye/check + eye/nose Tuned: Detect 100%; 40% false positives –Very computationally efficient 60 microprocessor instructions

Cascading Goal: Reject bad features quickly –Most features are bad Reject early in processing, little effort –Good regions will trigger full cascade Relatively rare Classification is progressively more difficult –Rejected the most obvious cases already Deeper classifiers more complex, more error-prone

Cascade Training Tradeoffs: Accuracy vs Cost –More accurate classifiers: more features, complex –More features, more complex: Slower –Difficult optimization Practical approach –Each stage reduces false positive rate –Bound reduction in false pos, increase in miss –Add features to each stage until meet target –Add stages until overall effectiveness targets met

Results Task: Detect frontal upright faces –Face/non-face training images Face: ~5000 hand-labeled instances Non-face: ~9500 random web-crawl, hand-checked –Classifier characteristics: 38 layer cascade Increasing number of features: 1,10,25,… : 6061 –Classification: Average 10 features per window Most rejected in first 2 layers Process 384x288 image in 0.067 secs

Detection Tuning Multiple detections: –Many subwindows around face will alert –Create disjoint subsets For overlapping boundaries, only report one –Return average of corners Voting: –3 similarly trained detectors Majority rules –Improves overall

Conclusions Fast, robust facial detection –Simple, easily computable features –Simple trained classifiers –Classification cascade allows early rejection Early classifiers also simple, fast –Good overall classification in real-time

Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Similar presentations

Presentation on theme: "Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004.

Similar presentations

Presentation on theme: "Computer Vision CSPP 56553 Artificial Intelligence March 3, 2004."— Presentation transcript:

Similar presentations

About project

Feedback