Detecting Shapes in Cluttered Images

Slides:

Advertisements

Similar presentations

Chamfer Distance for Handshape Detection and Recognition CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.

Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

An Introduction to Hidden Markov Models and Gesture Recognition Troy L. McDaniel Research Assistant Center for Cognitive Ubiquitous Computing Arizona State.

Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.

Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

EE663 Image Processing Edge Detection 5 Dr. Samir H. Abdul-Jauwad Electrical Engineering Department King Fahd University of Petroleum & Minerals.

Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.

Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.

Announcements Readings for today:

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Online Chinese Character Handwriting Recognition for Linux

Fixed Parameter Complexity Algorithms and Networks.

THE HIDDEN MARKOV MODEL (HMM)

Graphical models for part of speech tagging

Gesture Recognition CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.

More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.

Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.

Wong, Gardner, Krieger, Litt (2006) Zack Dvey-Aharon, March 2008.

Computer Vision Lecture 6. Probabilistic Methods in Segmentation.

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

John Lafferty Andrew McCallum Fernando Pereira

Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.

Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.

Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Hidden Markov Models – Concepts 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.

April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.

Hidden Markov Models BMI/CS 576

Nearest Neighbor Classifiers

What is a Hidden Markov Model?

Machine Learning - Introduction

Computer Architecture and Assembly Language

Lecture 13 Gesture Recognition – Part 2

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Artificial Intelligence

Greedy Method 6/22/2018 6:57 PM Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Supervised Time Series Pattern Discovery through Local Importance

Lecture 26 Hand Pose Estimation Using a Database of Hand Images

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Data Science Algorithms: The Basic Methods

CIS 700 Advanced Machine Learning Structured Machine Learning: Theory and Applications in Natural Language Processing Shyam Upadhyay Department of.

CSCI 5832 Natural Language Processing

Test study Guide/Breakdown

Hidden Markov Models Part 2: Algorithms

Analysis of Algorithms

Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.

Brute Force Approaches

Indexing and Hashing Basic Concepts Ordered Indices

CSSE463: Image Recognition Day 17

Three classic HMM problems

Hidden Markov Models (HMMs)

CONTEXT DEPENDENT CLASSIFICATION

Greedy Algorithms Alexandra Stefan.

Pruning 24-Feb-19.

Handwritten Characters Recognition Based on an HMM Model

Outline Announcement Perceptual organization, grouping, and segmentation Hough transform Read Chapter 17 of the textbook File: week14-m.ppt.

University of Texas at Arlington

Chapter 6 Network Flow Models.

Computer Architecture and Assembly Language

Lecture 6 Dynamic Programming

The EM Algorithm With Applications To Image Epitome

Hidden Markov Models By Manish Shrivastava.

Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.

Calibration and homographies

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Presentation transcript:

Detecting Shapes in Cluttered Images CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington

Objects With Variable Shape Structure Objects with parts that can: Occur an unknown number of times. Leaves, comb teeth. Be not present at all. Leaves, comb teeth, fingers. Have alternative appearances. Fingers. branches combs hands

residential neighborhood Other Examples Satellite Imaging boat pier parking lot residential neighborhood

Roadways highway exits

Other Examples Medical Imaging ribs & bronchial tree

Goal: Detection in Cluttered Images Branches Hands

Deformable Models Not Adequate Shape Modeling: “Shape deformation” ≠ “structure variation” ∩

Limitations of Fixed-Structure Methods A different model is needed for each fixed structure.

Limitations of Fixed-Structure Methods Worst case: number of models is exponential to number of object parts.

Overview of Approach Hidden Markov Models (HMMs) can generate shapes of variable structure. HMMs cannot be matched to an image. Image features are not ordered. Many (possibly most) features are clutter. Solution: Hidden State Shape Models.

An HMM for Branches of Leaves initial state stem right leaf left leaf top leaf end state Each box is a state. Arrows show legal state transitions.

An HMM for Branches of Leaves initial state stem right leaf left leaf top leaf end state stem left leaf Each box is a state. For each state there is a probability distribution of shape appearance.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal initial state. Choose a shape from the state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal initial state. Choose a shape from the state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal initial state. Choose a shape from the state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution. 17

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Choose a legal transition to a next state. Choose a shape from the next state’s shape distribution.

Generating a Branch initial state stem right leaf left leaf top leaf end state Can stop when at a legal end state.

An HMM for Handshapes Each circle is a state (object part). Arrows show legal state transitions.

Matching Observations via DTW initial state stem right leaf left leaf top leaf end state Given a sequence of shape parts, what is the optimal sequence of corresponding states?

Matching Observations via DTW 1 2 3 4 5 6 7 8 9 initial state stem right leaf left leaf top leaf end state Given a sequence of shape parts, what is the optimal sequence of corresponding states?

Matching Observations via DTW 9 top leaf initial state stem right leaf left leaf top leaf end state 7 left leaf 8 stem 6 stem 5 right leaf 4 stem 3 left leaf 2 stem 1 stem The Viterbi algorithm produces a globally optimal answer.

DTW/HMMs Cannot Handle Clutter 1 2 3 4 5 6 7 8 9 clean image, ordered observations DTW/HMM can parse the observations cluttered image, unordered observations

Key Challenges in Clutter The observations are not ordered. Many (possibly most) observations should not be matched. Solution: Hidden State Shape Models (HSSMs). Extending HMMs and the Viterbi algorithm to address clutter.

Inputs Shape Model. Possible matches for each model state. initial stem right leaf left leaf top leaf end state possible leaf locations Shape Model. Shape parts. Legal transitions. Initial/end states. More… possible stem locations Possible matches for each model state.

Algorithm Input/Ouptut Input: Set of K image features {F1, F2, …, FK}. possible leaf locations possible stem locations oriented edge pixels

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. Q1: stem Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. Q2: left leaf Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. Q3: stem Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. Q4: right leaf Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Algorithm Input/Output Input: Set of K image features {F1, F2, …, FK}. top leaf left leaves right leaves stem Output: Registration. ((Q1, O1), (Q2, O2), …, (QT, OT)). Sequence of matches. Qj: model state. : Oj feature. Model states tell us the structure. Features tell us the location.

Finding the Optimal Registration top leaf left leaves right leaves stem some possible registrations optimal registration Number of possible registrations is exponential to number of image features. Evaluating each possible registration is intractable. We can find optimal registration in polynomial time (quadratic/linear to the number of features). Using dynamic programming (modified Viterbi).

Evaluating a Registration ((Q1, O1), (Q2, O2), (Q3, O3), (Q4, O4), …, (QT, OT)). Probability is product of: I(Q1): prob. of initial state. B(Qj, Oj): prob. of feature given model state. A(Qj, Qj+1): prob. of transition from Qj to Qj+1. D(Qj, Oj, Qj+1, Oj+1): prob. of observing Oj+1 given state = Qj+1, and given previous pair (Qj, Oj). Not in HMMs, because there the order is known. top leaf left leaves right leaves stem optimal registration

Finding the Optimal Registration Dynamic Programming algorithm: Break up problem into smaller, interdependent problems. Definition: W(i, j, k) is the optimal registration such that: Registration length is j. Qj = Si. Oj = Fk. Optimal registration: W(i, j, k) with highest probability, such that Qj is a legal end state. Suffices to compute all W(i, j, k). O4 Q4: right leaf

Finding the Optimal Registration W(i, j, k): highest prob. registration such that: Registration length is j. Qj = Si. Oj = Fk. W(i, 1, k) is trivial: only one choice. Registration consists of pairing Si with Fk. Cost: I(Si) + B(Si, Fk): W(i, 2, k) is non-trivial: we have choices. Registration: ((Q1, O1), (Q2, O2)) Q2 = Si, O2 = Fk. What we do not know: Q1, O1. We can try all possible (Q1, O1), not too many.

Finding the Optimal Registration W(i, 3, k): even more choices. Registration: ((Q1, O1) , (Q2, O2), (Q3, O3)) Q3 = Si, O3 = Fk. What we do not know: Q1, O1, Q2, O2. Here we use dynamic programming. W(i, 3, k) = ((Q1, O1) , (Q2, O2), (Q3, O3)) implies that W(i’, 2, k’) = ((Q1, O1) , (Q2, O2)). If we have computed all W(i, 2, k), the number of choices for W(i, 3, k) is manageable. This way we compute W(i, j, k) for all j. The number of choices does no increase with j.

Unknown-scale Problem What is the length of the optimal registration? T = 400 T = 600 T = 800 T = 500 T = 800 T= 1000 Probability decreases with registration length. HMMs are biased towards short registrations

Unknown-scale Problem What is the length of the optimal registration? A B C Probability decreases with registration length. HMMs are biased towards short registrations

Handling Unknown Scale Registration: ((Q1, O1), (Q2, O2), (Q3, O3), (Q4, O4), …, (QT, OT)). Probability is product of: I(Q1): prob. of initial state. B(Qj, Oj): prob. of feature given model state. A(Qj, Qj+1): prob. of transition from Qj to Qj+1. D(Qj, Oj, Qj+1, Oj+1): prob. of observing Oj+1 given state = Qj+1, and given previous pair (Qj, Oj).

Handling Unknown Scale Registration: ((Q1, O1), (Q2, O2), (Q3, O3), (Q4, O4), …, (QT, OT)). Probability is product of: I(Q1): prob. of initial state. B(Qj, Oj): prob. of feature given model state. A(Qj, Qj+1): prob. of transition from Qj to Qj+1. D(Qj, Oj, Qj+1, Oj+1): prob. of observing Oj+1 given state = Qj+1, and given previous pair (Qj, Oj).

Handling Unknown Scale Before: B(Qj, Oj): prob. of feature given state. Adding a feature decreases registration probability. Now: B(Qj, Oj) is a ratio: B(Qj, Oj) can be greater or less than 1. Adding a feature may increase or decrease registration probability. Bias towards short registrations is removed. P(Oj | Qj) P(Oj | clutter)

Another Perspective Before: probability depended only on features matched with the model. Fewer features  higher probability. The probability function should consider the entire image. Features matched with model. Features assigned to clutter.

Another Perspective For every feature, does it match better with the model or with clutter? Compute P(Fk | clutter). Given a registration R, define: C(R): set of features left out. F: set of all image features. Total probability: P(registration) P(C(R) | clutter)). C(R): gray F: gray/black

Another Perspective Given a registration R, define: Total probability: C(R): set of features left out. F: set of all image features. Total probability: P(registration) P(C(R) | clutter)) proportional to: P(registration) P(C(R) | clutter)) P(F | clutter)) C(R): gray F: gray/black

Another Perspective Goal: maximize Equal to product of: P(registration) P(C(R) | clutter)) P(F | clutter)) Goal: maximize Equal to product of: I(Q1): prob. of initial state. B(Qj, Oj): A(Qj, Qj+1): prob. of transition from Qj to Qj+1. D(Qj, Oj, Qj+1, Oj+1): prob. of observing Oj given state = Qj, and given previous pair (Qj+1, Oj+1). P(Oj | Qj) P(Oj | clutter) C(R): gray F: gray/black

Experiments Time: 8 different orientations. Datasets: Methods. 353 hand images. 100 images of branches. Methods. HSSM: known scale. HSSM+CM: unknown scale, with clutter modeling. HSSM+CM+SEG: unknown scale, with clutter modeling + segmental model. Time: 8 different orientations. 2-3 minutes per images for HSSM and HSSM+CM. <30 minutes per images for HSSM+CM+SEG.

Results on Hands Chamfer Oriented Chamfer HSSM Correct recognition 4% 22% 34% Correct localization 35% 55% 60% Chamfer Matching HSSM

Results on Hands HSSM HSSM+CM HSSM+CM+SEG Correct recognition 34% 59% 70% Correct localization 60% 85% 95% HSSM HSSM +CM

Results on Hands HSSM HSSM+CM HSSM+CM+SEG Correct recognition 34% 59% 70% Correct localization 60% 85% 95% HSSM + CM HSSM +CM+SEG

correct recognition with HSSM Results on Branches HSSM Correct recognition 43% Correct localization 79% correct recognition with HSSM

Results on Branches HSSM HSSM+CM HSSM+CM+SEG Correct recognition 43% 65% N/A Correct localization 79% 98% HSSM HSSM +CM

Recap HSSMs model objects of variable shape structure. Key advantages: A single variable-shape model, vs. one model for each shape structure (exponential worst-case complexity). Detection can tolerate very large amounts of clutter. Object features can be a very small fraction of all features. Optimal registration length found automatically. Scale dependency between parts is captured using a segmental model.