Download presentation
Presentation is loading. Please wait.
1
Petacat: Applying ideas from Copycat to image understanding
2
How Streetscenes Works (Bileschi, 2006) 1. Densely tile the image with windows of different sizes. 2. HMAX C2 features are computed in each window. 3. The features in each window are given as input to each of five trained support vector machines (“pedestrian”, “car”, “bicycle”, “building”, “tree”) 4. If any return a classification with score above a learned threshold, that object is said to be “detected”. …
3
Object detection (here, “car”) with HMAX model (Bileschi, 2006)
4
Limitations of Streetscenes approach for “image understanding”
5
Exhaustive search – not scalable Does not recognize spatial and abstract relationships among objects for whole scene understanding Has no prior knowledge about object categories and their place in “conceptual space” HMAX model is completely feed-forward; no feedback to allow context to aid in scene understanding. –Where should feedback come in?
6
PersonDog leash attached to walking action holds Representation of High-Level Knowledge: A Simple Semantic Network (or “Ontology”) “Dog walking”
7
But...
8
PersonDog leash attached to walking action holds Modified Ontology Dog Group running “Dog walking”
9
PersonDog leash attached to walking action holds Modified Ontology running Allowing “conceptual slippage” “Dog walking” Dog Group
10
But...
12
Person leash attached to walking action holds “Dog walking” Modified Ontology running Cat Iguana Dog Dog Group
13
But...
17
PersonDog leash attached to walking action holds Modified Ontology running Cat Iguana BicycleCarHelicopter “Dog walking” Dog Group
18
But...
19
PersonDogLeashOutside Ground WalkingRunning StandingTree InsideStick Close to Far from BeachSidewalk Attached to Grass Lawn mowerGasoline RunwayAirplane Helicopter AboveLeft of Holding Dog walkingDog grooming Car SkyArmy Track Fanny pack Backpack
20
Need dynamical process of constructing representation.
21
Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes.
22
Need dynamical process of constructing representation. Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes. –Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search
23
Need dynamical process of constructing representation. Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes. –Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search –Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down).
24
Need dynamical process of constructing representation. Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes. –Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search –Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down). –Concepts are “fluid”, allowed to “slip” in certain contexts.
25
Need dynamical process of constructing representation. Information gained during the unfolding of perception feeds back to guide the directions the perceptual process takes. –Ongoing perception of “context” brings in appropriate concepts and conceptual slippages, and avoids exhaustive search –Prior, higher-level knowledge interacts with lower-level vision in both directions (bottom-up and top-down). –Concepts are “fluid”, allowed to “slip” in certain contexts. This allows perception of essential similarity in the face of superficial differences—i.e., analogy-making. –
26
Active Symbol Architecture (Hofstadter et al., 1995)
27
Basis for –Copycat (analogy-making), Hofstadter & Mitchell –Tabletop (anlaogy-making), Hofstadter & French –Metacat (analogy-making and self-awareness), Hofstadter & Marshall and many others…
30
Semantic network Temperature Workspace Active Symbol Architecture (Hofstadter et al., 1995) Perceptual agents (codelets)
31
Petacat: (Descendant of Copycat) Integration of Active Symbol Architecture and HMAX Initial task: Decide if image is an instance of “taking a dog for a walk”, and if so, how good an instance it is.
32
taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation Semantic Network
33
Property links Slip links taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation Semantic Network
34
Property links Slip links taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation Semantic Network Properties of nodes
35
Workspace
36
Semantic network Workspace
37
Semantic network Perceptual Agents (Codelets) Codelets as active symbols
38
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors is on Spatial Relation
39
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation cat
40
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation
41
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation
42
Dog? Illustration of what we plan to have happen – not a real run of Petacat
43
Dog? Person? Illustration of what we plan to have happen – not a real run of Petacat
44
Dog? Sidewalk? Person? Illustration of what we plan to have happen – not a real run of Petacat
45
Dog? Sidewalk? Person? Dog ? Outdoors? Illustration of what we plan to have happen – not a real run of Petacat
46
Dog? Sidewalk? Person? Dog ? Outdoors? Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence. Illustration of what we plan to have happen – not a real run of Petacat
47
Dog? negative Dog? negative Sidewalk? positive: 0.4 Person? negative Outdoors? positive: 0.7 Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence. Dog ? positive: 0.8 Illustration of what we plan to have happen – not a real run of Petacat
48
Builder codelets: Ask HMAX to compute C2 features using prototypes specific to the object (or scene), and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary. Dog? negative Dog? negative Sidewalk? positive: 0.4 Person? negative Outdoors? positive: 0.7 Dog ? positive: 0.8 Illustration of what we plan to have happen – not a real run of Petacat
49
Builder codelets: Ask HMAX to compute object-/scene-specific C2 features, and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary. Outdoors Dog Illustration of what we plan to have happen – not a real run of Petacat
50
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation
51
Dog ? Dog Leash? Outdoors Leash? Sidewalk? Person? Illustration of what we plan to have happen – not a real run of Petacat
52
Dog Outdoors Sidewalk Person Strength: 0.6 Illustration of what we plan to have happen – not a real run of Petacat
53
Dog Outdoors Sidewalk Illustration of what we plan to have happen – not a real run of Petacat
54
taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation
55
Dog Outdoors Sidewalk Leash? Dog ? Sidewalk? Dog ? Rope? Illustration of what we plan to have happen – not a real run of Petacat
56
Dog Outdoors Sidewalk Leash Dog (weak) Illustration of what we plan to have happen – not a real run of Petacat
57
Dog Outdoors Sidewalk Leash Dog (weak) Dog (strong) Illustration of what we plan to have happen – not a real run of Petacat
58
Dog Outdoors Sidewalk Leash Dog Illustration of what we plan to have happen – not a real run of Petacat
60
Dog Outdoors Sidewalk Leash Dog Once objects begin to be built, relation and grouping codelets can run on them. is next to is in front of is next to is in front of Dog group Illustration of what we plan to have happen – not a real run of Petacat
61
Once objects begin to be built, relation and grouping codelets can run on them. Dog Outdoors Sidewalk Dog is next to Dog group Leash Illustration of what we plan to have happen – not a real run of Petacat
62
Dog Outdoors Sidewalk Dog is next to Dog group is next to Leash Illustration of what we plan to have happen – not a real run of Petacat
63
How codelets decide where to look System starts out with weak segmentation (e.g., “normalized cuts” algorithm)
64
How codelets decide where to look System starts out with weak segmentation (e.g., “normalized cuts” algorithm) System creates “heat maps” for location and scale of objects in general (at each pixel, probability of finding an object at this location and at a particular height/width of bounding box. +++ +
65
How codelets decide where to look System starts out with weak segmentation (e.g., “normalized cuts” algorithm) System creates “heat maps” for location and scale of objects in general (at each pixel, probability of finding an object at this location and at a particular height/width of bounding box. Object scout codelets choose location and scale probabilisitically from these heat maps. +++ +
66
How codelets decide where to look When codelets look for individual object categories (e.g., dog), object- specific heat maps are created + Dog Person heat map +
67
How codelets decide where to look When codelets look for individual object categories (e.g., dog), object- specific heat maps are created As codelets build structure, heat maps are continually updated to reflect prior (learned) expectations about location and scale as a function of location and scale of “built” objects (as well as original weak segmentation). + Dog + Person heat map Person?
68
How Petacat makes a final decision Temperature taking a dog for a walk Dog Outdoor s Leash Dog is next to Dog group Sidewalk is next to Illustration of what we plan to have happen – not a real run of Petacat
69
How Petacat makes a final decision Temperature taking a dog for a walk Dog Outdoor s Leash Dog is next to Dog group Sidewalk “Situation” codelet is more likely to run when temperature is low. is next to Illustration of what we plan to have happen – not a real run of Petacat
70
Dog Outdoors Leash Dog is next to Dog group is next to Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages. Sidewalk Illustration of what we plan to have happen – not a real run of Petacat
71
Dog Outdoors Leash Dog is next to Dog group Sidewalk perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.
72
Dog Outdoors Leash Dog is next to Dog group perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group Sidewalk
73
Dog Outdoors Leash Dog is next to Dog group perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group If resulting temperature is low enough, classify scene as positive Sidewalk
74
Dog Outdoors Leash Dog is next to Dog group Sidewalk perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group If situation codelet fails enough times or does not run for a long time, program has increasing chance of ending with negative classification. If resulting temperature is low enough, classify scene as positive
75
If Petacat classifies the picture as positive, the temperature at the end of the run gives a measure of how good an instance the picture is (e.g., of the “dog walking” situation).
76
Summary:
77
Summary: How does Petacat avoid exhaustive search?
78
Recall Streetscenes system, which, given an image, does exhaustive search over: Window size and location in the image C1, C2 features in windows Object categories (e.g., car, pedestrian, tree, etc.)
79
Summary: How does Petacat avoid exhaustive search? Recall Streetscenes system, which, given an image, does exhaustive search over: Window size and location in the image In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained C1, C2 features in windows Object categories (e.g., car, pedestrian, tree, etc.)
80
Summary: How does Petacat avoid exhaustive search? Recall Streetscenes system, which, given an image, does exhaustive search over: Window size and location in the image In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained C1, C2 features in windows Codelets request C2 features only in “relevant” windows, and request only C2 features that are relevant to what the codelet is looking for. Object categories (e.g., car, pedestrian, tree, etc.)
81
Summary: How does Petacat avoid exhaustive search? Recall Streetscenes system, which, given an image, does exhaustive search over: Window size and location in the image In Petacat, codelets choose window size and location based on learned expectations and perceived context, with probabilities continually changing as more information is obtained C1, C2 features in windows Codelets request C2 features only in “relevant” windows, and request only C2 features that are relevant to what the codelet is looking for. Object categories (e.g., car, pedestrian, tree, etc.) Codelets look for object categories that are activated by context, based on prior expectations and currently perceived information.
82
Summary: How does Petacat avoid exhaustive search? Petacat effects a parallel terraced scan (Hofstadter, 1995): Codelets build structures at a rate (urgency) based on their perceived promise, which is continually updated as new information is perceived. Temperature allows this (continually changing) rate to depend on the global state of the system.
83
Relation to neuroscience/psychophysics –Gilbert & Sigman (2007): Emphasis of role to top-down processing in vision. “V1 and V2 may work as ‘active blackboards’ that integrate and sustain the result of computations performed in higher areas. –Kahneman, Triesman, and Gibbs (1992): Notion of “object files”: temporary and modifiable perceptual structures, created on the fly in working memory, which interact with a permanent network of concepts. –Churchland, Ramachandran, and Sejnowski: Theory of “interactive vision” –Treisman and colleagues: Shift between parallel, random, “pre- attentive” bottom-up processing and more deterministic, focused, serial, “attentive” top-down processing.
84
Does Petacat understand pictures?
85
Understanding (MM’s defintion): - Ability to appropriately use one’s knowledge and make appropriate conceptual slippages in a wide variety of environments/contexts. - Ability to use one’s existing concepts to learn new concepts
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.