Download presentation
Presentation is loading. Please wait.
Published byBeverly Harrington Modified over 9 years ago
1
Linguistic Spatial Reasoning Jim Keller Electrical and Computer Engineering Department University of Missouri-Columbia I get by with a lot of help from my friends…. Big group of faculty and students at the University of Missouri, the University of Florida, Notre Dame University, and the University of Guelph
2
Our Work Language Generation – Modeling of spatial relationships – Natural scene understanding Communication: human and machine Text 2 Sketch Human Geography Summarization – Most of the night, Mary had medium restlessness and low motion Common Theme – Generation of “un-natural” natural language – Modeling and matching based on linguistic statements – Users of natural language understanding systems
3
Outline Linguistic Scene Description (External) Linguistic Scene Description (Ego Centered) Human/Robot Dialog Sketched Route Understanding Text-to-Sketch – Inverse of above
4
Scene Description “B is to the right of A”? B A Natural scene understanding is an important aspect of computer vision Spatial relations among image objects play a vital role in the description of a scene
5
Scene Description Human intuition varies considerably – Vague concepts of what spatial relationships should mean – Uncertainty of how they can model differing human perceptions Using fuzzy set-based definitions supports the “Principle of Least Commitment” of David Marr Considerable amount of argument about the "best" method – Much of the debate centered on human intuition – Interesting results in Pattern Recognition applications Need flexible mechanisms that can be tailored – Perhaps to individual human experts
6
angle 180° -180° Relative position between 2-D objects The histogram of angles
7
v A B +180°-180° angle Histograms of Forces (Matsakis) Elegant Theory Much more flexible than angles Incorporates Metric information
8
Choice of the function force (d) distance d Gravitational Forces A B
9
Force-Histograms +180°-180° angle
10
Truth Membership of “A is RIGHT of B” F AB is called the F histogram associated with (A,B) Divided into 4 quadrants
11
A is RIGHTof B The average direction r (RIGHT) of the effective forces is computed The degree of truth a r (RIGHT) of the proposition “A is to the right of B” is computed as: a r (RIGHT) = ( r (RIGHT)) b r (RIGHT)
12
Features for Linguistic Description Compute 1, the Primary Direction [d 1 = a( 1 )] Produces Basic Directional Term (e.g., RIGHT of) 2, the Secondary Direction [d 2 = a( 2 )] Supplements the description (e.g., “but a little above”) m 1 and m 2 specify how good the description is for each direction – combination of F 0 and F 2
13
gravitational constant - - 3 2 1 1 3. A system of 27 fuzzy rules and meta-rules allows meaningful linguistic descriptions to be produced. 1. Each histogram gives its opinion about the relative position between the objects that are considered. 2. The two opinions are combined. Four numeric and two symbolic features result from this combination. Linguistic Scene Description
14
m 1 highmedium-highmedium-lowlow high perfectly-- 1 nearly 1 medium-high -- 2 nearly 2 loosely 1 medium-low mostlyloosely 2 3 d 1 low no primary direction no primary direction Examples of Fuzzy Rules and Meta-Rules Each linguistic output uses hedges from a dictionary of about thirty adverbs and other terms that can be tailored to individual users Fuzzy Rule-based System
15
m 2 highmediumlow high somewhatstrongly medium a littleslightly d 2 low no secondary direction no secondary direction Fuzzy rules for secondary direction Secondary directions
16
Simple Examples Green = satisfactory; Orange = Rather satisfactory; Red = Unsatisfactory
17
We used LADAR range images of the power-plant at China Lake, CA. They were processed by applying a median filter and a pseudo-intensity filter. The filtered images were segmented and labeled manually. Linguistic Scene Description Example
18
To the LEFT, but a little ABOVE. The system handles a rich language to describe the spatial organization of scene regions. It produces good intuitive results. The system here describes the relative position of the red object with respect to the group of buildings (in blue). Linguistic Scene Description Example
19
Linguistic Spatial Scene Description There are 5 missile launchers (1, 2, 3, 6, 8) They surround a center vehicle (4) The image includes a SAM site A convoy of vehicles (5, 7, 9, 10) is BelowRight of the SAM site Pseudo-intensity image of surface-to-air missile (SAM) site with convoy Objects detected and automatically labeled by extended ATR system Output of Scene Description Fuzzy Rule Base (from initial system)
20
Horizontal descriptions “ The fish can is to the left of the truck box ” “ The truck box is to the right of the fish can but extends to the front ” “ The candy box is to the left of the truck box ” “ The truck box is to the right of the candy box but extends to the front” “ The candy box is surrounded ” and “ The fish can is surrounded. ” Vertical descriptions “ The fish can is to the left of the truck box ” “ The truck box is to the right of the fish can ” “ The truck box is to the right of the candy box ” “ The candy box is on top of the fish can ” “ The fish can is below the candy box. ” Near descriptions “ The fish can and the candy box are near ” “ The fish can and the truck box are somewhat near ” “ The candy box and the truck box are somewhat near. ” An extension to the third Dimension
21
Outline Linguistic Scene Description (External) Linguistic Scene Description (Ego Centered) Human/Robot Dialog Sketched Route Understanding Text-to-Sketch – Inverse of above
22
Human/Robot Dialog Spatial Reasoning incorporated into NRL’s Natural Language Understanding System for mobile robots Sensed data results in a “grid map” that displays occupancy of cells (doesn’t need to be binary) Grid map after component labeling – robot heading towards Object 5
23
Generating Spatial Descriptions from Robot Sensors (here, NOMAD range sensors) Fuzzy Rules features
24
DETAILED SPATIAL DESCRIPTIONS for 6 OBJECTS: Object 1 is mostly behind me but somewhat to the right (the description is satisfactory). The object is very close. Object 2 is behind me (the description is satisfactory) The object is very close. Object 3 is to the left of me but extends to the rear relative to me (the description is satisfactory). The object is very close. Object 4 is mostly to the right of me but somewhat forward (the description is satisfactory). The object is very close. Object 5 is in front of me (the description is satisfactory). The object is very close. Object 6 is to the left-front of me (the description is satisfactory). The object is close. Scene 1
25
Human-Driven Spatial Language for Human-Robot Interaction Investigating spatial language for eldercare scenario in the home – An elderly resident has lost an object; the robot will help the resident to find it. – EXAMPLE: The eyeglasses are behind the lamp on the table to the left of the bed in the bedroom Start with human-subject experiments; Develop spatial language algorithms to match the results NSF project # IIS-1017097 – U of Missouri (Marge Skubic) – U of Notre Dame (Laura Carlson)
26
The virtual scene used for the human subject experiments Robot addressee vs. Human addressee Vary the speaker-addressee alignment Where to find it vs. how to find it Vary candidate reference objects Hallway LivingroomBedroom
27
Example “where” Statements the key is on the white table in the room to his left the book is on the wooden table in the back of the room to his right the wallet's in the right room behind the bed on the table next to the lamp and the plant the glasses case is down the hall in the right room -on the right side of the room as you enter, it's between the two chairs on a table next to a statue the mug's down the hall in the left room on a table on the light brown table in front of the couch next to a purse and a hat
28
Example “how” Statements move forward into the intersection, look left, move to the table at the far end of the room directly across from where he is and then look down and he'll find the car keys if you turn around and then walk left or walk forward about two steps then walk left into the bedroom, walk forward until you have passed the bed on your left, turn left around the foot of the bed, proceed forward, turn left again, walk forward towards the wall near the head of the bed, look down on the bedside table and there’s the object.
29
Outline Linguistic Scene Description (External) – Results in Scene Matching Linguistic Scene Description (Ego Centered) Human/Robot Dialog Sketched Route Understanding Text-to-Sketch – Inverse of above
30
Extracting a Navigation Path from a Sketch Start: Move forward When Object #3 is loosely to the left-front Then Turn right When Object #3 is to the left Then Move forward When Object #4 is mostly in front Then Turn left When Object #4 is to the right Then Move forward When Object #4 is to the right and Object #5 is in front Then Stop Robot path
31
Outline Linguistic Scene Description (External) – Results in Scene Matching Linguistic Scene Description (Ego Centered) Human/Robot Dialog Sketched Route Understanding Text-to-Sketch – Inverse of above
32
University of Missouri Consider a frantic (and fictitious) call like this: – I saw terrorist X slip into a building up ahead. I don’t know where I am in the city and I can’t read the signs. I’m walking towards the location. – There is a somewhat long, thin rectangular shaped parking lot that extends forward. – To the immediate right of that parking lot is a parking garage. – I see a moderately small rectangular building close to me that is mostly to my left but partially forward. – Across a 4-way intersection, there is a small rectangular building close to me on my left that extends behind me. – There is another small rectangular building across the street that is mostly to the front of me, but somewhat to the left. – A short distance to the right of that building is a small L-shaped office. – I’ve reached another 4-way intersection, there is a large L-shaped building that extends to the rear. That’s where he entered. Can we automatically pinpoint the location? Motivation: Funded by the National Geospatial Intelligence Agency
33
ACADEMIC RESEARCH GRANT Histograms of forces for red structure (upper left) Structures into database Histograms of forces + Fuzzy rules Convert to graph Natural Language Build graphics sketch to match Text descriptions IHMC Parser MATCHING Best approach to date: Evolutionary Computation Algorithm New approach: Hybrid Algorithm EC basic structure Subgraph Isomorphism locally in database in addition to mutation MU Text-to-Sketch System
34
ACADEMIC RESEARCH GRANT Syntactic parse tree for: “There is a large rectangular building is to my left.” Corresponding logical form graph of the deep semantic parse J. Allen, M. Swift and W. de Beaumont, “Deep Semantic Analysis of Text”, Proc., Symp. On Semantics in Systems for Text Processing, 2008. J. Allen, Natural Language Understanding, 2 nd Ed. Redwood City, CA, USA: Benjamin-Cummings, 1995. Automatically extract objects and relationships from IHMC’s deep semantic parse Right now, language variation is restricted Some Details: Parser
35
ACADEMIC RESEARCH GRANT Example of “directly to the front,” which produces a constrained search region, combined with the distance descriptor “somewhat close.” A compound linguistic description: “perfectly to the left of the blue building” and “mostly to the right, but somewhat above the red building” An illustration of “somewhat to the right” which produces a rather large search region Building blocks: Intelligent object placement scheme Too expensive to use guided random search only – Directional Fuzzy Templates I. Sledge and J. Keller, "Mapping natural language to imagery: Placing objects intelligently", Proc. IEEE Int. Conf. Fuzzy Syst., Jeju Island, Korea, August, 2009, pp 518-524 (Best student paper award).
36
ACADEMIC RESEARCH GRANT “To the immediate right of that parking lot is a large parking garage that is the same length as the parking lot.” Fuzzy Initial Placement template Optimized PlacementInitial Placement To the right but a little bit forward Perfectly to the right Linguistic Descriptions From fuzzy rule base Example of Sketch Creation After Parsing
37
ACADEMIC RESEARCH GRANT (a)“There is a somewhat long, thin rectangular shaped parking lot that extends forward relative to me on my right.” (b) “To the immediate right of that parking lot is a large parking garage that is the same length as the parking lot.“ (c) “I see a moderately small rectangular building close to me that is mostly to my left but partially forward.” (d) “Travelling to a 4-way intersection, there is a small rectangular building close to me on my left that extends behind me.” (e) “There is another small rectangular building across the street that is mostly to the front of me, but somewhat to the left.” (f) Left image: Final sketch (after a few more descriptions). Right image: Ground truth recovered from matching algorithm Sketch Building Example
38
ACADEMIC RESEARCH GRANT Evolutionary Computation Approach Sketch converted to chromosome structure: building = gene Each gene attributed with its Histograms of Forces
39
ACADEMIC RESEARCH GRANT Initial Generation & fitness evaluation of chromosome Add n random chromosomes to population over the Geospatial database Each gene attributed with its Histograms of Forces Blue are query HoFs Red are those of current Chromosome
40
ACADEMIC RESEARCH GRANT MUTATION for Reproduction Simplest version Example: Building “B” chosen for replacement Take Neighbors of “B” in database as potential replacements and choose one whose histograms best match sketch Green: in the chromosome Blue: chosen replacement To add diversity: every 100 generations, least fit 10% replaced by random chromosomes
41
ACADEMIC RESEARCH GRANT Iteration & Convergence Fitness: 0.917 Fitness: 0.871 Fitness: 0.865 Red Building replaced by Blue in this chromosome When a chromosome’s match quality exceeds q min, output results
42
ACADEMIC RESEARCH GRANT A walk by Shakespeare’s Pizza in Columbia
43
ACADEMIC RESEARCH GRANT A walk by Shakespeare’s Pizza in Columbia: Up close and personal
44
Human Geography Project University of Missouri and University of Florida HuGeo based knowledge Pattern Recognition Techniques Simulation and Modeling GIS Geospatial representation and reasoning Conflation Confidence assessment Text2Sketch Math models Repast Sampling methods Swarm Intelligence Uncertainty Modeling (Probabilistic, fuzzy, belief ) Classifiers Clustering Ontologies Multi and Hyper Spectral analysis Fusion
45
HuGeo Project Overview What’s needed/missing in HuGeo-based Knowledge Discovery component? – Extraction of concepts and events from blogs, twitter, news sources, etc – Interpretation of sentences – Placing them geographically i.e., we need NLP
46
Conclusions We model and utilize spatial language – We’re great at modeling, pattern recognition, fusion Our languages are not very flexible To take the next steps, we need better and deeper language models Help!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.