Download presentation
Presentation is loading. Please wait.
2
Semantic Understanding An Approach Based on Information-Extraction Ontologies David W. Embley Brigham Young University
3
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
4
Grand Challenge Semantic Understanding Can we quantify & specify the nature of this grand challenge?
5
Grand Challenge Semantic Understanding “If ever there were a technology that could generate trillions of dollars in savings worldwide …, it would be the technology that makes business information systems interoperable.” (Jeffrey T. Pollock, VP of Technology Strategy, Modulant Solutions)
6
Grand Challenge Semantic Understanding “The Semantic Web: … content that is meaningful to computers [and that] will unleash a revolution of new possibilities … Properly designed, the Semantic Web can assist the evolution of human knowledge …” (Tim Berners-Lee, …, Weaving the Web)
7
Grand Challenge Semantic Understanding “20 th Century: Data Processing “21 st Century: Data Exchange “The issue now is mutual understanding.” (Stefano Spaccapietra, Editor in Chief, Journal on Data Semantics)
8
Grand Challenge Semantic Understanding “The Grand Challenge [of semantic understanding] has become mission critical. Current solutions … won’t scale. Businesses need economic growth dependent on the web working and scaling (cost: $1 trillion/year).” (Michael Brodie, Chief Scientist, Verizon Communications)
9
What is Semantic Understanding? Understanding: “To grasp or comprehend [what’s] intended or expressed.’’ Semantics: “The meaning or the interpretation of a word, sentence, or other language form.” - Dictionary.com
10
Can We Achieve Semantic Understanding? “A computer doesn’t truly ‘understand’ anything.” But computers can manipulate terms “in ways that are useful and meaningful to the human user.” - Tim Berners-Lee Key Point: it only has to be good enough. And that’s our challenge and our opportunity! …
11
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
12
Information Value Chain Meaning Knowledge Information Data Translating data into meaning
13
Foundational Definitions Meaning: knowledge that is relevant or activates Knowledge: information with a degree of certainty or community agreement Information: data in a conceptual framework Data: attribute-value pairs - Adapted from [Meadow92]
14
Foundational Definitions Meaning: knowledge that is relevant or activates Knowledge: information with a degree of certainty or community agreement (ontology) Information: data in a conceptual framework Data: attribute-value pairs - Adapted from [Meadow92]
15
Foundational Definitions Meaning: knowledge that is relevant or activates Knowledge: information with a degree of certainty or community agreement (ontology) Information: data in a conceptual framework Data: attribute-value pairs - Adapted from [Meadow92]
16
Foundational Definitions Meaning: knowledge that is relevant or activates Knowledge: information with a degree of certainty or community agreement (ontology) Information: data in a conceptual framework Data: attribute-value pairs - Adapted from [Meadow92]
17
Data Attribute-Value Pairs Fundamental for information Thus, fundamental for knowledge & meaning
18
Data Attribute-Value Pairs Fundamental for information Thus, fundamental for knowledge & meaning Data Frame Extensive knowledge about a data item ̶ Everyday data: currency, dates, time, weights & measures ̶ Textual appearance, units, context, operators, I/O conversion Abstract data type with an extended framework
19
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
20
? Olympus C-750 Ultra Zoom Sensor Resolution:4.2 megapixels Optical Zoom:10 x Digital Zoom:4 x Installed Memory:16 MB Lens Aperture:F/8-2.8/3.7 Focal Length min:6.3 mm Focal Length max:63.0 mm
21
? Olympus C-750 Ultra Zoom Sensor Resolution:4.2 megapixels Optical Zoom:10 x Digital Zoom:4 x Installed Memory:16 MB Lens Aperture:F/8-2.8/3.7 Focal Length min:6.3 mm Focal Length max:63.0 mm
22
? Olympus C-750 Ultra Zoom Sensor Resolution:4.2 megapixels Optical Zoom:10 x Digital Zoom:4 x Installed Memory:16 MB Lens Aperture:F/8-2.8/3.7 Focal Length min:6.3 mm Focal Length max:63.0 mm
23
? Olympus C-750 Ultra Zoom Sensor Resolution4.2 megapixels Optical Zoom10 x Digital Zoom4 x Installed Memory16 MB Lens ApertureF/8-2.8/3.7 Focal Length min6.3 mm Focal Length max63.0 mm
24
Digital Camera Olympus C-750 Ultra Zoom Sensor Resolution:4.2 megapixels Optical Zoom:10 x Digital Zoom:4 x Installed Memory:16 MB Lens Aperture:F/8-2.8/3.7 Focal Length min:6.3 mm Focal Length max:63.0 mm
25
? Year 2002 MakeFord ModelThunderbird Mileage5,500 miles FeaturesRed ABS 6 CD changer keyless entry Price$33,000 Phone(916) 972-9117
26
? Year 2002 MakeFord ModelThunderbird Mileage5,500 miles FeaturesRed ABS 6 CD changer keyless entry Price$33,000 Phone(916) 972-9117
27
? Year 2002 MakeFord ModelThunderbird Mileage5,500 miles FeaturesRed ABS 6 CD changer keyless entry Price$33,000 Phone(916) 972-9117
28
? Year 2002 MakeFord ModelThunderbird Mileage5,500 miles FeaturesRed ABS 6 CD changer keyless entry Price$33,000 Phone(916) 972-9117
29
Car Advertisement Year 2002 MakeFord ModelThunderbird Mileage5,500 miles FeaturesRed ABS 6 CD changer keyless entry Price$33,000 Phone(916) 972-9117
30
? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 16 06 06 17 06 06 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 24 06 06 24 06 06
31
? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
32
Airline Itinerary Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
33
? Monday, October 13th Group AWLTGFGAPts. USA300 11 1 9 Sweden210 5 3 6 North Korea120 3 4 3 Nigeria030 0 11 0 Group BWLTGFGAPts. Brazil201 8 2 7 …
34
? Monday, October 13th Group AWLTGFGAPts. USA300 11 1 9 Sweden210 5 3 6 North Korea120 3 4 3 Nigeria030 0 11 0 Group BWLTGFGAPts. Brazil201 8 2 7 …
35
World Cup Soccer Monday, October 13th Group AWLTGFGAPts. USA300 11 1 9 Sweden210 5 3 6 North Korea120 3 4 3 Nigeria030 0 11 0 Group BWLTGFGAPts. Brazil201 8 2 7 …
36
? Calories250 cal Distance2.50 miles Time23.35 minutes Incline1.5 degrees Speed5.2 mph Heart Rate125 bpm
37
? Calories250 cal Distance2.50 miles Time23.35 minutes Incline1.5 degrees Speed5.2 mph Heart Rate125 bpm
38
? Calories250 cal Distance2.50 miles Time23.35 minutes Incline1.5 degrees Speed5.2 mph Heart Rate125 bpm
39
Treadmill Workout Calories250 cal Distance2.50 miles Time23.35 minutes Incline1.5 degrees Speed5.2 mph Heart Rate125 bpm
40
? PlaceBonnie Lake CountyDuchesne StateUtah TypeLake Elevation10,000 feet USGS QuadMirror Lake Latitude40.711ºN Longitude110.876ºW
41
? PlaceBonnie Lake CountyDuchesne StateUtah TypeLake Elevation10,000 feet USGS QuadMirror Lake Latitude40.711ºN Longitude110.876ºW
42
? PlaceBonnie Lake CountyDuchesne StateUtah TypeLake Elevation10,000 feet USGS QuadMirror Lake Latitude40.711ºN Longitude110.876ºW
43
Maps PlaceBonnie Lake CountyDuchesne StateUtah TypeLake Elevation10,100 feet USGS QuadMirror Lake Latitude40.711ºN Longitude110.876ºW
44
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
45
Information Extraction Ontologies SourceTarget Information Extraction Information Exchange
46
What is an Extraction Ontology? Augmented Conceptual-Model Instance Object & relationship sets Constraints Data frame value recognizers Robust Wrapper (Ontology-Based Wrapper) Extracts information Works even when site changes or when new sites come on-line
47
CarAds Extraction Ontology [1-9]\d{0,2}[kK] … \bmiles\b … [1-9]\d{0,2}[kK] … \bmiles\b …
48
Extraction Ontologies: An Example of Semantic Understanding “Intelligent” Symbol Manipulation Gives the “Illusion of Understanding” Obtains Meaningful and Useful Results
49
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
50
A Variety of Applications Information Extraction Semantic Web Page Annotation Free-Form Semantic Web Queries Task Ontologies for Free-Form Service Requests High-Precision Classification Schema Mapping for Ontology Alignment Accessing the Hidden Web Ontology Generation Challenging Applications (e.g. BioInformatics)
51
Application #1 Information Extraction
52
'97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888 Constant/Keyword Recognition Descriptor/String/Position(start/end) Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155
53
Heuristics Keyword proximity Subsumed and overlapping constants Functional relationships Nonfunctional relationships First occurrence without constraint violation
54
Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155 Keyword Proximity '97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles on her. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888
55
Subsumed/Overlapping Constants '97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888 Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155
56
Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155 Functional Relationships '97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles on her. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888
57
Nonfunctional Relationships '97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles on her. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888 Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155
58
First Occurrence without Constraint Violation '97 CHEVY Cavalier, Red, 5 spd, only 7,000 miles on her. Previous owner heart broken! Asking only $11,995. #1415. JERRY SEINER MIDVALE, 566-3800 or 566-3888 Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155
59
Year|97|2|3 Make|CHEV|5|8 Make|CHEVY|5|9 Model|Cavalier|11|18 Feature|Red|21|23 Feature|5 spd|26|30 Mileage|7,000|38|42 KEYWORD(Mileage)|miles|44|48 Price|11,995|100|105 Mileage|11,995|100|105 PhoneNr|566-3800|136|143 PhoneNr|566-3888|148|155 Database-Instance Generator insert into Car values(1001, “97”, “CHEVY”, “Cavalier”, “7,000”, “11,995”, “556-3800”) insert into CarFeature values(1001, “Red”) insert into CarFeature values(1001, “5 spd”)
60
Application #2 Semantic Web Page Annotation
61
Annotated Web Page
62
OWL CarAds...... 0 1 …… …… Mileage …… …… 2 …… 237 241 ……. ……
63
Application #3 Free-Form Semantic Web Queries
64
Step 1. Parse Query “Find me the and of all s – I want a ”pricemileagere d Nissan1998or newer >= Operator
65
Step 2. Find Corresponding Ontology Similarity value: 6 Similarity value: 2 >= Operator “Find me the price and mileage of all red Nissans – I want a 1998 or newer”
66
Step 3. Formulate XQuery Expression Conjunctive queries run over selected ontology’s extracted values 7 Nissan MakeIns7 41893 1999 YearIns7 41641 red ColorIns7 42186
67
Value-phrase-matching words determine conditions Conditions: Color = “red” Make = “Nissan” Year >= 1998 >= Operator Step 3. Formulate XQuery Expression
68
1: for $doc in document("file:///c:/ontos/owlLib/Car.OWL")/rdf:RDF 2: for $Record in $doc/owl:Thing 3: 4: let $id := substring-after(xs:string($Record/@rdf:about), "CarIns") 5: let $Color := $doc/car:Color[@rdf:ID=concat("ColorIns", $id)]/car:ColorValue/text() 6: let $Make := $doc/car:Make[@rdf:ID=concat("MakeIns", $id)]/car:MakeValue/text() 7: let $Year := $doc/car:Year[@rdf:ID=concat("YearIns", $id)]/car:YearValue/text() 8: let $Price := $doc/car:Price[@rdf:ID=concat("PriceIns", $id)]/car:PriceValue/text() 9: let $Mileage := $doc/car:Mileage[@rdf:ID=concat("MileageIns", $id)]/car:MileageValue/text() 10: 11: where($Color="red" or empty($Color)) and 12: ($Make="Nissan" or empty($Make)) and 13: ($Year>="1998" or empty($Year)) 14: return 15: {$Price} 16: {$Mileage} 17: {$Color} 18: {$Make} 19: {$Year} 20: For each owl:Thing Get the instance ID and extracted values Check conditions Return values Step 3. Formulate XQuery Expression
69
Step 4. Run XQuery Expression Over Ontology’s Extracted Data Uses Qexo 1.7, GNU’s XQuery engine for Java Use XSLT to transform results to HTML table
70
Application #4 Task Ontologies for Free-Form Service Requests
71
Example: Appointment Request
72
Example: Car Purchase Request
73
Example: Apartment Request
74
Application #5 High-Precision Classification
75
An Extraction Ontology Solution
76
Document 1: Car Ads Document 2: Items for Sale or Rent Density Heuristic
77
Document 1: Car Ads Year: 3 Make: 2 Model: 3 Mileage: 1 Price: 1 Feature: 15 PhoneNr: 3 Expected Values Heuristic Document 2: Items for Sale or Rent Year: 1 Make: 0 Model: 0 Mileage: 1 Price: 0 Feature: 0 PhoneNr: 4
78
Vector Space of Expected Values OV______ D1D2 Year 0.98 16 6 Make 0.93 10 0 Model 0.91 12 0 Mileage 0.45 6 2 Price 0.80 11 8 Feature 2.10 29 0 PhoneNr 1.15 1511 D1: 0.996 D2: 0.567 ov D1 D2
79
Grouping Heuristic Year Make Model Price Year Model Year Make Model Mileage … Document 1: Car Ads { { { Year Mileage … Mileage Year Price … Document 2: Items for Sale or Rent { {
80
Grouping Car Ads ---------------- Year Make Model -------------- 3 Price Year Model Year ---------------3 Make Model Mileage Year ---------------4 Model Mileage Price Year ---------------4 … Grouping: 0.875 Sale Items ---------------- Year Mileage -------------- 2 Mileage Year Price ---------------3 Year Price Year ---------------2 Price ---------------1 … Grouping: 0.500 Expected Number in Group = floor(∑ Ave ) = 4 (for our example) Sum of Distinct 1-Max Object Sets in each Group Number of Groups * Expected Number in a Group 1-Max 3+3+4+4 4*4 = 0.875 2+3+2+1 4*4 = 0.500
81
Application #6 Schema Mapping for Ontology Alignment
82
Problem: Different Schemas Target Database Schema {Car, Year, Make, Model, Mileage, Price, PhoneNr}, {PhoneNr, Extension}, {Car, Feature} Different Source Table Schemas {Run #, Yr, Make, Model, Tran, Color, Dr} {Make, Model, Year, Colour, Price, Auto, Air Cond., AM/FM, CD} {Vehicle, Distance, Price, Mileage} {Year, Make, Model, Trim, Invoice/Retail, Engine, Fuel Economy}
83
Solution: Remove Internal Factoring Discover Nesting: Make, (Model, (Year, Colour, Price, Auto, Air Cond, AM/FM, CD)*)* Unnest: μ (Model, Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* μ (Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* Table Legend ACURA
84
Solution: Replace Boolean Values Legend ACURA β CD Table Yes, CD Yes, β Auto β Air Cond β AM/FM Yes, AM/FM Air Cond. Auto
85
Solution: Form Attribute-Value Pairs Legend ACURA CD AM/FM Air Cond. Auto,,,,,,,,
86
Solution: Adjust Attribute-Value Pairs Legend ACURA CD AM/FM Air Cond. Auto,,,,,,,
87
Solution: Do Extraction Legend ACURA CD AM/FM Air Cond. Auto
88
Solution: Infer Mappings Legend ACURA CD AM/FM Air Cond. Auto {Car, Year, Make, Model, Mileage, Price, PhoneNr}, {PhoneNr, Extension}, {Car, Feature} Each row is a car. π Model μ (Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* Table π Make μ (Model, Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* μ (Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* Table π Year Table Note: Mappings produce sets for attributes. Joining to form records is trivial because we have OIDs for table rows (e.g. for each Car).
89
Solution: Infer Mappings Legend ACURA CD AM/FM Air Cond. Auto {Car, Year, Make, Model, Mileage, Price, PhoneNr}, {PhoneNr, Extension}, {Car, Feature} π Model μ (Year, Colour, Price, Auto, Air Cond, AM/FM, CD)* Table
90
Solution: Do Extraction Legend ACURA CD AM/FM Air Cond. Auto {Car, Year, Make, Model, Mileage, Price, PhoneNr}, {PhoneNr, Extension}, {Car, Feature} π Price Table
91
Solution: Do Extraction Legend ACURA CD AM/FM Air Cond. Auto {Car, Year, Make, Model, Mileage, Price, PhoneNr}, {PhoneNr, Extension}, {Car, Feature} Yes, ρ Colour←Feature π Colour Table U ρ Auto ← Feature π Auto β Auto Table U ρ Air Cond. ← Feature π Air Cond. β Air Cond. Table U ρ AM/FM ← Feature π AM/FM β AM/FM Table U ρ CD ← Feature π CD β CD Table Yes,
92
Application #7 Accessing the Hidden Web
93
Obtaining Data Behind Forms Web information is stored in databases Databases are accessed through forms Forms are designed in various ways
94
Hidden Web Extraction System Input Analyzer Retrieved Page(s) User Query Site Form Output Analyzer Extracted Information Application Extraction Ontology “Find green cars costing no more than $9000.”
95
Application #8 Ontology Generation
96
TANGO: Table Analysis for Generating Ontologies Recognize and normalize table information Construct mini-ontologies from tables Discover inter-ontology mappings Merge mini-ontologies into a growing ontology
97
Recognize Table Information Religion Population Albanian Roman Shi’a Sunni Country (July 2001 est.) Orthodox Muslim Catholic Muslim Muslim other Afganistan 26,813,057 15% 84% 1% Albania 3,510,484 20% 70% 10%
98
Construct Mini-Ontology Religion Population Albanian Roman Shi’a Sunni Country (July 2001 est.) Orthodox Muslim Catholic Muslim Muslim other Afganistan 26,813,057 15% 84% 1% Albania 3,510,484 20% 70% 10%
99
Discover Mappings
100
Merge
101
Application #9 Challenging Applications (e.g. BioInformatics)
102
Large Extraction Ontologies
103
Complex Semi-Structured Pages
104
Additional Analysis Opportunities Sibling Page Comparison Semi-automatic Lexicon Update Seed Ontology Recognition
105
Sibling Page Comparison
106
Attributes
107
Sibling Page Comparison
109
Semi-automatic Lexicon Update Additional Protein Names Additional Source Species or Organisms
110
nucleu s; zinc ion binding; nucleic acid binding; zinc ion binding; nucleic acid binding; linear; NP_079345; 9606; Eukaryota; Metazoa; Chorata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo; NP_079345; Homo sapiens; human; GTTTTTGTGTT……….ATAAGTGCATTAAC GGCCCACATG; FLJ14299 msdspagsnprtpessgs gsgg………tagpyyspy alygqrlasasalgyq; hypothetical protein FLJ14299; 8; eight; “8:?p\s?12”; “8:?p11.2”; “8:?p11.23”; : “37,?612,?680”; “37,?610,?585”; Seed Ontology Recognition
112
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
113
Limitations and Pragmatics Data-Rich, Narrow Domain Ambiguities ~ Context Assumptions Incompleteness ~ Implicit Information Common Sense Requirements Knowledge Prerequisites …
114
Busiest Airport? Chicago - 928,735 Landings (Nat. Air Traffic Controllers Assoc.) - 931,000 Landings (Federal Aviation Admin.) Atlanta - 58,875,694 Passengers (Sep., latest numbers available) Memphis - 2,494,190 Metric Tons (Airports Council Int’l.)
115
Busiest Airport? Chicago - 928,735 Landings (Nat. Air Traffic Controllers Assoc.) - 931,000 Landings (Federal Aviation Admin.) Atlanta - 58,875,694 Passengers (Sep., latest numbers available) Memphis - 2,494,190 Metric Tons (Airports Council Int’l.)
116
Busiest Airport? Chicago - 928,735 Landings (Nat. Air Traffic Controllers Assoc.) - 931,000 Landings (Federal Aviation Admin.) Atlanta - 58,875,694 Passengers (Sep., latest numbers available) Memphis - 2,494,190 Metric Tons (Airports Council Int’l.)
117
Busiest Airport? Chicago - 928,735 Landings (Nat. Air Traffic Controllers Assoc.) - 931,000 Landings (Federal Aviation Admin.) Atlanta - 58,875,694 Passengers (Sep., latest numbers available) Memphis - 2,494,190 Metric Tons (Airports Council Int’l.) Ambiguous Whom do we trust? (How do they count?)
118
Busiest Airport? Chicago - 928,735 Landings (Nat. Air Traffic Controllers Assoc.) - 931,000 Landings (Federal Aviation Admin.) Atlanta - 58,875,694 Passengers (Sep., latest numbers available) Memphis - 2,494,190 Metric Tons (Airports Council Int’l.) Important qualification
119
Dow Jones Industrial Average High Low Last Chg 30 Indus 10527.03 10321.35 10409.85 +85.18 20 Transp 3038.15 2998.60 3008.16 +9.83 15 Utils 268.78 264.72 266.45 +1.72 66 Stocks 3022.31 2972.94 2993.12 +19.65 44.07 10,409.85 Graphics, Icons, …
120
Dow Jones Industrial Average High Low Last Chg 30 Indus 10527.03 10321.35 10409.85 +85.18 20 Transp 3038.15 2998.60 3008.16 +9.83 15 Utils 268.78 264.72 266.45 +1.72 66 Stocks 3022.31 2972.94 2993.12 +19.65 44.07 10,409.85 Reported on same date Weekly Daily Implicit information: weekly stated in upper corner of page; daily not stated.
121
Presentation Outline Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data Information Extraction Ontologies Applications Limitations and Pragmatics Summary and Challenges
122
Some Key Ideas Data, Information, and Knowledge Data Frames Knowledge about everyday data items Recognizers for data in context Ontologies Resilient Extraction Ontologies Shared Conceptualizations Limitations and Pragmatics
123
Some Research Issues Building a library of open source data recognizers Precisely finding and gathering relevant information Subparts of larger data Scattered data (linked, factored, implied) Data behind forms in the hidden web Improving concept matching Heuristic orchestration Application of NLP techniques Calculations, unit conversions, data normalization, … Achieving the potential of the presented applications www.deg.byu.edu
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.