Download presentation
Presentation is loading. Please wait.
Published byBerniece Goodwin Modified over 8 years ago
1
1 The Webclopedia Eduard Hovy, Ulf Hermjakob, Chin-Yew Lin, Mike Junk, Laurie Gerber Information Sciences Institute University of Southern California
2
2 Webclopedia background Goal: View web as encyclopedia, using –question answering –multi- and single-doc summarization –translation Project start: March 2000 Initial task: QA, in TREC-9 Research foci: –robust parsing –QA typology and answer template learning –multi-document summarization
3
3 Core QA approach Many ways to ask the same thing: Question Variations, Many ways to answer the same thing: Answer Variations, Collect the major classes of ‘thing’—create QA typology, Classify Q variation patterns and A variation patterns appropriately in QA typology. Questions : –what is the QA typology? –what do the Q and A variation patterns contain? –how many are there, and how do you collect them?
4
4 IR Steps: create query from question (WordNet-expand) retrieve top 1000 documents Engines: MG (Sydney)—(Lin) AT&T (TREC)—(Lin) Segmentation Steps:segment each document into topical segments Engines: fixed-length (not used) TexTiling (Hearst 94)—(Lin) C99 (Choi 00)—(Lin) MAXNET (Lin00, not used) Ranking Steps: score each sentence in each segment, using WordNet expansion rank segments Engines: FastFinder (Junk) Segment Parsing Steps: parse segment sentences Engines: CONTEX (Hermjakob) Matching Steps: match general constraint patterns against parse trees match desired semantic type against parse tree elements match desired words against words in sentences Engines: matcher (Junk) Ranking and answer extraction Steps: rank candidate answers extract and format them Engines: part of matcher (Junk) Question parsing Steps: parse question find desired semantic type Engines: CONTEX (Hermjakob) QA typology Categorize QA types in taxonomy (Gerber) Constraint patterns Identify likely answers in relation to other parts of the sentence (Gerber) Retrieve documents Segment documents Rank segments Parse top segmentsParse question Input question Match segments against question Rank and prepare answers Create query Output answers Architecture
5
5 QA Typology Analyzed 17,384 questions (from answers.com) Typology of typical Q forms—94 nodes (47 leaf nodes) (THING ((AGENT (NAME (FEMALE-FIRST-NAME (EVE MARY...)) (MALE-FIRST-NAME (LAWRENCE SAM...)))) (COMPANY-NAME (BOEING AMERICAN-EXPRESS)) JESUS ROMANOFF...) (ANIMAL-HUMAN (ANIMAL (WOODCHUCK YAK...)) PERSON) (ORGANIZATION (SQUADRON DICTATORSHIP...)) (GROUP-OF-PEOPLE (POSSE CHOIR...)) (STATE-DISTRICT (TIROL MISSISSIPPI...)) (CITY (ULAN-BATOR VIENNA...)) (COUNTRY (SULTANATE ZIMBABWE...)))) (PLACE (STATE-DISTRICT (CITY COUNTRY...)) (GEOLOGICAL-FORMATION (STAR CANYON...)) AIRPORT COLLEGE CAPITOL...) (ABSTRACT (LANGUAGE (LETTER-CHARACTER (A B...))) (QUANTITY (NUMERICAL-QUANTITY INFORMATION-QUANTITY MASS-QUANTITY MONETARY-QUANTITY TEMPORAL-QUANTITY ENERGY-QUANTITY TEMPERATURE-QUANTITY ILLUMINATION-QUANTITY (SPATIAL-QUANTITY (VOLUME-QUANTITY AREA-QUANTITY DISTANCE-QUANTITY))... PERCENTAGE))) (UNIT ((INFORMATION-UNIT (BIT BYTE... EXABYTE)) (MASS-UNIT (OUNCE...)) (ENERGY-UNIT (BTU...)) (CURRENCY-UNIT (ZLOTY PESO...)) (TEMPORAL-UNIT (ATTOSECOND... MILLENIUM)) (TEMPERATURE-UNIT (FAHRENHEIT KELVIN CELCIUS)) (ILLUMINATION-UNIT (LUX CANDELA)) (SPATIAL-UNIT ((VOLUME-UNIT (DECILITER...)) (DISTANCE-UNIT (NANOMETER...)))) (AREA-UNIT (ACRE))... PERCENT)) (TANGIBLE-OBJECT ((FOOD (HUMAN-FOOD (FISH CHEESE...))) (SUBSTANCE ((LIQUID (LEMONADE GASOLINE BLOOD...)) (SOLID-SUBSTANCE (MARBLE PAPER...)) (GAS-FORM-SUBSTANCE (GAS AIR))...)) (INSTRUMENT (DRUM DRILL (WEAPON (ARM GUN))...) (BODY-PART (ARM HEART...)) (MUSICAL-INSTRUMENT (PIANO)))... *GARMENT *PLANT DISEASE)
6
6 Typology example: Organization
7
7 Linking Qs and As: Match templates Hand-constructed over 500 templates: –Formulated templates in terms of parse tree. –Separately developed Q variations and A variations, cross-multiply. –Most popular types: agent, patient, XofY (President of Costa Rica), born, nationality, where_event. Problems: –Core concepts ‘hide’ in other words: “French automaker Peugeot…”; “Grant’s birthplace…” –No logical entailment: “Who led the Branch Davidians?” — answers near “followers of” –WordNet term expansion too general—can mislead –Word-level matching too general: “Juan Gomez, former President of Costa Rica…”, “if Van Dyke is the nominee…”
8
8 Parser: Performance and ontology Parser trained on 3000 Penn Treebank sentences, and then with 251 questions added: (5-fold cross-validation; Hermjakob 2000) Parser ontology: 10977 nodes; 102 used in QA task C-AT-LOCATION, C-DATE, C-DATE-RANGE, C-DATE-WITH-DAY-OF-THE-WEEK, C- DATE-WITH-YEAR, C-PHONE-NUMBER, C-TEMP-LOC, C-TEMP-LOC-WITH-YEAR, C- TIME, I-EADJ-COLOR, I-EADJ-NATIONALITY, I-EN-ANIMAL, I-EN-AREA-QUANTITY, I- EN-BEVERAGE, I-EN-BODY-PART, I-EN-CITY, I-EN-CONTINENT, I-EN-COUNTRY, I-EN- DAY-OF-THE-WEEK,... Use to help system: all oceans, planets, US major league sports teams, etc. Parser output to Matcher contains semantic type of answer.
9
9 QTargets identified by parser (((I-EN-PROPER-PERSON S-PROPER-NAME))) [98] q4.1 Who is the author of the book, "The Iron Lady: A Biography of Margaret Thatcher"? q4.5 What is the name of the managing director of Apricot Computer? (((C-DATE) (C-TEMP-LOC-WITH-YEAR) (C-DATE-RANGE)) ((EQ C-TEMP-LOC))) [66] q4.15 When was London's Docklands Light Railway constructed? q4.22 When did the Jurassic Period end? (((I-EN-NUMERICAL-QUANTITY) (I-ENUM-CARDINAL))) [51] q4.70 How many lives were lost in the China Airlines' crash in Nagoya, Japan? q4.82 How many consecutive baseball games did Lou Gehrig play? (((I-EN-MONETARY-QUANTITY))) [12] q4.2 What was the monetary value of the Nobel Peace Prize in 1989? q4.4 How much did Mercury spend on advertising in 1993? (((Q-DEFINITION))) [35] q4.30 What are the Valdez Principles? q4.115 What is Head Start? (((Q-WHY-FAMOUS-PERSON))) [35] q4.207 What is Francis Scott Key best known for? q4.222 Who is Anubis? (((Q-ABBREVIATION-EXPANSION))) [16] q4.224 What does laser stand for? q4.238 What does the abbreviation OAS stand for? Ones the parser cannot handle (87 of 893): q4.3 What does the Peugeot company make? q4.61 What brand of white rum is made in Cuba? q4.63 What nuclear-powered Russian submarine sank in the Norwegian Sea on April 7, 1989? q4.68 What does El Nino mean in spanish? q4.79 What did Shostakovich write for Rostropovich?...
10
10 Candidate answer sentence parse tree: (:SURF "Ouagadougou, the capital of Burkina Faso" :CAT S-NP :LEX "Ouagadougou" :CLASS I-EN-PROPER-PLACE :SUBS (((HEAD) (:SURF "Ouagadougou" :CAT S-NP :LEX "Ouagadougou" :CLASS I-EN-PROPER-PLACE :SUBS (((HEAD) (:SURF "Ouagadougou" :CAT S-PROPER-NAME :LEX "Ouagadougou" :CLASS I-EN-PROPER-PLACE))))) ((DUMMY) (:SURF "," :CAT D-COMMA :LEX "," :SPAN ((35 36)))) ((MOD) (:SURF "the capital of Burkina Faso" :CAT S-NP :LEX "capital" :CLASS I-EN-CAPITAL)))...) Question type constraints: constraint_set(question,1.0) { (sem_equal [question OBJ] I-EN-INTERR-PRONOUN-WHAT) (set x? [question SUBJ]) (sem_equal [question HEAD] I-EV-BE) } Answer type constraints: constraint_set(qtarget2,1.0) { (set answer? [answer.*]) (sem_equal answer? I-EN-PROPER-PLACE) (syn_equal answer? S-PROPER-NAME) }
11
11 And when the parser fails? Fallback: Word-level matching Process: Slide window over potential answer region; score each position; select max. Factors: –w: window width (modulated by gaps of various lengths: “white house” “white car and house”) –r: rank of desired target –I: window word information content (inverse log freq) –q: # different question words, and specific rewards (bonus q=3.0) –e: penalty for question word expansion using WordNet synsets (e=0.8) –b: boosting for main verb match, target words, proper names, etc. (b=2.0) –u: semantic type match (with superconcept match, penalty u=0.1) phrases words Score = (500 / (500+w)) * (1 / r) * [( I 1.5*e*u*b*q ) 1.5 ]
12
12 IR hits finder hits constraints qtarget qword total Question9x1,…0111 Question100000 Question11x2,…01/200 Question12x2,…11/211 Question13x18,…1/21/301/2 Question14x2,…0101 Question15x1,…0000 Question16x1,…0101/2 Question17x3,…01/301/3 No IR Top segment Best score QA patterns Fallback: word match Sem. type from parser Performance analysis Goal: Compare performance of each module: Ran 50 test runs over 6 weeks:
13
13 Webclopedia module performance IR query construction IR retrieval Text segmenter Segment ranker Parser Qtarget from parser QA template Qtarget with match Qword plus fallback Overall score TREC9 query construction answer in top 100 docs answer in top 50 docs answer in single segment answer in top 5/10/20/50/100 parse successful and correct Qtarget correct template succeeds match succeeds ok with expansion 90%+ merge MG, ATT no major effect, except speedup 44/52/64/75/78% 90%+ ~85% 5%+ ~25% ~15% 31%
14
14 QA type n qtargets ISI % avg % ratio ------------------------------------------------------------ PROPER-PERSON 83 96.39% 42.21% 30.47% 1.39 DATE 64 100.00% 43.75% 26.03% 1.68 OTHER 62 30.65% 12.96% 13.34% 0.97 CITY 44 93.18% 58.60% 36.92% 1.59 PROPER-PLACE 41 95.12% 48.78% 30.89% 1.58 NUMERICAL-QUANTITY 40 95.00% 31.21% 23.08% 1.35 WHY-FAMOUS 35 71.43% 29.62% 20.12% 1.47 DEFINITION 32 18.75% 36.56% 19.30% 1.89 AGENT 25 100.00% 39.00% 34.75% 1.12 TITLED-WORK 21 80.95% 10.71% 12.10% 0.89 ACRONYM-EXPANSION 17 0.00% 28.43% 12.16% 2.34 PROPER-COMPANY 15 86.67% 28.00% 24.24% 1.15 OTHER-NAME 14 71.43% 5.95% 8.07% 0.74 ANIMAL 13 53.85% 13.72% 10.90% 1.26 PROPER-ORGANIZATION 12 50.00% 25.69% 24.49% 1.05 CONTINENT/WORLD-REGION 11 100.00% 34.09% 25.93% 1.31 TEXT 11 18.18% 16.67% 7.92% 2.10 DISTANCE-QUANTITY 8 87.50% 46.87% 22.59% 2.07 STATE-DISTRICT 8 100.00% 21.87% 11.98% 1.83 GEOGRAPHICAL-PERSON 7 57.14% 28.57% 15.07% 1.90 PROPER-SPORT-TEAM 7 100.00% 31.43% 18.38% 1.71 DISEASE 7 71.43% 24.29% 16.35% 1.49 MASS-QUANTITY 6 66.67% 4.17% 6.82% 0.61 MONETARY-QUANTITY 6 100.00% 11.67% 18.50% 0.63 OCCUPATION-PERSON 6 50.00% 50.00% 21.67% 2.31 PROPER-COLLEGE 6 83.33% 9.72% 12.21% 0.80 LIQUID 6 0.00% 0.00% 0.81% 0.00 TEMPORAL-QUANTITY 5 80.00% 25.00% 15.94% 1.57 ANIMAL-FOOD 5 100.00% 36.67% 14.40% 2.55 PROPER-MOUNTAIN 5 80.00% 53.00% 37.08% 1.43 QA type n qtargets ISI % avg % ratio ------------------------------------------------------------ INSTRUMENT 5 0.00% 0.00% 4.08% 0.00 SPORT 5 20.00% 14.00% 11.17% 1.25 COUNTRY 5 100.00% 60.00% 19.51% 3.08 LOCATION/HABITAT 4 75.00% 25.00% 16.53% 1.51 BODY-PART 4 50.00% 0.00% 0.00% ---- LANGUAGE 3 0.00% 51.11% 31.62% 1.62 AREA-QUANTITY 3 100.00% 0.00% 3.54% 0.00 PLANT 3 0.00% 16.67% 9.85% 1.69 PROPER-ANIMAL 3 100.00% 0.00% 19.61% 0.00 SPEED-QUANTITY 2 50.00% 0.00% 2.53% 0.00 TEMPERATURE-QUANTITY 2 100.00% 20.00% 10.20% 1.96 ABBREVIATION 2 0.00% 25.00% 13.38% 1.87 DATE-RANGE 2 100.00% 50.00% 12.37% 4.04 CAUSE-EFFECT 2 0.00% 0.00% 3.41% 0.00 LOCATION 1 100.00% 0.00% 19.80% 0.00 PROPER-PLANET 1 100.00% 100.00% 47.07% 2.12 TANGIBLE-OBJECT 1 0.00% 0.00% 17.17% 0.00 REASON 1 100.00% 0.00% 0.00% ---- CONSEQUENT 1 0.00% 0.00% 0.00% ---- PROPER-RIVER 1 100.00% 0.00% 3.03% 0.00 VEHICLE 1 0.00% 0.00% 1.52% 0.00 PROPER-OCEAN 1 100.00% 100.00% 52.27% 1.91 PROPER-SHIP 1 100.00% 0.00% 5.81% 0.00 SUBSTANCE 1 0.00% 50.00% 56.92% 0.88 PERCENTAGE 1 100.00% 0.00% 3.74% 0.00 COLOR 1 100.00% 0.00% 1.01% 0.00 PURPOSE 1 0.00% 20.00% 2.37% 8.43 PHONE-NUMBER 1 100.00% 0.00% 3.03% 0.00 LAST-NAME 1 100.00% 0.00% 14.09% 0.00 Scores on TREC-9 questions
15
15 Current work QA: –extend parser ontology, –integrate QA templates, Qtargets, Qwords, fallback words into single template notation, –learn QA templates automatically, systematically covering typology, –problem Qs: time sensitivity of answer, reliability of source, follow-up dialogue with user, etc. Multi-doc summarization: –create baseline system, –focus on core item extraction and core item ordering/ organization.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.