Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning in Search Slide 1 Hal Daumé III Learning in Search Hal Daumé III Information Sciences Institute University of Southern California.

Similar presentations


Presentation on theme: "Learning in Search Slide 1 Hal Daumé III Learning in Search Hal Daumé III Information Sciences Institute University of Southern California."— Presentation transcript:

1 Learning in Search Slide 1 Hal Daumé III (hdaume@isi.edu) Learning in Search Hal Daumé III Information Sciences Institute University of Southern California hdaume@isi.edu Acknowledgments: John Langford, Daniel Marcu...or structured prediction when the loss, features and structure don't decompose

2 Learning in Search Slide 2 Hal Daumé III (hdaume@isi.edu) Result: Searn (= “Search + Learn”) Computationally efficient Strong theoretical guarantees State-of-the-art performance Broadly applicable Easy to implement

3 Learning in Search Slide 3 Hal Daumé III (hdaume@isi.edu) Natural Language Processing Problems Machine Translation Summarization Information Extraction Question Answering Parsing Understanding InputOutput Arabic sentenceEnglish sentence Long documentsShort documents DocumentDatabase Corpus + query Sentence Answer Syntactic tree Logical sentence Want: Arbitrary features, arbitrary losses

4 Learning in Search Slide 4 Hal Daumé III (hdaume@isi.edu) Ready-to-use Data Millions of Words (English side)

5 Learning in Search Slide 5 Hal Daumé III (hdaume@isi.edu) Progress in Statistical MT insistent Wednesday may recurred her trips to Libya tomorrow for flying Cairo 6-4 ( AFP ) - an official announced today in the Egyptian lines company for flying Tuesday is a company " insistent for flying " may resumed a consideration of a day Wednesday tomorrow her trips to Libya of Security Council decision trace international the imposed ban comment. And said the official " the institution sent a speech to Ministry of Foreign Affairs of lifting on Libya air, a situation her receiving replying are so a trip will pull to Libya a morning Wednesday ". Egyptair Has Tomorrow to Resume Its Flights to Libya Cairo 4-6 (AFP) - said an official at the Egyptian Aviation Company today that the company egyptair may resume as of tomorrow, Wednesday its flights to Libya after the International Security Council resolution to the suspension of the embargo imposed on Libya. " The official said that the company had sent a letter to the Ministry of Foreign Affairs, information on the lifting of the air embargo on Libya, where it had received a response, the first take off a trip to Libya on Wednesday morning ". 2 0 0 2 2 0 0 3 slide from C. Wayne, DARPA

6 Learning in Search Slide 6 Hal Daumé III (hdaume@isi.edu) Phrase-Based Translation Australia is with North diplomatic relations have that countries few one of Korea 澳洲是北与韩有邦交的国家之一少数 Austral ia isis wit h Nort h Kore a isis diploma tic relation s one of the few countri es Austral ia isis diploma tic relation s wit h Nort h Kore a isis one of the few countri es [Koehn, Och and Marcu, NAACL 2003] NP-hard, even for simple models

7 Learning in Search Slide 7 Hal Daumé III (hdaume@isi.edu) The IBM Model [Brown et al., 1993] Mary did not slap the green witch Mary not slap slap slap the green witch n(3|slap) Maria no daba una botefada a la bruja verde d(j|i) Mary not slap slap slap NULL the green witch P NULL Maria no daba una botefada a la verde bruja t(la|the) Use the EM algorithm for training the parameters

8 Learning in Search Slide 8 Hal Daumé III (hdaume@isi.edu) Training Phrase-Based MT Systems Maria no daba una botefada a la bruja verde witc h green the slap not did Mary [Koehn, Och and Marcu, NAACL03]

9 Learning in Search Slide 9 Hal Daumé III (hdaume@isi.edu) Search Problem Phrase map Foreign input Example feature functions g: Word translation probabilities Phrase translation probabilities Distortion probabilities Length penalty Language model probability... Horribly intractable in general Best performer seems to be multi-pass large beam search.

10 Learning in Search Slide 10 Hal Daumé III (hdaume@isi.edu) Learning in NLP Applications Model Features Learning Search

11 Learning in Search Slide 11 Hal Daumé III (hdaume@isi.edu) Structured Prediction Formulated as a maximization problem: SearchModelFeaturesLearning Search (and learning) tractable only for very simple Y(x) and f

12 Learning in Search Slide 12 Hal Daumé III (hdaume@isi.edu) Search for Phrase-based MT L' homme a mangé un sandwich délicieux. The man ate a tasty / delicieux sandwich / delicieux tasty / sandwich sandwich / sandwich delicious / delicieux tasty sandwich / sandwich alien / sandwich... Multi-class classification problem?

13 Learning in Search Slide 13 Hal Daumé III (hdaume@isi.edu) Reducing Structured Prediction Key Assumption: Optimal Policy for training data Given:input, true output and state; Return:best successor state No assumption of single output

14 Learning in Search Slide 14 Hal Daumé III (hdaume@isi.edu) Conditional Random Fields Max-margin Markov Networks Searn Optimal Policies Optimal Path Approximately Optimal Policy Optimal Policy Weak Feedback Loss-augmented Search Normalized Search ⊇ ⊃ ⊃ ? ⊃ ? ⊃ Most NLP Problems

15 Learning in Search Slide 15 Hal Daumé III (hdaume@isi.edu) How to Learn in Search Idea: Train based only on optimal path (ala MEMM) Better Idea: Train based only on optimal policy, then train based on optimal policy + a little learned policy then train based on optimal policy + a little more learned policy then... eventually only use learned policy

16 Learning in Search Slide 16 Hal Daumé III (hdaume@isi.edu) Searn (= “Search + Learn”) Initialize current policy = optimal policy The man ate a tasty / delicieux sandwich / sandwich delicious / delicieux sandwich / delicieux... Repeat: Search each example using current policy Expand each state into a classification problem Compute per-class costs by completing search sandwich that is tasty sandwich L = 0 L = 0.1 L = 0.3 L = 0.8 Learn new policy using a cost-sensitive classifier Interpolate: cur = {new} + (1- ){cur}

17 Learning in Search Slide 17 Hal Daumé III (hdaume@isi.edu) Theoretical Analysis Theorem: For conservative, after 2T 3 ln T iterations, the loss of the learned policy h is bounded as follows: Loss of the optimal policy Average multiclass classification loss Worst case per-step loss No sampling assumptions! But also no generalization claims (left to classifier)

18 Learning in Search Slide 18 Hal Daumé III (hdaume@isi.edu) From Greedy to Beam Search Clone + Move KILL! Clone + Move KILL!

19 Learning in Search Slide 19 Hal Daumé III (hdaume@isi.edu) Proof on Concept: Sequence Labels Spanish Named Entity Recognition SVM ISO CR F Searn Searn+data Handwriting Recognition LRLR SVMM3N Searn Searn+data Chunking+ Tagging Incremental Perceptron LaSO CR F Searn

20 Learning in Search Slide 20 Hal Daumé III (hdaume@isi.edu) New Task: Summarization Give me information about the Falkland islands war Argentina was still obsessed with the Falkland Islands even in 1994, 12 years after its defeat in the 74- day war with Britain. The country's overriding foreign policy aim continued to be winning sovereignty over the islands. That's too much information to read! The Falkland islands war, in 1982, was fought between Britain and Argentina. That's perfect! Standard approach is sentence extraction, but that is often deemed to “coarse” to produce good, very short summaries. We wish to also drop words and phrases => document compression

21 Learning in Search Slide 21 Hal Daumé III (hdaume@isi.edu) Structure of Search S1S2S3S4S5Sn... = frontier node = summary node ➢ Lay sentences out sequentially ➢ Generate a dependency parse of each sentence ➢ Mark each root as a frontier node ➢ Repeat: ➢ Choose a frontier node node to add to the summary ➢ Add all its children to the frontier ➢ Finish when we have enough words To make search more tractable, we run an initial round of sentence extraction @ 5x length The man ate a big sandwich The man ate abig sandwich Argentina was still obsessed with the Falkland Islands even in 1994, 12 years after its defeat in the 74- day war with Britain. The country's overriding foreign policy aim continued to be winning sovereignty over the islands. Optimal Policy not analytically available; Approximated with beam search.

22 Learning in Search Slide 22 Hal Daumé III (hdaume@isi.edu) Example Output (40 word limit) Sentence Extraction + Compression: Argentina and Britain announced an agreement, nearly eight years after they fought a 74-day war a populated archipelago off Argentina's coast. Argentina gets out the red carpet, official royal visitor since the end of the Falklands war in 1982. Vine Growth (Searn): Argentina and Britain announced to restore full ties, eight years after they fought a 74-day war over the Falkland islands. Britain invited Argentina's minister Cavallo to London in 1992 in the first official visit since the Falklands war in 1982. 6 Diplomatic ties restored3 Falkland war was in 1982 5 Major cabinet member visits3 Cavallo visited UK 5 Exchanges were in 19922 War was 74-days long 3 War between Britain and Argentina +24 +13

23 Learning in Search Slide 23 Hal Daumé III (hdaume@isi.edu) Results

24 Learning in Search Slide 24 Hal Daumé III (hdaume@isi.edu) Discussion + Open Questions

25 Learning in Search Slide 25 Hal Daumé III (hdaume@isi.edu) CRFs vs. Searn on Tagging B A What happens with the transition? Based on discussions with Charles S + Andrew McC CRF: Make it look as bad as possible Greedy Searn: If we get to A, make it look as good as possible Beam Searn: Make it look bad if we hit both A & B Make it look good otherwise

26 Learning in Search Slide 26 Hal Daumé III (hdaume@isi.edu) Increasing the Beam Length of search path Classifier loss As beam gets larger: T gets larger! l avg gets smaller!

27 Learning in Search Slide 27 Hal Daumé III (hdaume@isi.edu) When to Use Structure? For many problems, the loss fn forces it (MT, summarization, parsing, coreference, etc.) For some problems, it doesn't (tagging under hamming loss,... anything else?)... information theoretically, no need... for training-data “fitting,” can only hurt... the rub must be in generalization: previous tags act as “summaries” (seems quite similar to Ando+Zhang)

28 Learning in Search Slide 28 Hal Daumé III (hdaume@isi.edu) Solving Combinatorial Problems? Beam search really bad for such problems Use hill-climbing search ala TSP Here, Searn learns to make changes that improves the outputs (Not clear this problem is worth solving)

29 Learning in Search Slide 29 Hal Daumé III (hdaume@isi.edu) Take Home Messages Want arbitrary features and losses Corresponding “argmax” problems intractable Must run search Then, we should use search during learning Need an optimal policy in order to do learning Can always approximate the optimal policy


Download ppt "Learning in Search Slide 1 Hal Daumé III Learning in Search Hal Daumé III Information Sciences Institute University of Southern California."

Similar presentations


Ads by Google