Presentation is loading. Please wait.

Presentation is loading. Please wait.

LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 15: 10/16.

Similar presentations


Presentation on theme: "LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 15: 10/16."— Presentation transcript:

1 LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 15: 10/16

2 Administrivia No lecture this Thursday

3 Today’s Topics Midterm review Finite State Transducers (FST)

4 Question 1 Download the file wsj.txt (~ 50K lines) Write a Perl program that finds all lines containing any possible form of the idiom take... advantage of... How many are there in wsj.txt? Submit your program Submit the lines returned by your program

5 Question 1 First hit on Google: –take advantage (of someone) to use someone's weakness to improve your own situation. Mr. Smith often takes advantage of my friendship and leaves the unpleasant tasks for me to do.See also: advantage, take advantagetake –take advantage (of something) to use an opportunity to get or achieve something. He took advantage of the prison's education program to earn a college degree. There are peaches and strawberries grown on the farm, and I sure take full advantage of them.Usage notes: often said of someone who has opportunities that others do not have: The rich can take advantage of clever accounting tricks to avoid taxes.See also: advantage, takeadvantagetake –Cambridge Dictionary of American IdiomsCambridge Dictionary of American Idioms –Cambridge University Press 2003

6 Answer 1 1.Investors took advantage of Tuesday 's stock rally 2.Like other forms of arbitrage, it merely seeks to take advantage of momentary discrepancies 3.As usually practiced it takes advantage of a rather basic concept 4.So if index arbitrage is simply taking advantage of thin inefficiencies 5.`` If you could get the rhythm of the program trading, you could take advantage of it. '' 6.Mrs. Gorman took advantage of low prices 7.According to Upjohn 's estimates, only 50 % to 60 % of the 1,100 eligible employees will take advantage of the plan. 8.Nissan has increased earnings more than market share by cutting costs and by taking advantage of a general surge 9.Mr. Peladeau took his first big gamble 25 years ago, when he took advantage of a strike at La Presse 10.In addition, the two companies will develop new steam turbine technology, such as the plants ordered by Florida Power, and even utilize each other 's plants at times to take advantage of currency fluctuations. 11.One of GE 's goals when it bought 80 % of Kidder in 1986 was to take advantage of `` syngeries '' 12.I take advantage of this opportunity given to me by The Wall Street Journal And taking more direct action has the advantage of avoiding sharp increases 13.To take advantage of local expertise and custom 14.Several blue-chip companies tapped the new-issue market yesterday to take advantage of falling interest rates. 15.He also noted that a strong sterling market yesterday might have helped cocoa in New York as arbitragers took advantage of the currency move. 16.My kids ' college education looms as perhaps the greatest future opportunity for spending, although I 'll probably have to cash in their toy portfolio to take advantage of it. 17.As the ad 's tone implies, the Texas spirit is pretty xenophobic these days, and Lone Star is n't alone in trying to take advantage of that. 18.IBM, which Gartner Group said generates 22 % of its revenue in this market, should be able to take advantage of its loyal following 19.Erik Keller, a Gartner Group analyst, said organizational changes may still be required to really take advantage of CIM 's capabilities

7 Answer 1 20.These latter-day scalawags would be ill-advised to take advantage of the situation 21.Most of trading action now is from professional traders who are trying to take advantage of the price swings 22.For instance, First Quadrant Corp., an asset allocator based in Morristown, N.J., said it quickly boosted stock positions in its `` aggressive '' accounts to 75 % from 55 % to take advantage of plunging prices Friday. 23.Others are doing `` index arbitrage '' a strategy of taking advantage of price discrepancies 24.The campaign, created by Omnicom Group 's DDB Needham agency, takes advantage of the eye-catching photography 25.According to industry lawyers, the ruling gives pipeline companies an important second chance to resolve remaining disputes and take advantage of the cost-sharing mechanism. 26.Thanks to a new air-traffic agreement and the ability of Irish travel agents to issue Aeroflot tickets, tourists here are taking advantage of Aeroflot 's reasonable prices 27.But, `` You never can tell, '' he added, `` you have to take advantage of opportunities. 28.A broad rally began when several major processors began buying futures contracts, apparently to take advantage of the price dip. 29.`` We hope to take advantage of it, '' 30.And we hope to take advantage of panics 31.To take full advantage of the financial opportunities 32.Specifically, it must understand how real-estate markets overreact to shifts in regional economies and then take advantage of these opportunities.

8 Answer 1 Perl Program: –a simple way to exclude the case shown earlier open (F,$ARGV[0]) or die "$ARGV[0] not found!\n"; while ( ) { print $_ if (/\b(take|takes|taking|taken|took)\b(.*) advantage of/ && $2 !~ /\bthe\b/) }

9 Question 2 Give a regular grammar in Prolog notation that accepts strings with an odd number of a’s (#a’s =1,3,5,...) followed by an even number of b’s (#b’s = 2,4,6,...) i.e. a n b m n odd, m even Examples: –aaabb –abbbb –aaaaabb –*aabb –*aaab Submit your program Show it works on the given examples

10 Answer 2 Regular grammar in Prolog DCG format: 1.s --> [a], b. 2.s --> [a], d. 3.b --> [a], s. 4.d --> [b], e. 5.e --> [b]. 6.e --> [b], d. Run | ?- s([a,a,a,b,b],[]). yes | ?- s([a,b,b,b,b],[]). yes | ?- s([a,a,a,a,a,b,b],[]). yes | ?- s([a,a,b,b],[]). no | ?- s([a,a,a,b],[]). no

11 Question 3 Using an extra argument with regular grammar rules in Prolog DCG format, give a grammar that accepts L = a n b m n even (n=2,4,6,...) m is the odd number closest to but not exceeding n/2 Note: L is a non-regular language Examples: –aab –aaaab –*aaaabb –aaaaaabbb –*aaaaaabbbb –aaaaaaaabbb –*aaaaaaaabbbb –*aaaaaaaabbbbb Show your program works on the above examples

12 Answer 3 Program 1.s(X) --> [a], b(s(X)). 2.b(X) --> [a], c(s(X)). 3.b(X) --> [a], s(s(X)). 4.c(s(s(0))) --> [b]. 5.c(s(s(s(s(0))))) --> [b]. 6.c(s(s(X))) --> [b], d(X). 7.d(s(s(X))) --> [b], c(X). Run | ?- s(0,[a,a,b],[]). yes | ?- s(0,[a,a,a,a,b],[]). yes | ?- s(0,[a,a,a,a,b,b],[]). no | ?- s(0,[a,a,a,a,a,a,b,b,b],[]). yes | ?- s(0,[a,a,a,a,a,a,b,b,b,b],[]). no | ?- s(0,[a,a,a,a,a,a,a,a,b,b,b],[]). yes | ?- s(0,[a,a,a,a,a,a,a,a,b,b,b,b],[]). no | ?- s(0,[a,a,a,a,a,a,a,a,b,b,b,b,b],[]). no

13 Question 4 Give a regexp for the language described in Question 2 a n b m n odd, m even

14 Answer 4 a n b m n odd, m even a(aa)*(bb)+

15 Question 5 Give a regexp for the complement of the following FSA 1 2 4 35 ba ab a b a,b a b

16 Answer 5 Original machine is deterministic Flip the states 1 2 4 35 ba ab a b a,b a b 1 2 4 35 ba ab a b a b

17 Answer 5 Notice 5 is a dead- end state Erase 5 1 2 4 35 ba ab a b a,b a b 1 2 4 3 ba ab a b

18 Answer 5 Eliminated state 5 Eliminate states 2 and 4 1 2 4 3 ba ab a b 13 ab ba ab ba (ab|ba)*

19 Answer 5 Eliminated state 5 Equations –E1 = aE2 | bE4 | λ –E2 = bE3 –E4 = aE3 –E3 = aE2 | bE4 | λ Eliminate E4 –E1 = aE2 | baE3 | λ –E3 = aE2 | baE3 | λ Eliminate E2 –E1 = abE3 | baE3 | λ –E3= abE3 | baE3 | λ Group E3 –E1 = (ab|ba)E3 | λ –E3 = (ab|ba)E3 | λ Solve E3 –E3 = (ab|ba)* –E1 = (ab|ba)(ab|ba)*|λ = (ab|ba)* 1 2 4 3 ba ab a b

20 Question 6 Give the deterministic FSA corresponding to:

21 Answer 6 Deterministic machine 1 5 2 a 3 c 4 c a b 6 b a 8 a c 7 a

22 Finite State Transducers Just like Finite State Automata (FSA) except for an output tape Mealy Machine formulation: –at each transition, a FST can read an input symbol and output a (different) symbol onto the tape Background reading –Chapter 3 of the textbook

23 Morphology morphology –words are composed of morphemes –morpheme: basic semantic unit, e.g. -ee in employee –Inflectional: no change in category, e.g. V -ed  V –can carry information about tense, personal, number, gender, case etc. –Derivational: category-changing, e.g. V -able  A –very productive

24 Walkers. Standees. © Sandiway Fong sign above travelator at Pittsburgh International Airport

25 Today’s Topic Finite State Transducers (FST) for morphological processing –... also Prolog implementation

26 Recall Finite State Automata (FSA) from lecture 8 –(Q,s,f,Σ,  ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): {a, b} 5.transition function  : signature: character × state → state  (a,s)=x  (a,x)=x  (b,x)=y  (b,y)=y sx y a a b b

27 Modeling English Adjectives using FSA –from section 3.2 of textbook examples –big, bigger, biggest, *unbig –cool, cooler, coolest, coolly –red, redder, reddest, *redly –clear, clearer, clearest, clearly, unclear, unclearly –happy, happier, happiest, happily –unhappy, unhappier, unhappiest, unhappily –real, *realer, *realest, unreal, really fsa (3.4) Initial machine is overly simple need more classes to make finer grain distinctions e.g. *unbig

28 Modeling English Adjectives using FSA divide adjectives into classes examples –adj-root 2 : big, bigger, biggest, *unbig –adj-root 2 : cool, cooler, coolest, coolly –adj-root 2 : red, redder, reddest, *redly –adj-root 1 : clear, clearer, clearest, clearly, unclear, unclearly –adj-root 1 : happy, happier, happiest, happily –adj-root 1 : unhappy, unhappier, unhappiest, unhappily –adj-root 1 : real, *realer, *realest, unreal, really fsa (3.5) However... Examples uncooler Smoking uncool and getting uncooler. google: 22,800 (2006), 10,900 (2005) *realer google: 3,500,000 (2006) 494,000 (2005) *realest google: 795,000 (2006) 415,000 (2005)

29 Modeling English Adjectives using FSA e.g. *unbig google: 2,590 hits (2007) morphology is productive morphemes carry (compositional) meaning can be used for dramatic effect unbig vs. small

30 The Mapping Problem To map between a surface form and the decomposition of a word into its components –e.g. root +  (person/number/gender) and other features using spelling rules Example: (3.11) Notes: ^ marks a morpheme boundary # is the end-of-word marker

31 Stage 1: Lexical  Intermediate Levels example: –f o x +N +PL (lexical) –f o x ^s# (intermediate) lexical level: –uninflected “dictionary” level intermediate level: –replace abstract morphemes by concrete ones key –+N : noun fox can also be a verb, but fox +V cannot combine with +PL –+PL : (abstract) plural morpheme realized in English as s (basic case) –boundary markers ^ and # for use by the spelling rule machine (later)

32 Stage 1: Lexical  Intermediate Levels example: –f o x +N +PL (lexical) –f o x ^s# (intermediate) machine idea –character-by-character correspondences –f  f –o  o –x  x –+N   (  = empty string) –+PL  ^s# use a Finite State Machine with input/output mapping –Finite State Transducer (FST)

33 Stage 1: Lexical  Intermediate Levels Example: –g o o s e +N +PL (lexical) –g e e s e # (intermediate) Example: –g o o s e +N +SG (lexical) –g o o s e # (intermediate) Example: –m o u s e +N +PL (lexical) –m i  c e # (intermediate) Example: –s h e e p +N +PL (lexical) –s h e e p # (intermediate)

34 Stage 1: Lexical  Intermediate Levels 3.11 Notation: input : output f means f:f

35 Extension to Finite State Transducers (FST) [Mealy machine extension to FSA] –(Q,s,f,Σ,  ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): pairs I:O –I = input alphabet, O = output alphabet –ε may be included in I and O 5.transition function (or matrix)  : signature: i/o pair × state → state  (a:b,s)=x  (a:b,x)=x  (b:a,x)=y  (b:ε,y)=y sx y a:b b: ε b:a

36 Finite State Automata (FSA) recall: one possible Prolog encoding strategy –define one predicate for each state taking one argument (the input string) consume input character call next state with remaining input string –query ?- s(L). call start state s

37 Finite State Automata (FSA) –define one predicate for each state take one argument (the input string), and consume input character call next state with remaining input string –query ?- s(L). i.e. call start state s –state s: (start state) s([a|L]) :- x(L). –state x: x([a|L]) :- x(L). x([b|L]) :- y(L). –state y: (end state) y([]). y([b|L]) :- y(L). sx y a a b b simple extension to FST: each predicate takes two arguments: input and output

38 Stage 1: Lexical  Intermediate Levels example –s0([f|L1],[f|L2]) :- s1(L1,L2). –s0([c|L1],[c|L2]) :- s3(L1,L2). –s1([o|L1],[o|L2]) :- s2(L1,L2). –s2([x|L1],[x|L2]) :- s5(L1,L2). –s3([a|L1],[a|L2]) :- s4(L1,L2). –s4([t|L1],[t|L2]) :- s5(L1,L2). –s5([‘+N’|L1],L2) :- s6(L1,L2). –s6([‘+PL’|L1],[^,s,#|L2]) :- s7(L1,L2). –s7([],[]).% end state

39 Stage 1: Lexical  Intermediate Levels FST queries –lexical  intermediate ?- s0([f,o,x,’+N’,’+PL’],X). –X = [f, o, x, ^, s, #] –intermediate  lexical ?- s0(X,[f,o,x,^,s,#]). –X = [f, o, x, '+N', '+PL'] –enumerator ?- s0(X,Y). –X = [f, o, x, '+N', '+PL'] –Y = [f, o, x, ^, s, #] ; –X = [c, a, t, '+N', '+PL'] –Y = [c, a, t, ^, s, #] ; No inversion of a transducer T: T -1 switch input and output labels in Prolog, simply change the call

40 Stage 1: Lexical  Intermediate Levels Figure 3.17 (top half): tape view of input/output pairs

41 The Mapping Problem Example: (3.11) (Context-Sensitive) Spelling Rule: (3.5) –   e / { x, s, z } ^ __ s#  rewrites to letter e in left context x^ or s^ or z^ and right context s# i.e. insert e after the ^ when you see x^s# or s^s# or z^s# in particular, we have x^s#  x^es#

42 Stage 2: Intermediate  Surface Levels also can be implemented using a FST important! machine is designed to pass input not matching the rule through unmodified (rather than fail) implements context-sensitive rule q 0 to q 2 : left context q 3 to q 0 : right context

43 Stage 2: Intermediate  Surface Levels Example (3.17)

44 Stage 2: Intermediate  Surface Levels Transition table for FST in 3.14 pg.79

45 Stage 2: Intermediate  Surface Levels in Prolog (simplified) –q0([],[]). % final state –q0([^|L1],L2) :- !, q0(L1,L2). % ^:  –q0([z|L1],[z|L2]) :- !, q1(L1,L2). –% repeat for s,x –q0([#|L1],[#|L2]) :- !, q0(L1,L2). –q0([X|L1],[X|L2]) :- \+ mentioned(X), q0(L1,L2). % other ! is known as the “cut” predicate –it affects how Prolog backtracks for another solution –it means “cut” the backtracking off –Prolog will not try any other possible matching rule on backtracking

46 Exercise Ungraded exercise: –Implement 3.14 in Prolog –Make sure you can do e-insertion and the inverse operation, i.e. go from surface form to intermediate form


Download ppt "LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 15: 10/16."

Similar presentations


Ads by Google