Download presentation
Presentation is loading. Please wait.
Published byAbigayle Phillips Modified over 9 years ago
1
CROSSWORD PUZZLE – TEAM 2 Members:Derek van Assche Cody Hansen Jonathan Juett Seungbum Park Anthony Vito Date: 4/22/2014
2
Agenda Tasks Resources.Puz files Components
3
Tasks Create components to handle patterns Extend current list of clue patterns Write regular expressions for clue patterns Design and implement a GUI Download a larger set of.puz files
4
Resources Vehicle make and model database [1] 7,352 Vehicle Entries Model Years from 1909 to 2013
5
Resources Notable Names Database [2] Contains information on noteworthy people.
6
Resources List of rock bands and singers [3] 674 Entries
7
Resources Dictionary [4] Contains words found at dictionary.com Large list of words and word-like tokens
8
Resources BabelNet [5] Integration of WordNet, Open Multilingual WordNet, Wikipedia, and OmegaWiki
9
Resources WordNet [6] Large lexical database Nouns, verbs, adjectives and adverbs grouped in synsets Google Ngram [7] Corpus collected from online text by Google Information about ngrams of various lengths and their frequencies Natural Language Toolkit [8] Provides interface WordNet Text processing for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.
10
.Puz files Sources: http://chronicle.com/section/Crosswords/43 Number of.puz: 192 http://puzzle.about.com Number of.puz: 18 http://bobklahn.home.comcast.net/~bobklahn/CrosSynergy/ index.html Number of.puz: 86 http://www.fleetingimage.com/wij/xyzzy/12-dr.html Number of.puz: 96 Puzzles Located at /data0/projects/cross/more_puz
11
Component Input: 1A Capital of Canada 6 1D Jaguar, e.g. 6 Output: [ ] 1A OTTAWA 5 1A QUEBEC 2 1D FELINE 3 1D CANINE 1
12
Component Antonyms pattern: Example: ZENITH 1 2010 NYT Nadir's opposite ABHOR 2 2010 unk Antonym for "adore" Regular Expression: ^([A-Za-z]+)(([\'][s]){0,1}|([s][\']){0,1})\s(opposite|antonym) ^([Oo]pposite|[Aa]ntonym)\s(of|for)\s(\'|\"){0,1}([\w]+)(\'|\"){0,1} $ Resources used: Nltk for access to Wordnet Evaluation MRAR score of 1.0 One correct answer out of one attempt
13
Component E.g. clue pattern: Example: HORSE 5 2002 CSy Chestnut, e.g. Regular Expression:,[\s][Ee]\.g\.$ Resources used: Nltk for access Wordnet to hypernyms Evaluation MRAR score of 0.5 Two correct answers out of four attempts
14
Component Say pattern: Example: MISS 4 1999 NYT Overshoot, say Regular Expression:, [Ss]ay$ Resources used: Nltk for access Wordnet to hypernyms Evaluation MRAR score of 0 Zero correct answers out of five attempts
15
Component In Brief pattern: Example: ETS 2 2008 NYT Some "Stargate SG-1" characters, in brief Regular Expression:, [Ii]n brief Resources used: Nltk for access to Wordnet for synonyms Evaluation Matched zero clues out of thirty
16
Component Kind of pattern: Example: SEAT 2 2000 unk Kind of belt Regular Expression: [Kk]ind of Resources used: Nltk for access to Wordnet for synonyms Evaluation Matched zero clues out of thirty
17
Component Antonym, E.g., Say, In Brief, Kind of pattern: Ways to improve: Incorporate scoring system Increase performance Accessing WordNet can be slow
18
Component Rock Band pattern: Example: SID 5 2001 CSy Rocker Vicious Regular Expression: \"[\w\s]+\"\s(rock\s)?band|[Rr]ocker[\s]+.*[A- Z]|\".+\"[\s]+.*[Rr]ocker|\'.+\'[\s]+.*[Rr]ocker Resources used: Rock Band Database Evaluation MRAR score of 0.5139 Two correct answers out of two attempts
19
Component Rock Band pattern: Ways to improve: Create a more complete database Include well-known songs Expand list of current patterns Include songs : "Come Sail Away" rockers => Styx
20
Component Vehicle pattern: Example: ACCORD 2 2004 unk Honda model Regular Expression: [Mm]odels?|[Vv]ehicles? Resources used: Vehicle make and model database Evaluation MRAR score of 0.7246 Precision of 0.9891
21
Component Vehicle pattern: Ways to improve: Expand list of current patterns ‘70s Pontiac => Pontiac GTO
22
Component And/Or pattern: Example: LEVIS 4 2001 WaP Strauss and Stubbs ABE 2 1997 unk Lincoln or Burrows Regular Expression: [A-Z][a-z]+[\s](and|or)[\s][A-Z][a-z]+$ Resources used: NNDB (Notable Names Database) Evaluation MRAR score of 0.57 Precision of 0.6355
23
Component And/Or pattern: Ways to improve: Integrate with Wikipedia or BabelNet Saturn and Mars => Planets Extend list of current patterns The Third son of Adam and Eve
24
Component Single Word pattern: Example: ABANDON 2 1999 USA Desert Regular Expression: [A-Z0-9][a-z0-9]+$ Resources used: BabelNet for synonyms, hyponyms, hypernyms Evaluation Undetermined
25
Component Single Word pattern: Ways to improve: Implement BabelNet API Accessing HTML is slow Eliminates timeout issue Implement stemming Helps solve conjugated clues Challenged => Dared Use Nltk
26
Component Prefix pattern: Example: STETHO 3 2003 NYT Prefix with scope Regular Expression: [Pp]refix Resources used: dictionary.com all_words.text file on Morana Evaluation MRAR score of 0.33 Precision of 0.665
27
Component Preceder pattern: Example: SEMI 3 2007 CSy Final preceder Regular Expression: [Pp]receder Resources used: Google ngrams Evaluation Undetermined
28
Component Preceder pattern: Ways to improve: Implement downloaded corpus Eliminates timeout issue
29
Thanks for listening Are there any questions?
30
Sources [1] https://github.com/n8barr/automotive-model-year-data [2] http://www.nndb.com/ [3] http://www.allmusic.com/ [4] http://dictionary.reference.com/ [5] http://babelnet.org/ [6] http://wordnet.princeton.edu/ [7] http://www.google.com/url?q=http%3A%2F%2Fgoogleresearch. blogspot.com%2F2006%2F08%2F all-our-n-gram-are-belong-to-you.html&sa=D&sntz=1&usg= AFQjCNEFJhdTDMnlK11Tg9vumlsRfDgq9Q [8] http://www.nltk.org/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.