Download presentation
Presentation is loading. Please wait.
1
Videogame Project Progress Evaluated previous work Crawled Giantbomb game database Identified entities in review text Clustering adjectives
2
Named Entity Identification Problems such as: “7 Wonders of the Ancient World” sometimes referred to as “7 Wonders” Yet pattern varies for other titles First part of title often refers to a series Can't just use “7” Giantbomb aliases very incomplete
3
Named Entity Identification False starts Tried to use capitalization together with sentence splitting to more easily pick out potential abbreviations (particularly of games other than one being reviewed) But most abbrs occur in user reviews, often lacking reliable capitalization and punctuation Bulleted lists not preserved in corpus
4
Named Entity Identification Finding “7 Wonders” isn’t too hard on its own Prioritization Is it part of the game name? A series name? An abbreviation for another game in the series? A character? Some other game’s name entirely? Testing led to repeated priority-order tweaks
5
Next Cluster adjectives Cluster on mutual information with abstract nouns in Google 2-gram data Using MK-means: every element belongs to cluster of closest centroid, as well as any centroid within threshold σ
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.