Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September.

Similar presentations


Presentation on theme: "Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September."— Presentation transcript:

1 Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September 2, 2011 Alexander Hogenboom hogenboom@ese.eur.nl Frederik Hogenboom fhogenboom@ese.eur.nl Flavius Frasincar fhogenboom@ese.eur.nl Uzay Kaymak kaymak@ese.eur.nl Otto van der Meer 276933rm@student.eur.nl Kim Schouten 288054ks@student.eur.nl Erasmus University Rotterdam PO Box 1738, NL-3000 DR Rotterdam, the Netherlands

2 Introduction (1) News greatly impacts financial markets Some of many recent examples: 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

3 Introduction (2) It is important to automatically and accurately identify economic events in news items in a timely manner This involves processing large amounts of heterogeneous sources of unstructured data Domain-specific information captured in domain semantics facilitates detection of relevant concepts 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

4 Introduction (3) SPEED: a Semantics-based Pipeline for Economic Event Detection Our approach: –Extracts financial events from emerging news (RSS feeds) –Annotates news messages with meta-data –Aims for fast processing in order to enable real-time use 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

5 SPEED: Framework 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

6 SPEED: Implementation (1) Java-based pipeline using a general architecture for text engineering (GATE) GATE components used: –English Tokenizer– Part-Of-Speech Tagger –Sentence Splitter– Morphological Analyzer Adaptations and additions required: –Word Sense Disambiguation –Ontology-based components 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

7 SPEED: Implementation (2) Ontology Gazetteer: –GATE uses an inefficient list of ontology concepts –We employ a look-up tree based on hash maps Word Group Look-Up: –Tree-based approach using WordNet Word Sense Disambiguator –Adaptation of the Structural Semantic Interconnections (SSI) algorithm Event Phrase Gazetteer: –Matches event concepts 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

8 SPEED: Implementation (3) Event Pattern Recognition: –Based on GATE Rule Transducer, utilizing JAPE patterns –Additionally operates on event concepts Ontology Instantiator: –Retrieves event annotations in text –Creates event individuals in ontology –Updates affected concepts 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

9 Evaluation (1) Word Sense Disambiguator: –Evaluated on SemCor –Original SSI: precision 53%, recall 31% –Adapted SSI: precision 59%, recall 59% Entire framework: –Evaluated on 200 news messages from Yahoo! Business & Technology feeds, annotated by three domain experts (with IAA 66% or higher) for 10 events regarding: CEOs (60) Partners (23) Revenues (22) Presidents (22) Subsidiaries (46) Profits (33) Products (136) Share values (45) Losses (27) Competitors (50) –Event instances: precision 86%, recall 81% –Fully decorated events: precision 62%, recall 53% 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

10 Evaluation (2) Latency: –Total pipeline: 632 milliseconds per document –Linguistic and syntactic analysis: 30% –Word Sense Disambiguation: 60% –Remaining tasks: 10% 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

11 Conclusions SPEED framework: –Components are semantically enabled –Pipeline outputs are ontology instances –Adapted SSI algorithm Evaluation underlines fast and accurate performance Future work: –Applications in algorithmic trading –Linking sentiment to discovered events (e.g., trends, moods, opinions, etc.) 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)

12 Questions 22nd International Conference on Database and Expert Systems Applications (DEXA 2011)


Download ppt "Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September."

Similar presentations


Ads by Google