Erasmus University Rotterdam Introduction With the vast amount of information available on the Web, there is an increasing need to structure Web data in.

Slides:



Advertisements
Similar presentations
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Advertisements

AVATAR: Advanced Telematic Search of Audivisual Contents by Semantic Reasoning Yolanda Blanco Fernández Department of Telematic Engineering University.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Learning Semantic Information Extraction Rules from News The Dutch-Belgian Database Day 2013 (DBDBD 2013) Frederik Hogenboom Erasmus.
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Hermes: News Personalization Using Semantic Web Technologies
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
Connecting Customer Relationship Management Systems to Social Networks 7th International Conference on Knowledge Management, Services, and Cloud Computing.
Determining Negation Scope and Strength in Sentiment Analysis SMC 2011 Paul van Iterson Erasmus School of Economics Erasmus University Rotterdam
Exploiting Emoticons in Sentiment Analysis SAC 2013 Daniella Bal Erasmus University Rotterdam Flavius Frasincar Erasmus University.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
RCQ-GA: RDF Chain Query Optimization using Genetic Algorithms BNAIC 2009 Alexander Hogenboom, Viorel Milea, Flavius Frasincar, and Uzay Kaymak Erasmus.
OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation 2006 Spring Research Conference Yihong Ding.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
March 17, 2008SAC WT Hermes: a Semantic Web-Based News Decision Support System* Flavius Frasincar Erasmus University Rotterdam.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Automatically Annotating Web Pages Using Google Rich Snippets 11th Dutch-Belgian Information Retrieval Workshop (DIR 2011) February 4, 2011 Frederik Hogenboom.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
By ANDREW ZITZELBERGER A Framework for Extraction Ontology Based Information Management.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September.
An Overview of Event Extraction from Text Workhop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE'11) October 23,
News Personalization using the CF-IDF Semantic Recommender International Conference on Web Intelligence, Mining, and Semantics (WIMS 2011) May 25, 2011.
Towards Semantic Web: An Attribute- Driven Algorithm to Identifying an Ontology Associated with a Given Web Page Dan Su Department of Computer Science.
1 Cui Tao PhD Dissertation Defense Ontology Generation, Information Harvesting and Semantic Annotation For Machine-Generated Web Pages.
Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Website Content, Forms and Dynamic Web Pages. Electronic Portfolios Portfolio: – A collection of work that clearly illustrates effort, progress, knowledge,
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Sentiment Analysis with a Multilingual Pipeline 12th International Conference on Web Information System Engineering (WISE 2011) October 13, 2011 Daniëlla.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
Erasmus University Rotterdam Introduction Nowadays, emerging news on economic events such as acquisitions has a substantial impact on the financial markets.
A News-Based Approach for Computing Historical Value-at-Risk International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) Frederik Hogenboom.
TOWL Time-determined ontology-based information system for real-time stock market analysis Econometric Institute Erasmus School of Economics Erasmus University.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Content Analysis Techniques to Ease Browsing with Handhelds Jalal Mahmud Yevgen Borodin I.V. Ramakrishnan Department of Computer Science State University.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Intelligent Database Systems Lab Presenter: WU, JHEN-WEI Authors: Rodrigo RizziStarr, Jose´ Maria Parente de Oliveira IS Concept maps as the first.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
*Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR Rotterdam, the Netherlands † Teezir BV Wilhelminapark 46, NL-3581 NL, Utrecht, the Netherlands.
Semantics-Based News Recommendation with SF-IDF+ International Conference on Web Intelligence, Mining, and Semantics (WIMS 2013) June 13, 2013 Marnix Moerland.
Erasmus University Rotterdam Introduction Content-based news recommendation is traditionally performed using the cosine similarity and TF-IDF weighting.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Towards Cross-Language Sentiment Analysis through Universal Star Ratings KMO 2012 Malissa Bal Erasmus University Rotterdam Flavius.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Lexico-semantic Patterns for Information Extraction from Text The International Conference on Operations Research 2013 (OR 2013) Frederik Hogenboom
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
Semantics-Based News Recommendation International Conference on Web Intelligence, Mining, and Semantics (WIMS 2012) June 14, 2012 Michel Capelle
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Ontology Based Annotation of Text Segments Presented by Ahmed Rafea Samhaa R. El-Beltagy Maryam Hazman.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
European Network of e-Lexicography
Presented by: Prof. Ali Jaoua
[jws13] Evaluation of instance matching tools: The experience of OAEI
CS246: Information Retrieval
Dynamic Facet Ordering for Faceted Product Search Engines*
Presentation transcript:

Erasmus University Rotterdam Introduction With the vast amount of information available on the Web, there is an increasing need to structure Web data in order to make it accessible to both users and machines. E- commerce is one of the areas in which growing data congestion on the Web has serious consequences. We propose a framework that is capable of populating a consumer electronic product ontology using tabular information from Web shops. Formalizing product information in this way enhances product comparison and recommendation applications. Our approach employs both lexical and syntactic matching for mapping properties and instantiating values. A Consumer Electronics Ontology Currently, there is a lack of detailed ontologies for consumer electronic products. Hence, we create the OntoProduct ontology:  It is an extended version of the Consumer Electronics Ontology (CEO) with additional product attributes;  It contains 270 consumer electronic product properties and 24 product classes;  It is fully compatible with the well-known GoodRelations (GR) ontology for e-commerce;  It utilizes the Units of Measurement Ontology (MUO) for linking units of measurements to quantitative values. Ontology Population Framework The population of ontologies is regularly preceded by a knowledge extraction phase, where, for instance, natural language Web pages on products are converted into raw tabular data. Our framework focuses on the population of ontologies with product information in the e-commerce domain, using extracted tabular data in the form of key- value pairs. For this, user-defined annotations for lexical and syntactic matching are employed, which facilitate the main tasks of our framework:  Classification: optional step to determine product class; most Web stores nowadays have some kind of class or category data of each product already available.  Property Matching: step for extracting ontology properties from tabular key-value pairs by lexically matching keys with ontology properties, and by matching regular expressions on values.  Value Instantiation: phase for creating individuals in the ontology and associating them to their properties, while making use of parsers, content spotters, and instantiation tools. Results & Conclusions The property matching and value instantiation processes in the framework have been evaluated separately using a golden standard, under the assumption that the product class of a product description is known. The raw product data was obtained from two different Web sources, i.e., Best Buy and Newegg.com. In our experiments, the proposed framework achieved a solid performance for the property matching and value instantiation processes. The former process resulted in a precision, recall, accuracy, and F1 score of 96.95%, 93.27%, 94.80%, and 95.07%, respectively, whereas the latter process resulted in scores of 77.12%, 76.09%, 62.07%, and 76.60%, respectively. Although some raw product keys in the test set were not present in the training set, many key-value pairs were still matched with ontology properties. In practice, this means that a semi-automatic approach would only require training the algorithm with a few products from each product class in order to achieve satisfactory performance on property matching for all the products in a Web shop. After analyzing the results in more detail, we found that the regular expressions, in conjunction with the lexical representations, are often capable of correctly mapping key-value pairs to properties in the ontology. For example, the key “Product Dimensions” is correctly mapped to ceo:hasWidth, ceo:hasHeight, and ceo:hasDepth, demonstrating the usefulness of regular expressions in this context. Acknowledgemen t Damir Vandic is supported by an NWO Mosaic scholarship for project : Semantic Web Enhanced Product Search (SWEPS). Frederik Hogenboom is supported by the NWO Physical Sciences Free Competition project : Financial Events Recognition in News for Algorithmic Trading (FERNAT) and the Dutch national program COMMIT. Contact Damir Vandic Postal:Erasmus University Rotterdam P.O. Box 1738 NL-3000 DR Rotterdam The Netherlands Phone: +31 (0) Web: Ontology Population from Web Product Information Damir Vandic, Lennart J. Nederstigt, Steven S. Aanen, Flavius Frasincar, Frederik Hogenboom