Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics.

Slides:



Advertisements
Similar presentations
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
An Ontology Creation Methodology: A Phased Approach
Internet Search Lecture # 3.
Ontologies: Dynamic Networks of Formally Represented Meaning Dieter Fensel: Ontologies: Dynamic Networks of Formally Represented Meaning, 2001 SW Portal.
Knowledge Management and Engineering David Riaño.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Information and Business Work
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
1 © Ramesh Jain Social Life Networks: Ontology-based Recognition Ramesh Jain Contact:
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Overview of Web Data Mining and Applications Part I
Overview of Search Engines
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Siemens Big Data Analysis GROUP 3: MARIO MASSAD, MATTHEW TOSCHI, TYLER TRUONG.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
1 An Analytical Evaluation of BPMN Using a Semiotic Quality Framework Terje Wahl & Guttorm Sindre NTNU, Norway Terje Wahl, 14. June 2005.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Defining Text Mining Preprocessing Transforming unstructured data stored in document collections into a more explicitly structured intermediate format.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
XML The “E-Lance Economy” or “Digital Economy” is a new challenge for interacting over networks. XML was developed by the World Wide Web Consortium (W3C)
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Management of Digital Content in Business Environments Constantine D. Spyropoulos Director of Institute of Informatics & Telecommunications NCSR “Demokritos”
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
Data Mining for Web Intelligence Presentation by Julia Erdman.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
Knowledge and Learning in Complex Business Systems Zuobing Xu University of California, Santa Cruz (Silicon Valley Center) Ram Akella, Kristin Fridgeirsdottir,
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
OWL Representing Information Using the Web Ontology Language.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Intelligent Agents. 2 What is an Agent? The main point about agents is they are autonomous: capable of acting independently, exhibiting control over their.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
Adaptive Faceted Browsing in Job Offers Danielle H. Lee
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
The Semantic Web Vision. Course Work Dr Yasser Fouad Blogs.alexu.edu.eg 2.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Dr.S.Sridhar,Ph.D., RACI(Paris),RZFM(Germany),RMR(USA),RIEEEProc.
Business process management (BPM)
Business process management (BPM)
Semantic Web: Commercial Opportunities and Prospects
ece 627 intelligent web: ontology and beyond
Web Mining Department of Computer Science and Engg.
AGMLAB Information Technologies
Presentation transcript:

Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics in search 2. Semantics for interoperability Jon Atle Gulla Norwegian University of Science and Technology, Trondheim, Norway 3. Ontologies in process mining 4. Linguistics in news reporting

Who am I? Professor, Information Systems group, IDI/NTNU Education: Siv.ing./dr.ing. (information systems, NTH) Cand.philol. (linguistics, AVH) MSc (management, London Business School) Work experience: Fast Search & Transfer, Munich (linguistics in search) Norsk Hydro, Brussels (enterprise systems) GMD, Darmstadt (information retrieval) Field of research:  Search technologies  Semantic Web  Social Web  Sentiment analysis and recommendations Jon Atle GullaICEIS 20082

1. The FAST Alltheweb.com site 2000: Alltheweb.com was one of the largest search engines on the Internet FAST acquired Elexir Sprachtechnologie in Munich Intended to add linguistics to search engine Query Retrieved documents Jon Atle GullaSpråkteknologi og innovasjon

Linguistic Techniques in FAST Linguistics in search: Documents Categories of documents Search options Category-based selection All selected Categorizing techniques Reduced search space Relevant documents Transformed documents Query Transformed query Content-based search Keyword-based search Transformational techniques Increased semantics Presentational techniques List of documents Presentation of document list Content-based access Title-based access Improved transparency Language identification Spam detection Topic categorization Lemmatization Phrasing Anti-phrasing Clustering Jon Atle GullaSpråkteknologi og innovasjon

The FAST Experience Linguistics a small part of a large system Linguistics as behind-the-scene technology Linguistics not a major breakthrough Linguistics is not easy:  Data-intensive  Only statistical approaches feasible at the time Jon Atle GullaICEIS What happened to FAST? 2003: Internet part sold to Overture (Yahoo) 2009: Enterprise part sold to Microsoft What happened to FAST? 2003: Internet part sold to Overture (Yahoo) 2009: Enterprise part sold to Microsoft

2. Semantics in Interoperability Semantic Web:  Adding semantics to data/services for humans and computers to communicate better  Ontology: Explicit representation of a shared conceptualization (domain terminology model)  Semantic markup languages for ontology building (OWL, RDF) 2003: Petromax IIP project for construction of ontology for the oil & gas sector (based on ISO15926) 2011: EU LinkedDesign project for use of ontologies in manufacturing processes Jon Atle GullaICEIS 20086

Jon Atle GullaICEIS Silly Semantic Conflicts Prevent Data harmonization Even simple terms are misunderstood

Jon Atle GullaICEIS … An artefact that is an assembly of pipes and piping parts, with valves and associated control equipment that is connected to the top of a wellhead and is intended for control of fluid from a well. CHRISTMAS TREE … OWL petroleum ontology

SemanticWeb Lessons Learned Data integration and harmonization improved in sector But:  Demanding and complex technologies  Semantic Web technologies still immature and expensive  So far few commercial solutions using semantic technologies (Some work on ontology-driven search applications) Jon Atle GullaICEIS 20089

3. Ontologies in Process Mining Process mining:  Techniques and tools for discovering process flow, control, data, organizational and social structures from enterprise systems’ event logs  Dynamic reporting for exposing real business flows and explaining interesting transaction patterns Semantic process mining: Using ontologies to improve the interpretation of event logs and the construction of business flows Jon Atle GullaICEIS

Semantic Process Mining Jon Atle GullaICEIS Detected process flow Formal definition of process terminology Ontology

Commercialization of Technology 2004: Businesscape founded Ongoing work on Enterprise Visualization Suite:  Combines two challenging technologies (data mining and Semantic Web)  Substantial improvement from traditional process mining (and traditional reporting tools)  However: Difficult to explain the complexity and capability of solution to customers Few customers competent enough to distinguish process mining from traditional reporting Jon Atle GullaICEIS

4. Linguistics in News Reporting Semantic approaches to news reporting:  Extract content from news articles  Validate content of articles  Opinion mining from news articles and social sites  Model user preferences for news recommendation  Combine/aggregate knowledge from heterogenous sources Commercial potential uncertain Jon Atle GullaICEIS

Conclusions Linguistics often a supporting technology Good linguistic resources tedious and expensive to develop Not always easy to justify inclusion of linguistics Linguistics in our projects:  Enable new services and products  Enhance existing services and products Jon Atle GullaICEIS