Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics.

Similar presentations


Presentation on theme: "Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics."— Presentation transcript:

1 Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics in search 2. Semantics for interoperability Jon Atle Gulla Norwegian University of Science and Technology, Trondheim, Norway Email: jag@idi.ntnu.no 3. Ontologies in process mining 4. Linguistics in news reporting

2 Who am I? Professor, Information Systems group, IDI/NTNU Education: Siv.ing./dr.ing. (information systems, NTH) Cand.philol. (linguistics, AVH) MSc (management, London Business School) Work experience: Fast Search & Transfer, Munich (linguistics in search) Norsk Hydro, Brussels (enterprise systems) GMD, Darmstadt (information retrieval) Field of research:  Search technologies  Semantic Web  Social Web  Sentiment analysis and recommendations Jon Atle GullaICEIS 20082

3 1. The FAST Alltheweb.com site 2000: Alltheweb.com was one of the largest search engines on the Internet FAST acquired Elexir Sprachtechnologie in Munich Intended to add linguistics to search engine Query Retrieved documents Jon Atle GullaSpråkteknologi og innovasjon

4 Linguistic Techniques in FAST Linguistics in search: Documents Categories of documents Search options Category-based selection All selected Categorizing techniques Reduced search space Relevant documents Transformed documents Query Transformed query Content-based search Keyword-based search Transformational techniques Increased semantics Presentational techniques List of documents Presentation of document list Content-based access Title-based access Improved transparency Language identification Spam detection Topic categorization Lemmatization Phrasing Anti-phrasing Clustering Jon Atle GullaSpråkteknologi og innovasjon

5 The FAST Experience Linguistics a small part of a large system Linguistics as behind-the-scene technology Linguistics not a major breakthrough Linguistics is not easy:  Data-intensive  Only statistical approaches feasible at the time Jon Atle GullaICEIS 20085 What happened to FAST? 2003: Internet part sold to Overture (Yahoo) 2009: Enterprise part sold to Microsoft What happened to FAST? 2003: Internet part sold to Overture (Yahoo) 2009: Enterprise part sold to Microsoft

6 2. Semantics in Interoperability Semantic Web:  Adding semantics to data/services for humans and computers to communicate better  Ontology: Explicit representation of a shared conceptualization (domain terminology model)  Semantic markup languages for ontology building (OWL, RDF) 2003: Petromax IIP project for construction of ontology for the oil & gas sector (based on ISO15926) 2011: EU LinkedDesign project for use of ontologies in manufacturing processes Jon Atle GullaICEIS 20086

7 Jon Atle GullaICEIS 20087 Silly Semantic Conflicts Prevent Data harmonization Even simple terms are misunderstood

8 Jon Atle GullaICEIS 20088 … An artefact that is an assembly of pipes and piping parts, with valves and associated control equipment that is connected to the top of a wellhead and is intended for control of fluid from a well. CHRISTMAS TREE … OWL petroleum ontology

9 SemanticWeb Lessons Learned Data integration and harmonization improved in sector But:  Demanding and complex technologies  Semantic Web technologies still immature and expensive  So far few commercial solutions using semantic technologies (Some work on ontology-driven search applications) Jon Atle GullaICEIS 20089

10 3. Ontologies in Process Mining Process mining:  Techniques and tools for discovering process flow, control, data, organizational and social structures from enterprise systems’ event logs  Dynamic reporting for exposing real business flows and explaining interesting transaction patterns Semantic process mining: Using ontologies to improve the interpretation of event logs and the construction of business flows Jon Atle GullaICEIS 200810

11 Semantic Process Mining Jon Atle GullaICEIS 200811 Detected process flow Formal definition of process terminology Ontology

12 Commercialization of Technology 2004: Businesscape founded Ongoing work on Enterprise Visualization Suite:  Combines two challenging technologies (data mining and Semantic Web)  Substantial improvement from traditional process mining (and traditional reporting tools)  However: Difficult to explain the complexity and capability of solution to customers Few customers competent enough to distinguish process mining from traditional reporting Jon Atle GullaICEIS 200812

13 4. Linguistics in News Reporting Semantic approaches to news reporting:  Extract content from news articles  Validate content of articles  Opinion mining from news articles and social sites  Model user preferences for news recommendation  Combine/aggregate knowledge from heterogenous sources Commercial potential uncertain Jon Atle GullaICEIS 200813

14 Conclusions Linguistics often a supporting technology Good linguistic resources tedious and expensive to develop Not always easy to justify inclusion of linguistics Linguistics in our projects:  Enable new services and products  Enhance existing services and products Jon Atle GullaICEIS 200814


Download ppt "Jon Atle GullaSpråkteknologi og innovasjon1 Språkteknologi i industrielle anvendelser Or: How we have commercialized linguistic technologies 1. Linguistics."

Similar presentations


Ads by Google