Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.

Similar presentations


Presentation on theme: "© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential."— Presentation transcript:

1 © Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential

2 Market fragmentation By domains By languages Confidential

3 WHY SHOULD LT VENDORS SHARE THEIR RESOURCES? ●Many of LT vendors have their own LT ●LTs are focused on particular domain/language(s) ●Resources are critical for enabling such technologies ●If case of share vendors may loose competitive advantage 3 Confidential

4 Technologies ability and restrictions ●Language specific = language centric = limited by language ●Difficulties - Controlled links ●Anaphora ●Long distance links ●Ellipsis ●Ontology, dictionaries, statistic = trained on limited set of data = covers only limited variety of meaning representations = sometimes good to achieve 40% of recall (NER US DoD track) 4 Confidential

5 WHAT IS BIGDATA… ●Multilingual ●Covers more than 1 domain ●85 – 90% is in unstructured text documents ●Language expression of the same meaning vary by uncountable number of ways 5 Confidential

6 A FUNDAMENTAL NATURAL LANGUAGE TECHNOLOGY REQUIRED SCALABLE BY DOMAINS AND LANGUAGES 6 Confidential

7 ABBYY Compreno as proposal 7 ●Interlingua approach: ●semantic model is based on universal language independent representation both for lexis and grammar ●Working Languages: ●Russian, English: at the stage of terminological and collocation expansion ●German: full prototype (lexis, syntax) is completed; at the stage of main lexis expansion (from core to periphery) ●French: full prototype is completed (tested on controlled MT task) ; ●Chinese: lexical system prototype is completed (challenged task never carried out before); ●It is proved that Compreno is a scalable technology to use for any language Confidential Universal Semantic Hierarchy Statistic and machine learning Syntactic and semantic analysis

8 Complete syntactic and semantic analysis The bank was located at the bank of the river; it was closed. The complete analysis helps overcome linguistic problems in the text, if any..

9 Compreno current achievements 9 Confidential Russian syntax analysis 2011PrecisionRecallF Compreno 0.950.980.97 System 2 0.930.980.96 System 3 0.900.980.94 System 4 0.890.950.92 System 5 0.860.980.92 System 6 0.86 System 7 0.790.980.87 Fact Extraction 2013 ComprenoSystem 1ComprenoSystem 2ComprenoSystem 3 Precision0.95 0.960.980.92 Recall0.930.700.840.440.920.74 F-measure0.940.810.900.610.920.82 ABBYY advantage 14%32%10%

10 Applications ●BigData analytics – analysis of facts, extraction of objects ●Intelligence, eDiscovery (any kind) ●Search by meaning rather than by concepts ●Dialogues systems by natural language ●Translation 10 Confidential

11 Few facts about Compreno ●18 years of development ●About 350 people involved ●More than 2000 man-years 11 Confidential

12 Barriers for wide implementation ●At least 3 years per language ●At least 30 linguists per language ●At least 12M € per language ●Then support and improvement 12 Confidential

13 EU project idea ●Describe ALL EU languages ●Describe Major domains: healthcare, law, government, major industries ●ABBYY commitment: ●Methodology, management, instruments 13 Confidential

14 EU BENEFITS – CREATE SINGLE DIGITAL LT MARKET ●Operate not with language but with universal model of it – interlingual approach ●Describe one domain in one language – apply in all other languages ●A platform for LT vendors to create solutions and products easy scalable by languages and domains 14 Confidential


Download ppt "© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential."

Similar presentations


Ads by Google