Cultural Heritage Digitization meeting, Sofia, 30 Jan 2012 Experience with EC FP5, FP6, FP7 and Cultural Heritage projects Vladimir Alexiev, PhD, PMP
Ontotext Ontotext is a Bulgarian company with 65 staff: Sofia, Varna, Innsbruck, London, Connecticut, New Zealand Started in 2000 as a research lab in Sirma Group. Spun off in 2008 with investment from NEVEQ World-leader in semantic technologies. 360-degree semtech: repository (OWLIM), text mining (KIM, GATE), web mining (WMF), Ontology and Linked Data Management Revenue grew 210% in the last 3 years: 5M BGL in 2011, over 7M expected in 2012 Commercial revenue grew 10x in the last 3 years Ontotext experience with FP and CH #2 30 Jan 2012
Completed FP5, FP6, FP7 Projects OMM, BOR, OntoMap, On-To-Knowledge, SWWS, OntoWeb, VISION, DIP, SEKT, INFRAWEBS, PrestoSpace, MediaCampaign, RASCALLI, SemanticGov, SUPER, TAO, TripCom, LarKC, SOA4ALL OMMBOROntoMapOn-To-KnowledgeSWWSOntoWeb VISIONDIPSEKTINFRAWEBSPrestoSpaceMediaCampaign RASCALLISemanticGovSUPERTAOTripComLarKC SOA4ALL Typically Ontotext is a core technology partner Typical size is 2-4M EUR (STREP); 10-15M EUR (IP). 3 years Typical Ontotext share is k EUR Ontotext is the most successful Bulgarian participant in EU FP research projects, received the prestigious Pitagoras awardprestigious Pitagoras award Topics range from core semtech to web services, SOA, business processes, eGovernment, media, TV, life sciences… Ontotext experience with FP and CH #3 30 Jan 2012
Current EC FP7 Projects Project cycle and continuity: (F)inishing, (M)iddle, (S)tarting NoTube: Personalized creation, distribution, consumption of TV content (F)NoTube Insemtives: Incentives for Semantics (F)Insemtives Cubist: Combining and Uniting Business Intelligence and Semantic Technologies (M)Cubist Khresmoi: Knowledge Helper for Medical and Other Information (M)Khresmoi Molto: Multilingual Online Translation (M)Molto Render: Reflecting Knowledge Diversity (M)Render TrendMiner: Trend Mining & Summarisation of Real-time Media Streams (S)TrendMiner AnnoMarket: Marketplace for Semantic Annotation Services (S) Euclid: Educational Curriculum for Linked Data (S) Ontotext experience with FP and CH #4 30 Jan 2012
Ontotext Statistics from One Call FP7 SME DCL was a call targeting SMEs Topic: Digital Content and Languages Purpose: work towards a linked data economy 2-phase call: short proposal (5p), full proposal ( ) Ontotext experience with FP and CH #5 30 Jan 2012 Short proposals: 15 (initiated 1, active in 13, rode along in 1) Full proposals: 6 Accepted: 2
Collaboration With Academia and Research Ontotext collaborates extensively with universities and research centers all over Europe on EU FP projectscollaborates extensively Ontotext has a long-standing collaboration with the University of Sheffield on text analysis and semantic technologies 2 professors work part-time at Ontotext: Kiril Simov (BAS IICT), Maurice Grinberg (NBU Cognitive Science) 2 PhDs working at Ontotext teach at university: Mariana Damova (NBU Semantic Technologies), Vladimir Alexiev (NBU and BAS IMI: IT Project Management) Ontotext hires interns and doctorants and offers possibilities for doctoral research abroad Ontotext experience with FP and CH #6 30 Jan 2012
Commercial Projects UK 59%, US 18%, Global 9%, BG 7%, IT 3%, KR 2%, MX 2%, now DE Data providers 27% (jobs, food, cars), Publishing 26%, Government 18%, Life Sciences 11%, Cultural Heritage 10%, Telecom 4% Regular SemTech training courses in London Commercial revenue is close to 2/3 of total – EC projects are a bit shunned because we lose focus – We see great potential in Cultural Heritage so we want to focus on that Ontotext experience with FP and CH #7 30 Jan 2012
Cultural Heritage Experience OWLIM has a following in CH: Molto FP7 and Gothenburg Museum, Charisma FP7, 3D COFORM FP7, Dutch Public Library POC, Polish Digital National Museum, LODAC (JP) KR-BG ITCC: semantic publishing to Europeana British Museum: ResearchSpace project, funded by the Andrew Mellon Foundation. Collaborative web-based research for the cultural heritage scholarly community. Based on the CIDOC CRM ontologyResearchSpaceCIDOC CRM The National Archives: Semantic Knowledge Base for the Government Web Archive. 780M documents (150M after de- duplication), annotated over 10B facts Ontotext experience with FP and CH #8 30 Jan 2012
Semantic Technologies and Applicability to CH Many C.H. institutions have a data integration problem: data about the same artifacts is scattered in separate silos: cataloging, acquisition, conservation, research/scientific… If the Web (1.0) is a giant hyper-linked document, then Semantic Web (3.0) is a giant linked data-base Semantic Technologies are the best way to interconnect: – Ontologies and schemas ensure metadata interoperability (ESE, EDM, LIDO, CIDOC CRM, EADS, MODS…) – Linked Open Data provides additional context (DBpedia, GeoNames, FreeBase, WordNet, …) – Thesauri ensure consistent vocabulary (Getty ULAN, AAT, TGN; IconClass, RKD People, Concepts, etc) Ontotext experience with FP and CH #9 30 Jan 2012