Our mission is to advance cutting-edge research and applications of knowledge technologies that support the analysis, modeling and management of knowledge and data. We have authored and edited numerous scientific books and coordinated several EU projects. Our technologies have been successfully applied to many practical problems. We are active in education and transfer of knowledge, and act as a bridge between science and industry in Slovenia and abroad.
Department of KNOWLEDGE TECHNOLOGIES Jožef Stefan Institute Contents Basic Information …………………………….……... 2 Scientific highlights ………………………….……… 6 Relevance highlights ………………….……….…. 12 Vision ………………………………………..………….. 23
Basic information Research areas –Data Mining –Text, Web and Multimedia Mining –Semantic Web –Human Language Technologies –Decision Support –Knowledge Management Application areas –Ecology, Geology –Medicine, Health care –Biomedicine, Systems biology –Agriculture, Forestry –Telecommunications –Digital libraries –Cultural heritage –eGov, eBusiness, eLearning 30 years of research tradition –Founded as Department of Artificial Intelligence in 1979 –Department of Knowledge Technologies since 2004 –30 researchers, 20 students/external, 5 support staff 2
Collaborations In Slovenia –Center for Knowledge Transfer in IT at Jožef Stefan Institute (JSI) –Jožef Stefan International Postgraduate School –Spin-offs: Temida and Quintelligence –Cycorp Europe established in 2007 at JSI International –Collaboration with over 100 partners of EU projects, academic and industrial –Strong ties with over 20 other partners, including CMU, Stanford University, NASA Ames, Microsoft Research and Osaka University Industry British Telecom France Telecom Siemens Business Solutions Empolis/Bertelsman Atos Origin Software AG UN FAO Dassault Aviation BRGM (Bureau de recherches géologiques et minières) SINTEF (Norway) iSOCO (Spain) Ontoprise (Germany) SIRMA (Bulgaria) Helsinki Institute of Technology FZI Karlsruhe CRF FIAT 3
Education and knowledge transfer Teaching –MSc and PhD courses in major research areas of knowledge technologies –Supervision of BSc, MSc, and PhD students Institutions –Jožef Stefan International Postgraduate School –Universities of Nova Gorica, Maribor, Ljubljana, Primorska, Graz University videolectures.net –World leading video lectures Web portal Summer schools –Semantic Web Summer School 2004 – 100 attendees –AI Summer School ACAI-05 – 100 attendees High school student competitions –Yearly Computer Science competitions –Books of tasks and solutions 4
Selected publications before
Department of KNOWLEDGE TECHNOLOGIES Scientific Highlights
Scientific results Publications in prestigious journals –Journal of Machine Learning Research (3), Machine Learning (6), Decision Support Systems –Ecological modeling (10), Journal of Biomedical Informatics (3) Awards –Two elected ECCAI fellows (2007, 2008) –Prešeren BSc thesis award (2007) –Best software award (ESWC-2006) PC chairs of major scientific events –DS-06, ESSLLI-07, ILP-08 –ECML/PKDD-07, ECML/PKDD-09 High SCI citations of group members Editors/authors of numerous books and proceedings 7
Scientific highlight: Subgroup discovery New methods and systems –Discovering interesting subgroups in tabular data: SD, CN2-SD, APRIORI-SD –Discovering interesting subgroups in multi-realtional data: RSD Breakthrough technology –Effective method for using ontologies in relational data mining Using GO, KEGG, ENTREZ to form relational features Successful discovery of new scientific knowledge in functional genomics Journal papers –MLJ 2004a and 2004b, JMLR 2004, …, IEEE TSMC 2006, MLJ 2007, JBI 2007, JMLR 2008, JBI
Scientific highlight: Equation discovery New methods –Integrating process-based domain knowledge and models Breakthrough technology –Integrating knowledge-based and data-driven modeling of dynamic systems Systems LAGRAMGE 2.0, IPM Numerous successful applications –Modeling aquatic ecosystems Lake Bled, Ohrid, Kasumigaura, Greifensee, Glumsoe Journal papers in MLJ, Ecological Modeling, … State-of-the-art-survey book –Computational Discovery of Scientific Knowledge 9
Scientific highlight: Text mining and visualization Content-Land New methods –Text processing, clustering, SVM, ontology construction, … –Graph and text visualization Breakthrough technology –Open source text mining SW Systems –Text-Garden – text-mining library ( –Document-Atlas – text visualization ( –OntoGen – semi-automated ontology construction ( Award winner at ESWC
Scientific highlight: Qualitative decision support New methods –Qualitative DS modeling –Truly hierarchical, probabilistic Systems DEXi 2.0, proDEX Monograph on Decision Support Journal papers in Decision Support Systems, Journal of Operational Research, Ecological Modeling, … Successful applications –GM farming models and DS systems, Highway control, … –SW Tools: SIGMEA Maize Coexistence Advisor (SMAC Advisor), ECOGEN Soil Quality Index (ESQI) 11
Department of KNOWLEDGE TECHNOLOGIES Relevance Highlights
Relevance highlight: European projects “… Knowledge Technologies is the most successful Slovenian program in terms of EU projects.” National Research Fund director F. Demšar, Oct. 25, 2007 FP6 20 EU projects –4 IP projects, 1 NoE, 3 SSA, 1 CA –11 STREP projects –Coordination of one STREP project (IQ) In FP6 we acquired ~ 25% of Slovenian FP6-IST funds (5.1+ Mio EUR), i.e. ~ 7% of Slovenian FP6 funds FP6 projects 13
FP6 European projects ALVIS - Superpeer Semantic Search Engine (STREP, 2004–06) SIGMEA - Sustainable Introduction of GMOs into European Agriculture (STREP, 2004–07) IMAGINATION - Image-based Navigation in Multimedia Archives (STREP, 2006–09) SMART - Statistical Multilingual Analysis for Retrieval and Translation (STREP, 2006–09) SWING - Semantic Web Services Interoperability for Geospatial Decision Making (STREP, 2006–09) TAO - Transitioning Applications to Ontologies (STREP, 2006–09) E.E.T Pipeline - European Embryonal Tumor Pipeline (STREP, 2007–09) E4 - Extended Enterprise management in Enlarged Europe (STREP, 2006–08) Tool-East - Open Source Enterprise Resource Planning and Order Management System for Eastern European Tool and Die Making Workshops (STREP, 2006–08) IQ - Inductive Queries for Mining Patterns and Models (STREP, 2005–08), Coordinator HEALTHREATS - Integrated Decision Support System for HEALTH THREATS and crises management (STREP, 2007–10) SEKT - Semantically-Enabled Knowledge Technologies (IP, 2004 – 2006) ECOLEAD - European Collaborative Networked Organizations Leadership Initiative (IP, 2004–08) NeOn - Lifecycle Support for Networked Ontologies (IP, 2006–10) Co-Extra – GM and non-GM Supply Chains: Their Co-existence and Traceability (IP, 2008– 09) PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning (NoE, 2003–07) CEC-WYS - Central European Centre for Women and Youth in Science (SSA, 2004–07) IST-World - Knowledge Base for RTD Competencies (SSA, 2005–07) WS DEBATE - Stimulating Policy Debate on Women and Science Issues in Central Europe (SSA, 2006–08) KD-ubiq - A blue print for ubiquitous knowledge discovery systems (CA, 2005–08) 14
FP7 European projects (in 2008) FP7 –3 IP, 2 STREP, 1 NoE, 1 CSA –In FP7 we have acquired ~ 30% of Slovenian FP7-ICT funds (2.5+ Mio EUR) Projects –COIN - COllaboration and INteroperability for networked enterprises (IP, 2008–12) –ACTIVE - Enabling the Knowledge Powered Enterprise (IP, 2008–11) –EURIDICE - European Inter-Disciplinary Research on Intelligent Cargo for Efficient, Safe and Environment-friendly Logistics (IP, 2008–11) –PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning 2 (NoE, 2008–13) –BISON - Bisociation Networks for Creative Information Discovery (FET, 2008– 11) –PHAGOSYS - Systems biology of phagosome formation and maturation - modulation by intracellular pathogens (STREP, 2008–11) –MONDILEX - Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and their Digital Resources (CSA, 2008–10) 15
Industrial participation in European projects We have helped 11 Slovenian companies to become partners of FP6 and FP7 EU projects. Total value of EC contribution for these industrial partners is more than 2 Mio EUR. 16 –Orodjarski grozd –Avtomobilski grozd –Grozd visokotehnološke opreme –Kogast Grosuplje –Emo orodjarna –Valji Štore –Tecos –Quintelligence –Cycorp –Amebis –Hermes Softlab
Relevance highlight: Slovene language and heritage nl.ijs.si portal Largest public repository of Slovene language resources ~ 10,000 requests/day Annotated language corpora Lexicons and dictionaries On-line tools for language processing: concordancers, lemmatisers, taggers eZISS digital library of critical editions of Slovene literature 17
Relevance highlight: Environmental data analysis Applied projects –Agriculture: modeling co-existence of genetically modified and conventional crops –Forestry (automated forest mapping): from satellite images instead of LIDAR cost reduction: from 660 to 0.01 US$/km2 –Fire risk model: Deployed in e-GIS UJME Events –ECEM/EAML-04: European Conf. on Ecological Modeling: Env. App. of Machine Learning –Special issues of Ecological Modeling journal Postgraduate education –University of Nova Gorica –Jožef Stefan International Postgraduate School –University of Trento 18
Relevance highlight: Healthcare data analysis Projects MediNet and MediNet+ for the Slovenian Ministry of Health –Qualifications of physicians Modeling and exception finding –Planning of needs for physicians –Accessibility of primary healthcare 19
Relevance highlight: Public portals videolectures.net World leading video lectures Web portal 4,500+ videos of 3,000+ lectures at 150+ events About 3,000 views/day Collaboration with CMU, Cambridge, Oxford, Max Planck, Berkeley, INRIA To include all MIT OpenCourseWare and CERN lectures base in
World leading Web portal for analyzing European science 90,000 RTD organizations, 68,000 RTD projects, 1.6 Mio experts and 2.5 Mio publications A bout 15,000 visits/day E xtending coverage to Russia, India, SEE countries Relevance highlight: Public portals 21
Organization of events Slovenian events –Solomon seminar – regular public seminar, running for 9 years (200+ seminars) –SiKDD – yearly Slovenian Conference on Data Mining –Language Technologies - biennial conferences International events –ECEM/EAML-04: Eur. Conf. on Ecol. Modeling – 100 attendees –European Semantic Web Conference 2006 – 350 attendees –IDA 2007 – 100 attendees –10+ international meetings and workshops (~50 attendees) International events planned –ECML/PKDD 2009 – est. 400 attendees –WWW 2012 (in process) – est attendees largest CS event to be organized in Slovenia 22
Department of KNOWLEDGE TECHNOLOGIES Vision 23
Future advances in basic research (1) Data analytics –Structured data analysis (structured prediction, bissociation analysis, …) –Sensor network analysis, social network analysis (large graph data) –Multi-modal data analysis (information fusion, different data types) –Complex data visualization Text analytics –Extending TextGarden to multimedia mining (text, image, Web) and social network analysis –Advancing information extraction, machine translation –Ontologies and Semantic Web 24
Future advances in basic research (2) Human language technologies –Semantic annotation of Slovene language corpora –Integrated digital library of Slovene text-critical editions –Slovene cultural heritage – processing old (19th century) language Decision support –Integration of qualitative and quantitative methods –Handling incompleteness, uncertainty and imprecision Knowledge management –Web 2.0, Semantic Web services –Networked organizations –eLearning – videolectures.net 25
Impact on other sciences through applied research Environmental sciences –Ecology (Aquatic, Modeling the response of ecosystems to climate change) –Forestry –Environmental epidemiology Agriculture Biomedicine –Bioinformatics, Functional genomics, Systems biology –Medicine Linguistics, Humanities and Social sciences Engineering Impact will be achieved in collaboration with partners of EU projects 26
National relevance Developing of IT and building a knowledge-based society –Through basic research –Training competent researchers in this area –Education (at graduate and post-graduate level) Applied research –Impact of the potential introduction of GM crops, environmental epidemiology of tick-borne diseases, introduction of ML technology for Slovene-English machine translation systems, analytic techniques for enterprise knowledge management, systems biology Continue opening new high-tech jobs in Slovenia –Through direct industrial applications –Through inclusion of Slovenian industry in EU projects 27
Means for achieving our vision Clear scientific focus on knowledge technologies Excellent links with scientists abroad Excellent links with industry Young and visionary staff Available equipment Secured funding: –About 25 % from national long-term research program Knowledge Technologies and other national and international projects –About 75 % from EU funded projects 28
Principal researchers – project leaders Nada Lavrač Sašo Džeroski Tomaž Erjavec Dunja Mladenić Head of Department Marko Bohanec Marko Debeljak Marko Grobelnik Mitja Jermol 29
Department members - Bohinj
Notes 31
Notes 32
Jožef Stefan Institute and Postgraduate School Jožef Stefan ( ) was one of the most distinguished physicists of the 19th century. He originated the law of the total radiation from a black body. Founded in 1949, Jožef Stefan Institute is the leading Slovenian scientific research institute, covering a broad spectrum of basic and applied research. The staff of more than 850 specializes in natural sciences, life sciences and engineering. Department of Knowledge Technologies is one of the seven ICT departments of the institute. Other departments are in the area of chemistry, biochemistry, ecotechnology, nanotechnology, physics, nuclear technology and safety. Founded in 2004, Jožef Stefan International Postgraduate School offers MSc and PhD programs: ICT, nanotechnology and ecotechnology. Courses are taught in English. j = σ T 4