LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Alex Killing, Diane Evans, Cristina Vertan.

Slides:



Advertisements
Similar presentations
Support.ebsco.com Nursing Reference Center Tutorial.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning Lothar Lemnitzer GLDV AK eLearning, 11. September 2007.
Using a domain-ontology and semantic search in an eLearning environment Lothar Lemnitzer, Kiril Simov, Petya Osenova, Eelco Mossel and Paola Monachesi.
Crosslingual Ontology-Based Document Retrieval (Search) in an eLearning Environment Eelco Mossel LSP 2007, Hamburg.
WP 4: Integration of Language Technology Tools into ILIAS Learning Management System Alexander Killing Project review, Utrecht, 1 Feb 2007.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Applying Ontology-Based Lexicons to the Semantic Annotation of Learning Objects Kiril Simov and Petya Osenova BulTreeBank Project
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
© NCSR, Paris, December 5-6, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Enrich the lexicons for the 1 st domain based on partners remarks.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
Using language services to enrich the LOs' descriptions Dr. Vassilis Protonotarios University of Alcala, Spain 10 th Strategic Seminar / Conference 6-7.
Crosslingual Ontology-Based Document Retrieval (Search) in an eLearning Environment RANLP, Borovets, 2007 Eelco Mossel University of Hamburg.
Supporting e-learning with automatic glossary extraction Experiments with Portuguese Rosa Del Gaudio, António Branco RANLP, Borovets 2007.
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
Information Retrieval in Practice
Search Engines and Information Retrieval
Crosslingual Retrieval in an eLearning Environment Cristina Vertan, Kiril Simov, Petya Osenova, Lothar Lemnitzer, Alex Killing, Diane Evans, Paola Monachesi.
WP 2: Semi-automatic metadata generation driven by Language Technology Resources Lothar Lemnitzer Project review, Utrecht, 1 Feb 2007.
Keyword extraction for metadata annotation of Learning Objects Lothar Lemnitzer, Paola Monachesi RANLP, Borovets 2007.
LTeL - Language Technology for eLearning -
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
LT4eL - WP1: Setting the scene WP leader: UAIC Univ. AI. I. Cuza of Iasi Faculty of Computer Science Dan Cristea, Corina Forăscu, Dan Tufiş, Ionuţ Pistol,
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Semi-automatic glossary creation from learning objects Eline Westerhout & Paola Monachesi.
University of Jyväskylä – Department of Mathematical Information Technology Computer Science Teacher Education ICNEE 2004 Topic Case Driven Approach for.
Overview of Search Engines
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
Domain Modelling the upper levels of the eframework Yvonne Howard Hilary Dexter David Millard Learning Societies LabDistributed Learning, University of.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
New trends in Semantic Web Cagliari, December, 2nd, 2004 Using Standards in e-Learning Claude Moulin UMR CNRS 6599 Heudiasyc University of Compiègne (France)
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Metadata generation and glossary creation in eLearning Lothar Lemnitzer Review meeting, Zürich, 25 January 2008.
National Institute of Standards and Technology 1 Testing and Validating OAGi NDRs Puja Goyal Salifou Sidi Presented to OAGi April 30 th, 2008.
CCLVET Cross Cultural Learning and Teaching in Vocational Education and Training Overview LEONARDO DA VINCI Transfer of Innovation AGREEMENT NUMBER – LLP-LDV-TOI-08-AT-0021.
February 2007MCST - FP7 Launch1 Michael Rosner Department of Computer Science and Artificial Intelligence University of Malta.
Standards-Based Science Instruction. Ohio’s Science Cognitive Demands Science is more than a body of knowledge. It must not be misperceived as lists of.
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
University of Economics Prague Information Extraction (WP6) Martin Labský MedIEQ meeting Helsinki, 24th October 2006.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Moodle (Course Management Systems). Managing Your class In this Lecture, we’ll cover course management, including understanding and using roles, arranging.
WP5: Validation Anne De Roeck Diane Evans The Open University, UK.
FP WIKT '081 Marek Skokan, Ján Hreňo Semantic integration of governmental services in the Access-eGov project Faculty of Economics.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
© NCSR, Frascati, July 18-19, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Use of PROTÉGÉ to generate ontology and lexicons for the 1 st domain.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
LEMAIA PROJECT Kick off meeting Rome February 2007 LEMAIA: a Project to foster e-learning diffusion Pietro RAGNI LEMAIA PROJECT Rome, 11 april.
HTBN Batches These slides are intended as a starting point for further discussion of how eTime might be extended to allow easier processing of HTBN data.
COST Action and European GBIF Nodes Anne-Sophie Archambeau.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
Information Retrieval in Practice
WP 2: Ontology & Metadata Models ITD
Search Engine Architecture
Usage scenarios, User Interface & tools
Introduction to Information Retrieval
Chaitali Gupta, Madhusudhan Govindaraju
Presentation transcript:

LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Alex Killing, Diane Evans, Cristina Vertan

LT4eL - Language Technology for eLearning -1- EU-IST-FP6 Project The LT4eL project uses multilingual language technology tools and semantic web techniques for improving the retrieval of learning material. The developed technology will facilitate personalized access to knowledge within learning management systems and support decentralisation and co-operation in content management.

LT4eL - Language Technology for eLearning -2- Start date:; 1 December 2005 Duration: 30 months EU finacing: 1.5 milion Euro Type project: STREP IST-4 Coordination: Paola Monachesi (Utrecht university) Contact for information:

LT4eL - Partners Utrecht University (UU), The Netherlands University of Hamburg (UHH), Germany University “Al.I.Cuza” of Iasi (UAIC), Romania University of Lisbon (FFCUL), Portugal Charles University Prague (CUP), Czech Republic Institute for Parallel Processing, Bulgarian Academy of Sciences (IPP-BAS), Bulgaria University of Tübingen (UTU), Germany Institute of Computer Science, Polish Academy of Sciences (ICS- PAS), Poland Zürich University of Applied Sciences Winterthur (ZHW), Switzerland University of Malta (UOM), Malta University of Cologne (UCO), Germany Open University (OU), United Kingdom

LT4eL - Languages Bulgarian Czech Dutch German Maltese Polish Portugese Romanian English

LT4eL -Aims Improve retrieval of learning material Facilitate construction of user specific courses Improve creation of personalized content Support decentralization of content management Allow for multilingual retrieval of content

LT4eL- Objectives -1- Scientific and Technological Objectives –Creation of an archive of learning objects and linguistic resources –Integration of language technology resources in eLearning –Integration of semantic Knowledge in eLearning –Integration of functionalities in open source LMS –Validation of enhanced LMS

LT4eL- Objectives -2- Political objectives –Support multilinguality –Knowledge transfer –Awareness raising –Exploitation of resources –Facilitate access to education

LT4eL - Workpackages ▪ WP1 - Setting the scene - WP leader: University AI. I. Cuza of Iasi WP1 ▪ WP2 - Semi-automatic metadata generation driven by Language Technology resources - WP leader: University of T 歟 ingenWP2 ▪ WP4 - Integration of the new functionalities in the ILIAS Learning Management System - WP leader: University of CologneWP4 ▪ WP3 - Enhancing eLearning with semantic knowledge - WP leader: IPP, Bulgarian Academy of ScienceWP3 ▪ WP5 - Validation of new functionalities in the ILIAS Learning Management System - WP leader: Open University (England)WP5 ▪ Multilinguality - Leader: University HamburgMultilinguality

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

Collection of Learning materials collection of the learning material (uploads & updates at - passwd protected) ▪ IST domains for the LOs:▪ –1. Use of computers in education, with sub-domains:▪ 1.1 Teaching academic skills, with sub-domains:▪ Academic skills▪ Relevant computer skills for the above tasks (MS Word, Excel, Power Point, LaTex, Web pages, XML)▪ Basic computer skills (use of computer for beginners) (chats, e- mail, Intenet)▪ 1.2 e-Learning, e-Marketing▪ 1.3 The I*Teach document (Leonardo project, Impact of use of computers in society 1.5 Studies about use of computers in schools / high schools▪ 1.6 Impact of e-Learning on education▪ –2. Calimera documents (parallel corpus developped in the Calimera FP5 project, )

Collection of learning materials and linguistic tools normalization of the learning material▪ commonly agreed DTD and convertors from html/txt to basic XML format▪ Inventarization and classification of existing tools ( relevant to: –▪ the integration of language technology resources in eLearning (WP2) – the integration of semantic knowledge (WP3)▪ Inventarization and classification of existing language resources▪ corpora and frequencies lists ハ : ▪ lexica:

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

WP2: Integration of language resources in eLearning Aims of the Workpackage supporting authors in the generation of metadata for Los improving keyword-driven search for LOs supporting the development of glossaries for learning material

Metadata metadata are essential to make LOs visible for larger groups of users authors are reluctant or not experienced enough to supply them NLP tools are supposed to help them in that task the project uses the LOM metadata schema as a blueprint

Task 1: Identification of keywords Good keywords have a typical, non random distribution in and across Los Keywords tend to appear mor often at certain places in texts (headings etc.) Keywords are often highlighted / emphasised by authors

Modelling Keywordiness Residual Inverse document frequency used to model inter text distribution of KW Term burstiness used to model intra text distribution of KW Knowledge of text structure used to identify salient regions (e,g, headings) Layout features of texts used to identify emphasised words and weight them higher

Challenges Treating multi word keywords (suffix arrays will be used to identify n-gramsof arbitrary length) Assigning a combined weight which takes into account all the aforementioned factors Evaluation –manually assigned keywords will be used to measure precision and recall of key word extractor against – inter annotator agreement will be tested to get a upper bound for keyword assignment task

Task 2: Identification of definitory contexts This task makes use of the linguistic annotation of Los The approach is empirical Identification of definitory contexts is language specific Workflow –Definitory contexts will be searched and marked in LOs (manually) –Local grammars will be drafted on the basis of these examples –The linguistic annotation will be used for these grammars –The grammars will be applied to new Los Integration –The tools will be integrated as additional functions to the ILIAS LMS – The tools will also be available for integration in other LMS – We consider making the tools available as web services

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

WP3:ontology based cross- lingual retrieval Generic approach For each domain : – Using computers for beginners – Impact of eLearning in Society we built a domain ontology For consistency reasons we consider also an upper ontology (DOLCE) Lexical material in all 9 languages is mapped on the ontology and on the upper ontology According to : types of relations in the ontology and Uses cases Similarity (predefined ontological chunks) we define some search patterns for the user interface

Domain Ontology First built starting with English documents Concepts are based on : –Extracted keywords in WP2 and –Glossaries for the given domains Concepts have generic names with parts in English (for readability reasons) e.g: C11_editors For each concept we provide labels with explanation of the concept in english and ideally in all other languages Types of relations: –Is_a –Part_of –Here we need some informations about what people are searching The ontology will be encoded in OWL- DL

Mapping multilingual resources on the domain ontology -1- Trivial for words having exact a correspondent in the ontology Problems appear when: 1.One word in a language sub-sums two or more concepts in the ontology 2.One word in a language sub-sums two or more concepts in an ontology but only in relations with some other concepts 3.One word has a much restrictive meaning not present in the ontology

Mapping multilingual resources on the domain ontology -2- Solution to 1: –Express the lexical items in OWL-DL expressions: disjunction, conjunctions of classes (give example) Solution to 2: –Express the lexical items in OWL-DL using together with operations on classes also relations between the involved concepts Solution to 3: –Insert new concept in the ontology

Ontology enrichment If one word cannot be mapped directly on the ontology look if a similar meaning can be retrieved in some other languages. If this seems to be not an isolated case insert the new concept in the ontology. In any case assign to each concept a label indicated the languages in which this concept is lexicalised The insertion of a new concept will be done with FACT or RACER

Linking lexicon, domain ontology and upper ontology Domain ontology concepts will be mapped on the upper ontology. This will ensure that all important properties of main classes are considered. Not relevant senses of some lexical items could be also mapped directly on the upper ontology

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

WP4: Tasks Integration of LT4eL Tools for semi-automated metadata generation, definitory context extraction and ontology supported extended data retrieval into a learning management system (prototype based on ILIAS LMS) Developing and providing documentation for a standard-technology-based interface between the language technology tools and learning management systems

WP4: Objective - Fostering Re-Use of LT-Tools LMS 1 (ILIAS) Language Technology Tools LT-Interface XML-RPC / Web Service Language Technology Tools LT-Interface XML-RPC / Web Service LMS 2 (e.g. Moodle) Language Technology Tools LT-Interface XML-RPC / Web Service LMS 3 (e.g. ATutor) Simple-as-possible, well-documented and standards-based interface

WP4: Using LT-Tools in Learning Managements Systems Possible Use Case Scenarios –Author annotates learning object with keywords –Author generates glossary for learning object –Tutor searches for learning objects –Learner searches for learning material in multiple languages –Learner browses through learning material with ontology based information

WP4: Example ILIAS-LT-Tools Use Case Scenario: Keyword Generation 1. Author adds new learning object to the LMS (e.g. HTML file) 2. ILIAS displays a form including input fields for title, language and filename 3. Author enters title and language, selects a local.pdf file and hits Upload File 4. ILIAS+LTTools display the (LOM) metadata input form, including a list of auto-generated, suggested keywords 5. Author selects some of the suggested keywords, enters some new keywords and hits Save 6. ILIAS saves the metadata

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

WP5: Validation of enhanced LMS. Challenge is to answer these questions: –How does this compare with what can already be done with existing systems? –What added value is there? –What is the educational / pedagogic value of these functionalities? Problem is to evaluate the functionality and separate from issues of usability or unfamiliarity with the LMS platform.How can we expect users to identify any benefit?

How can we expect users to identify any benefit? Present them with tasks to complete using LMS With no project functionality With project functionality – Partial – Full Identify potential users – Course Creators –Content Authors or Providers – Teachers – Sudents studying in their own language studying in a second language

Create outline User Scenarios We define scenarios, in this context, as – a story focused on a user or group of users which provides information on the nature of the users, the goals they wish to achieve and the context in which the activities will take place. – They are written in ordinary language, and are therefore understandable to various stakeholders, including users. –They may also contain different degrees of detail.

Example Outline Scenario for a student A student has just completed studying in English a topic on 'The use of computers in Schools'. They are interested in finding more information on the use of this topic within their subject domain. Their first language is German Suggested search approaches might be: –standard search as available within the LMS not using any of LT4eL functionality. –add in the lexicon –add in the multi-linguality – add in the ontology Users will be given guidance / familiarisation activities in using each of the tools beforehand. ▪ User Scenarios are under development for all the identified users. Each scenario will focus on one or more of the new functionalities dependent on the roles of a particular user.

Possible Teachers /Course creators tasks Add new content to new course structure Search for existing content and add to course structure Add new content to existing course Add supplementary content (could be another language) Modify existing content Create new content and make available to the system.

Feedback from Users Sessions will be used to gather some initial feedback using – individual interviews –group plenary –questionnaires

Project plan Preparatory work in place (May 06). Development functionalities complete (November 2006). Integration functionalities in LMS complete (May 2007) First cycle integration functionalities in LMS and their validationcomplete (November 2007) Second cycle integration functionalities in LMS and their validationcomplete (May 2008)