1 ESCRIRE: Embedded Structured Content Representation In Repositories Jérôme Euzenat INRIA Rhône-Alpes
2 ESCRIRE: Motivations Embedding a simplified but formal representation of content in documents : search on structured criteria; document comparison (genericity, similarity…); automatic classification and organisation.
3 Knowledge based queries (and book (about "Agatha Christie")) vs. book AND "Agatha Christie" (and flat (location "Alps")) …including those in Val d’Isère! (and bookshop (location "London")) …bookstore included.
4 Query languages level 3 Semiotic level 2 Semantic (F-logic, Escrire…) level 1 Structural (SQL, XQL) level 0 Full-text search
5 ESCRIRE: Goals Comparison of several knowledge representation techniques in order to find the type of situation to which they are most suited (indexing, classifying, filtering…).
6 ESCRIRE: Consortium “Coordinated research action (ARC)” involving Acacia (Sophia-Antipolis): conceptual graphs Sherpa/Exmo (Rhône-Alpes): object-based representations Orpailleur (Lorraine): terminological logics. Usinor: application.
7 “Ontology” Description ESCRIRE: Acquisition Global analysis Individual analysis Integration XML document Document Tr-schema Tr-object
8 ESCRIRE: Queries “Ontology” Query helper XML document Tr-schema Tr-queryTroeps XML document
9 ESCRIRE: Problem statement Given: A set of (HTML) documents annotated by a description of their content in a pivotal langage An ontology of the domain A set of queries about the subject. Retrieve: the adequate documents.
10 ESCRIRE: Software variation Knowledge representation + query evaluation Translated from a pivotal language in Conceptual graphs, Object-based representation, Description logic Translated by hand in CG, OKR, DL
11 ESCRIRE: Evaluation Two corpora Two (or three) sets (training and test) provided by an external user. Qualitative and quantitative evaluation (possibly external).
12 ESCRIRE: Quantitative criteria Precision: rate of correct answers Recall: rate of complete answers Acuracy=(precision+recall)/2 Performances in time Coverage of the query language Ordering of answers
13 ESCRIRE: Qualitative criteria Given by external users (query designers): Naturalness of queries Adequacy of answers Overall appreciation (aggregation).
14 ESCRIRE: Scaling Multiplying the size by orders of magnitude: Corpus Ontology Queries.
15 ESCRIRE: Reference comparisons Dublin core metadata Full-text search
16 ESCRIRE: Ontology elements (1) …
17 ESCRIRE: Ontology elements (2) … …
18 ESCRIRE: Content descriptions inhibition …
19 ESCRIRE: Knowledge embedding … … … … …
20 ESCRIRE: Queries Stated on objects, but results are documents (concerning these topics) Document similarity by content similarity
21 ESCRIRE: Query language SELECT / FROM / WHERE / ORDERBY + AND / OR / NOT / ALL / EXISTS | IN ALIKE
22 ESCRIRE: Corpus 1 Subject: genetic interaction Text source: MedLine abstracts Annotations: manual Ontology: Knife knowledge base + other
23 ESCRIRE: Corpus 2 Subject: Psychological stress Text source: MedLine abstracts Annotation: manual annotations Ontology: UMLS/MeSH
24 ESCRIRE: Where are we? Building translators from pivot to actual formats 1st part of Corpus 1 available (other data shall folow quikly)
25 ESCRIRE: Calls Other corpora Natural language technology Other representation systems starting from september 2000
26 For more information…