ICS-FORTH August 1, 2008 1 Integrated Information Management and Access - new chances for museums, archives and libraries Martin Doerr Foundation for Research.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

ICS-FORTH March 30, Waking from a Dogmatic Slumber - A Different View on Knowledge Management for DLs Martin Doerr Alicante, Spain September 21,
Trondheim, August 21, Martin Doerr Trondheim August 21, 2003 FORTH, Greece Chair, CIDOC CRM Special Interest Group The CIDOC Conceptual Reference.
1 ICS –FORTH, Oct.30-Nov.4,2006, Cyprus Documenting Events in Metadata Martin Doerr, Athina Kritsotaki Center for Cultural Informatics Institute of Computer.
The Dream of a Global Network of Knowledge
ICS-FORTH May 23, An Ontological Approach to Digital Preservation Metadata Martin Doerr Foundation for Research and Technology - Hellas Institute.
1 CIDOC CRM + FRBR ER = FRBR OO … an equation for a harmonised view of museum information and bibliographic information Martin Doerr First CASPAR Seminar.
ICS-FORTH March 30, Waking from a Dogmatic Slumber - A Different View on Knowledge Management for DL’s Martin Doerr London, UK March 30, 2006 Center.
Melbourne, October 13, Electronic Communication on Diverse Data - The Role of the oo CIDOC Reference Model - Martin Doerr (ICS-FORTH, Crete, Greece)
1 Adaptive Management Portal April
ICS-FORTH September 16, FRBR OO, a Conceptual Model for Performing Arts Athens, September 16, 2008 Martin Doerr*, Patrick LeBoeuf**, Chrysoula Bekiari*
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Lecture Nine Database Planning, Design, and Administration
BIS310: Week 7 BIS310: Structured Analysis and Design Data Modeling and Database Design.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
The CIDOC CRM, a Standard for the Integration of Cultural Information
ICS-FORTH February 19, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr Foundation for Research and Technology.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
ICS-FORTH May 25, The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May.
ICS – FORTH, August 31, 2000 Why do we need an “Object Oriented Model” ? Martin Doerr Atlanta, August 31, 2000 Foundation for Research and Technology -
ICS-FORTH October 14, The CIDOC CRM, factor for the integration and presentation of cultural information Martin Doerr Foundation for Research and.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Harmonising without Harm: towards an object-oriented formulation of FRBR aligned on the CIDOC CRM ontology Maja Žumer (University of Ljubljana) & Patrick.
Using an ontology-driven system to integrate museum information and library information Paper presented on the occasion of the Symposium on Digital Semantic.
ICS-FORTH November, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr Foundation for Research and Technology -
Metadata, the CARARE Aggregation service and 3D ICONS Kate Fernie, MDR Partners, UK.
ICS-FORTH March 19, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr, Stephen Stead Foundation for Research and.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
ICS-FORTH November 5, The CIDOC CRM, a New Standard for Knowledge Sharing Martin Doerr Foundation for Research and Technology - Hellas Institute.
1 The CIDOC CRM Harmonized models for the Digital World: CIDOC CRM, FRBROO, CRMDig, EDM. Martin Dörr Stephen Stead Helsinki, Finland June 10, 2012 Center.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
A CIDOC CRM – compatible metadata model for digital preservation
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
The CIDOC CRM, a Standard for the Integration of Cultural Information
Smithsonian, March 26, International Symposium “Sharing the Knowledge” Martin Doerr Smithsonian, Washington DC March 26, 2003 FORTH, Greece Chair,
Metadata and Documentation Iain Wallace Performing Arts Data Service.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
ArtSTOR When the Rubber Hits the Road Using the CIDOC CRM in the Real World Tony Gill 27 March 2003.
ICS-FORTH April 26, The CIDOC CRM, a Conceptual Model for Cultural Documentation Martin Doerr Foundation for Research and Technology - Hellas Institute.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
ICS-FORTH Thesauri of Historical Periods A Proposal for Standardization Martin Doerr, Athina Kritsotaki Heraklion, Crete, June
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
ICS-FORTH October 21-26, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr Foundation for Research and Technology.
From FRBR to FRBR OO through CIDOC CRM… A Common Ontology for Cultural Heritage Information Patrick Le Bœuf, National Library of France International Symposium.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Cornell CS 502 Metadata for the Web Issues and Simple Answers CS 502 – Carl Lagoze – Cornell University.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
1 The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr, Stephen Stead Imperial College, London, UK May 22, 2009 CIDOC Conceptual.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
ICS-FORTH October 6 - 9, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr, Stephen Stead Foundation for Research.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
ICS-FORTH October 6 - 9, The CIDOC CRM, a Standard for the Integration of Cultural Information Martin Doerr, Stephen Stead Foundation for Research.
Summon® 2.0 Discovery Reinvented
From FRBR to FRBROO through CIDOC CRM…
Data Management: Documentation & Metadata
Workshop on Semantic Interoperability in e-Science Martin Doerr
The CIDOC CRM, a Standard for the Integration of Cultural Information
Presentation transcript:

ICS-FORTH August 1, Integrated Information Management and Access - new chances for museums, archives and libraries Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Singapore, August 1, 2008 Center for Cultural Informatics

ICS-FORTH August 1,  Information Integration – a utility perspective  Museum and Library Information  Key-words, Finding Aids and Thesauri  Do we talk about the same thing?  Understanding events, contexts and stories  CIDOC CRM, simple implementations Integrated Information Management Overview

ICS-FORTH August 1, Memory institutions maintain Digital Repositories (“Digital Memories”) Information systems preserving and providing access to primary information sources, scientific and scholarly information and literature, such as digital libraries of publications, indices of archives of social or scientific activities, or documentation of physical collections. Digital Repositories are necessarily heterogeneous to optimize their function for different information forms and access needs, but the knowledge they contain forms a logical whole. To get information and learn from information we need uniform access, retrieval by human criteria and connection of disparate information assets (e.g., painting & biography) Information Integration Management A Perspective of Utility

ICS-FORTH August 1, Information integration provides a syntactically and semantically homogeneous layer on top, be it physical or virtual, manual or automated. Multiple standard formats can coexist, if information can be transformed or merged. One format does not ensure that the information is connected! Standardization and transformation go hand in hand. For both, documentation (metadata) needs to be provided, adapted or “cleaned”: legacy data to standard form, from one standard to another, “tune” data so that they can be transformed. Ultimate integration cost: manual creation/ adaptation of metadata. Better integration is not always more work, but needs more foresight. Bad decisions cost most. Information Integration Management A Perspective of Utility

ICS-FORTH August 1,  Levels of Integration: From one platform, I can… 1. read everything, if I have the ID : syntactic integration, The Web 2. get everything that refers to the words X, Y, Z: Google and others 3. get everything about a particular person, thing, place, fact, or concept 4. learn, if there are things, facts with given characteristics 5. learn about associations and contexts of things across documents For instance,  What species is this object?  Which professions had the relatives of van Gogh? Which where the clients of van Gogh’s paintings?  Were German soldiers in Russia before WWII ?  Which antique art objects may Michelangelo have seen? (25 years project !) Information Integration Management A Perspective of Utility

ICS-FORTH August 1,  The traditional library task:  Collect and preserve documents and provide finding aids  The job is solved, when the (one, best) document is handed out. “All you need is in this document”.  But understanding lives from relationships. Museum information has complex relationships. Relationships may be categorical or factual:  Categorical (e.g., “smoking causes cancer”). : Richly exploited by Semantic Web technology. Use and integration limited to research results. Not useful for primary research itself.  Factual associations concatenate information assets to meaningful (“epistemic”) networks (“stories”): support context-based hypothesis building, cross-disciplinary search etc. (e.g. “John smoked with 20”, … ”. “John had lung cancer with 60”) Information Integration Management A Perspective of Utility

ICS-FORTH August 1,  The typical library contents: “The whole stories”  Secondary literature (research results)  Facts brought into causal context  Categorical: theories and hypotheses  Fiction.  The typical archive contents: “The needle in the haystack”  Primary sources, “bits and pieces” (letters, legal documents, administration acts, images, scientific records).  factual, kept in the sequence of creation, as by the creator or responsible.  The typical museum information: “Museum objects rarely talk”  Factual documentation of properties and context per object, references, classification  Highly heterogeneous, disparate. Information Integration Management Library, Archive, Museum Information

ICS-FORTH August 1, Museum Information “ A Monet is not like a Dinosaur”  Museum objects may be:  Unique in form, valuable out of context — Valued art objects: “La Pie by Monet”, aesthetic minerals, exceptional life forms, curiosities. Unique by particular context, not valuable out of context, valuable only as illustration or symbol, — Historical heirlooms, relics of saints, “John Lennon’s T-Shirt” Not unique, not particularly valuable. Used as example of a category out of the particular context — Most objects in Natural History, ethnology, archeology. Unique by rarity, valuable as evidence out of a particular context — Most objects in paleontology, many unique archeological objects: “6th left rib from a T. Rex”

ICS-FORTH August 1, Information Integration Management The Museum Information Problem  The ultimate goal of users seeking information is not to get an “object” but to understand a topic.  Understanding lives from relationships:  objects are interpreted by context (e.g., bone finds in Evan’s “bathtubs”)  contexts are interpreted by objects (e.g., many arrowheads in Troy IV)  objects are interpreted by categories (e.g., Evan’s Minoan “bathtubs”)  categories are supported by examples (e.g., the shape of a kris)  categories may be based on rare evidence (e.g., a hominid tooth)  We need to integrate museum, archives, libraries in a sensible way to find integrated knowledge and produce new knowledge, to provide evidence for new hypotheses or verify or challenge old hypotheses.

ICS-FORTH August 1,  Museum and library information has complex interrelations. Museum and library information overlaps, and otherwise is different.  Libraries document literature in order to facilitate access to it.  Museum documentation classifies and describes museum objects, their context and relevance. It refers to literature. Museums produce regularly (secondary) literature.  Museum objects are referred to and published in literature. Literature may describe museum objects, their context and theories about and related to them. Literature describes concepts that are exemplified or illustrated by museum objects. No standard documentation format yet for that!  Libraries may also produce literature. Libraries may document and curate rare objects as museums do. Most museums maintain libraries. Information Integration Management Library and Museum Information

ICS-FORTH August 1, Libraries Museums Archives illustrate, exemplify refer to Books Objects primary Documents provide finding aids are about document features & context provide finding aids make narratives from publish using Information Integration Management Archive, Library and Museum Information

ICS-FORTH August 1, Key-words, Finding Aids and Thesauri The second level of integration  Why is Google (i.e. Search Engines!) good?  Low cost, no data tuning, scalable  Find easily secondary literature, esp. if abundant  Find things by usual category names  No user training, no access language => Recommendation: You should always provide a good search engine !  Why is Google bad?  User must know all synonyms  Names are not things: Rare things are covered under frequent names (e.g., “George Bush”, a S/W called “Volcano”)  Relations only by aggregation of terms appearing in the source (e.g., “First known Turkish - Greek marriage in Crete” (1635) ),  No control on relevance, no statistics possible, no related sources

ICS-FORTH August 1,  Finding Aids:  Assumption: User knows a topic, characterized by a noun, or knows associations of the topic uncorrelated to the problem to be solved (e.g. “organic farming” for “host-parasite studies”, an author for a topic, or: search object by date of acquisition, because I don’t remember the name)  Dublin Core Metadata Elements makes 15 relationships to terms explicit (type, classification, creator, publisher, date, format etc.)  It increases precision  It increases recall if additional terms in the metadata are added Key-words, Finding Aids and Thesauri The second level of integration

ICS-FORTH August 1,  Is Dublin Core better than Google?  Literature search by Author-Title: Google is sufficient or better  Type, format, subject, coverage: DC only better if terms not in the content  Relationship: DC better if not connected by relevant term cluster  Non-verbose, non-digital objects: DC provides the minimal metadata!  By Shakespeare or about Shakespeare: DC disambiguates!  What Dublin Core does not?  Not appropriate for museum objects (no place, finding info, material)  No typed relationships, no context information  No notion of identity (separation of URI and name, American library tradition) => DC has significant benefit for non-verbose digital objects. Key-words, Finding Aids and Thesauri The second level of integration

ICS-FORTH August 1, Key-words, Finding Aids and Thesauri The second level of integration  Thesauri of controlled terms (categories)  Subjects, object types, place types, person roles, event types  Good for secondary literature search, metadata fields (libraries!)  Bad: A “new language” users must learn, expensive to create  invisible thesauri enhance search engines  “Museums do not like thesauri”:  Not suited for factual knowledge!!  Cultural terminology is a dynamic research tool (“every PhD a new typology”) to conclude from form to function or time etc.  Only few high-level terms are stable and useful for finding aids Recommendation: Small thesauri for museums (that users can see on one page) increase power of metadata and improve search results.

ICS-FORTH August 1, Do we talk about the same Thing? Co-reference can connect documents! Such networks hide stories! (complementary information) ? ? ?

ICS-FORTH August 1, Integration by Factual Relations Ethiopia Johanson's Expedition CIDOC CRM Core Ontology Documents in Digital Libraries Hadar Discovery of Lucy AL Lucy Deductions Linking documents by co-reference Primary link corresponding to one document Donald Johanson Cleveland Museum of Natural History Instance of real world nodes (KOS) Do we talk about the same Thing? Hypertext is wrong: Documents contain links!

ICS-FORTH August 1, match Authority service local ids Content LinktableLinktable match Source 1 Source 2 local ids id Dyn amic li nk Join Join across sources by transitivity of co-reference query input: “Martin” output: “George” “Κώστας” / “Kostas” Do we talk about the same Thing? Co-reference links via authority files Not scalable! Find “friends of a friend”

ICS-FORTH August 1, match local ids Content make a co-reference Source 1 Source 2 local ids Join Join across sources by transitivity of co-reference query input: “Martin” output: “George” local ids make a co-reference Do we talk about the same Thing? Co-reference links without authority files Find “friends of a friend” “Κώστας” / “Kostas”

ICS-FORTH August 1, Do we talk about the same Thing? The third level of integration  Do we talk about the same thing?  Documents are connected if they refer to the same things people, places, events = “Co-reference”. The hypertext model is wrong.  Authority files cannot catch up, they simplify procedure but do not solve it. The scale is incredible.  Curation of direct co-reference links (co-reference clusters) needed.  Not more expensive than a search engine index  Duplicate detection, data cleaning and Web 2.0 methods can help massively generate co-reference links Recommendation: Prepare for co-reference in documentation practice! (tag names, link locally etc. )

ICS-FORTH August 1, Understanding Events, Contexts, Stories The Fourth Level of Integration  So far, by integration nothing learned yet beyond what I manually collect from each source.  Co-reference: Allows for tracing stories, but not for querying stories.  Understanding lives from relationships.  Is there a global model of relationships? (social, economic, material, geographic, biological relations…, thousands of documentation formats)  Dominance of the mesoscopic, human activity scale.  Identification, classification, part-whole, reference, participation in meetings => these relations integrate museum and library information!  Confirmed by museums, e-science, historians.

ICS-FORTH August 1, Information Integration Management Context as a network of related “meetings” space time “LAOKOON” (copy) (in Vatican museum) Winkelmann “…noble simplicity, silent grandeur…” (in a library) Winkelmann’s birth Winkelmann’s death Winkelmann sees “Laokoon” Winkelmann writes…. Winkelmann’s mother unknown Roman copies “Laokoon” “LAOKOON” unknown Roman Greece RomeGermany (archive information) Published Inference (in a library)

ICS-FORTH August 1, The CIDOC CRM ISO21127 The CIDOC Conceptual Reference Model (ISO21127:2006)  Developed by the CRM Special Interest Group of the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM), following an initiative of ICS-FORTH, Heraklion, Crete.  Is an extensible core ontology describing the underlying semantics of over a hundred database schemata and structures from all museum disciplines, archives and libraries. (Now extended by FRBR OO, modeling IFLA’s FRBR).  It is result of 15 years interdisciplinary work and agreement.  In essence, it is a generic model of recording of “what has happened” in human scale, i.e. a class of discourse.  By it we can generate huge, meaningful networks of knowledge by a simple abstraction: history as meetings of people, things and information.  It bears surprise: Minimal or no specialization allows for covering new domains.

ICS-FORTH August 1, The CIDOC CRM Historical Archives…. Type:Text Title: Protocol of Proceedings of Crimea Conference Title.Subtitle: II. Declaration of Liberated Europe Date: February 11, Creator:The Premier of the Union of Soviet Socialist Republics The Prime Minister of the United Kingdom The President of the United States of America Publisher:State Department Subject:Postwar division of Europe and Japan “ The following declaration has been approved: The Premier of the Union of Soviet Socialist Republics, the Prime Minister of the United Kingdom and the President of the United States of America have consulted with each other in the common interests of the people of their countries and those of liberated Europe. They jointly declare their mutual agreement to concert… ….and to ensure that Germany will never again be able to disturb the peace of the world…… “ Documents Metadata About…

ICS-FORTH August 1, The CIDOC CRM Images, non-verbose objects… Type:Image Title: Allied Leaders at Yalta Date: 1945 Publisher:United Press International (UPI) Source:The Bettmann Archive Copyright:Corbis References:Churchill, Roosevelt, Stalin Photos, Persons Metadata About…

ICS-FORTH August 1, The CIDOC CRM Places and Objects TGN Id: Names: Yalta (C,V), Jalta (C,V) Types: inhabited place(C), city (C) Position: Lat: N,Long: E Hierarchy: Europe (continent) <– Ukrayina (nation) <– Krym (autonomous republic) Note: …Site of conference between Allied powers in WW II in 1945; …. Source: TGN, Thesaurus of Geographic Names Places, Objects About… Title: Yalta, Crimean Peninsula Publisher: Kurgan-Lisnet Source: Liaison Agency

ICS-FORTH August 1, The CIDOC CRM Explicit Events, Object Identity, Symmetry P14 performed P11 participated in P94 has created E31 Document “Yalta Agreement” E7 Activity “Crimea Conference” E65 Creation Event * E38 Image P86 falls within P7 took place at P67 is referred to by E52 Time-Span February 1945 P81 ongoing throughout P82 at some time within E39 Actor E53 Place E52 Time-Span

ICS-FORTH August 1, The CIDOC CRM Data Example (e.g. from Extraction) Transfer of Epitaphios GE34604(entityE10 Transfer of Custody, E8 Acquisition Event P28 custody surrendered by Metropolitan Church of the Greek Community of Ankara P23 transferred title from P29 custody received by Museum Benaki P22 transferred title to Exchangeable Fund of Refugees P2 has type national foundation P14 carried out by Exchangeable Fund of Refugees P4 has time-span GE34604_transfer_time P82 at some time within P7 took place at Greece nation republic P89 falls within Europe continent TGN data P30 custody transferred through, P24 changed ownership through Epitaphios GE34604 (entityE22 Man-Made Object) P2 has type ) E39 Actor(entity ) E39 Actor(entity ) E39 Actor(entity P40 Legal Body ) (entity E55 Type ) (entity E55 Type ) (entity E55 Type ) (entity E55 Type ) (entity Metropolitan Church of the Greek Community of Ankara ) E39 Actor(entity E53 Place ) (entity E53 Place ) (entity E52 Time-Span ) (entity E61 Time Primitive)(entity Multiple Instantiation !

ICS-FORTH August 1, The CIDOC CRM Top-level Entities relevant for Integration participate in E39 Actors E55 Types E28 Conceptual Objects E18 Physical Thing E2 Temporal Entities E41 Appellations affect or / refer to refer to / refine refer to / identifie location at within E53 Places E52 Time-Spans

ICS-FORTH August 1, The CIDOC CRM Example: The Temporal Entity Hierarchy

ICS-FORTH August 1,  Identification of real world items by real world names.  Classification of real world items.  Part-decomposition and structural properties of Conceptual & Physical Objects, Periods, Actors, Places and Times.  Participation of persistent items in temporal entities. — creates a notion of history: “world-lines” meeting in space-time.  Location of periods in space-time and physical objects in space.  Influence of objects on activities and products and vice-versa.  Reference of information objects to any real-world item. The CIDOC CRM A Classification of its Relationships

ICS-FORTH August 1,  Ontologies are formalized knowledge: clearly defined concepts and relationships about real possible states of affairs of a domain. “Semantics” is the world they refer to (“ontological commitment”), and not a set of logical rules! (e.g., what is an event?)  Ontologies describe a reality, independent from context and performance! Information models are not ontologies! They abbreviate, denormalize, select. E.g.: “DC.creator”, “DC.Date”, “birthday/birthplace”, “destination” in the MIDAS schema (UK monuments records).  Ontologies can be understood by people and processed by machines to enable data exchange, data integration, query mediation:  Local information systems may export information in a CRM compatible form (CRM Core or more).  Local information systems may answer queries by a subset of CRM concepts.  Exported information may be merged in another database (“data warehouse”). Complementary information can thus be easily integrated. The CIDOC CRM What is an ontology?

ICS-FORTH August 1,  There cannot be one database schema for all ALM information. A global core ontology is a high-level explanation, not a format, allowing for automated correlation, mediation, transformation, generation of integrated views.  A particular Installation should have a core schema, compatible with the core ontology, following an informed decision about its integration and access capabilities, for instance, CRM Core, MuseumDat,or a similar CRM-compatible schema. DC and CRM Core can be combined.  With CRM, we know at any time what extension to more functionality means, e.g., FRBRoo/ FRBRCore. (DC extension simply failed!).  CRM Core(or MuseumDat): A low-cost entry to CRM compatibility. — As easy as Dublin Core, but appropriate to relate ALM — start with finding aids — add co-reference – manual, automated, Web 2.0 — add NLP to recover more events. — Add more sophisticated relationships. Interoperability of Museum Information towards a network of knowledge

ICS-FORTH August 1, Interoperability of Museum Information CRM Core metadata elements

ICS-FORTH August 1, E52 Time-Span 1898 E53 Place France (nation) E21 Person Auguste Rodin E52 Time-Span 1840 E67 Birth Rodin’s birth E52 Time-Span 1917 P4 has time-span E69 Death Rodin’s death E12 Production Rodin making “Monument to Balzac” in 1898 E21 Person Honoré de Balzac E55 Type sculptors E84 Information Carrier The “Monument to Balzac” (plaster) E55 Type plaster E52 Time-Span 1925 E55 Type bronze E40 Legal Body Rudier (Vve Alexis) et Fils E12 Production Bronze casting “Monument to Balzac” in 1925 E55 Type companies E84 Information Carrier The “Monument to Balzac”(S1296) P108B was produced by P62 depicts P16B was used for P134 continued P2 has type P120B occurs after P4 has time-span P2 has type P100B died in P98B was born P4 has time -span P2 has type P14 carried out by P62 depicts P108B was produced by P2 has type P7 took place at P4 has time-span Interoperability of Museum Information Integration with CRM Core (Network View)

ICS-FORTH August 1, Work (CRM Core). Category = E84 Information Carrier Classification =sculpture (visual work) Classification =plaster Identification =The Monument to Balzac (plaster) Description =Commissioned to honor one of France's greatest novelists, Rodin spent seven years preparing for Monument to Balzac. When the plaster original was exhibited in Paris in 1898, it was widely attacked. Rodin retired the plaster model to his home in the Paris suburbs. It was not cast in bronze until years after his death. Event Role in Event =P108B was produced by Identification= Rodin making Monument to Balzac in 1898 Event Type = E12 Production Participant Identification =Rodin, Auguste Identification =ID: Participant Type = artists Participant Type = sculptors Date = 1898 Place = France (nation) Related event Role in Event =P134B was continued by Identification= Bronze casting Monument to Balzac in 1925 Event Role in Event =P16B was used for Identification= Bronze casting Monument to Balzac in 1925 Event Type = E12 Production Participant Identification =Rudier (Vve Alexis) et Fils Participant Type = companies Thing Present Identification =The Monument to Balzac (S.1296) Thing Present Type =bronze Thing Present Type =sculpture (visual work) Date = 1925 Related event Role in Event =P120B occurs after Identification= Rodin's death Relation To = Honore de Balzac Relation type refers to Artist (CRM Core). Category = E21 Person Classification = artists Classification = sculptors Identification =Rodin, Auguste Identification =ID: Event Role in Event =P98B was born Identification= Rodin‘s birth Event Type = E67 Birth Date = 1840 Event Role in Event =P100B died in Identification= Rodin‘s death Event Type = E69_Death Date = 1917 Related event Role in Event =P120 occurs before Identification= Bronze casting Monument to Balzac in 1925 Metadata View

ICS-FORTH August 1, The CIDOC CRM Why an Integration layer on Top?  Information acquisition needs: — sequence and order, completeness, case-specific language and constraints to guide and control data entry. — ergonomic documentation units, optimized to specialist needs — work-flow on series of analogous items, item-centric. — Low interoperability needs (capability to be mapped!)  Integration / comprehension needs epistemic networks: — break up document boundaries, relate facts to wider context, — match shared identifiers of items, aggregate alternatives — no preference direction of search, no cardinality constraints. — High interoperability needs (mapping to a global schema)  Interpretation, story-telling, hypothesis building — explore context, paths, analogies (orthogonal to data acquisition) — present in order, resolve alternatives (enforce constraints) — deduction and induction

ICS-FORTH August 1, Epistemic Networks on DLs Metadata at sources and indirect co-reference links surrogate nodes Core Ontology (e.g., CIDOC CRM) Sources LucyJohanson's Expedition Donald Johanson Hadar indirect co-reference links extracted, normalized metadata Easy update Scalable, peer-to-peer Slow querying, Concatenation of facts, Alternatives management Ethiopia?

ICS-FORTH August 1,  Historical information is factual and contextual. Metadata formats for cultural heritage data must be adequate to the scientific discourse.  We need small thesauri for museums. Better invest in Gazetteers (placenames), and authority files.  CRM Core already captures first sensible Museum-Archive-Library connection. Immense benefit over Dublin Core, with similar effort.  The co-reference problem is widely ignored (or even feared ?). Its scale is extraordinary. Traditional KOS and data cleaning are not enough. We need Web 2.0 methods.  Capacity to link and transform information is crucial to integrate information in long-terms, beyond platforms. The CRM shows how to do that. Understand the historical perspective of information. Interoperability of Museum Information Conclusions