NSF Cyberinfrastructure Workshop Metadata, semantic information and ontologies Lead: Danielle Forsyth Respondents: Jim Bonner Bertram Ludaescher Complex System Modeling and Verification requires vast amounts of Data…
What we observe What we derive How we explain what we observe and derive When we ‘become one with’ the processes we collect data about A Discovery Model Data Information Knowledge Wisdom
Does It Really Work That Way?
Does Wisdom Follow Data ? Data Examination Hypothesis
Does Data Confirm Wisdom ? Hypothesis Experiment Data
Is it a Circular Process ? Hypothesis Experiment Data Examination
Data and Information demand Attention A Wealth of Data Demands a Wealth of Attention A Wealth of Attention Devoted to Data results in a Dearth of Attention Devoted to Wisdom Data = Wisdom ?
Observed Data Derived Data Observed Reality Predicted Reality Model
Observed Data Derived Data Observed Reality Predicted Reality Model difference
Observed Data Derived Data Observed Reality Predicted Reality Physical Model difference Policy Enablers Desired Reality Socio- Economic Model difference
Ontologies and Metadata
Collect metadata with, but not necessarily part of, the data. Grow the metadata as data is used. Index the metadata description for search. Use a rich metadata description language to support inferencing and data mining. Keep the data where it makes the most sense for collection and processing – distribute the search. Manipulate the knowledge description (the metadata description) and the meta-data at the same time. Support a Data Search Metaphor
Ontologies Ontologies are dictionaries of categories and properties –Dictionaries (namespaces) as policy and organization –Categories (classes) as conceptual buckets –Properties as descriptive elements Data Properties – serial number, weight, length … Relationship Properties – entered by, derived from … Ontologies have both a qualitative (my mud index = your turbidity scale) and quantitative components –Standards driven by web based text applications and transaction systems do not necessarily meet scientific needs.
Approaches to Ontologies Published dictionaries –Fixed, periodically updated –Flexible and evolving My copy of the published dictionary Shared copy of the dictionary My own dictionary My community’s dictionary
Approaches to Ontologies Top Down Data/Problem Driven Product Processing Community –ie. SWEET Application Problem Design or build with Search in mind..
Requirements Industry standard, machine readable and semantically rich descriptions that support: –machine based inferencing and reasoning –community and researcher based knowledge building/sharing –knowledge mapping and re-use –an approach that allows for context appropriate and policy based access to data and knowledge –Access to data and knowledge by broader communities within government, industry and policy Allow community to leverage broader industry efforts –Problem/process centric approach