Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.

Similar presentations


Presentation on theme: "Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and."— Presentation transcript:

1 Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and Imaging Research (NCMIR) Mark Ellisman Maryann Martone Steve Peltier Steve Lamont... Data-Intensive Computing Environments San Diego Supercomputer Center (SDSC) Reagan Moore Chaitan Baru Amarnath Gupta Bertram Ludäscher Richard Marciano Arcot Rajasekar Ilya Zaslavsky... University of California, San Diego

2 Infrastructure for Sharing Neuroscience Data CCBCCB, Montana SU Surface atlas, Van Essen LabVan Essen Lab NCMIRNCMIR, UCSD stereotaxic atlas LONILONI MCell, CNL, SalkCNL SOURCES: NCMIR, U.C. San Diego Caltech Neuroimaging Center for Imaging Science, John Hopkins Center for Computational Biology, Montana State Laboratory of Neuro Imaging (LONI), UCLA Computatuonal Neurobiology Laboratory, Salk Inst. Van Essen Laboratory, Washington University … Data Management Infrastructure (DICE/NPACI) MIX Mediation in XML MCAT information discovery SRB data handling HPSS storage... Knowledge-based GRID infrastructure ? ? ? ? Data Management Infrastructure (“Data Grid”) GTOMO, Telemicroscopy, Globus, SRB/MCAT, HPSS

3 Sharing Resources on the Brain Data Grid Scientific groups... –create data products (e.g., text data, images, simulation data …) –put them in collections –add metadata (who created it, what is the data about …) –make it available for sharing (on the web, in data caches, in HPSS, …) Technical challenges... –size & packaging of data –heterogeneity: data types, storage technologies, transport mechanisms, authentication,... –access levels: collection, object, fragment; data-specific functions (“data blades”) Data Grid technologies can help... –distributed data management, e.g., Storage Request Broker/Metadata Catalog (SRB/MCAT), computing (Globus),... –focus is on resource sharing (data, networks, cycles)

4 Integration Issue: Semantic Integration/Mediation ??? SEMANTIC INTEGRATION ??? SYNTACTIC/STRUCTURAL Integration Integrated Views (Src-XML => Intgr-XML) Schema Integration (DTD =>DTD) Wrapping, Data Extraction (Text => XML) MIX Mediation of Information using XML SYSTEM INTEGRATION SRB/MCAT TCP/IP grid-ftp HTTP storage, query capabilities protocols & services Distributed Query Processing Globus JDBC DOM CORBA

5 Standard Mediator/Wrapper Architecture GRID federation services ??? INTEGRATED VIEW Client/User-Query (Neuro)Science (Re)Sources DB Files WWW Lab1Lab2Lab3 Wrapper XML Q/A SRB/MCAT, DOM, X(ML)Query structure transport syntax storage } domain semantics ??? Integration logic protocol translation

6 The Need for Semantic Integration protein localization What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents? morphometry neurotransmission ???Mediator ??? Web CaBP, Expasy Wrapper ??? Integrated View ??? ??? Integrated View Definition ??? Data, relationships, constraints are modeled (CMs) Cross-source relationships are modeled Semantic (knowledge- based) mediation services Cross-source queries

7 Hidden Semantics: Protein Localization RyR …. spine 0 branchlet 30 Molecular layer of Cerebellar Cortex Purkinje Cell layer of Cerebellar Cortex Fragment of dendrite

8 Hidden Semantics: Morphometry … 12.348 1.93 4.47 9.884 7.930 4.47 1.79 … Branch level beyond 4 is a branchlet Must be dendritic because Purkinje cells don’t have somatic spines

9 Knowledge-Based (Semantic) Mediation Multiple Worlds Integration Problem: –compatible terms not directly joinable –complex, indirect associations among attributes –unstated integrity constraints Approach: –a “theory” under which terms can be “semantically joined” => lift mediation to the level of conceptual models (CMs) => formalize domain knowledge, ICs become rules over CMs => Knowledge-Based/Model-Based (Semantic) Mediation

10 XML-Based vs. Model-Based Mediation Raw Data IF  THEN  Logical Domain Constraints Integrated-CM := CM-QL(Src1-CM,...)...... (XML) Objects Conceptual Models XML Elements XML Models C2 C3 C1 R Classes, Relations, is-a, has-a,... DOMAIN MAP Integrated-DTD := XML-QL(Src1-DTD,...) No Domain Constraints A = (B*|C),D B =... Structural Constraints (DTDs), Parent, Child, Sibling,... CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, OIL, DAML, …}

11 Knowledge-Based Mediator Prototype USER/Client USER/Client S1 S2 S3 XML-Wrapper CM-Wrapper XML-Wrapper CM-Wrapper XML-Wrapper CM-Wrapper GCM CM S1 GCM CM S2 GCM CM S3 CM (Integrated View) Mediator Engine FL rule proc. LP rule proc. Graph proc. XSB Engine Domain Map DM Integrated View Definition IVD Logic API (capabilities) CM Queries & Results (exchanged in XML) CM Plug-In

12 Mediation Services: Source Registration (System Issues) Source Data Type Access Protocol Query Capability table treefile SRB HTTPJDBC SQL XML QL DOOD ARC Result Delivery Tuple-at-a-time Set-at-a-time Stream Binary for Viewer Selections SPJ

13 Mediation Services: Source Registration (Semantics Issues) Domain Map Registration –provide concept space/ontology … as a private object (“ myANATOM ”) … merge with others (give “semantic bridges”) … and check for conflicts Conceptual Model Registration –schema: classes, associations, attributes –domain constraints –“put data into context” (linking data to the domain map) Next

14 ANATOM Domain Map ANATOM Back

15 anatom_dom(X) :- (ucsd_has_a(X,_) ; ucsd_has_a(_,X) ; ucsd_isa(X,_) ; ucsd_isa(_,X)). senselab_dom(X) :- (sl_has_a(X,_) ; sl_has_a(_,X) ; sl_isa(X,_) ; sl_isa(_,X)). % map Senselab anatom terms to equivalent UCSD ANATOM sl2ucsd(X,X) :- senselab_dom(X), anatom_dom(X). sl2ucsd('A',axon). sl2ucsd('AH',axon). sl2ucsd('Dad',spiny_branchlet). % should map to a PATH not just the end of the path sl2ucsd('Dam',main_branches). % some of the main_branches based on the branch level sl2ucsd('Dap',main_branches). sl2ucsd('Dbd',spiny_branchlet). sl2ucsd('Dbm',main_branches). sl2ucsd('Dbp',main_branches). sl2ucsd('Ded',spiny_branchlet). sl2ucsd('Dem',main_branches). sl2ucsd('Dep',main_branches). sl2ucsd('T',axon). % keep has_a edge if at least one node is known from UCSD has_a(X,Y) :- sl2ucsd(_,X), ucsd_has_a(X,Y). has_a(X,Y) :- sl2ucsd(_,Y), ucsd_has_a(X,Y). % keep all and only UCSD is_a rels isa(X,Y) :- ucsd_isa(X,Y).BackBack Senselab (Yale) and NCMIR (UCSD) “Semantic Bridge”

16 Neuron Spiny Neuron Substantia Nigra Pc AxonSomaDendrite GABA Neurotransmitter Compartment Dopamine R Substance P MyNeuron Medium Spiny Neuron Substantia Nigra Pr Globus Pallidus Int. Globus Pallidus Ext. MyDendrite OR ALL:has AND = exp Neostriatum Refinement of a Domain Map (Ontology): Putting Data in Context via Registration of new Classes & Relationships

17 Mediation Services : Integrated View Definition DERIVE protein_distribution(Protein, Organism, Brain_region, Feature_name, Anatom, Value) FROM I:protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> {AS:anatomical_structure[name->Anatom]}], % from PROLAB NAE:neuro_anatomic_entity[name->Anatom; % from ANATOM located_in->>{Brain_region}], AS..segments..features[name->Feature_name; value->Value]. provided by the domain expert and mediation engineer declarative language (here: Frame-logic)

18 Example Query Evaluation (I) Example: protein_distribution –given: organism, protein, brain_region –Use DOMAIN-KNOWLEDGE-BASE: recursively traverse the has_a_star paths under brain_region collect all anatomical_entities –Source PROLAB: join with anatomical structures and collect the value of attribute “image.segments.features.feature.protein_amount” where “image.segments.features.feature.protein_name” = protein and “study_db.study.animal.name” = organism –Mediator: aggregate over all parents up to brain_region report distribution

19 Example Query Evaluation (II) @SENSELAB: X1 := select output from parallel fiber ; @MEDIATOR: X2 := “hang off” X1 from Domain Map; @MEDIATOR: X3 := subregion-closure(X2); @NCMIR: X4 := select PROT-data(X3, Ryanodine Receptors); @MEDIATOR: X5 := compute aggregate(X4); "How does the parallel fiber output (Yale/SENSELAB) relate to the distribution of Ryanodine Receptors (UCSD/NCMIR)?"

20 Mediation Services: Client Registration Client Update Client Fat Result Viewer Query Client Check Data Merge Before Insert Derive Before Insert Client-side Buffer Client-side Processing Navigate/ Ad-hoc Query Capability Query on Schema Thin Result Viewer Send Full Data Server-side Buffer Context Sensitive Server-Push/ Client-Pull

21 Example Client: Query Formulation and Result Display combination of ad hoc and navigational queries client side visualization (left) results are shown in semantic context (right)

22 Mediation Services: Semantic Annotation Tools line drawing ==annotation==> (spatial) database for mediation

23 XML Sources RDB Sources File Sources HTML Sources Query interface (down API): SDLIP, SOAP,... (subsets of) SQL, X(ML)-Query, CPL,... DOM SRB-based access Result delivery interface (up API): SDLIP, SOAP,... pull (tuple/set-at-a-time, DOM) vs. push (stream) synchronous/asynchronous direct data/data reference Wrapper Layer Digital Libraries (Collections) Spatial Sources Source registration: domain knowledge model & schema query & computation capabilities Query processing: view unfolding semantic optimization capability-based rewriting Source model lifting: domain knowledge reconciliation model transformation Query formulation: user query integrated view definition Optimizer Model Reasoner Deductive Engine Mediator Layer Mediation Services Mediator Architecture Blueprint Boston Univ. NCMIR UCSD Yale Univ. Montana Univ. SDLIP ARC IMS

24 Coming up: Knowledge-Based/Semantic Mediation of Brain Data CCBCCB, Montana SU Surface atlas, Van Essen LabVan Essen Lab NCMIRNCMIR, UCSD stereotaxic atlas LONILONI MCell, CNL, SalkCNL ANATOM PROTLOC ResultResult (VML/SVG) ResultResult (XML/XSLT)  Knowledge-Based Mediation

25 Some Open Issues Data/Knowledge Modeling –Extensibility: how to handle a source with new data types and operations? Temporal Data: instrument readings, video microscopy Spatial Data: Integrating with spatial database systems Image database systems –Conflict Management Grades of certainty Alternate Hypothesis Integrating Services –Registration and warping of my image slice to a reference Integrating into Larger Applications –M-Cell simulation –Telemicroscopy –Visualization

26 Model-Based Mediation with Domain Maps, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Data Engineering (ICDE), Heidelberg, 2001 Knowledge-Based Mediation of Heterogeneous Neuroscience Information Sources, Amarnath Gupta, Bertram Ludäscher, Maryann Martone, Intl. Conference on Scientific and Statistical Databases (SSDBM), Berlin, 2000. Model-Based Information Integration in a Neuroscience Mediator System, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Very Large Data Bases (VLDB), Cairo, 2000. References


Download ppt "Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and."

Similar presentations


Ads by Google