Download presentation
Presentation is loading. Please wait.
Published byEaster Miller Modified over 9 years ago
1
Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and Imaging Research (NCMIR) Mark Ellisman Maryann Martone Steve Peltier Steve Lamont... Data-Intensive Computing Environments San Diego Supercomputer Center (SDSC) Reagan Moore Chaitan Baru Amarnath Gupta Bertram Ludäscher Richard Marciano Arcot Rajasekar Ilya Zaslavsky... University of California, San Diego
2
Infrastructure for Sharing Neuroscience Data CCBCCB, Montana SU Surface atlas, Van Essen LabVan Essen Lab NCMIRNCMIR, UCSD stereotaxic atlas LONILONI MCell, CNL, SalkCNL SOURCES: NCMIR, U.C. San Diego Caltech Neuroimaging Center for Imaging Science, John Hopkins Center for Computational Biology, Montana State Laboratory of Neuro Imaging (LONI), UCLA Computatuonal Neurobiology Laboratory, Salk Inst. Van Essen Laboratory, Washington University … Data Management Infrastructure (DICE/NPACI) MIX Mediation in XML MCAT information discovery SRB data handling HPSS storage... Knowledge-based GRID infrastructure ? ? ? ? Data Management Infrastructure (“Data Grid”) GTOMO, Telemicroscopy, Globus, SRB/MCAT, HPSS
3
Sharing Resources on the Brain Data Grid Scientific groups... –create data products (e.g., text data, images, simulation data …) –put them in collections –add metadata (who created it, what is the data about …) –make it available for sharing (on the web, in data caches, in HPSS, …) Technical challenges... –size & packaging of data –heterogeneity: data types, storage technologies, transport mechanisms, authentication,... –access levels: collection, object, fragment; data-specific functions (“data blades”) Data Grid technologies can help... –distributed data management, e.g., Storage Request Broker/Metadata Catalog (SRB/MCAT), computing (Globus),... –focus is on resource sharing (data, networks, cycles)
4
Integration Issue: Semantic Integration/Mediation ??? SEMANTIC INTEGRATION ??? SYNTACTIC/STRUCTURAL Integration Integrated Views (Src-XML => Intgr-XML) Schema Integration (DTD =>DTD) Wrapping, Data Extraction (Text => XML) MIX Mediation of Information using XML SYSTEM INTEGRATION SRB/MCAT TCP/IP grid-ftp HTTP storage, query capabilities protocols & services Distributed Query Processing Globus JDBC DOM CORBA
5
Standard Mediator/Wrapper Architecture GRID federation services ??? INTEGRATED VIEW Client/User-Query (Neuro)Science (Re)Sources DB Files WWW Lab1Lab2Lab3 Wrapper XML Q/A SRB/MCAT, DOM, X(ML)Query structure transport syntax storage } domain semantics ??? Integration logic protocol translation
6
The Need for Semantic Integration protein localization What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents? morphometry neurotransmission ???Mediator ??? Web CaBP, Expasy Wrapper ??? Integrated View ??? ??? Integrated View Definition ??? Data, relationships, constraints are modeled (CMs) Cross-source relationships are modeled Semantic (knowledge- based) mediation services Cross-source queries
7
Hidden Semantics: Protein Localization RyR …. spine 0 branchlet 30 Molecular layer of Cerebellar Cortex Purkinje Cell layer of Cerebellar Cortex Fragment of dendrite
8
Hidden Semantics: Morphometry … 12.348 1.93 4.47 9.884 7.930 4.47 1.79 … Branch level beyond 4 is a branchlet Must be dendritic because Purkinje cells don’t have somatic spines
9
Knowledge-Based (Semantic) Mediation Multiple Worlds Integration Problem: –compatible terms not directly joinable –complex, indirect associations among attributes –unstated integrity constraints Approach: –a “theory” under which terms can be “semantically joined” => lift mediation to the level of conceptual models (CMs) => formalize domain knowledge, ICs become rules over CMs => Knowledge-Based/Model-Based (Semantic) Mediation
10
XML-Based vs. Model-Based Mediation Raw Data IF THEN Logical Domain Constraints Integrated-CM := CM-QL(Src1-CM,...)...... (XML) Objects Conceptual Models XML Elements XML Models C2 C3 C1 R Classes, Relations, is-a, has-a,... DOMAIN MAP Integrated-DTD := XML-QL(Src1-DTD,...) No Domain Constraints A = (B*|C),D B =... Structural Constraints (DTDs), Parent, Child, Sibling,... CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, OIL, DAML, …}
11
Knowledge-Based Mediator Prototype USER/Client USER/Client S1 S2 S3 XML-Wrapper CM-Wrapper XML-Wrapper CM-Wrapper XML-Wrapper CM-Wrapper GCM CM S1 GCM CM S2 GCM CM S3 CM (Integrated View) Mediator Engine FL rule proc. LP rule proc. Graph proc. XSB Engine Domain Map DM Integrated View Definition IVD Logic API (capabilities) CM Queries & Results (exchanged in XML) CM Plug-In
12
Mediation Services: Source Registration (System Issues) Source Data Type Access Protocol Query Capability table treefile SRB HTTPJDBC SQL XML QL DOOD ARC Result Delivery Tuple-at-a-time Set-at-a-time Stream Binary for Viewer Selections SPJ
13
Mediation Services: Source Registration (Semantics Issues) Domain Map Registration –provide concept space/ontology … as a private object (“ myANATOM ”) … merge with others (give “semantic bridges”) … and check for conflicts Conceptual Model Registration –schema: classes, associations, attributes –domain constraints –“put data into context” (linking data to the domain map) Next
14
ANATOM Domain Map ANATOM Back
15
anatom_dom(X) :- (ucsd_has_a(X,_) ; ucsd_has_a(_,X) ; ucsd_isa(X,_) ; ucsd_isa(_,X)). senselab_dom(X) :- (sl_has_a(X,_) ; sl_has_a(_,X) ; sl_isa(X,_) ; sl_isa(_,X)). % map Senselab anatom terms to equivalent UCSD ANATOM sl2ucsd(X,X) :- senselab_dom(X), anatom_dom(X). sl2ucsd('A',axon). sl2ucsd('AH',axon). sl2ucsd('Dad',spiny_branchlet). % should map to a PATH not just the end of the path sl2ucsd('Dam',main_branches). % some of the main_branches based on the branch level sl2ucsd('Dap',main_branches). sl2ucsd('Dbd',spiny_branchlet). sl2ucsd('Dbm',main_branches). sl2ucsd('Dbp',main_branches). sl2ucsd('Ded',spiny_branchlet). sl2ucsd('Dem',main_branches). sl2ucsd('Dep',main_branches). sl2ucsd('T',axon). % keep has_a edge if at least one node is known from UCSD has_a(X,Y) :- sl2ucsd(_,X), ucsd_has_a(X,Y). has_a(X,Y) :- sl2ucsd(_,Y), ucsd_has_a(X,Y). % keep all and only UCSD is_a rels isa(X,Y) :- ucsd_isa(X,Y).BackBack Senselab (Yale) and NCMIR (UCSD) “Semantic Bridge”
16
Neuron Spiny Neuron Substantia Nigra Pc AxonSomaDendrite GABA Neurotransmitter Compartment Dopamine R Substance P MyNeuron Medium Spiny Neuron Substantia Nigra Pr Globus Pallidus Int. Globus Pallidus Ext. MyDendrite OR ALL:has AND = exp Neostriatum Refinement of a Domain Map (Ontology): Putting Data in Context via Registration of new Classes & Relationships
17
Mediation Services : Integrated View Definition DERIVE protein_distribution(Protein, Organism, Brain_region, Feature_name, Anatom, Value) FROM I:protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> {AS:anatomical_structure[name->Anatom]}], % from PROLAB NAE:neuro_anatomic_entity[name->Anatom; % from ANATOM located_in->>{Brain_region}], AS..segments..features[name->Feature_name; value->Value]. provided by the domain expert and mediation engineer declarative language (here: Frame-logic)
18
Example Query Evaluation (I) Example: protein_distribution –given: organism, protein, brain_region –Use DOMAIN-KNOWLEDGE-BASE: recursively traverse the has_a_star paths under brain_region collect all anatomical_entities –Source PROLAB: join with anatomical structures and collect the value of attribute “image.segments.features.feature.protein_amount” where “image.segments.features.feature.protein_name” = protein and “study_db.study.animal.name” = organism –Mediator: aggregate over all parents up to brain_region report distribution
19
Example Query Evaluation (II) @SENSELAB: X1 := select output from parallel fiber ; @MEDIATOR: X2 := “hang off” X1 from Domain Map; @MEDIATOR: X3 := subregion-closure(X2); @NCMIR: X4 := select PROT-data(X3, Ryanodine Receptors); @MEDIATOR: X5 := compute aggregate(X4); "How does the parallel fiber output (Yale/SENSELAB) relate to the distribution of Ryanodine Receptors (UCSD/NCMIR)?"
20
Mediation Services: Client Registration Client Update Client Fat Result Viewer Query Client Check Data Merge Before Insert Derive Before Insert Client-side Buffer Client-side Processing Navigate/ Ad-hoc Query Capability Query on Schema Thin Result Viewer Send Full Data Server-side Buffer Context Sensitive Server-Push/ Client-Pull
21
Example Client: Query Formulation and Result Display combination of ad hoc and navigational queries client side visualization (left) results are shown in semantic context (right)
22
Mediation Services: Semantic Annotation Tools line drawing ==annotation==> (spatial) database for mediation
23
XML Sources RDB Sources File Sources HTML Sources Query interface (down API): SDLIP, SOAP,... (subsets of) SQL, X(ML)-Query, CPL,... DOM SRB-based access Result delivery interface (up API): SDLIP, SOAP,... pull (tuple/set-at-a-time, DOM) vs. push (stream) synchronous/asynchronous direct data/data reference Wrapper Layer Digital Libraries (Collections) Spatial Sources Source registration: domain knowledge model & schema query & computation capabilities Query processing: view unfolding semantic optimization capability-based rewriting Source model lifting: domain knowledge reconciliation model transformation Query formulation: user query integrated view definition Optimizer Model Reasoner Deductive Engine Mediator Layer Mediation Services Mediator Architecture Blueprint Boston Univ. NCMIR UCSD Yale Univ. Montana Univ. SDLIP ARC IMS
24
Coming up: Knowledge-Based/Semantic Mediation of Brain Data CCBCCB, Montana SU Surface atlas, Van Essen LabVan Essen Lab NCMIRNCMIR, UCSD stereotaxic atlas LONILONI MCell, CNL, SalkCNL ANATOM PROTLOC ResultResult (VML/SVG) ResultResult (XML/XSLT) Knowledge-Based Mediation
25
Some Open Issues Data/Knowledge Modeling –Extensibility: how to handle a source with new data types and operations? Temporal Data: instrument readings, video microscopy Spatial Data: Integrating with spatial database systems Image database systems –Conflict Management Grades of certainty Alternate Hypothesis Integrating Services –Registration and warping of my image slice to a reference Integrating into Larger Applications –M-Cell simulation –Telemicroscopy –Visualization
26
Model-Based Mediation with Domain Maps, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Data Engineering (ICDE), Heidelberg, 2001 Knowledge-Based Mediation of Heterogeneous Neuroscience Information Sources, Amarnath Gupta, Bertram Ludäscher, Maryann Martone, Intl. Conference on Scientific and Statistical Databases (SSDBM), Berlin, 2000. Model-Based Information Integration in a Neuroscience Mediator System, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Very Large Data Bases (VLDB), Cairo, 2000. References
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.