Presentation is loading. Please wait.

Presentation is loading. Please wait.

CaBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS.

Similar presentations


Presentation on theme: "CaBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS."— Presentation transcript:

1 caBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS

2 NCI biomedical informatics  Goal: A virtual web of interconnected data, individuals, and organizations redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise

3 Trials Animal Models states context pathways ontologies agents therapeutics probes components genes genotypes gene expression proteins protein expression etiology, treatment, prevention

4 Molecular Pathology Clinical Trials caCORE access portals participating group nodes Cancer Genomics Mouse Models building common architecture, common tools, and common standards

5 Interoperability Semantic interoperability Syntactic interoperability Courtesy: Charlie Mead  in·ter·op·er·a·bil·i·ty -ability of a system...to use the parts or equipment of another system Source: Merriam-Webster web site  interoperability -ability of two or more systems or components to exchange information and to use the information that has been exchanged. Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990]

6 Enterprise Vocabulary  NCI Meta-Thesaurus (Cross-map standard vocabularies/ontologies, e.g. SNOMED, MEDRA, ICD) -Semantic integration, inter-vocabulary mapping -UMLS Metathesaurus extended with cancer-oriented vocabularies 800,000 Concepts, 2,000,000 terms and phrases Mappings among over 50 vocabularies  NCI Thesaurus -Description logic-based -18,000 “Concepts” Concept is the semantic unit One or more terms describe a Concept – synonymy Semantic relationships between Concepts biomedical objects common data elements controlled vocabulary

7 Common Data Elements  Structured data reporting elements  Precisely defining the questions and answers -What question are you asking, exactly? -What are the possible answers, and what do they mean? biomedical objects common data elements controlled vocabulary

8 Biomedical Information Objects  Data service infrastructure developed using OMG’s Model Driven Architecture approach  Object models expressed in UML represent actual biomedical research entities such as genes, sequences, chromosomes, sequences, cellular pathways, ontologies, clinical protocols, etc.  The object models form the basis for uniform APIs (Java, SOAP, HTTP-XML, Perl) that provide an abstraction layer and interfaces for developers to access information without worrying about the back- end data stores biomedical objects common data elements controlled vocabulary

9 Standards supporting infrastructure  Enterprise Vocabulary Services (EVS) -Browsers -APIs  cancer Bioinformatics Infrastructure Objects (caBIO) -Applications -APIs  cancer Data Standards Repository (caDSR) -CDEs -Case Report Forms -Object models -ISO 11179 model

10 Data Access Objects Object Managers Domain Objects RM I Web Server Tomcat Servlets JSPs SOAP XML XSL/XSLT HTML (Browsers) SOAP Clients Java Applications DataObject Presentation Client Integrating Architecture HTML/XML Clients Meta-Data PERL Clients

11 Semantic Integration: Modeling Time Class Attributes EVS Concept for Attribute ‘agentName’ EVS Concept for Class ‘Agent’ EVS Concept for Attribute ‘id’... etc. EVS Concept for instance objects Object Mapping to EVS Concepts Done at Modeling Time

12 Semantic Integration: Metadata Registration Time UML model, including EVS Concept mappings ISO11179 mapping caDSR loading Curation: Data standards registration for instance data

13 Semantic Integration: Runtime Java Applications Data Access Objects (OJB) Object Managers Web Server Tomcat Servlets ( XML XSL/XSLT ) JSPs SOAP HTML/XML Clients (Browsers) SOAP Clients Data Object PresentationClient Perl Clients Domain Objects [Gene, Disease, Concept, DataElement] RMI Research DBs Research DBs

14 caGRID caCORE architecture extension caBIO server caBIO client OGSA-DAI + Globus OGSA-DAI caGRID extension ( metadata ) caGRID extension (caBIO adapter) caGRID extension ( query ) Client Grid Data Source caGRID extension (Concept Discovery) caGRID extension (Federated Query) caGRID Extension (Integration of Discovery and Query Services)

15 NCICB applications: clincial trials support - C3DS molecular pathology - caArray cancer images - caImage pre-clinical models - caModelsDb laboratory support - caLIMS

16 Standards-based Data System for the conduct of clinical trials: C3D (Cancer Central Clinical Database) –WWW-based eCRF-based primary data capture by protocol C3PR (Cancer Central Clinical Participant Registry) –WWW-based Central registration of participants across protocols C3PA (Cancer Central Clinical Protocol Administration) –Scientific management system for clinical protocols C3TR (Cancer Central Clinical Tissue Repository) –Tissue repository C3DW (Cancer Central Clinical Data Warehouse) –De-identified patient information accessed via caBIO

17

18

19 Image Portal The NCICB has developed an image portal to allow researchers to search for mouse and human images and annotations –Human and mouse images and annotations were provided by the MMHCC

20 Pathway Database Enhance value of imperfect, but available, pathway knowledge Make biological assumptions explicit Combine sources of data (e.g. KEGG, BioCarta,...) Merge data from separate pathways Build a causal framework to support (future) quantitative simulation/analysis

21 Cancer Biomedical Informatics Grid (caBIG)  Common, widely distributed infrastructure permits cancer research community to focus on innovation  Shared vocabulary, data elements, data models facilitate information exchange  Collection of interoperable applications developed to common standard  Raw published cancer research data is available for mining and integration

22 caBIG will facilitate sharing of infrastructure, applications, and data

23 caBIG action plan  Establish pilot network of Cancer Centers -Groups agreeing to caBIG principles -Mixture of capabilities -Mixture of contributions  Expanding collection of participants  Establish consortium development process -Collecting and sharing expertise -Identifying and prioritizing community needs -Expanding development efforts  Moving at the speed of the internet…

24 Three Domain Workspaces and two Cross Cutting Workspaces have been launched during the Pilot phase DOMAIN WORKSPACE 3 Tissue Banks & Pathology Tools provides for the integration, development, and implementation of tissue and pathology tools. DOMAIN WORKSPACE 2 Integrative Cancer Research provides tools and systems to enable integration and sharing of information. DOMAIN WORKSPACE 1 Clinical Trial Management Systems addresses the need for consistent, open and comprehensive tools for clinical trials management. CROSS CUTTING WORKSPACE 2 Architecture developing architectural standards and architecture necessary for other workspaces. CROSS CUTTING WORKSPACE 1 Vocabularies & Common Data Elements responsible for evaluating, developing, and integrating systems for vocabulary and ontology content, standards, and software systems for content delivery

25 Key deliverables of caBIG pilot  Componentized, standards-based Clinical Trials Management System -e-IND filing/regulatory reporting with FDA -Electronic management of trials -Integration of diverse trials  Tissue Management System -Systematic description and characterization of tissue resources -Ability to link tissue resources to clinical and molecular correlative descriptions  “Plug and Play” analytic tool set -microarray -proteomics -pathways -data analysis and statistical methods -gene annotation  Diverse library of raw, structured data

26 Cancer Molecular Analysis Project (CMAP) - a prototypic biomedical data integration effort biomedical objects common data elements controlled vocabulary Profiles, Targets, Agents, Clinical Trials CGAP NCBI UCSC (via DAS) BioCarta KEGG Gene Ontologies CTEP clinical trials CGAP gene expression NCI drug screening

27

28

29

30

31

32

33

34

35

36

37

38

39 caBIG community contributions  Infrastructure -Ontologies -Databases  Applications -Clinical trials support -Analytic tools -Data mining  Data -Trials -Experimental outcomes Genomic Microarray Proteomic

40

41 acknowledgements  NCICB -Peter Covitz -Sue Dubman -Mary Jo Deering -Leslie Derr -Carl Schaefer -Christos Andonyadis -Mervi Heiskanen -Denise Hise -Kotien Wu -Fei Xu -Frank Hartel  LPG/CCR -Michael Edmundson -Bob Clifford -Cu Nguyen http://ncicb.nci.nih.gov http://cmap.nci.nih.gov http://caBIG.nci.nih.gov


Download ppt "CaBIG: the cancer Biomedical Informatics Grid Ken Buetow NCICB/NCI/NIH/DHHS."

Similar presentations


Ads by Google