Download presentation
Presentation is loading. Please wait.
Published byArthur Bryant Modified over 9 years ago
1
i2b2 National Center for Biomedical Computing i2b2 Clinical Research Chart and Hive Architecture Henry Chueh Shawn Murphy Isaac Kohane, PI
2
i2b2 National Center for Biomedical Computing Summary Background Intro to the Clinical Research Chart (CRC) Hive / Cell Software Architecture More details on establishing and using the CRC
3
i2b2 National Center for Biomedical Computing Background Clinical documentation is…clinical Lack of systematic approach for organizing clinical data for research Ownership issues are unique Consent issues are a challenge
4
i2b2 National Center for Biomedical Computing Driving Biological Projects Asthma Hypertension Huntington’s Disease Diabetes
5
i2b2 National Center for Biomedical Computing Clinical Research Chart (CRC) Organize and transform clinical data to maximize its utility for research Develop an Application and Database framework to serve this goal Establish an architecture that allows data from different studies done on this platform to be integrated
6
i2b2 National Center for Biomedical Computing Design of Clinical Research Chart OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces A program CRC DB HL7 MSH|^/&|736401….. PID|102|3231285.…. Text files XML.…. database clinical trials
7
i2b2 National Center for Biomedical Computing Design of Clinical Research Chart OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces A program Data pipeline/workflow applicationPheno/Genotype Database Visualization and Analysis of database contents CRC DB Text files XML.…. database clinical trials HL7 MSH|^/&|736401….. PID|102|3231285.….
8
i2b2 National Center for Biomedical Computing i2b2 Skeletal Data Flow Shared data Study specific data Study specific data Clinical Research Chart Enterprise Systems Registration, ADT, Labs, Reports, Clinical Notes, etc Enterprise data source (RPDR) Enterprise data source (RPDR) Annotation UI EDC applications Local Systems Systems not gathered into Enterprise data warehouses i2b2 ETL workflow Annotation Service EDC Service Analytic workflow
9
i2b2 National Center for Biomedical Computing Overall Themes Framework to allow development of application services in a maximally decoupled fashion. Linux and Windows OS support Java and C++ programming languages Use Cases for construction of CRC come from Driving Biology Projects and experience with clients of Partners Research Patient Data Registry
10
i2b2 National Center for Biomedical Computing Focus on Workflow Necessary for both pre-CRC and post- CRC processes Needed for scientific flexibility Implies a consistent environment for data pipelining and flow control
11
i2b2 National Center for Biomedical Computing i2b2 Hive Formed as a collection of interoperable Cells, or services Loosely coupled Makes no assumptions about proximity Connected by Web services Activated by a workflow engine that forms basis of choreography among Cells for complex interactions
12
i2b2 National Center for Biomedical Computing Complex choreography
13
i2b2 National Center for Biomedical Computing i2b2 Cell Behaves as a functional service Separates interactions conceptually into transactions and semantics Focuses on facilitating transactions with simple semantics (e.g., datatype) Leaves deep semantics to be defined by the services provided by a Cell Does not restrict language implementation
14
i2b2 National Center for Biomedical Computing Target layer for i2b2 TCP/IP Web Services I2b2 platform Semantic Objects
15
i2b2 National Center for Biomedical Computing Cell examples Concept extraction from clinical narratives Simple transformations; e.g., basic text format conversion Complex encoding; e.g., encoding MIAME in MAGE Microarray data normalization …
16
i2b2 National Center for Biomedical Computing Exposing Cells Protocols layered on top of SOAP At the WSDL level for integrators; ie, bioinformaticians & software engineers At a functional level for investigators i2b2 toolkits to allow integrators to expose controlled functionality to investigators (Automator)
17
i2b2 National Center for Biomedical Computing Automator Approach investigators informaticians Extend Kepler workflow engine i2b2 Automator
18
i2b2 National Center for Biomedical Computing Bird’s eye view Workflow engine Investigator Portal CRC Repository
19
i2b2 National Center for Biomedical Computing Current Implementation Extending Kepler workflow engine for i2b2 Data model for CRC repository Defining protocols necessary for interaction (in addition to SOAP) Created Cell for concept extraction from narratives Early designs for Automator toolkit
20
i2b2 National Center for Biomedical Computing i2b2 Architecture Key Points Leverage existing workflow standards and software Use Web services as basic form of interaction Assume unlimited choreography, but… Provide tools to distill complexity into basic automation for clinical investigators
21
i2b2 National Center for Biomedical Computing SW Licensing and Distribution Commit to Open Source software Use GNU Lesser General Public License Establish local i2b2 repository exposed through i2b2 website Contribute to a more global NCBC SourceForge style repository if it emerges ?NIH Forge Keep i2b2 protocols fully open
22
i2b2 National Center for Biomedical Computing Interoperability across NCBC Strongly consider Web services as basic protocol for generic shared interactions Consider sharing datasets Promote diversity of approach and use of shared software (don’t impose uniformity) Facilitate/promote NCBC Open Source project teams
23
i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow Populating the Clinical Research Chart (CRC)
24
i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow Use workflow framework to choreograph applications services in specific sequences Used to extract, transform, conform, and load data and metadata into the CRC
25
i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces Output Input A program increasingly useful Local or through SOAP service
26
i2b2 National Center for Biomedical Computing Ontology Service OntologyConsent/TrackingApplication PoolManagement Manages mappings of terms to common vocabularies Provides lists of acceptable (enumerated) values for various attribute and value slots. Allows for management of hierarchies, groupings, and relationships between terms Ontology
27
i2b2 National Center for Biomedical Computing Person Consent/Tracking Service OntologyConsent/TrackingApplication PoolManagement Provides mappings between patient/subject identifiers Tracks patient/subject consent information Allows identification of the patient/subject based upon fuzzy demographic matches Consent/Tracking
28
i2b2 National Center for Biomedical Computing Application Pool (CVS) Service OntologyConsent/TrackingApplication PoolManagement Stores programs/scripts used in pipeline Provides applications to be downloaded when needed Manages versioning of software Provides documentation Application Pool
29
i2b2 National Center for Biomedical Computing Management Service OntologyConsent/TrackingApplication PoolManagement Stores workflow execution plan Starts and controls workflow execution Schedules workflow execution Monitors workflow execution and data locations Controls permissions associated with workflow execution Management
30
i2b2 National Center for Biomedical Computing Data Pipeline/Workflow Application Use Case for Asthma Data OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces OutputInput A program RPDR CRC DB AsthmaMart Data retrieval Data de-identification Language processing Vocabulary matching Load Data into Mart
31
i2b2 National Center for Biomedical Computing Data Pipeline/Workflow Implementation Define standard XML representation for workflow - MoMLDefine standard XML representation for workflow - MoML Define standards for SOAP services and resource discovery Adopt and extend open source workflow package (Kepler)Adopt and extend open source workflow package (Kepler) Prototypes by July timeframe BIRN -> NAMIC and LONI collaboration Can follow construction details at http://diagon/i2b2 http://diagon/i2b2
32
i2b2 National Center for Biomedical Computing Phenotype/Genotype Database
33
i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Principles Analytical database schema that does not need to change with new data types and concepts Defined fundamental unit of data (atomic fact) = observation Defined metadata strategy Various levels of de-identification (reviewed and approved by IRB)
34
i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Architecture (see preprint)
35
i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Use Case Smoking observations represented in database Patient_id_eConcept_cdStart_dateProvider_idConfidence_num Z234CT-A-SMK1/1/1997M00223033 Z234CT-A-SMK1/1/1998M00341259 Z234IC9-30511/1/2001M00223033 Z234CT-A-NSK1/1/2002M00341259 Patient_id_eBirth_dateSex_cdRace_cdDeath_date Z2343/4/1924FemaleBlack4/5/2003 Provider_idProvider_pathName_char M0022303MGH\Neurology\M0022303M0022303 Concept_cdConcept_pathName_char CT-A-SMKAsthV1\DRptNLP\Tobacco Use\SmokerSmoking IC9-3051 V2\Diagnosis\Mental Disorders (290-319)\Non- psychotic disorders (300-316)\(305) Nondependent abuse of drugs\(305-1) Tobacco use disorder\(305- 11) Tobacco use disorder, co~ Tobacco Use Disorder, continuous use CT-A-NSKAsthV1\DRptNLP\Tobacco Use\Non smokerNever smoked
36
i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Implementation Asthma CRC DB “primed” with data from 90,000 patients from Research Patient Data Registry Serves as fundamental data structure for i2b2 supported data Querying and Visualization Application Suite CRC DB’s able to fuse seamlessly together Various levels of de-identification to be supported for data sharing and publication
37
i2b2 National Center for Biomedical Computing Visualization and Analysis of CRC database Post-CRC workflow
38
i2b2 National Center for Biomedical Computing Visualization and Analysis Principles Supported application suite to query and view CRC database contents Outside applications for analysis and viewing able to plug in to application suite Pipeline/Workflow framework may be used for analysis and re-entry of derived data into CRC database
39
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported Applications, Querying and Visualization –Standard querying –Data exploration
40
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported Applications, ontology management –Ontology Management Integrate (outside?) population analysis applications
41
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported applications have plug-in architecture for outside analytic tools: –Standard web-link support with GET and POST oriented data transfer –Support transfer of specifically transformed data to outside applications –Complex analysis supported with workflow application
42
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture - Query
43
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture - Exploration
44
i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture – Ontology mgmt
45
i2b2 National Center for Biomedical Computing Visualization and Analysis Use Case
46
i2b2 National Center for Biomedical Computing Visualization and Analysis Implementation of analysis tools Workflow framework to accommodate external analytic applications CRC DB ProgID CA2.3 SN8745 PA5683 SN8745 SNOMED CODE patient id 0000004 account # 347 subject id 4 ProgID CX2.3 ProgID PN5.1ProgID TH3.0 ProgID SN5.4 ProgID AA3.3 ProgID CN2.3ProgID XN0.9
47
i2b2 National Center for Biomedical Computing Final Assembly statistics application server statistics application server Gene expression in APOE 4 Allele Alzheimer's Seizures ER visits Clinic visits Outcomes calculated every week Surgery ER visit microarray (encrypted) ownership manager encryption Trauma Gene-Chips population registry database microarray (encrypted) Trauma Surgery Multiple sclerosis Trauma CT Scan Hemorrhage Thalamus person conceptdate Gene-Chips Seizure Alzheimer’s Diabetes Z5937X Z5956X Z5937X raw value 3/4 3/9 5/2 4/6
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.