Presentation is loading. Please wait.

Presentation is loading. Please wait.

I2b2 National Center for Biomedical Computing i2b2 Clinical Research Chart and Hive Architecture Henry Chueh Shawn Murphy Isaac Kohane, PI.

Similar presentations


Presentation on theme: "I2b2 National Center for Biomedical Computing i2b2 Clinical Research Chart and Hive Architecture Henry Chueh Shawn Murphy Isaac Kohane, PI."— Presentation transcript:

1 i2b2 National Center for Biomedical Computing i2b2 Clinical Research Chart and Hive Architecture Henry Chueh Shawn Murphy Isaac Kohane, PI

2 i2b2 National Center for Biomedical Computing Summary Background Intro to the Clinical Research Chart (CRC) Hive / Cell Software Architecture More details on establishing and using the CRC

3 i2b2 National Center for Biomedical Computing Background Clinical documentation is…clinical Lack of systematic approach for organizing clinical data for research Ownership issues are unique Consent issues are a challenge

4 i2b2 National Center for Biomedical Computing Driving Biological Projects Asthma Hypertension Huntington’s Disease Diabetes

5 i2b2 National Center for Biomedical Computing Clinical Research Chart (CRC) Organize and transform clinical data to maximize its utility for research Develop an Application and Database framework to serve this goal Establish an architecture that allows data from different studies done on this platform to be integrated

6 i2b2 National Center for Biomedical Computing Design of Clinical Research Chart OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces A program CRC DB HL7 MSH|^/&|736401….. PID|102|3231285.…. Text files XML.…. database clinical trials

7 i2b2 National Center for Biomedical Computing Design of Clinical Research Chart OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces A program Data pipeline/workflow applicationPheno/Genotype Database Visualization and Analysis of database contents CRC DB Text files XML.…. database clinical trials HL7 MSH|^/&|736401….. PID|102|3231285.….

8 i2b2 National Center for Biomedical Computing i2b2 Skeletal Data Flow Shared data Study specific data Study specific data Clinical Research Chart Enterprise Systems Registration, ADT, Labs, Reports, Clinical Notes, etc Enterprise data source (RPDR) Enterprise data source (RPDR) Annotation UI EDC applications Local Systems Systems not gathered into Enterprise data warehouses i2b2 ETL workflow Annotation Service EDC Service Analytic workflow

9 i2b2 National Center for Biomedical Computing Overall Themes Framework to allow development of application services in a maximally decoupled fashion. Linux and Windows OS support Java and C++ programming languages Use Cases for construction of CRC come from Driving Biology Projects and experience with clients of Partners Research Patient Data Registry

10 i2b2 National Center for Biomedical Computing Focus on Workflow Necessary for both pre-CRC and post- CRC processes Needed for scientific flexibility Implies a consistent environment for data pipelining and flow control

11 i2b2 National Center for Biomedical Computing i2b2 Hive Formed as a collection of interoperable Cells, or services Loosely coupled Makes no assumptions about proximity Connected by Web services Activated by a workflow engine that forms basis of choreography among Cells for complex interactions

12 i2b2 National Center for Biomedical Computing Complex choreography

13 i2b2 National Center for Biomedical Computing i2b2 Cell Behaves as a functional service Separates interactions conceptually into transactions and semantics Focuses on facilitating transactions with simple semantics (e.g., datatype) Leaves deep semantics to be defined by the services provided by a Cell Does not restrict language implementation

14 i2b2 National Center for Biomedical Computing Target layer for i2b2 TCP/IP Web Services I2b2 platform Semantic Objects

15 i2b2 National Center for Biomedical Computing Cell examples Concept extraction from clinical narratives Simple transformations; e.g., basic text format conversion Complex encoding; e.g., encoding MIAME in MAGE Microarray data normalization …

16 i2b2 National Center for Biomedical Computing Exposing Cells Protocols layered on top of SOAP At the WSDL level for integrators; ie, bioinformaticians & software engineers At a functional level for investigators i2b2 toolkits to allow integrators to expose controlled functionality to investigators (Automator)

17 i2b2 National Center for Biomedical Computing Automator Approach investigators informaticians Extend Kepler workflow engine i2b2 Automator

18 i2b2 National Center for Biomedical Computing Bird’s eye view Workflow engine Investigator Portal CRC Repository

19 i2b2 National Center for Biomedical Computing Current Implementation Extending Kepler workflow engine for i2b2 Data model for CRC repository Defining protocols necessary for interaction (in addition to SOAP) Created Cell for concept extraction from narratives Early designs for Automator toolkit

20 i2b2 National Center for Biomedical Computing i2b2 Architecture Key Points Leverage existing workflow standards and software Use Web services as basic form of interaction Assume unlimited choreography, but… Provide tools to distill complexity into basic automation for clinical investigators

21 i2b2 National Center for Biomedical Computing SW Licensing and Distribution Commit to Open Source software Use GNU Lesser General Public License Establish local i2b2 repository exposed through i2b2 website Contribute to a more global NCBC SourceForge style repository if it emerges ?NIH Forge Keep i2b2 protocols fully open

22 i2b2 National Center for Biomedical Computing Interoperability across NCBC Strongly consider Web services as basic protocol for generic shared interactions Consider sharing datasets Promote diversity of approach and use of shared software (don’t impose uniformity) Facilitate/promote NCBC Open Source project teams

23 i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow Populating the Clinical Research Chart (CRC)

24 i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow Use workflow framework to choreograph applications services in specific sequences Used to extract, transform, conform, and load data and metadata into the CRC

25 i2b2 National Center for Biomedical Computing Pre-CRC Data Pipeline/Workflow OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces Output Input A program increasingly useful Local or through SOAP service

26 i2b2 National Center for Biomedical Computing Ontology Service OntologyConsent/TrackingApplication PoolManagement Manages mappings of terms to common vocabularies Provides lists of acceptable (enumerated) values for various attribute and value slots. Allows for management of hierarchies, groupings, and relationships between terms Ontology

27 i2b2 National Center for Biomedical Computing Person Consent/Tracking Service OntologyConsent/TrackingApplication PoolManagement Provides mappings between patient/subject identifiers Tracks patient/subject consent information Allows identification of the patient/subject based upon fuzzy demographic matches Consent/Tracking

28 i2b2 National Center for Biomedical Computing Application Pool (CVS) Service OntologyConsent/TrackingApplication PoolManagement Stores programs/scripts used in pipeline Provides applications to be downloaded when needed Manages versioning of software Provides documentation Application Pool

29 i2b2 National Center for Biomedical Computing Management Service OntologyConsent/TrackingApplication PoolManagement Stores workflow execution plan Starts and controls workflow execution Schedules workflow execution Monitors workflow execution and data locations Controls permissions associated with workflow execution Management

30 i2b2 National Center for Biomedical Computing Data Pipeline/Workflow Application Use Case for Asthma Data OntologyConsent/TrackingApplication PoolManagement Services: Data flowing Custom Interfaces Soap/Http interfaces OutputInput A program RPDR CRC DB AsthmaMart Data retrieval Data de-identification Language processing Vocabulary matching Load Data into Mart

31 i2b2 National Center for Biomedical Computing Data Pipeline/Workflow Implementation Define standard XML representation for workflow - MoMLDefine standard XML representation for workflow - MoML Define standards for SOAP services and resource discovery Adopt and extend open source workflow package (Kepler)Adopt and extend open source workflow package (Kepler) Prototypes by July timeframe BIRN -> NAMIC and LONI collaboration Can follow construction details at http://diagon/i2b2 http://diagon/i2b2

32 i2b2 National Center for Biomedical Computing Phenotype/Genotype Database

33 i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Principles Analytical database schema that does not need to change with new data types and concepts Defined fundamental unit of data (atomic fact) = observation Defined metadata strategy Various levels of de-identification (reviewed and approved by IRB)

34 i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Architecture (see preprint)

35 i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Use Case Smoking observations represented in database Patient_id_eConcept_cdStart_dateProvider_idConfidence_num Z234CT-A-SMK1/1/1997M00223033 Z234CT-A-SMK1/1/1998M00341259 Z234IC9-30511/1/2001M00223033 Z234CT-A-NSK1/1/2002M00341259 Patient_id_eBirth_dateSex_cdRace_cdDeath_date Z2343/4/1924FemaleBlack4/5/2003 Provider_idProvider_pathName_char M0022303MGH\Neurology\M0022303M0022303 Concept_cdConcept_pathName_char CT-A-SMKAsthV1\DRptNLP\Tobacco Use\SmokerSmoking IC9-3051 V2\Diagnosis\Mental Disorders (290-319)\Non- psychotic disorders (300-316)\(305) Nondependent abuse of drugs\(305-1) Tobacco use disorder\(305- 11) Tobacco use disorder, co~ Tobacco Use Disorder, continuous use CT-A-NSKAsthV1\DRptNLP\Tobacco Use\Non smokerNever smoked

36 i2b2 National Center for Biomedical Computing Phenotype/Genotype Database Implementation Asthma CRC DB “primed” with data from 90,000 patients from Research Patient Data Registry Serves as fundamental data structure for i2b2 supported data Querying and Visualization Application Suite CRC DB’s able to fuse seamlessly together Various levels of de-identification to be supported for data sharing and publication

37 i2b2 National Center for Biomedical Computing Visualization and Analysis of CRC database Post-CRC workflow

38 i2b2 National Center for Biomedical Computing Visualization and Analysis Principles Supported application suite to query and view CRC database contents Outside applications for analysis and viewing able to plug in to application suite Pipeline/Workflow framework may be used for analysis and re-entry of derived data into CRC database

39 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported Applications, Querying and Visualization –Standard querying –Data exploration

40 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported Applications, ontology management –Ontology Management Integrate (outside?) population analysis applications

41 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture Supported applications have plug-in architecture for outside analytic tools: –Standard web-link support with GET and POST oriented data transfer –Support transfer of specifically transformed data to outside applications –Complex analysis supported with workflow application

42 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture - Query

43 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture - Exploration

44 i2b2 National Center for Biomedical Computing Visualization and Analysis Architecture – Ontology mgmt

45 i2b2 National Center for Biomedical Computing Visualization and Analysis Use Case

46 i2b2 National Center for Biomedical Computing Visualization and Analysis Implementation of analysis tools Workflow framework to accommodate external analytic applications CRC DB ProgID CA2.3 SN8745 PA5683 SN8745 SNOMED CODE patient id 0000004 account # 347 subject id 4 ProgID CX2.3 ProgID PN5.1ProgID TH3.0 ProgID SN5.4 ProgID AA3.3 ProgID CN2.3ProgID XN0.9

47 i2b2 National Center for Biomedical Computing Final Assembly statistics application server statistics application server Gene expression in APOE  4 Allele Alzheimer's Seizures ER visits Clinic visits Outcomes calculated every week Surgery ER visit microarray (encrypted) ownership manager encryption Trauma Gene-Chips population registry database microarray (encrypted) Trauma Surgery Multiple sclerosis Trauma CT Scan Hemorrhage Thalamus person conceptdate Gene-Chips Seizure Alzheimer’s Diabetes Z5937X Z5956X Z5937X raw value 3/4 3/9 5/2 4/6

48


Download ppt "I2b2 National Center for Biomedical Computing i2b2 Clinical Research Chart and Hive Architecture Henry Chueh Shawn Murphy Isaac Kohane, PI."

Similar presentations


Ads by Google