EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Cancer surveillance network on a grid infrastructure Paul De Vlieger LPC CNRS - IN2P3 ERIM
Enabling Grids for E-sciencE EGEE-III INFSO-RI Outline Introduction & Context –Cancer –Screening –Problems The RSCA project –Idea –Architecture –Technical description Specific issues –Security –Patient identification Roadmap & Conclusion Cancer surveillance network on a grid infrastructure 2
Enabling Grids for E-sciencE EGEE-III INFSO-RI ) Introduction & Context Cancer surveillance network on a grid infrastructure 3
Enabling Grids for E-sciencE EGEE-III INFSO-RI Introduction Cancer surveillance network on a grid infrastructure 4 Cancer in World –Major public health problem –25% deaths in US –Rise of Cancer care costs In Europe –2006: 1.2Million deaths One third can be evitable: –lifestyle, diet, vaccination, screening Breast Cancer –Status #1 for women (25% new cases) –Easy to investigate
Enabling Grids for E-sciencE EGEE-III INFSO-RI Cancer challenges Screening –“Earlier is better” In EU: –Breast Cancer 59 Million women 41% targeted in the 11 member states which deployed nationwide screening program –Colon Cancer 136 Million women & men 43% targeted in 12 states –Cervical Cancer 109 Million women 51% targeted in 15 states –Melanomas (soon) Cancer surveillance network on a grid infrastructure 5
Enabling Grids for E-sciencE EGEE-III INFSO-RI Cancer fighting impact Epidemiology –Publishes indicators about mortality, incidence and prevalence of cancers. –European, national and regional epidemiology structures exists –Evaluate the screening programs Statistics over cancer –Really hard to estimate –Relevance of information Needed to evaluate the impact of screening campaigns Cancer surveillance network on a grid infrastructure 6
Enabling Grids for E-sciencE EGEE-III INFSO-RI Cancer incidence & mortality Cancer surveillance network on a grid infrastructure 7 Deaths Source: IARC Cases of cancers in EU in 2006
Enabling Grids for E-sciencE EGEE-III INFSO-RI Breast cancer incidence Breast cancer incidence in EU states in 2004 Cancer surveillance network on a grid infrastructure 8 Source: cancer screening in European Union – Report 2007
Enabling Grids for E-sciencE EGEE-III INFSO-RI Breast cancer screening Cancer surveillance network on a grid infrastructure 9 Screening Structure Invitation: All women > 50 Mammography Biopsy Anatomical pathology report Cancer treatment loop Problem
Enabling Grids for E-sciencE EGEE-III INFSO-RI Problems No automatic data transfer between anatomic pathology laboratories and cancer screening structures: –Data requested by fax of carried out by hand Side effects: –Reinterpretation, retype of data : time consuming activity –Invitation of already infected persons –Self evaluation biased due to a lack of information about non- respondents: Are they already affected? Is a cancer diagnosed between 2 invitations? Cancer surveillance network on a grid infrastructure 10
Enabling Grids for E-sciencE EGEE-III INFSO-RI Problems (2) Anatomical pathologists –Refuses to share their data: The French union of cytopathologists requested his members not transmit their data –The old procedure (CRISAP) have some drawbacks: Physicians have to do the job themselves Data is taken out of the production area The usage of the results does not mention the sources No patients disambiguation is done –Most of CRISAPs are now idle: Impact the quality of statistical indicators at different scales Cancer surveillance network on a grid infrastructure 11
Enabling Grids for E-sciencE EGEE-III INFSO-RI Case of Auvergne Region of France: 1,3Million of habitants 2 screening structures: ARDOC & ABIDEC Several Anatomical pathology laboratories (private & public) (6) –#1 data holder on cancer –Realizes biopsies and analysis of cancers: stade, gravity, metastases –NO cancer without Anatomic-pathology report Crisap structure –Anatomic pathology data centralization Cancers registers Cancer surveillance network on a grid infrastructure 12
Enabling Grids for E-sciencE EGEE-III INFSO-RI ) The RSCA Project Cancer surveillance network on a grid infrastructure 13
Enabling Grids for E-sciencE EGEE-III INFSO-RI The RSCA project “A grid-enabled surveillance network” Query anatomical pathology laboratories databases on demand –Using grid technologies –No massive extraction of databases Respectful of data ownership –Respectful of patient privacy Compliant with data processing laws Use of cryptographic algorithms –Guarantying all security requirements Using grid security layers Strong authentication methods Cancer surveillance network on a grid infrastructure 14
Enabling Grids for E-sciencE EGEE-III INFSO-RI Sentinel network Cancer surveillance network on a grid infrastructure 15 Screening Structure Anatomic pathology lab Screening Structure Crisap database National / Regional Epidemiology GRID
Enabling Grids for E-sciencE EGEE-III INFSO-RI Architecture Cancer surveillance network on a grid infrastructure 16
Enabling Grids for E-sciencE EGEE-III INFSO-RI Pandora Gateway Designed in the health-e-child project by Maat- Gknowledge –Medical data accessibility –High level of security –Production status Set of software designed as a SOA –Designed to enable secure access to services needed by end- users –Complete abstraction of low-level business Databases, gLite, globus… Cancer surveillance network on a grid infrastructure 17
Enabling Grids for E-sciencE EGEE-III INFSO-RI Pandora Gateway architecture Cancer surveillance network on a grid infrastructure 18 Abstraction layer Security Workflow management Logging Domain Logic Business Logic Middleware: glite / database /globus
Enabling Grids for E-sciencE EGEE-III INFSO-RI Pandora Gateway: Security Pandora Gateway Authentication Service Service 1 Service 2 Service 3 VOMS 11/09/2015 Source: Maat-Gknowledge
Enabling Grids for E-sciencE EGEE-III INFSO-RI Technical architecture Cancer surveillance network on a grid infrastructure 20 Certification Authority + GIPCPS
Enabling Grids for E-sciencE EGEE-III INFSO-RI ) Specific issues Cancer surveillance network on a grid infrastructure 21
Enabling Grids for E-sciencE EGEE-III INFSO-RI Security requierements Business logic part: –Pandora GateWay PKI GSI VOMS Logging On the authentication part: –Use of electronic health cards Contains a trusted certificate (X.509) and engage the physician responsibility Available in France and soon in EU Obligatory to obtain agreements for nominative medical data exchange Cancer surveillance network on a grid infrastructure 22
Enabling Grids for E-sciencE EGEE-III INFSO-RI Patient identification Major issue in a model healthcare system –Difficulties to build the electronic health record –Use of existing identifiers are strictly supervised How to provide a way to identify surely persons while respecting their privacy? –Linking existing identifiers –Avoiding false matching Cancer surveillance network on a grid infrastructure 23 Data server Cancer screening Anatomic pathology Lab Patient X Id= Name= Martin D= 17/01/65 ADDR= [...] Patient X Id= S32AV48 Name= Martin D= 17/01/65 ADDR= [...] Data server
Enabling Grids for E-sciencE EGEE-III INFSO-RI Proposition of solution Add a specific identifier for the sentinel network –uuid type: 110E8400-E29B-11D4-A When a data provider upload data to the sentinel network: –A request is sent to the central server (GateWay) with the identification informations of the patient –The central server routes the requests to all data sources and return an identification number Either an existing number if the patient is already known Or a new one –In case of doubt (matching incomplete), a manual intervention is needed to settle the problem –Then the identifier retrieved is encrypted with the local public key of the data provider When a user asks for patient information: –The request is straightforward as the sentinel network contains only well-identified data, statistics requests are free of doubloons Cancer surveillance network on a grid infrastructure 24
Enabling Grids for E-sciencE EGEE-III INFSO-RI Sentinel network Identification server Central security server Virtual identification database For a patient X: Id association = folder XXXXX Id anapath= folder XXXXX Id sentinel = A47X…A251 -> B123….A231 -> 71D3….E127 -> 77A2….82F1 Screening Firewall GRID server Data I/O Data provider Private Server Physician Firewall DATA input GRID Server Virtual identification database Private Information Public Information Private Server pubkey privkey
Enabling Grids for E-sciencE EGEE-III INFSO-RI Roadmap & Conclusion Cancer surveillance network on a grid infrastructure 26
Enabling Grids for E-sciencE EGEE-III INFSO-RI Roadmap 03/2008: start of discussions 12/2008: list of requirements 02/2009: developer team agreements 03/2009: creation of the RSCA association –Manager of the sentinel network 07/2009: prototype alpha –Installation in one Lab, one screening structure in Clermont- Ferrand –Smart cards APIs ready –Focus on breast cancer 12/2009: prototype beta & first exchanges 2010-? debugging & extension Cancer surveillance network on a grid infrastructure 27
Enabling Grids for E-sciencE EGEE-III INFSO-RI Conclusion Possibilities –Medical Data Exchange Nominative patients sheets Reliable patient identification system –Statistics & epidemiology Global view of cancer data source Future extensions –Medical images (mammograms) MDM already answered (gLite) –Other types of cancer –Other data sources, other clients –Calculation? Workflows? Cancer surveillance network on a grid infrastructure 28
Enabling Grids for E-sciencE EGEE-III INFSO-RI Thanks Cancer surveillance network on a grid infrastructure 29