Presentation is loading. Please wait.

Presentation is loading. Please wait.

15th of June 2009Grids & e-Science 2009 Santander1 eScience activities in a brain imaging research network. David Rodríguez González SINAPSE collaboration.

Similar presentations


Presentation on theme: "15th of June 2009Grids & e-Science 2009 Santander1 eScience activities in a brain imaging research network. David Rodríguez González SINAPSE collaboration."— Presentation transcript:

1 15th of June 2009Grids & e-Science 2009 Santander1 eScience activities in a brain imaging research network. David Rodríguez González SINAPSE collaboration National e-Science Centre. School of Informatics & SFC Brain Imaging Research Centre, Division of Clinical Neuroscience University of Edinburgh On behalf of the SINAPSE Collaboration.

2 15th of June 2009Grids & e-Science 2009 Santander2 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

3 15th of June 2009Grids & e-Science 2009 Santander3 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

4 15th of June 2009Grids & e-Science 2009 Santander4 Slide by J. Wardlaw

5 15th of June 2009Grids & e-Science 2009 Santander5 Massive expansion in research imaging  All branches of medicine – particularly brain  Not just medicine – psychology, linguistics, engineering, parapsychology, etc.  In Scotland too!!! 8% UK population 12.5% of all highest rated departments. Highest concentration of biotech in Europe  Neuroscience – much larger than NIH But in 2006 there were machines, pockets of excellence, but little cohesion Slide by J. Wardlaw

6 15th of June 2009Grids & e-Science 2009 Santander6 The SINAPSE Project  Stands for Scottish Imaging Network: a Platform for Scientific Excellence.  Pooling initiative of six Scottish universities: Aberdeen, Dundee, Edinburgh, Glasgow, St. Andrews and Stirling.  Main objectives: develop imaging expertise, support multi-centre clinical research in conjunction with the Clinical Research Networks, improve the ability of neuroscientists to collaborate on clinical trials, have a direct impact on patient health.

7 15th of June 2009Grids & e-Science 2009 Santander7 SINAPSE – gluing it together  Networking CRNs – large patient populations CRFs – patient-focused research facilities Poolings – buys more science Individual projects – ageing cohorts, multicentre studies  Harmonise framework for image data management  Standardise imaging methods  Make available image processing methods  Ethics, good imaging research practice  Translation from bench to bedside Slide by J. Wardlaw

8 15th of June 2009Grids & e-Science 2009 Santander8 SINAPSE priority projects  Stroke, the brain and the blood-brain interface  Ageing brain to dementia  Novel molecular imaging markers for major psychiatric disorders  Innovative radiotracers for CNS inflammation

9 15th of June 2009Grids & e-Science 2009 Santander9 Slide by J. Wardlaw

10 15th of June 2009Grids & e-Science 2009 Santander10 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

11 15th of June 2009Grids & e-Science 2009 Santander11 e-Science for SINAPSE  Sharing of research data and applications between centres is an important part of the SINAPSE project’s objectives The increasing amount of data acquired in modern imaging facilities and the distributed nature of SINAPSE require a proper data management strategy  National e-Science Centre actively involved in the SINAPSE collaboration Mainly through the IT & Image Analysis Committee

12 15th of June 2009Grids & e-Science 2009 Santander12 eScience project activities  Information governance & data de- identification Networking Development of de-identification tool  Data sharing infrastructure Facilitating multi-centre studies  Portal for brain imaging Improving usability  Other Analysis methods

13 15th of June 2009Grids & e-Science 2009 Santander13 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

14 15th of June 2009Grids & e-Science 2009 Santander14 Data Protection Act  UK’s Data Protection Act (1998). Implements the European Community Data Protection Directive 1995.  Establish individuals’ rights on data held about them and obligations for organisations or people processing personal data.  Personal data must be processed in a fair and lawful manner. 8 DPA principles.  Other legislation pieces apply to medical data. Common law: duty of confidentiality. Human Rights Act 1998 (article 8).

15 15th of June 2009Grids & e-Science 2009 Santander15 DPA in research  The DPA does not define the term “research purposes” apart from clarifying that it includes statistical or historical purposes.  Data processing for research should be ‘compatible’ with the purpose for which the data were originally obtained.  The data subjects should be aware that their personal information will be used for research purposes.

16 15th of June 2009Grids & e-Science 2009 Santander16 Anonymous Data  Coded (pseudonymised or linked anonymised) data: the identifiable information has been substituted by alphanumerical sequences with no plain meaning. The data is anonymous to the research team. The key to reverse the transformation shall be held securely by a third party to avoid falling into the DPA.  (Fully) Anonymised data: all personal identifiers or codes have been irreversibly removed.

17 15th of June 2009Grids & e-Science 2009 Santander17 MIDAS meeting (18 th March 2009)  Medical Imaging Data Access and Sharing  Hosted in the e-Science Institute  Brought together representatives from the NHS Scotland & the universities  Successful meeting with useful discussion Came out with a roadmap for improving the data sharing between both sides Report produced now being circulated between attendees

18 15th of June 2009Grids & e-Science 2009 Santander18 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

19 15th of June 2009Grids & e-Science 2009 Santander19 SINAPSE DICOM De-Identification Toolkit  Implemented in Java.  Configurable for each site.  The idea is to deploy it as near as possible to the data acquisition.  Privacy Policy configurable using XML documents. Different projects can apply different policies. The policy specifies the classes that will execute the transformation of the data. Graphical tool for editing the policies.  These classes will be distributed in signed jars, and their authenticity will be checked using their hash. For data provenance checks and auditing purposes the classes’ version will be tracked.

20 15th of June 2009Grids & e-Science 2009 Santander20 Data De-Identification National PACS CHI Transformation Service SINAPSE Anonymiser Local Storage Anonymous research data Link Table NHSResearch Centre Local RIS

21 15th of June 2009Grids & e-Science 2009 Santander21 CHI Transformation Service  CHI (Community Health Index) is the National unique identifier for NHS (Scotland) patients Used in any health related communication As it identifies the patient it is sensitive information  It is composed of 10 digits that include Date of birth Gender Control digit  Possibilities Reversible / Irreversible transformation Unique for all SINAPSE / Unique for each Data Controller

22 15th of June 2009Grids & e-Science 2009 Santander22 Data input De- identification Metadata extraction Data output Anonymiser workflow File systemReceiverFile systemSFTPContentProvenanceStructureCatalogue

23 15th of June 2009Grids & e-Science 2009 Santander23 DICOM standard DICOM library DICOM library adaptor Anonymiser library FieldTransformer ApplicationPolicy Builder FieldTransformer Privacy policy SINAPSE Anonymiser components

24 15th of June 2009Grids & e-Science 2009 Santander24 FieldTransformers  Classes implementing an interface that are used for atomic transformations of the contents of fields.  Specified in run time by the used Privacy Policy.  Format independent. Only work with the content.  Examples: DatesTransformer StudyIDTransformer InformationOverwriter …

25 15th of June 2009Grids & e-Science 2009 Santander25 Privacy Policies  XML documents containing the rules for anonymising the data  Specify: The target fields The class used for the transformation including:  Version  Digest  Location (jar file) Parameters

26 15th of June 2009Grids & e-Science 2009 Santander26 Policy Editor  A graphical tool to help building policy documents.  DICOM dictionary.  Searches for “FieldTransformer” classes in jar files.

27 15th of June 2009Grids & e-Science 2009 Santander27 Policy Editor

28 15th of June 2009Grids & e-Science 2009 Santander28 Registry  A catalogue containing privacy policies.  The application can work without this.  But it helps to set a coherent set of policies. Transformer classes,  and the corresponding jar files.

29 15th of June 2009Grids & e-Science 2009 Santander29 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

30 15th of June 2009Grids & e-Science 2009 Santander30 Data Sharing e-Infrastructure  For enabling multi-centre clinical research through data sharing  Some features of the proposed of the SINAPSE e- infrastructure project are: De-Identification, automatic compliance with data protection policies; Security, advanced authentication and authorisation within projects; Usability, providing a user friendly environment to access data and applications; Modularity, conforming to relevant standards and use of existing components; Centralisation, leveraging existing compute clusters and storage.

31 15th of June 2009Grids & e-Science 2009 Santander31 Benefits  Easier Data Protection compliance for users  Enables secure data sharing  Coherent view of available data (single point of access)  Roadmap for end-of-project data publication & data curation

32 15th of June 2009Grids & e-Science 2009 Santander32 Data Storage & Access  Centralised model adopted: cheaper, easier, allows to reduce the IT burden undertaken by research staff. Although there are several grid projects that provide DICOM functionalities.  The research data will be encrypted before storing it.  Data organised per project Access control using groups & roles.  Authentication using Shibboleth due to usability concerns regarding X.509 certificates.

33 15th of June 2009Grids & e-Science 2009 Santander33 Centralised Architecture (pros & cons)  Simpler Deployment  Easier middleware release control  Lesser impact in participant centres  Easier to manage and use  No default resilience A second centre would be needed But this is only necessary for critical services With a good support a reasonable service can be provided using a single centre

34 15th of June 2009Grids & e-Science 2009 Santander34 Deployment Plan  ECDF (http://www.is.ed.ac.uk/ecdf/) ‏http://www.is.ed.ac.uk/ecdf/ A singular facility along Scotland  Disk space and CPU time will be rented depending on the necessities. 1456 CPU cores 275 TB of disk  Also SINAPSE owned server to be hosted by ECDF: ECDF will provide basic hardware + software support SINAPSE services to be hosted in it:  Portal  Data Catalogue  Research Data encryption service  OGSA-DAI  Projects’ customised databases  RAPID…

35 15th of June 2009Grids & e-Science 2009 Santander35 Data Files University Authentication Service Metadata Catalogue SINAPSE Storage Data Catalogue Data Upload Local Storage Portal Data Upload Service Metadata extraction Data Encryption Data Storage

36 15th of June 2009Grids & e-Science 2009 Santander36 Metadata Catalogue SINAPSE Storage Data Catalogue Portal Data Decryption Clinical Information Data Access

37 15th of June 2009Grids & e-Science 2009 Santander37 RESOURCES DATA PROVIDER SERVICES SINAPSE SERVICES CPUsStorage NetworkLocal AuthSINAPSE Anonymiser CHI Transformation Service VOMSJSSMetadata CatalogueRD Key StoragePortalBasic WSShibbolethRAPIDRD EncryptionOGSA-DAI SINAPSEEXTERNALLOCAL APPLICATIONS Storage AgeingPsychiatryStroke… CPUs

38 15th of June 2009Grids & e-Science 2009 Santander38 Authentication  Shibboleth federated authentication Single sign-on Delegated to home universities Users will continue using a method they are already familiar with  X.509 certificates are usual in Grids But can be a handicap for some users

39 15th of June 2009Grids & e-Science 2009 Santander39 Authorisation  Dynamic Virtual Organisations Members should be added/removed easily New VOs creation for new projects/studies VO role management  Role based access Allows different access levels to information for different users

40 15th of June 2009Grids & e-Science 2009 Santander40 Catalogues  Data Catalogue for keeping track of the files in the system  Metadata Catalogue storing key attributes extracted from the DICOM headers It will also keep information on de- identification process for data provenance  Clinical Information databases and customised metadata databases can be deployed by the different projects  OGSA-DAI will be used to provide access to these resources

41 15th of June 2009Grids & e-Science 2009 Santander41 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

42 15th of June 2009Grids & e-Science 2009 Santander42 Portal  A gridsphere based portal will give access to the resources.  Basic functionality to be provided by SINAPSE Data uploading Catalogues querying …  Different subprojects will develop their own customised portlets to be integrated in the portal

43 15th of June 2009Grids & e-Science 2009 Santander43 A portal for brain imaging (MSc project by Albert Heyrovský)  Motivations to facilitate the usage of complex software packages to provide access to large computing resources like ECDF  First application: brain perfusion imaging analysis  Easily extensible to other brain imaging applications  Portlets generated using the Rapid system developed at NeSC

44 15th of June 2009Grids & e-Science 2009 Santander44 Contents  SINAPSE  eScience for SINAPSE  Data Protection  Data de-identification  Data Sharing  Portal  Status & Plans

45 15th of June 2009Grids & e-Science 2009 Santander45 Status  The proposal was adopted by the SINAPSE IT & Image Analysis committee  Grant application to support pilot project (including hardware & storage resources) rejected Considering resubmission  SINAPSE De-Identification Toolkit deployed SBIRC (Edinburgh) Aberdeen Used for anonymising acute stroke study data

46 15th of June 2009Grids & e-Science 2009 Santander46 Plans  Development of new components started: Catalogues.  Portal for brain imaging to be kickstarted with the MsC in eScience student´s project  Collaboration with other centres CRIC (Edinburgh) TMRC (Dundee)

47 15th of June 2009Grids & e-Science 2009 Santander47 Questions


Download ppt "15th of June 2009Grids & e-Science 2009 Santander1 eScience activities in a brain imaging research network. David Rodríguez González SINAPSE collaboration."

Similar presentations


Ads by Google