INFSO-RI Enabling Grids for E-sciencE A Grid Approach to Distributed Image Analysis for Early Diagnosis of Alzheimer Disease Livia Torterolo University of Genoa / DIST EGEE User Forum CERN,
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Agenda Introduction to SPM analysis and to development of a web-based SPM service GRID implementation of SPM service Current Status and Plans
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Partners The first part of the work has been done during the Italian Neuroinformatics OCSE Node project in collaboration with –San Raffaele Hospital of Milano –University of Milano – Bicocca GRID implementation has been carried out by Bio-Lab at DIST, University of Genoa, during the Grid.it Italian project The application is a quite stable work in progress and served as a demostrator for the Grid.it project but further developments and extensions are in progress still inside the GILDA testbed
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Introduction to Alzheimer Disease AD is the most common form of dementia, accounting from 50% to 70% of all dementias in elderly people (around 3 million people in EU25) Clinically, AD is characterized by a progressive loss of cognitive abilities Memory loss is typically the earliest sign of AD
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Analysis of PET-SPECT images
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Results of a SPM analysis on a PET study of glucose metabolism in a patient with dementia. Ipometabolic pattern in the frontal cortex: design matrix (top right), statistically significant clusters on a glass brain in three orthogonal planes (top left) and on a 3D brain rendering (bottom) SPM Results
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Why a portal? The remote access to SPM analysis could provide doctors from peripheral hospitals with an invaluable tool to run the analysis remotely only using a standard browser web No particular HW resources or computer knowledge are required In order to avoid errors during the analysis, only selected users should access and use the servicehttp://
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, SPM: Why GRID? A large set of images of normal patients is required to be used for comparison PET and SPECT studies on normal subjects are very rare Images of patients are covered by privacy and security issues and for this reason they cannot be freely moved on the net or published by the centre that made the analysis
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Contribution to GILDA LCG Grid Environment v machines dedicated to LCG NODE: 1 CE, 1 SE, 3 WNs, 1 UI, 1 LCG install server ( 12 CPUs, 7Giga (RAM), ≈ 350Giga of storage)
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Data Management and File Access tools: to access data from User Interface using Logical File Names (LFN) LCG File Catalog (LFC): to register data into the catalog AMGA Metadata Catalog: to add related metadata to data LCG tools used
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Data Management and File Access
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, LCG File Catalog (LFC) Several functionalities: Use of lcg_utils Use of GFAL calls Use of GSI certification Access to grid file in SEs from “anywhere” Several replicas of files in different sites Copy of data from/to local file system to GRID
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, AMGA: ARDA Metadata Grid Application Side-by-Side a File Catalogue (LFC): File Metadata Access control to resources on the Grid is done via VOMS Different front-end: –mdcli & mdclient and C++ API –Java Client API and command line mdjavaclient.sh & –mdjavacli.sh (also under Windows !!) –Python Client API Strong security requirements: –patient data is sensitive –metadata access must be restricted to authorized users
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, GRID-Design of the application With reference to different LCG tools mentioned above, application code has been modified in the following way: –Registration and storage of data files (PET/SPECT images) on SEs available using lcg_utils interacting with LFC and AMGA. –Development of a C program with GFAL C API in order to access distributed images using their LFNs and to extract some information necessary to SPM analysis without copying them locally. –Job Submission: creation of a JDL file to submit the executable (and not the images) with GFAL call to the GRID. –Statistical Analysis: running of SPM analysis from results obtained from job submission. Statistical analysis is performed outside GRID environment.
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Final GRID structure
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Application Running: Setting Parameters Setting SPM parameters Selection of normal subjects QUERIES to AMGA !
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Application running: JOB SUBMISSION SPM Analysis RESULTS
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Issues & Problems Pros: –Data distribution over different SEs accessed with GFAL APIs. –Integration of a Grid application into an existing environment with hard constraints and several integration problems to solve –Application behaviour to the user unchanged Cons: –GFAL API not well documented & Compatibility issues and conflicts on libraries (but good support from GILDA staff) –Too much components (Zope + Plone, Python, shell scripts, Matlab, C/C++ compiled code, …& Grid!) –Software maintenance is hard
Enabling Grids for E-sciencE INFSO-RI EGEE User Forum, CERN, Current Status and Plans Application –pre-processing parallelization (further functions to port on Grid) Interface –move to Genius 3.1 Overall framework –improve performance and scalability –A Grid environment devoted to the analysis of microarrays, derived from this project, will be tested in the frame of European Project BioinfoGRID (Bioinformatics Application for Life Science) –A bioinformatics data storage and analysis platform with metadata support will be developed under Italian MIUR-FIRB project LITBIO (Laboratory for Interdisciplinary Technologies in Bioinformatics) Thanks for your attention!