Download presentation
Presentation is loading. Please wait.
Published byMaximilian Newton Modified over 9 years ago
1
AHM September 2004 Grid Services Supporting the Usage of Secure Federated, Distributed Biomedical Data Dr Richard Sinnott Technical Director National e-Science Centre ||| Deputy Director Technical Bioinformatics Research Centre University of Glasgow 3 rd September 2004
2
AHM September 2004 Overview of BRIDGES Biomedical Research Informatics Delivered by Grid Enabled Services (BRIDGES) NeSC (Edinburgh and Glasgow) and IBM www.brc.dcs.gla.ac.uk/projects/bridges www.brc.dcs.gla.ac.uk/projects/bridges Supporting project for CFG project Generating data on hypertension Rat, Mouse, Human genome databases Variety of tools used BLAST, BLAT, Gene Prediction, visualisation, … Variety of data sources and formats Microarray data, genome DBs, project partner research data, medical records, … Aim is integrated infrastructure supporting Data federation Security
3
AHM September 2004 Grids & Life Sciences Extensive Research Community >1000 per research university Extensive Applications Many people care about them Health, Food, Environment, … Interacts with many disciplines Physics, Chemistry, Maths/Statistics, Nano-engineering, … Huge and expanding number of databases relevant to bioinformatics community Heterogeneity, Interdependence, Complexity, Change, Dirty… Linking in co-ordinated, secure manner full of open issues to be addressed Compute demands growing as more in-silico research undertaken
4
AHM September 2004 Database Growth PDB Content Growth DBs growing exponentially!!! Biobliographic (MedLine, PubMed…) Amino Acid Seq (SWISS-PROT, …) 3D Molecular Structure (PDB, …) Nucleotide Seq (GenBank, EMBL, …) Biochemical Pathways (KEGG, WIT…) Molecular Classifications (SCOP, CATH,…) Motif Libraries (PROSITE, Blocks, …)
5
AHM September 2004 Complexity of Biological Data Nucleotide sequences Nucleotide structures Gene expressions Protein Structures Protein functions Protein-protein interaction (pathways) Cell Cell signalling Tissues Organs PhysiologyOrganisms Populations + links to plant/crops, environmental, health, … information sources
6
AHM September 2004 More genomes …... Arabidopsis thaliana mouse rat Caenorhabitis elegans Drosophila melanogaster Mycobacterium leprae Vibrio cholerae Plasmodium falciparum Mycobacterium tuberculosis Neisseria meningitidis Z2491 Helicobacter pylori Xylella fastidiosa Borrelia burgorferi Rickettsia prowazekii Bacillus subtilis Archaeoglobus fulgidus Campylobacter jejuni Aquifex aeolicus Thermotoga maritima Chlamydia pneumoniae Pseudomonas aeruginosa Ureaplasma urealyticum Buchnerasp. APS Escherichia coli Saccharomyces cerevisiae Yersinia pestis Salmonella enterica Thermoplasma acidophilum
7
AHM September 2004 Bio e-Science Projects
8
AHM September 2004 Bridges Project Synteny Grid Service blast + VO Authorisation Information Integrator OGSA-DAI
9
AHM September 2004 Grid Security OGSA security Single sign-on based on (X.509) digital certificates establish credentials –Certification authority based (RAL in UK) Services (and clients) have APIs for fine grained security Based on GSS-API Provides for authentication but need authorisation Various technologies for authorisation including PERMIS, CAS, … Collaborating with P rivil E ge and R ole M anagement I nfrastructure S tandards Validation (PERMIS) team Lead by Prof David Chadwick, University of Salford –(www.permis.org)
10
AHM September 2004 Security Authorisation PERMIS allows to Define roles for who can do what on what Policy = { Role x Target x Action } –Can user X invoke service Y and access or change data Z? »Policies created with PERMIS PolicyEditor (output is XML based policy)
11
AHM September 2004 Security Authorisation PERMIS Privilege Allocator then used to sign policies Associates roles with specific users Policies stored as attribute certificates in LDAP server When is authorisation done? Two main choices Portal personalised for users based on their policies –If not allowed to invoke service then they do not get to see it Actions of users (with given role) are authorised every time the service is invoked –They can see the service but potentially not be allowed to invoke it »Performance issues… but more likely scenario for authorisation In both cases, if not explicitly agreed in policy then rejected and logged! –Both cases being explored Plan to exploit the GGF SAML AuthZ specification Based on GT3.3 – currently have BLAST service in GT3.2Final –Identified issues with standards…
12
AHM September 2004 Where we are today! Information Integrator DB repository established and populated … with public data sets (OMIM, HUGO, RGD, SWISS-PROT) … linked to relevant resources (ENSEMBL- rat, human, mouse, MGI) GT3 based Grid services developed (BLAST) using own meta-scheduler General usage of ScotGrid and local Condor pool Portal developed using IBM WebSphere Genome visualisation browsers SyntenyVista – for viewing synteny between local/remote data sets MagnaVista – for exploring genetic information across multiple (remote) resources Gaining experience with security technologies Setting up policies with Grid security authorisation software etc Rolled-out Alpha version of system to CFG group July ‘04
13
AHM September 2004 Lessons learned Public data resources openness Often cannot query directly Often not easy/possible to find schemas Joint Data Standards Study investigating this Started on 1 st June and involves –Digital Archiving Consultancy –Bioinformatics Research Centre (Glasgow) –NeSC (Edinburgh and Glasgow) Look at technical, political, social, ethical etc issues involved in accessing and using public life science resources –Will liase with NDCC –Interview relevant scientists, data curators/providers 8 month project with final report in January –Funded by MRC, BBSRC, Wellcome Trust, JISC, NERC, DTI GT3 not without pain! (… understatement!!!!) Hopefully GT4 will be better?
14
AHM September 2004
15
www.nesc.ac.uk
16
AHM September 2004
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.