Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 Director, Research and Academic Computing Director,

Similar presentations


Presentation on theme: "1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 Director, Research and Academic Computing Director,"— Presentation transcript:

1 1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 stewart@iu.edu Director, Research and Academic Computing Director, Information Technology Core, Indiana Genomics Initiative

2 License Terms Please cite this presentation as: Stewart, C.A. BioGrids in the US: Current status and future opportunities. 2004. Presentation. Presented at: International School on Physics and Industry workshop on Particle Accelerators and Detectors: from Physics to Medicine (Ettore Majorana Foundation and Center for Scientific Culture, Erice, Italy, 15 Apr 2005). Available from: http://hdl.handle.net/2022/14780http://hdl.handle.net/2022/14780 Portions of this document that originated from sources outside IU are shown here and used by permission or under licenses indicated within this document. Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse. Except where otherwise noted, the contents of this presentation are copyright 2004 by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

3 3 What is a grid? A grid is a system including computational resources data storage resources visualization resources specialized instruments tied together by high-performance networks Why grids? –Transcend limits of location –Use resources that would otherwise not be accessible –To do things that would otherwise not be possible

4 4 Types of grids By area of focus: Collaboration grid Computational grid –Supercomputer grids –Cycle scavenging Data grids Hybrid grids Not included as part of this classification system: openness of software or organizational structure

5 5 Computational Grid: TeraGrid US key national grid effort Based on Globus infrastructure Attempts to solve grid technology challenges in a very general fashion Currently 9 sites Little application thus far specifically in the area of biology Construction project: first we build it, then…

6 6 Special purpose Computational Grid: IU/HLRS 2003 HPC Challenge Global analysis of Arthropod evolution One application: fastDNAml 8 types of systems; 641 processors; 6 continents 200 trees analyzed

7 7 Cycle scavenging Computational Grids Folding at home (www.stanford.edu/ group/pandegroup/ folding/) Fight AIDS at home (fightaidsathome. scripps.edu/) Evolution@home (www.evolutionary- research.net/) http://www.stanford.edu/group/ pandegroup/folding/results.html

8 8 Data grids in biology Research data grids –Centralized Life Sciences Data Service –Teragrid (www.teragrid.org) Research and clinical data grids –SPIN (Shared Pathology Informatics Network) –Central Indiana Hospitals

9 9 Centralized Life Sciences Data Service at Indiana University Goal: transparent and integrated access to multiple data sources Federated database approach focuses on establishing glue between existing databases “ Private ” databases stay where they are – under local control “ Public ” databases may be replicated locally for performance Queries are entered as standard SQL

10 10 NR EST Swiss prot BLAST Data sources BLAST engine CLSD Engine (IBM II) LIGAN D BIND ENZY ME dbSN P Public data sources MS SQL Server IUSM workgroup databases Custom Web Applicati on Portal

11 11 CLSD: Finding Genes Queries multiple databases, linking expression data (local and remote) and location data Built by research lab in IUSM Portal, built with CLSD as a grid back end Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

12 12 Understanding Microarray Data The Microarray Data Portal was created by the Center for Medical Genomics at IU School of Medicine. Supported in part by the 21st Century Research & Technology Fund and the Indiana Genomics Initiative. The Indiana Genomics Initiative is supported in part by a grant from the Lilly Foundation, Inc.

13 13 Clinical data grids in Indiana SPIN (Shared Pathology Informatics Network) –Distributed database of anonymized data about pathology specimens provides –Data in compliance with US privacy regulations –SPIN software runs at participating institutions Regenstrief Institute –From data vaults to data grids –Hundreds of millions of patient records –Clinical service grid serving central Indiana hospitals

14 14 Semantic requirements for BioData Grids Interoperability of nomenclature and metadata a critical challenge! “ A biologist would rather use another biologist ’ s toothbrush than another biologist ’ s terminology ” – Thomas Kaufman Consistent semantics are required! Example projects: –GO: Gene Ontology –SBML: Systems Biology Markup Language –MAGE-ML: MicroArray Gene Expression Markup Language –SNOMED – CT: SNOMED Clinical Terms

15 15 Hybrid Grids SCrAPS –Advanced Photon Source at Argonne National Laboratories –“ Better than being there ” functionality –Real time integration of remote instruments, collaboration, computation, and visualization –Near real time data movement BIRN –Key NIH funded biogrid –Includes data, computation, visualization Encyclopedia of Life eDiamond

16 16 Where are we today? By area of focus: Collaboration grid Computational grid –Supercomputer grids –Cycle scavenging Data grids Hybrid grids By status: Very general construction projects Handcrafted grid solutions Special projects (heroic efforts involved) Ongoing production services

17 17 Looking Ahead Access to computing power via grids is still largely experimental Access to data via grids has transformed biomedical research and is transforming clinical practice Access to instruments is still experimental Great opportunities to advance biomedical research through use of grids Biology is different –Data is always collected somewhere –Affinities between grid structure and future software structure Sometimes grids are not the answer

18 18 Acknowledgments This research was supported in part by the Indiana Genomics Initiative. The Indiana Genomics Initiative of Indiana University is supported in part by Lilly Endowment Inc. This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University, and in particular by IU’s relationship with IBM as an IBM Life Sciences Institute of Innovation. This material is based upon work supported by the National Science Foundation under Grant No. 0116050 and Grant No. CDA-9601632. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). UITS staff: Mary Papakhian, Stephen Simms, Richard Repasky, Matt Link, John Samuel, Eric Wernert, Anurag Shankar, Andrew Arenson, John Herrin, Malinda Lingwall, W. Les Teach

19 19 Thank you! Further information available at: http://about.uits.iu.edu/divisions/rac/cv_stewart.html


Download ppt "1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 Director, Research and Academic Computing Director,"

Similar presentations


Ads by Google