1 Global Analysis of Arthropod Evolution – a successful grid project Craig A. Stewart, Rainer Keller, Matthias Hess, Uwe Woessner, Martin Aumüller, Matthias.

Slides:



Advertisements
Similar presentations
Future of Supercomputer Centers: an evolutionary ecology view Craig Stewart Executive Director, Indiana University Pervasive Technology Institute Associate.
Advertisements

April 19, 2015 CASC Meeting 7 Sep 2011 Campus Bridging Presentation.
What is Cyberinfrastructure?
Benefits from Participation in the Supercomputing 2008 (SC08) Conference Jim Bottum Jill Gemmill Walt Ligon Mihaela Vorvoreanu.
Bill Barnett, Bob Flynn & Anurag Shankar Pervasive Technology Institute and University Information Technology Services, Indiana University CASC. September.
Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig Stewart (Indiana University) Chris Jordan.
ESE Einführung in Software Engineering X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
Pti.iu.edu /jetstream Award # A national science & engineering cloud funded by the National Science Foundation Award #ACI Jetstream Overview.
Pti.iu.edu /jetstream Award # A national science & engineering cloud funded by the National Science Foundation Award #ACI Prepared for the.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Rockhopper: Penguin on Demand at Indiana.
INDIANAUNIVERSITYINDIANAUNIVERSITY April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart
Current challenges and opportunities in Biogrids Dr. Craig A. Stewart Director, Research and Academic Computing, University Information.
Computational Biology: Data, computation, and visualization Dr. Craig A. Stewart & Dr. Eric Wernert 7 August 2003.
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
Campus Bridging: What is it and why is it important? Barbara Hallock – Senior Systems Analyst, Campus Bridging and Research Infrastructure.
Statewide IT Conference, Bloomington IN (October 7 th, 2014) The National Center for Genome Analysis Support, IU and You! Carrie Ganote (Bioinformatics.
Win8 on Intel Programming Course The challenge Paul Guermonprez Intel Software
Next Generation Cyberinfrastructures for Next Generation Sequencing and Genome Science AAMC 2013 Information Technology in Academic Medicine Conference.
Information technology, collaboration, and achieving IU ’ s research goals Craig A. Stewart 13 November 2003 Director, Research and Academic.
Craig Stewart 23 July 2009 Cyberinfrastructure in research, education, and workforce development.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Using the Purdue DB Technology to build.
INDIANAUNIVERSITYINDIANAUNIVERSITY January 2002 INGEN's advanced IT facilities Craig A. Stewart
Goodbye from Indianapolis, IUPUI, and Craig A. Stewart Executive Director, Pervasive Technology Institute Associate Dean, Research Technologies Indiana.
INDIANAUNIVERSITYINDIANAUNIVERSITY 1 Evolutionary Biology and Computational Grids Craig Stewart Director, Research and Academic Computing.
High Performance Computing for University Medical Research: A Successful Implementation Dr. Craig A. Stewart, Ph.D. Director, Research and.
Big Red II & Supporting Infrastructure Craig A. Stewart, Matthew R. Link, David Y Hancock Presented at IUPUI Faculty Council Information Technology Subcommittee.
I-Light: A Network for Collaboration between Indiana University and Purdue University Craig Stewart Associate Vice President Gary Bertoline Associate Vice.
Genomics, Transcriptomics, and Proteomics: Engaging Biologists Richard LeDuc Manager, NCGAS eScience, Chicago 10/8/2012.
The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists Internet2 Network Infrastructure for the Life Sciences Focused.
Leveraging the National Cyberinfrastructure for Top Down Mass Spectrometry Richard LeDuc.
XSEDE12 Closing Remarks Craig Stewart XSEDE12 General Chair Executive Director, Indiana University Pervasive Technology Institute.
September 6, 2013 A HUBzero Extension for Automated Tagging Jim Mullen Advanced Biomedical IT Core Indiana University.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. The IQ-Table & Collection Viewer A.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 Director, Research and Academic Computing Director,
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream - A self-provisioned, scalable science and.
Computing and Communications and Biology Molecular Communication; Biological Communications Technology Workshop Arlington, VA 20 February 2008 Jeannette.
July 18, 2012 Campus Bridging Security Challenges from “Panel: Security for Science Gateways and Campus Bridging”
A Global Grid for Analysis of Arthropod Evolution Craig A. Stewart, Rainer Keller, Richard Repasky, Matthias Hess, David Hart, Matthias Müller, Ray Sheppard,
Making Campus Cyberinfrastructure Work for Your Campus Guy Almes Patrick Dreher Craig Stewart Dir. Academy for Dir. Advanced Computing Associate Dean Advanced.
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream Overview – XSEDE ’15 Panel - New and emerging.
INDIANAUNIVERSITYINDIANAUNIVERSITY 1 Parallel implementation and performance of fastDNAml - a program for maximum likelihood phylogenetic inference Craig.
Using Prior Knowledge to Improve Scoring in High-Throughput Top-Down Proteomics Experiments Rich LeDuc Le-Shin Wu.
Research Computing Archived Presentation Title:Indiana Economic Development From Indiana Economic Development Corporation to Indiana and Purdue.
INDIANAUNIVERSITYINDIANAUNIVERSITY Spring 2000 Indiana University Information Technology University Information Technology Services Please cite as: Stewart,
November 18, 2015 Quarterly Meeting 30Aug2011 – 1Sep2011 Campus Bridging Presentation.
February 27, 2007 University Information Technology Services Research Computing Craig A. Stewart Associate Vice President, Research Computing Chief Operating.
UITS Research Technologies – Services Available to Regenstrief Institute 13 Oct 2015 Craig Stewart ORCID ID Executive Director, Indiana.
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
Recent key achievements in research computing at IU Craig Stewart Associate Vice President, Research & Academic Computing Chief Operating Officer, Pervasive.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Update on EAGER: Best Practices and.
Award # funded by the National Science Foundation Award #ACI Jetstream: A Distributed Cloud Infrastructure for.
Jetstream: A new national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor, Collaboration.
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
Craig Stewart ORCID ID Jetstream Principal Investigator Executive Director, Indiana University Pervasive Technology Institute Presented.
1 A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Informatics Tools at the Indiana CTSI.
Numerical Methods Multidimensional Gradient Methods in Optimization- Example
Jetstream Overview Jetstream: A national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor,
1 Campus Bridging: What is it and why is it important? Barbara Hallock – Senior Systems Analyst, Campus Bridging and Research Infrastructure.
Jetstream: A national research and education cloud Jeremy Fischer ORCID Senior Technical Advisor, Collaboration and.
Research & Academic Computing Indiana University Statewide IT Conference 11 September 2003 Indianapolis IN.
Matt Link Associate Vice President (Acting) Director, Systems
funded by the National Science Foundation Award #ACI
Methodology Overview 2 basics in user studies Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.
Research and Academic Computing Division
Elliptic Partial Differential Equations – Direct Method
Project Title: I. Research Overview and Outcome
Presentation transcript:

1 Global Analysis of Arthropod Evolution – a successful grid project Craig A. Stewart, Rainer Keller, Matthias Hess, Uwe Woessner, Martin Aumüller, Matthias Müller, Richard Repasky, David Hart, Huian Li, Donald K. Berry University Information Technology Services, Indiana University High Performance Computing Center Stuttgart And many other contributors… © Copyright Trustees of Indiana University 2004

License Terms Please cite this presentation as: Stewart, C.A., R. Keller, M. Hess, U. Wössner, M. Aumüller, M. Müller, R. Repasky, D. Hart, H. Li and D.K. Berry. Global grid analysis of arthropod evolution – a successful grid project Presentation. Presented at: 7th HLRS Metacomputing and GRID Workshop (Stuttgart, Germany, 26 Apr 2004). Available from: Portions of this document that originated from sources outside IU are shown here and used by permission or under licenses indicated within this document. Items indicated with a © or denoted with a source url are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse. Except where otherwise noted, the contents of this presentation are copyright 2004 by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license ( This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

3 Outline The SCxy conference and the HPC Challenge The biological problem The software used The global grid The results! Acknowledgements

4 The SCxy conference and the HPC Challenge Supercomputing Conference (sponsored by ACM and IEEE) High Performance Challenge –demonstrates new capabilities in advanced computing systems –(or sometimes silly supercomputer tricks)

5 Biological problem Are Hexapods a single evolutionary group? Are ecdysozoans a single evolutionary group?

6 A partial bestiary All organism illustrations copyright Jennifer Fairman, Used by agreement

7 Software and data analysis Non-grid preparatory work –Download sequences from NCBI (67 Taxa, 12,162 bp, mitochondrial genes for 12 proteins) –Align sequences with Multi-Clustal –Determine rate parameters with TreePuzzle Grid preparatory work –Analyze performance of fastDNAml with Vampir –Meetings via Access Grid & CoVise The grid software –PACXMPI – Grid/MPI middleware –Covise – Collaboration and visualization –fastDNAml – Maximum Likelihood phylogenetics

8 A project of HLRS (High Performance Computing Center Stuttgart) PACX-MPI (PArallel Computer eXtension) enables seamlessly execution of MPI-conforming parallel applications on a Grid. Application recompiled and linked w. PACX-MPI. Communication between MPI processes internally is done with the vendor MPI, while communication to other parts of the Metacomputer is done via the connecting network. Key advantages: –Optimized vendor MPI library is used. –Two daemons (MPI processes) take care of communication between systems – allows bundling of communication.

9 COVISE COllaborative VIsualization and Simulation Environment A project of HLRS (High Performance Computing Center Stuttgart) Focus on collaborative and interactive use of supercomputers Interactive startup of calculation on a Computational Grid Real-Time visualization of the results and the performance of computation.

10 fastDNAml ML analysis of phylogenetic trees based on DNA sequences Foreman/worker MPI program Heuristic search for best trees For 67 taxa: 2.12 ~ trees Goal: 300 bootstraps, 10 jumbles per – 3000 executions (more than 3x typical!)

11 Why this project on a grid? Important & time-sensitive biological question requiring massive computer resources A biologically-oriented code that scales well Grid middleware environment & collaboration tool well suited to the task at hand Opportunity to create a grid spanning every continent on earth (except Antarctica)

12 The metacomputers OneOrigin Spain Linux cluster 64Japan Linux cluster 12Australia IBM SP 32US TwoT3E128Germany IBM SP 64US Dec Alpha 4Brazil Sun fire Singapore ThreeHitachi SR Germany Cray T3E 128 UK Cray T3E 32US IBM SP (Blue Horiz) 32US FourDec Alpha (Lemieux) 64US FiveLinux system 1Tunisia Five functional units; 8 types of systems (several on Top500 list); 6+ vendors; 641 processors; 9 countries, 6 continents

13

14 The results ~200 trees were analyzed during the course of the week The biological results are still being analyzed Our HPC challenge project was awarded the prize for “ Most geographically distributed application ”

15 Things we learned Proper alignment of parallelism coarseness and network speeds was important There was real value to the use of the metacomputer concept within the overall grid You can distribute a lot of machine computations, but less of the human work. (=>simplicity is a virtue) There are today few large scale grids delivering computational services for biological computation in a persistent fashion. The temporary grid we created ranks as one of the larger grids ever created for biological computing

16 For further information fastDNAml: PACXMPI: COVISE: HLRS: UITS: uits.iu.edu Center for Genomics and Bioinformatics: SCxy: about.uits.iu.edu/divisions/rac/index.html about.uits.iu.edu/divisions/rac/pubsstaff.html ingen.iu.edu it.iu.edu

17 Acknowledgments This research was supported in part by the Indiana Genomics Initiative. The Indiana Genomics Initiative of Indiana University is supported in part by Lilly Endowment Inc. This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University. This material is based upon work supported by the National Science Foundation under Grant No and Grant No. CDA Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors) and do not necessarily reflect the views of the National Science Foundation (NSF). Assistance with this presentation: John Herrin, Malinda Lingwall, W. Les Teach Thanks to the SciNet team and SC2003 organizers! This project was an outcome of a kind invitation from Prof. Dr. Michael Resch and HLRS to Craig Stewart last year.

18 Our partners

19 Rainer Keller, Matthias Hess HLRS, University of Stuttgart Richard Repasky UITS, Indiana University John Colbourne Center for Genomics and Informatics, Indiana University Craig Stewart, David HartUITS, Indiana University Jennifer Steinbachs Center for Genomics and Bioinformatics, Indiana University Uwe Woessner HLRS, University of Stuttgart Donald BerryUITS, Indiana University Matthias MuellerHLRS, University of Stuttgart Huian LiUITS, Indiana University Gary W. Stuart Center for Genomics and Bioinformatics, Indiana University Michael ReschHLRS, University of Stuttgart Eric Wernert UITS, Indiana University Martin Aumüller, Ulrich LangHLRS, University of Stuttgart Markus Buchhorn Australia National University Hiroshi Takemiya National Institute of Advanced Industrial Science & Technology, Japan Rim Belhaj ISET'Com, Tunesia Wolfgang E. Nagel ZHR, Technical University of Dresden Sergui Sanielevici Pittsburgh Supercomputing Center Sergio takeo KofujiLCCA/CCE-USP David BannonVictorian Partnership for Advanced Computing, Australia Norihiro Nakajima Japan Atomic Energy Research Institute Rosa Badia CEPBA-IBM Research Institute Mark A. Miller San Diego Supercomputer Center Hyungwoo ParkKorea Institute of Science and Technology Information Rick Stevens Argonne National Laboratory Fang-Pang Lin National Center for High Performance Computing John Brooke Manchester Computing David Moffett Purdue University Tan Tin WeeNational University of Singapore Greg Newby Arctic Region Supercomputer Center J.C.T. Poole CACR, Cal-Tech Ramched Hamza Sup'com, Tunesia Mary Papakhian, John N. HuffmanUITS, Indiana University Leigh GrundhoefferUITS, Indiana University Ray SheppardUITS, Indiana University Peter Cherbas Center for Genomics and Bioinformatics, Indiana U. Stephen Pickles, Neil StringfellowCSAR, University of Manchester

20 Thank you! Questions?