CI Updates and Planning Discussion

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Rational Unified Process
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
The iPlant Collaborative Cyberinfrastructure aka Development of Public Cyberinfrastructure to Support Plant Science Presented by Dan Stanzione Co-PI and.
IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.
Jisc Data Spring Pitch: Cloud Workbench Ben Butchart EDINA.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Rational Unified Process Fundamentals Module 4: Disciplines II.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading.
© Copyright AARNet Pty Ltd PRAGMA Update & some personal observations James Sankar Network Engineer - Middleware.
What’s Coming? What are we Planning?. › Better docs › Goldilocks – This slot size is just right › Storage › New.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar Jesús Marco
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
LHCbComputing Update of LHC experiments Computing & Software Models Selection of slides from last week’s GDB
Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4,5 1. CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department.
Rekayasa Perangkat Lunak Part-6
EGI Foundation (Session Chair)
VisIt Project Overview
Pragmatics 4 Hours.
Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.
Software Project Configuration Management
Information Systems Development
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Example: Rapid Atmospheric Modeling System, ColoState U
SDL Web Cloud – My first experience and learnings
Managing your IT Environment
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
Tools and Services Workshop Overview of Atmosphere
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Design and Implementation
Putting All The Pieces Together: Developing a Cyberinfrastructure at the Georgia State University Library Tim Daniels, Learning Commons Coordinator Doug.
CS 501: Software Engineering Fall 1999
XSEDE’s Campus Bridging Project
Eleonora Sadykova Cathy Nyaga Ray Stoneham
Chapter 2 – Software Processes
Cyberinfrastructure for the Life Sciences
Chapter 7 –Implementation Issues
THE REALITY OF USING CONTAINERS TO BUILD PRODUCTS
Middleware, Services, etc.
Panel: Building the NRP Ecosystem
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
Quoting and Billing: Commercialization of Big Data Analytics
Overview of Workflows: Why Use Them?
Rapid software development
Gordon Erlebacher Florida State University
Implementation Discussion Bin Hu
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
What Can It Do For You? Spira | #InflectraCon
Presentation transcript:

CI Updates and Planning Discussion Dan Stanzione

CI Portfolio Technology Evaluation Projects Software Systems Prototyping/Design Activities Production Software Infrastructure and Services

First Release DE’s

First Release First IPTOL Release this March Some science, but largely the first release of many aspects of the new Architecture, e.g. First release of: The component model The authentication framework Provenance tracking Remote execution on Teragrid and other large systems Integration of existing bioinformatics tool as components

Architecture Evolution Architecture has evolved rapidly in direct response to working group discussions Move away from static workflows for G2P, focus instead on a dynamic, visual programming model (implications for release). Support a more “exploratory” mode Accelerate release of API, as this will be a bigger key than even I anticipated. Let me come back to this after describing a little more

Technology Eval Activities This is where we are “experimenting” with things that might work and are relevant. A couple of examples Exploring alternate implementations of QTL mapping algorithms Experimental Reproducability Also some more hardware oriented:

Experimental Systems We are experimenting with some newer technologies to plug gaps in the existing lineup for demonstrated needs (also leveraging some other funding) New model for shared memory (ScaleMP cluster to be deployed this quarter) Will support 100s of GB or RAM for *existing* codes. “Cloud Storage” models to reduce archive cost, increase capacity (HDFS system on commodity cluster to be deployed this quarter) Will also support Hadoop data processing

Prototyping/Design As some of you have seen, a number of our staff are engaged in prototyping as part of the design effort. Examples: Bernice, Greg, prototyping potential workflows with VisTrails Liya building prototype reference implementation with GLM

Systems and Services Put the infrastructure back in Cyberinfrastructure We aren’t just releasing application tools Provide access to large scale systems, storage, service hosting. Make resources available through the “web applications” as discussed yesterday

Existing Systems We have made resources available to iPlant users from a number of TeraGrid and local systems (and applied to TG for a larger allocation): Ranger (TG/Large Scale Supercomputer) Stampede (TACC/High Throughput) Eucalyptus/VM System (UA, “Cloud”) Longhorn (TG/Remote Visualization and GPU)

Storage Services We have also begun offering storage to a number of projects connected to the grand challenges in some way, as well as iPlant internal. IRODS interface Corral at TACC, a local storage array at UA

What the Architecture Looks Like Now The Discovery Environments are still a place to explore datasets, as always. Users will come to the web site, and will see: Toolkits from which they can select components and build their own workflows. Opportunities to use “gold standard” workflows, tested workflows of others. Resources will come from remote computing, increasing scale. Some may be simple: Select dataset, run massive GLM run, send me the data when it’s done More infrastructure than application

Near Term While finishing IPTOL release, continue developing component model and API Starting now, based on WG summary, begin building components described in NGS documents. Target a 2nd quarter release of the basic visual programming model with a “toolbox” of NextGen Seq tools Rapid release of tools thereafter, with occasional “infrastructure” updates

Near Term Start releasing “gold standard” or “approved” workflows once the tools are out (e.g. save and share “Ruth’s workflow”). Begin training in how others can integrate 3rd party tools in infrastructure this summer. Keep releasing tools/components as specified by working groups (will build schedule based on the WG final docs).

Next Steps WG progress looks fantastic… Priorities getting defined, iPlant people covering the group have a good idea of where to go next Some groups ahead of others, but all heading that way. Perhaps time to move to a new “mode”. Finalize docs, declare first version “done”. Move into either “beta test” mode as things come back (maybe after a brief meeting hiatus) Or, start on “version 2”, the next wishlist.

Discussion

Staffing 16 staff working on supporting working groups, prototyping, enhancing existing codes 8 staff at UA and 2 at TACC in production software, integration 3 staff in systems support 3 in project management, coordinating roles 9 faculty in CI advisory/Tech Eval Roles Roughly 10 students/postdocs supporting eval projects