Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group.

Slides:



Advertisements
Similar presentations
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Advertisements

PNFS, 61 th IETF, DC1 pNFS: Requirements 61 th IETF – DC November 10, 2004.
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
Last Lecture The Future of Parallel Programming and Getting to Exascale 1.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
Benefits from Participation in the Supercomputing 2008 (SC08) Conference Jim Bottum Jill Gemmill Walt Ligon Mihaela Vorvoreanu.
Presented by Suzy Tichenor Director, Industrial Partnerships Program Computing and Computational Sciences Directorate Oak Ridge National Laboratory DOE.
Scientific Grand Challenges Workshop Series: Challenges in Climate Change Science and the Role of Computing at the Extreme Scale Warren M. Washington National.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The DOE Science Grid Computing and Data Infrastructure for Large-Scale Science William Johnston, Lawrence Berkeley National Lab Ray Bair, Pacific Northwest.
Structural Genomics – an example of transdisciplinary research at Stanford Goal of structural and functional genomics is to determine and analyze all possible.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Where do we go from here? “Knowledge Environments to Support Distributed Science and Engineering” Symposium on Knowledge Environments for Science and Engineering.
1 The Australian Partnership for Sustainable Repositories Margaret Henty Digital Futures Industry Briefing November 8, 2006.
NREL is a national laboratory of the U.S. Department of Energy Office of Energy Efficiency and Renewable Energy operated by Midwest Research Institute.
FACULTY OF COMPUTER SCIENCE OUTPUT DD  annual event from students for students with contact to industry (~800 visitors)  live demonstrations  research.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN Symposium on Knowledge Environments for Science and Engineering National Science Foundation Nov , 2002.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Welcome to HTCondor Week #14 (year #29 for our project)
CceHUB A Knowledge Discovery Environment for Cancer Care Engineering Research Ann Christine Catlin HUBzero Workshop November 7, 2008.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
U.S. Department of Energy Office of Science Advanced Scientific Computing Research Program CASC, May 3, ADVANCED SCIENTIFIC COMPUTING RESEARCH An.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
Genomics Eddy Rubin December 3, DOE Joint Genome Mission Supported by the DOE Office of Science, JGI unites the expertise of five national laboratories—Lawrence.
U.S. Department of Energy Office of Science Advanced Scientific Computing Research Program NERSC Users Group Meeting Department of Energy Update September.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
November 13, 2006 Performance Engineering Research Institute 1 Scientific Discovery through Advanced Computation Performance Engineering.
Colour of Ocean Data, Brussels, November 2002 Colour of Ocean Data: Discussion Panel Lesley Rickards British Oceanographic Data Centre.
Infrasound Consortium for Applied Research 2002 Infrasound Technology Workshop, Netherlands Milton Garces Infrasound Laboratory, University of Hawaii,
Dr. Michael P. Bishop Professor and Haynes Chair in Geosciences Department of Geography.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott and Christian Engelmann Computer Science.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
California Energy Commission - Public Interest Energy Research Program Demand Response Research Center Research Overview Load Management Informational.
DV/dt - Accelerating the Rate of Progress towards Extreme Scale Collaborative Science DOE: Scientific Collaborations at Extreme-Scales:
ALICE-USA Grid-Deployment Plans (By the way, ALICE is an LHC Experiment, TOO!) Or (We Sometimes Feel Like and “AliEn” in our own Home…) Larry Pinsky—Computing.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Oct 15, PETASCALE DATA STORAGE INSTITUTE The Drive to Petascale Computing Faster computers need more data, faster. --
CyberInfrastructure workshop CSG May Ann Arbor, Michigan.
Characterizing Failure Data in High-Performance-Computing Systems Bianca Schroeder With Garth Gibson and Gary Grider Department of Computer Science Carnegie.
Presented by An Overview of the Common Component Architecture (CCA) The CCA Forum and the Center for Technology for Advanced Scientific Component Software.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
NSDL Collections Based on DOE User Facilities Christopher Klaus 10/05/03.
WP18: High-Speed Data Recording Krzysztof Wrona, European XFEL 07 October 2011 CRISP.
Finding Partners, Creating Impact Rusty Low Poles Together Workshop NOAA Boulder, CO July 20-22, 2005.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Highest performance parallel storage for HPC environments Garth Gibson CTO & Founder IDC HPC User Forum, I/O and Storage Panel April 21, 2009.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
MD2K is an NIH Big Data to Knowledge (BD2K) Center of Excellence. Visit MD2K: Center of Excellence for Mobile Sensor Data-to-Knowledge.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Xolotl: A New Plasma Facing Component Simulator Scott Forest Hull II Jr. Software Developer Oak Ridge National Laboratory
Course, Curriculum, and Laboratory Improvement (CCLI) Transforming Undergraduate Education in Science, Technology, Engineering and Mathematics PROGRAM.
The Performance Evaluation Research Center (PERC) Participating Institutions: Argonne Natl. Lab.Univ. of California, San Diego Lawrence Berkeley Natl.
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Lawrence Livermore National Laboratory 1 Science & Technology Principal Directorate - Computation Directorate Scalable Fault Tolerance for Petascale Systems.
Future Usage Environments and System Integration Issues David Forslund Cognition Group, Inc. HCMDSS Workshop November 16, 2004.
National Science Foundation Blue Ribbon Panel on Cyberinfrastructure Summary for the OSCER Symposium 13 September 2002.
Center for Component Technology for Terascale Simulation Software (CCTTSS) 110 April 2002CCA Forum, Townsend, TN This work has been sponsored by the Mathematics,
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
Engineered nanoBIO Node
Bird of Feather Session
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets A.Chervenak, I.Foster, C.Kesselman, C.Salisbury,
Gaurab KCa,b, Zachary Mitchella,c and Sarat Sreepathia
Presentation transcript:

Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group

2 Roth_PDSI_0611  Petascale computing makes petascale demands on storage  Performance  Capacity  Concurrency  Reliability  Availability  Manageability  Parallel file systems are barely keeping pace at terascale; the challenges will be much greater at petascale Cray XT3 Cray X1E The petascale storage problem at ORNL

3 Roth_PDSI_0611 Petascale Data Storage Institute  The PDSI is an institute in the Department of Energy (DOE) Office of Science’s Scientific Discovery through Advanced Computing (SciDAC-2) program  Using diverse expertise with applications and file and storage systems, members will collaborate on requirements, standards, algorithms, analysis tools Led by Dr. Garth Gibson, Carnegie Mellon University

4 Roth_PDSI_0611 Carnegie Mellon University Participating Institutions Lawrence Berkeley National Laboratory/NERSC Los Alamos National Laboratory Pacific Northwest National Laboratory Sandia National Laboratories Oak Ridge National Laboratory University of California at Santa Cruz University of Michigan at Ann Arbor

5 Roth_PDSI_0611 Novel Storage Mechanisms IT Automation Standards and APIs Community Building Failure Data Collection Performance Data Collection Petascale Data Storage Institute agenda Collection Dissemination Innovation Main ThrustsProjects

6 Roth_PDSI_0611 Collection: Performance analysis  Performance data collection and analysis  Workload characterization  Benchmark collection and publication 6 Roth_PDSI_0611 Led by William Kramer, National Energy Research Scientific Computing Center (NERSC)

7 Roth_PDSI_ Months in production use Failures per month Unknown Human Environment Network Software Hardware Collection: Failure analysis  Capture and analyze failure, error, and usage data from high-end computing systems  Initial example: Los Alamos failure data available for 22 systems over 9 years with extensive analysis by Bianca Schroeder, Carnegie Mellon University Led by Gary Grider, Los Alamos National Laboratory

8 Roth_PDSI_0611 Dissemination: Outreach  Our approach  Workshops (SC06 PDSI workshop, November 17!)  Tutorials and course materials  Online, open repository with documents, tools, performance and failure data  Target audience  Computational scientists  Academia (professors and students)  Industry (storage researchers and developers) Led by Dr. Garth Gibson, Carnegie Mellon University Goal: to disseminate information about techniques, mechanisms, best practices, and available tools

9 Roth_PDSI_0611 Dissemination: Standards and APIs Some work underway  POSIX extensions  E.g., support for weak data and metadata consistency   Parallel Network File System (pNFS)  In IETF NFSv4.1 standard draft  University of Michigan Center for Information Technology Integration producing reference implementation  Led by Gary Grider, Los Alamos National Laboratory Goals: to facilitate standards development and deployment and to validate and demonstrate new extensions and protocols

10 Roth_PDSI_0611 Innovation IT automation applied to high-end computing systems and problems Novel mechanisms for core high-end computing storage problems  Storage system instrumentation for machine learning  Data layout and access planning  Automated diagnosis, tuning, failure recovery  WAN/global storage access  High performance collective operations  Rich metadata at scale  Integration with system virtualization technology Led by Dr. Garth Gibson Carnegie Mellon University Led by Darrell Long University of California at Santa Cruz

11 Roth_PDSI_0611 Summary  The Petascale Data Storage Institute brings together individuals with expertise in file and storage systems, applications, and performance analysis  PDSI will be a focal point for computational scientists, academia, and industry for storage-related information and tools, both within and outside SciDAC

12 Roth_PDSI_0611 ORNL contact Philip C. Roth Future Technologies Group Computer Science and Mathematics (865) Roth_PDSI_0611