Presentation is loading. Please wait.

Presentation is loading. Please wait.

Oct 15, 2006 PETASCALE DATA STORAGE INSTITUTE The Drive to Petascale Computing Faster computers need more data, faster. --

Similar presentations


Presentation on theme: "Oct 15, 2006 PETASCALE DATA STORAGE INSTITUTE The Drive to Petascale Computing Faster computers need more data, faster. --"— Presentation transcript:

1 Oct 15, 2006 http://www.pdsi-scidac.org/ PETASCALE DATA STORAGE INSTITUTE The Drive to Petascale Computing Faster computers need more data, faster. -- 2001: 10 TF -- 2005: 100 TF -- 2008: 1 PF -- 2011: 10 PF 2015: 100 PF -- PDSI Thrusts: Data Capture Education & Dissemination Innovation Everything Must Scale with Compute Checkpoint at Terabytes/sec Petabyte files Billions of files Revisit programming for Input/Output Data center automation Acceleration for search Computing Speed Parallel I/O Network Speed Memory Archival Storage TFLOP/s GigaBytes/sec Gigabits/sec 1 10 3 5 5 50 500 5,000 2.5 25 250 TeraBytes 2,500.5 5 50 5 500 5,000 5 500 50.5 ‘00 ‘04 ‘08 2012 Year 10 2 Disk PetaBytes.05.5 5 50 Metadata Inserts/sec 200 20,000 2,000 500 GigaBytes/sec Application Performance

2 Oct 15, 2006 http://www.pdsi-scidac.org/ Steeped in Terascale Experience Pheonix & XFS Lightning & PanFS Q & PFS MPP2 & Lustre Red Storm & Lustre Roadrunner & PanFS Jaguar & Lustre Blue Mountain & XFS Seaborg & GPFS PETASCALE DATA STORAGE INSTITUTE

3 Oct 15, 2006 http://www.pdsi-scidac.org/ Storage Manager pNFS server pNFS HPC Apps Driver 1. SBC (blocks) 2. OSD (objects) 3. NFS (files) Layout grant & revoke PETASCALE DATA STORAGE INSTITUTE Peta-Bytes Tera-B/sec Giga-files Mega-CPUs Tera-Bytes Giga-B/sec Mega-files Kilo-CPUs Education & Dissemination Innovation Data Capture Education Workshops Tutorials Course materials Outreach Storage-research-list Collaboration w/ other Scidacs IT Automation Instrumentation Visualization Machine Learning Diagnosis Adaptation App Workloads INCITE resources Trace & replay tools (e.g. BLAST, CCSM, Calore, EVH1, MCNP, GYRO, Sierra, QCD and other Scidacs) API Standards POSIX API Rich metadata Compute-in-disk Archive API Quality of Storage Scaling Further Global/WAN access Federated security Metadata at scale Para-virtualization NFSv4 extended w/ layouts Storage bricks pNFS MDS Mechanical tier I/O replies & requests Automation Agents supervisor Monitoring info Configuration settings Managerial tier Goal specifications & complaints Statistics & predictions Administrator HPC NFS Parallel NFS Secure NFS IETF Standard Strategic Plan Failure Data Capture & publish Computer Failure Data Repository (e.g. LANL’s outages by root cause)

4 Oct 15, 2006 http://www.pdsi-scidac.org/ Carnegie Mellon University Garth Gibson (PI) University of California, Santa Cruz Darrell Long (co-PI) University of Michigan, Ann Arbor Peter Honeyman (co-PI) Los Alamos National Laboratory Gary Grider (co-PI) Lawrence Berkeley National Laboratory Bill Kramer (co-PI) Oak Ridge National Laboratory Philip Roth (co-PI) Pacific Northwest National Laboratory Evan Felix (co-PI) Sandia National Laboratory Lee Ward (co-PI) PETASCALE DATA STORAGE INSTITUTE Participating Organizations

5 Oct 15, 2006 http://www.pdsi-scidac.org/ Programming for Storage The Need for Training Programmers for Storage  HPC IT managers work for users who program apps  Often performance of apps/workflows dependent on storage  Many times best solutions would be to change the program  Reality is app specialists intolerant of requests to reprogram for better storage performance  That is, reprogramming for storage performance often doesn’t get done Approach: Create tools, training to help a priori  Give programmers libraries, performance debugging tools that avoid or detect poor storage patterns  Give tutorials, case studies, help pages showing weak programming approaches and how to improve them PETASCALE DATA STORAGE INSTITUTE

6 Oct 15, 2006 http://www.pdsi-scidac.org/ Example from BioInformatics Pseudo code example from IT manager -- single thread for( I=0, I<1000, I++){ for( J=0, J<1000, J++){ buf = compute (I,J); f = open( “file_foo”); lseek(f, offset(I,J)); write(f, buf, lengthof(buff)) close(f); }}  Buf turns out to be small, unaligned, fixed length  Obvious fixes: –Open/close outside both loops –Malloc sizeof 1000000*lengthof(buff), copy into it in memory, one write at end PETASCALE DATA STORAGE INSTITUTE


Download ppt "Oct 15, 2006 PETASCALE DATA STORAGE INSTITUTE The Drive to Petascale Computing Faster computers need more data, faster. --"

Similar presentations


Ads by Google