Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center May 20, 2010.

Similar presentations


Presentation on theme: "Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center May 20, 2010."— Presentation transcript:

1 Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center May 20, 2010

2 XS-WG Update | Nystrom | May 20, 2010 2 Extreme Scalability Working Group (XS-WG): Purpose Meet the challenges and opportunities of deploying extreme-scale resources into the TeraGrid, maximizing both scientific output and user productivity. –Aggregate, develop, and share wisdom –Identify and address needs that are common to multiple sites and projects –May require assembling teams and obtaining support for sustained effort XS-WG benefits from active involvement of all Track 2 sites, BlueWaters, tool developers, and users. The XS-WG leverages and combines RPs’ interests to deliver greater value to the computational science community.

3 XS-WG Update | Nystrom | May 20, 2010 3 XS-WG Participants Nick NystromPSC, XS-WG lead Jay AlamedaNCSA Martin BerzinsUniv. of Utah (U) Paul BrownIU Lonnie CrosbyNICS, IO/Workflows lead Tim DudekGIG EOT Victor EijkhoutTACC Jeff GardnerU. Washington (U) Chris HempelTACC Ken JansenRPI (U) Shantenu JhaLONI Nick KaronisNIU (G) Dan KatzU. of Chicago Ricky KendallORNL Byoung-Do KimTACC Scott LathropGIG, EOT AD Vickie LynchORNL Amit MajumdarSDSC, TG AUS AD Mahin MahmoodiPSC, Tools lead Allen MalonyUniv. of Oregon (P) David O’NealPSC Dmitry PekurovskySDSC Wayne PfeifferSDSC Raghu ReddyPSC, Scalability lead Sergiu SanieleviciPSC Sameer ShendeUniv. of Oregon (P) Ray SheppardIU Alan SnavelySDSC Henry TufoNCAR George TurnerIU John UrbanicPSC Joel WellingPSC Nick WrightNERSC (P) S. Levent YilmazCSM, U. Pittsburgh (P) U: user; P: performance tool developer; G: grid infrastructure developer; *: joined XS-WG since last TG-ARCH update

4 XS-WG Update | Nystrom | May 20, 2010 4 Technical Challenge Area #1: Scalability and Architecture Algorithms, numerical methods, multicore performance, etc. –Robust, scalable infrastructure (libraries, frameworks, languages) for supporting applications that scale to O (10 4–6 ) cores –Numerical stability and convergence issues that emerge at scale –Exploiting systems’ architectural strengths –Fault tolerance and resilience Contributors –POC: Raghu Reddy (PSC) Recent and ongoing activities: hybrid performance –Raghu submitted a technical paper to TG10 with Annick Pouquet –Synergy with AUS; work by Wayne Pfeiffer and Dmitry Pekurovsky –Emphasis on documenting & disseminating guidance Raghu’s work on the HOMB benchmark, Pfeiffer, Pekurovsky, others

5 XS-WG Update | Nystrom | May 20, 2010 5 Technical Challenge Area #2: Tools Performance tools, debuggers, compilers, etc. –Evaluate strengths and interactions; ensure adequate installations –Analyze/address gaps in programming environment infrastructure –Provide advanced guidance to RP consultants Contributors POC: Mahin Mahmoodi (PSC) Recent and ongoing activities: reliable tool installations Nick and Mahin visited NICS in December to give a seminar on performance engineering and tool use Mahin and NICS staff developed efficient, sustainable procedures and policies for keeping tool installations up to date and functional Ongoing application of performance tools at scale to complex applications to ensure their correct functionality; identify & remove problems Nick, Sameer, Riu Liu, and Dave Cronk co-presented a performance engineering tutorial at LCI10 (March 8, 2010, Pittsburgh)

6 XS-WG Update | Nystrom | May 20, 2010 6 Collaborative Performance Engineering Tutorials SC09: Productive Performance Engineering of Petascale Applications with POINT and VI-HPS (November 16, 2009) –Allen Malony and Sameer Shende (Univ. of Oregon), Rick Kufrin (NCSA), Brian Wylie and Felix Wolf (JSC), Andreas Knuepfer and Wolfgang Nagel (TU Dresden), Shirley Moore (UTK), Nick Nystrom (PSC) –Addresses performance engineering of petascale, scientific applications with TAU, PerfSuite, Scalasca, and Vampir –Includes hands-on exercises using a Live-DVD containing all of the tools, helping to prepare participants to apply modern methods for locating and diagnosing typical performance bottlenecks in real-world parallel programs at scale LCI10: Using POINT Performance Tools: TAU, PerfSuite, PAPI, Scalasca, and Vampir (March 8, 2010) Sameer Shende (Univ. of Oregon), David Cronk (Univ. of Tennessee at Knoxville), Nick Nystrom (PSC), and Rui Liu (NCSA) Targeted multicore performance issues

7 XS-WG Update | Nystrom | May 20, 2010 7 Technical Challenge Area #3: Workflow, data transport, analysis, visualization, and storage Coordinating massive simulations, analysis, and visualization –Data movement between RPs involved in complex simulation workflows; staging data from HSM systems across the TeraGrid –Technologies and techniques for in situ visualization and analysis Contributors –POC: Lonnie Crosby (NICS) Current activities –Extreme Scale I/O and Data Analysis Workshop

8 XS-WG Update | Nystrom | May 20, 2010 8 Extreme Scale I/O and Data Analysis Workshop March 22-24, 2010, Austin –http://www.tacc.utexas.edu/petascale-workshop/ Sponsored by the Blue Waters Project, TeraGrid, and TACC Builds on preceding Petascale Application Workshops –December 2007, Tempe and June 2008, Las Vegas: petascale applications –March 2009, Albuquerque: fault tolerance and resilience; included significant participation from NNSA, DOE, and DoD 48 participants from 30 institutions 2 days: presentations + lively discussion –application requirements; filesystems; I/O libraries and middleware; large-scale data management

9 XS-WG Update | Nystrom | May 20, 2010 9 Extreme Scale I/O and Data Analysis Workshop: Some Observations & Findings Users are doing parallel I/O using a variety of means –Rolling their own, HDF, netCDF, MPI-IO, ADIOS, …: no one size fits all Data volumes can exceed the capability of analysis resources –E.g. ~0.5-1.0 TB per wall clock day for certain climate simulations The greatest complaint was large variability in I/O performance –2-10 × slowdown cited as common; 300 × observed –The causes are well understood. How to avoid them is not. Potential research direction: Extensions to schedulers to support file information from jobs being submitted plus detailed knowledge of parallel filesystem characteristics might enable I/O quality of service and allow effective workload optimization.

10 XS-WG Update | Nystrom | May 20, 2010 10 Questions?


Download ppt "Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center May 20, 2010."

Similar presentations


Ads by Google