Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center October 22, 2009.

Similar presentations


Presentation on theme: "Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center October 22, 2009."— Presentation transcript:

1 Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center October 22, 2009

2 XS-WG Update | Nystrom | October 22, 2009 2 Extreme Scalability Working Group (XS-WG): Purpose Meet the challenges and opportunities of deploying extreme-scale resources into the TeraGrid, maximizing both scientific output and user productivity. –Aggregate, develop, and share wisdom –Identify and address needs that are common to multiple sites and projects –May require assembling teams and obtaining support for sustained effort XS-WG benefits from active involvement of all Track 2 sites, BlueWaters, tool developers, and users. The XS-WG leverages and combines RPs’ interests to deliver greater value to the computational science community.

3 XS-WG Update | Nystrom | October 22, 2009 3 XS-WG Participants Nick NystromPSC, XS-WG lead Jay AlamedaNCSA Martin BerzinsUniv. of Utah (U) Paul BrownIU Shawn BrownPSC Lonnie CrosbyNICS, IO/Workflows lead Tim DudekGIG EOT Victor EijkhoutTACC Jeff GardnerU. Washington (U) Chris HempelTACC Ken JansenRPI (U) Shantenu JhaLONI Nick KaronisNIU (G) Dan KatzLONI Ricky KendallORNL Byoung-Do KimTACC Scott LathropGIG, EOT AD Vickie LynchORNL Amit MajumdarSDSC, TG AUS AD Mahin MahmoodiPSC, Tools lead Allen MalonyUniv. of Oregon (P) David O’NealPSC Dmitry PekurovskySDSC Wayne PfeifferSDSC Raghu ReddyPSC, Scalability lead Sergiu SanieleviciPSC Sameer ShendeUniv. of Oregon (P) Ray SheppardIU Alan SnavelySDSC Henry TufoNCAR George TurnerIU John UrbanicPSC Joel WellingPSC Nick WrightSDSC (P) S. Levent Yilmaz*CSM, U. Pittsburgh (P) U: user; P: performance tool developer; G: grid infrastructure developer; *: joined XS-WG since last TG-ARCH update

4 XS-WG Update | Nystrom | October 22, 2009 4 Technical Challenge Area #1: Scalability and Architecture Algorithms, numerics, multicore, etc. –Robust, scalable infrastructure (libraries, frameworks, languages) for supporting applications that scale to O (10 4–6 ) cores –Numerical stability and convergence issues that emerge at scale –Exploiting systems’ architectural strengths –Fault tolerance and resilience Contributors –POC: Raghu Reddy (PSC) –Members: Reddy, Majumdar, Urbanic, Kim, Lynch, Jha, Nystrom Current activities –Understanding performance tradeoffs in hierarchical architectures e.g. partitioning between MPI/OpenMP for different node architectures, interconnects, and software stacks candidate codes for benchmarking: HOMB, WRF, perhaps others –Characterizing bandwidth-intensive communication performance

5 XS-WG Update | Nystrom | October 22, 2009 5 Investigating the Effectiveness of Hybrid Programming (MPI+OpenMP) Begun in XS-WG, extended through AUS effort in collaboration with Amit Majumdar Examples of applications with hybrid implementations: WRF, POP, ENZO To exploit more memory per task, threading offers clear benefits. But what about performance? –Prior results are mixed; pure MPI often seems at least as good. –Historically, systems had fewer cores/socket and fewer cores/node than we have today, and far fewer than they will have in the future. –Have OpenMP versions been as carefully optimized? Reasons to look into hybrid implementations now –Current T2 systems have 8-16 cores per node. –Are we at the tipping point for threading offering a win? If not, is there one, and at what core count, and for which kinds of algorithms? –What is the potential for performance improvement?

6 XS-WG Update | Nystrom | October 22, 2009 6 Hybrid OpenMP-MPI Benchmark (HOMB) Developed by Jordan Soyke, while a student intern at PSC, subsequently enhanced by Raghu Reddy Simple benchmark code Permits systematic evaluation by –Varying computation-communication ratio –Varying message sizes –Varying MPI vs. OpenMP balance Allows characterization of performance bounds –Characterizing the potential hybrid performance of an actual application is possible with adequate understanding of its algorithms and their implementations.

7 XS-WG Update | Nystrom | October 22, 2009 7 Characteristics of the Benchmark Perfectly parallel with both MPI/OpenMP Perfectly load balanced Distinct computation and communication sections Only nearest-neighbor communication –Currently no reduction operations –No overlap of computation and communication Can easily vary computation/communication ratio Current tests are with large messages

8 XS-WG Update | Nystrom | October 22, 2009 8 Preliminary Results on Kraken: MPI vs. MPI+OpenMP, 12 threads/node Hybrid could be beneficial because of other reasons: –Application has limited scalability because of the decomposition –Application needs more memory –Application has dynamic load imbalance The hybrid approach provides increasing performance advantage as communication fraction increases. –… for the current core count per node. –Non-threaded sections of an actual application would have an Amdahl’s Law effect; these results constitute a best case limit.

9 XS-WG Update | Nystrom | October 22, 2009 9 Technical Challenge Area #2: Tools Performance tools, debuggers, compilers, etc. –Evaluate strengths and interactions; ensure adequate installations –Analyze/address gaps in programming environment infrastructure –Provide advanced guidance to RP consultants Contributors POC: Mahin Mahmoodi (PSC) Members: Mahmoodi, Wright, Alameda, Shende, Sheppard, Brown, Nystrom Current activities Focus on testing debuggers and performance tools at large core counts Ongoing, excellent collaboration between SDCI tool projects, plus consideration of complementary tools Submission for a joint POINT/IPM tools tutorial to TG09 Installing and evaluating strengths of tools as they apply to complex production applications

10 XS-WG Update | Nystrom | October 22, 2009 10 Collaborative Performance Engineering Tutorials TG09: Using Tools to Understand Performance Issues on TeraGrid Machines: IPM and the POINT Project (June 22, 2009) –Karl Fuerlinger (UC Berkeley), David Skinner (NERSC/LBNL), Nick Wright (then SDSC), Rui Liu (NCSA), Allen Malony (Univ. of Oregon), Haihang You (UTK), Nick Nystrom (PSC) –Analysis and optimization of applications on the TeraGrid, focusing on Ranger and Kraken. SC09: Productive Performance Engineering of Petascale Applications with POINT and VI-HPS (Nov. 16, 2009) –Allen Malony and Sameer Shende (Univ. of Oregon), Rick Kufrin (NCSA), Brian Wylie and Felix Wolf (JSC), Andreas Knuepfer and Wolfgang Nagel (TU Dresden), Shirley Moore (UTK), Nick Nystrom (PSC). –Addresses performance engineering of petascale, scientific applications with TAU, PerfSuite, Scalasca, and Vampir. –Includes hands-on exercises using a Live-DVD containing all of the tools, helping to prepare participants to apply modern methods for locating and diagnosing typical performance bottlenecks in real-world parallel programs at scale.

11 XS-WG Update | Nystrom | October 22, 2009 11 Technical Challenge Area #3: Workflow, data transport, analysis, visualization, and storage Coordinating massive simulations, analysis, and visualization –Data movement between RPs involved in complex simulation workflows; staging data from HSM systems across the TeraGrid –Technologies and techniques for in situ visualization and analysis Contributors –POC: Lonnie Crosby (NICS) –Members: Crosby, Welling, Nystrom Current activities –Focus on I/O profiling and determining platform-specific recommendations for obtaining good performance for common parallel I/O scenarios

12 XS-WG Update | Nystrom | October 22, 2009 12 Co-organized a Workshop on Enabling Data-Intensive Computing: from Systems to Applications July 30-31, 2009, University of Pittsburgh http://www.cs.pitt.edu/~mhh/workshop09/index.html http://www.cs.pitt.edu/~mhh/workshop09/index.html 2 days: presentations, breakout discussions –architectures –software frameworks and middleware –algorithms and applications Speakers –John Abowd - Cornell University –David Andersen - Carnegie Mellon University –Magda Balazinska - The University of Washington –Roger Barga - Microsoft Research –Scott Brandt - The University of California at Santa Cruz –Mootaz Elnozahy - International Business Machines –Ian Foster - Argonne National labs –Geoffrey Fox - Indiana University –Dave O'Hallaron - Intel Research –Michael Wood-Vasey - University of Pittsburgh –Mazin Yousif - The University of Arizona –Taieb Znati - The National Science Foundation From R. Kouzes et al., The Changing Paradigm of Data-Intensive Computing, IEEE Computer, January 2009

13 XS-WG Update | Nystrom | October 22, 2009 13 Next TeraGrid/Blue Waters Extreme-Scale Computing Workshop To focus on parallel I/O for petascale applications, addressing: –multiple levels of applications, middleware (HDF, MPI-IO, etc.), and systems –requirements for data transfers to/from archives and remote processing and management facilities. Tentatively scheduled for the week of March 22, 2010, in Austin Builds on preceding Petascale Application Workshops –December 2007, Tempe: general issues of petascale applications –June 2008, Las Vegas: more general issues of petascale applications –March 2009, Albuquerque: fault tolerance and resilience; included significant participation from NNSA, DOE, and DoD


Download ppt "Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center October 22, 2009."

Similar presentations


Ads by Google