Co-Design Breakout A. Maccabe & M. Sato Park Vista Hotel Gatlinburg, Tennessee September 5-6, 2014.

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
High-Performance Computing
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Materials by Design G.E. Ice and T. Ozaki Park Vista Hotel Gatlinburg, Tennessee September 5-6, 2014.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Life and Health Sciences Summary Report. “Bench to Bedside” coverage Participants with very broad spectrum of expertise bridging all scales –From molecule.
Implementation methodology for Emerging Reconfigurable Systems With minimum optimization an appreciable speedup of 3x is achievable for this program with.
Workshop Charge J.C. Wells & M. Sato Park Vista Hotel Gatlinburg, Tennessee September 5-6, 2014.
OASIS Reference Model for Service Oriented Architecture 1.0
The Education of a Software Engineer Mehdi Jazayeri Presented by Matthias Hauswirth.
Building a Cluster Support Service Implementation of the SCS Program UC Computing Services Conference Gary Jung SCS Project Manager
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Business process management (BPM) Petra Popovičová.
Computational Thinking Related Efforts. CS Principles – Big Ideas  Computing is a creative human activity that engenders innovation and promotes exploration.
Architectural Design.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Computer System Architectures Computer System Software
EECE **** Embedded System Design
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: Dennis Hoppe (HLRS) ATOM: A near-real time Monitoring.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Using Business Scenarios for Active Loss Prevention Terry Blevins t
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
The Global View Resilience Model Approach GVR (Global View for Resilience) Exploits a global-view data model, which enables irregular, adaptive algorithms.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)
The Globus Project: A Status Report Ian Foster Carl Kesselman
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
CS 3610: Software Engineering – Fall 2009 Dr. Hisham Haddad – CSIS Dept. Chapter 6 System Engineering Overview of System Engineering.
1 Introduction to Software Engineering Lecture 1.
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
CSC 7600 Lecture 28 : Final Exam Review Spring 2010 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS FINAL EXAM REVIEW Daniel Kogler, Chirag Dekate.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Manno, , © by Supercomputing Systems 1 1 COSMO - Dynamical Core Rewrite Approach, Rewrite and Status Tobias Gysi POMPA Workshop, Manno,
© 2009 IBM Corporation Parallel Programming with X10/APGAS IBM UPC and X10 teams  Through languages –Asynchronous Co-Array Fortran –extension of CAF with.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Software Development in HPC environments: A SE perspective Rakan Alseghayer.
Breakout Group: Debugging David E. Skinner and Wolfgang E. Nagel IESP Workshop 3, October, Tsukuba, Japan.
System-Directed Resilience for Exascale Platforms LDRD Proposal Ron Oldfield (PI)1423 Ron Brightwell1423 Jim Laros1422 Kevin Pedretti1423 Rolf.
1 Technical & Business Writing (ENG-715) Muhammad Bilal Bashir UIIT, Rawalpindi.
Programmability Hiroshi Nakashima Thomas Sterling.
B5: Exascale Hardware. Capability Requirements Several different requirements –Exaflops/Exascale single application –Ensembles of Petaflop apps requiring.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
SDM Center Parallel I/O Storage Efficient Access Team.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
SDN-SF LANL Tasks. LANL Research Tasks Explore parallel file system networking (e.g. LNet peer credits) in order to give preferential treatment to isolated.
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
Percipient StorAGe for Exascale Data Centric Computing Exascale Storage Architecture based on “Mero” Object Store Giuseppe Congiu Seagate Systems UK.
Business process management (BPM)
Business process management (BPM)
Dynamic Data Driven Application Systems
Structural Simulation Toolkit / Gem5 Integration
Overview of System Engineering
Dynamic Data Driven Application Systems
CS 8532: Advanced Software Engineering
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
Presentation transcript:

Co-Design Breakout A. Maccabe & M. Sato Park Vista Hotel Gatlinburg, Tennessee September 5-6, 2014

Participants Dr. Mitsuhisa Sato, RIKEN Barney Maccabe, ORNL Robert Harrison, BNL Galen Shipman, ORNL Scott Klasky, ORNL Shaun Gleason, ORNL Dr. Kenji Ono, RIKEN, AICS Jeff Larkin, Nvidia Bronson Messer, ORNL Tjerk Straatsma, ORNL (Materials) Hooney Park, ORNL (Life Sciences) Sadaf Alam, CSCS (Life Sciences) Pat Worley, ORNL (Global Change) Seung-Hwan Lim, ORNL (Engineering) Jeremy Archuleta, ORNL (Engineering) David Bernholdt, ORNL (Fusion) Miwako Tsuji, RIKEN

Break-out Agenda TimeActivityWho 8:30Overview of the breakout Send out emissaries Life Sciences: Hooney Park, Materials by Design: Tjerk Straatsma, Global Change Prediction: Pat Worley, Computational Engineering: Jeremy Archuleta, Nuclear Sciences: Miwako Tsuji Fusion Sciences: David Bernholdt 8:45Robert Harrison, Scott Klasky, Kenji Ono, Mitsuhisa Sato 10:00Break 10:30Introductions of remaining participants with reactions to overview talks 11:30Identification of key challenges, from a computer science perspective

Break-out Agenda (continued) TimeActivity NoonLunch 1:00Re-write charge questions 1:30Draft answers to charge questions 3:00Break 3:30Finalize answers to charge questions 4:30Integrate issues identified by emissaries 5:30Adjourn

Co-design Models for co-design –Embedding Embed computer scientists in each application team Embed application scientists in each computer science team –Define and provide abstractions possibly subject to abuse Obsolescence Symmetry: What’s good for Apps is good for CS –CS needs abstract representations of applications Benchmarks, mini apps, skeletons, motifs, etc –Apps need abstract representations of machines Programming models and languages (parcels, X10, Chapel), simulators, performance prediction tools (e.g., Aspen), etc The role of software engineering? –What is the research challenge? Why co-design now? –Computation is changing –Explore flexibility on both sides: apps and systems Re-examine boundaries –Need to ensure continuity into the future

Examples of Systems co-design questions Hardware/NIC –What if the NIC can inject messages directly into an L1-cache? –What if the NIC provides computational capabilities? Collectives, matching, atomics, etc Streaming remote data to memory (bypass storage system)? Application profiles to guide compiler optimizations? How to manage different types of memory/storage? –What is the memory access pattern? –How much cache? –How much NVRAM? Which data reduction strategies can be deployed in storage or communication? Where can we tolerate overheads? –OS Noise, etc

Programming Models Programming models must be more holistic –Encompass more than nodes and machines Not MPI+X, but integration of I/O, storage, rest of world –End-to-end design Expose the whole context for computation so we can address the real problem No point in doing co-design in areas that aren’t the bottleneck We need a hierarchy of abstractions –MPI + X Good for nodes & machines Misses the broader context –Hadoop Encompasses the full data context and enables moving computation to data Too inefficient for scientific applications Network to the outside world –On-node interconnect, In-machine interconnect, Machine room interconnect, “rest of world” Multiuser, Interactive Supercomputing

Commodity – why it’s hard to define the programming model What is the right commodity –CPU + Cell did not become the commodity –CPU + GPU has shown to be much more sustainable commodity What is the right level for commodity –Commodity within a node (individual components/processors) –Commodity in software layers (e.g., Linux) –Commodity in infrastructure (e.g., cloud technologies)

Breakout Charge Questions, continued 1.What technical breakthroughs in science and engineering research can be enabled by exascale platforms and are attractive targets for Japan-US collaboration over the next 10 years? Exascale Challenges –Support for many core architectures –Communication layers e.g., PGAS –Application composition (beyond communication) –Memory / storage hierarchy (including staging and file systems) –Workflow management –Lightweight / micro kernels –Energy efficient scheduling –Performance measurement, modeling and prediction –Resilience: Fault models and minimizing duplication in detection and response

Breakout Charge Questions, continued 2.What is the representative suite of applications systems in your research area, available today, which should form the basis of your co-design communication with computer architects application teams? a)How are these applications currently constrained by compute and data resources, programming models, or available software tools? b)What are the gaps in available applications and application workflows, and requirements to fill these gaps? c)Which of these are ripe for collaboration within the context of Japan-US cooperation? –I/O systems (and frameworks), e.g., ADIOS –Communication layers, e.g., PGAS, UCCS –Application composition, e.g., COMPOSE/Hobbes –Visualization and analysis services –Simulation, emulation and modeling tools

Breakout Charge Questions, continued 3.How can the application systems research community, represented by a topical breakout at this workshop, constructively engage the vendor community in co- design? a)How should these various aspects of the application and architecture be optimized for effective utilization of exascale compute and data resources? b)Consider all aspects of exascale application: formulation and basic algorithms, programming models & environments, data analysis and management, hardware characteristics. –Full engagement of vendors and application teams, –Connected to procurement process CAAR at ORNL (one way, applications adapt to architecture) Co-design for post-K (two way, apps influence architecture and vice versa)

Breakout Charge Questions, continued 4.How can you best manage the “conversations” with computer designers/architects application developers around co-design such that (1) they are practical for computer design, and (2) the results are correctly interpreted within both communities? a)What are the useful performance benchmarks from the perspective of your domain? b)Are mini-apps an appropriate and/or feasible approach to capture your needs for communication to the computer designers? c)Are there examples of important full applications that are an essential basis for communication with computer designers? d)Can these be simplified into skeleton apps or mini-apps to simplify and streamline the co-design conversation

Breakout Charge Questions, continued 5.Describe the most important programming models and environment in use today within your community and characterize these as sustainable or unsustainable. a)Do you have appropriate methods and models to expose application parallelism in a high-performance, portable manner? b)Are best practices in software engineering often or seldom applied? c)Going forward, what are the critically important programming languages? d)On which libraries and/or domain-specific languages (DSL) is your research community dependent? e)Are new libraries or DSL’s needed in your research domain? f)Are these aspects of your programming environment sustainable or are new models needed to ensure their availability into the future?

Breakout Charge Questions, continued 6.Does your community have mature workflow tools that are implemented within leadership computing environments to assist with program composition, execution, analysis, and archival of results? If no, what are your needs and is their opportunity for value added? a)For example, do you need support for real-time, interactive workflows to enable integration with real-time data flows?

Breakout Charge Questions, continued 7.What are the new programming models, environments and tools that need to be developed to achieve our science goals with sustainable application software?

Breakout Charge Questions, continued 8.Is there a history, a track record in your research community for co-design for HPC systems in the installed machines in the past, and is there any co-design study done for these systems to document the effectiveness of co- design?