Facilitating Collaboration in High-Performance Computing Projects with an Interaction Room Matthias Book1, Morris Riedel1,2, Helmut Neukirchen1, Markus Götz1,2 1University of Iceland 2Jülich Supercomputing Centre, Germany {book, helmut}@hi.is {m.riedel, m.goetz}@fz-juelich.de The research leading to these results has received funding from the European Union Horizon 2020 – the Framework Programme for Research and Innovation (2014-2020) under Grant Agreement n°754304 1
The Nature of Software Development: Social learning “Because software is embodied knowledge, and that knowledge is initially dispersed, tacit, latent, and incomplete, software development is a social learning process.” Howard Baetjer, Jr.: Software as Capital. IEEE Computer Society Press,1998 Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Typical software development in High-Performance Computing (HPC) Often starts with a domain expert writing HPC code. E.g. science PhD student with no formal education in HPC nor SE. Problems: Code survives often academic scientists with temporary contracts: Code needs to be handed over to new scientists. Implicit assumptions and knowledge lost. Domain expert sooner or later struggles with HPC problems: code needs to be handed to HPC experts, e.g. at a supercomputing centre. Implicit assumptions and knowledge need to be transferred. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room 3
The Interaction Room (IR) Successfully used in business information system development. An Interaction Room is a dedicated room for the project team where domain experts and software development experts feel at home with large whiteboards (paper or digital canvas) on the walls but without a classic conference table to visualize and discuss key project aspects informally instead of going over tedious documents. M Book, S Grapenthin, and V Gruhn: “Seeing the forest and the trees: focusing team interaction on value and effort drivers”, Proc. ACM SIGSOFT 20th Intl. Symp. on Foundations of Software Engineering, 2012. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Example: Process canvas for an information system with annotations to capture implicit assumptions/knowledge Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Interaction Room: A Pragmatic Approach to Conceptualizing Software Informal, high-level sketches of software models: sacrifice formality, completeness (e.g. no strict UML), in favor of pragmatism, interdisciplinary understanding, serve as catalyst for the identification, understanding and discussion of the most critical aspects, annotated to capture implicit assumptions/knowledge. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Interaction Room for HPC Facilitate collaboration of experts from the science and HPC or software engineering domains! Needs to be adapted by providing HPC-specific canvases that address: Crucial interdisciplinary discussion points, Typical HPC software development phases. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room 7
Crucial Interdisciplinary Communication Points in HPC Simulation Science Projects Domain experts need to help HPC experts understand, e.g.: Research question we trying to answer? What assumptions/simplifications are made? HPC experts need to validate technical decisions with domain experts, e.g.: Domain decomposition, HPC cluster HW architecture. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Software Development Phases in HPC Simulation Science Projects Understand the problem domain: In general (goal and scope), In detail. (How would serial problem domain code look like?) Perform appropriate domain decomposition, choose appropriate communicators, etc. Implement communication between processes/threads; mix into code of problem domain. Test and validate simulation model and code. Optimize accuracy, tune performance (specific to HW architecture). Domain concepts to be supported by IR Domain concepts to be supported by IR HPC concepts to be supported by IR HPC concepts to be supported by IR Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Interaction Room Canvases for HPC Simulation Science Projects Problem canvas: Goal and scope of research question (the domain). Real-world canvas: Description of the pertinent aspects of the science domain. Decomposition canvas: Breakdown of scientific model into parallelizable units. Architecture canvas: Implementation of simulation on suitable HPC technology/HW architecture. Not necessary sequential use, but iterative refinement of canvas contents. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Problem Canvas Goal and scope of research question about the domain Domain experts collect: Research question, Boundary conditions, Assumptions, Abstractions, Quality requirements. Example: Heat dissipation problem Research Question: “What will the temperature in a room after running an air conditioner on one side and a heater on the other for several hours?” Boundary conditions: Starting temp., A/C and heater setting. Abstractions: Consider heat transfer by air flow / convection only, not by radiation. Assumptions: No moving objects in the room, no windows/doors. Quality requirements: Temp. must be determined as double precision float. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Real-World Canvas Description of the pertinent aspects of the domain Domain experts sketch static properties of the simulation space: Physical laws, Spatial setup. Domain experts sketch dynamic properties of simulation process: Forces, Events. Example: Heat dissipation problem Room geometry, heater & A/C location, Working of convection forces/air flows, Appropriate formulae for physical laws. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Decomposition Canvas Breakdown of scientific model into parallelizable units HPC experts sketch parallel model reflecting real-world model: Static HPC aspects: Domain decomposition, data structures. Dynamic HPC aspects: Communication patterns, halos/ghosts, adaptive mesh refinements, iterative numerical methods. Example: Heat dissipation problem Adaptive mesh refinement, Halo creation & communication strategy, Cartesian communicator, Iterative method to solve formula of physical heat transfer law. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Architecture Canvas Implementation of simulation on suitable HPC technology HPC experts sketch mapping of parallel model to actual cluster HW: Static aspects: Cluster architecture, memory model, tools. Dynamic aspects: Communication protocols, I/O operations, checkpointing. Example: Heat dissipation problem 1024 cores, Hybrid MPI/OpenMP, Jacobi solver, Parallel I/O, Checkpoints every 2.000 iterations. Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room
Summary and Outlook Interaction Room facilitates collaboration of domain experts (scientists) and HPC/software experts. Helps to capture implicit knowledge/assumptions. Canvases tailored to HPC simulation science project phases: Scientific problem, Real-world scientific processes, Parallel decomposition, HPC architecture. Outlook: More precise definition of canvases and annotations. Supporting HPC simulation science, HPC data science. Validation in HPC projects, e.g.: Porting HPC code to European Exascale prototype cluster DEEP-EST (Dynamical Exascale Entry Platform – Extreme Scale Technologies) http://www.deep-est.eu The research leading to these results has received funding from the European Union Horizon 2020 – the Framework Programme for Research and Innovation (2014-2020) under Grant Agreement n°754304 Helmut Neukirchen: Facilitating Collaboration in HPC with an Interaction Room 15