Download presentation
Presentation is loading. Please wait.
1
Programming Models for SimMillennium
Kathy Yelick NSF Infrastructure Site Visit March 2, 1998
2
Talk Outline Programming problems in SimMillenium
Overview of software tools Facilities for research in programming systems Titanium project
3
Programming Challenges
Large scale computations Optimized simulation algorithms are complex Use of hierarchical parallel machine Constructing services must be simple Cost-conscious programming Minimization algorithms Unstructured meshes ? Adaptive meshes
4
Infrastructure for Programming Systems
High end machines converging on CLUMPs Network bandwidth needed for applications Many non-local accesses (20-50% of grid points for AMR) Few floating point operations per element Having machine in the building provides low-threshold access to hardware Access to visualization facility crucial observations in applications debugging
5
Programming Tools for SimMillennium
Basic tools installed and supported 1 Billion bytes of code in the “software warehouse” exported MPI, C/C++/Fortran compilers, threads, numerical libraries Novel systems based on user demand Parallel Matlab, Khoros, HPF, DOE2000 Tools (Petsc, etc.) Research systems developed here Communication substrates: Active Messages (Culler) Languages: Split-C (Culler & Yelick) Titanium (Aiken, Graham, Hilfinger, Yelick) Service building tools (Brewer, Culler, and Joseph)
6
Titanium Approach Performance is primary goal, expressiveness second
Parallelism model SPMD Global address space with global/local distinction Based on safe language: Java Safety simplifies programming and compiler analysis Multidimensional arrays added Immutable classes added Optimizing compiler Domain-specific language extensions
7
New Compiler Analyses for Parallelism
Analysis of synchronization finds unmatched barriers, parallel code blocks extends traditional control flow analysis Analysis of communication reorder and pipeline memory operations without observed effect extends traditional dependence analysis Analyses extended to domain-specific constructs arrays indexed by domains of points looping constructs provide summarize information
8
Titanium Status Runs on NOW and SMPs
Sequential performance competitive with C/F77 preliminary optimizations within 40% for many problems 3D multigrid 13% faster on Pentium Parallel efficiency good EM3D (unstructured kernel) 3D AMR limited by algorithm Speedup Number of processors
9
Support for SimMillennium Applications
Long-standing collaboration in fluids and AMR astrophysics (McKee), combustion (Colella), turbulence (Marcus),… Planned collaborations in unstructured meshes and sparse solvers earthquake modeling (Fenves), TCAD (Neureuther),... Proposed solution: extend Titanium Linguistic support for unstructured data Development of new analyses optimization
10
SimMillenium Machines
CLUMPs adds new level in hierarchy algorithms currently optimize for caches on SMPs communication optimizations for distributed memories need to simultaneously optimize both Need for Multiprotocol communication Active Messages and MPI Eliminating protocols during compilation Locality and load balance trade-off understood for flat machine models different within and between SMPs
11
Programming in the Economy
New optimization criterion: cost Mapping performance data to cost models Use of performance models in algorithm and system design is a common theme of UCB research Need tools to map performance measurements to these models Building services Need to lower the threshold Service building packages provides functionality, but too low level Titanium language provides easy integration
12
Measuring Success Complete research agendas Users Services
high performance reasonable programmability Users on SimMillenium using tools, including research languages and systems Services new functionality provided “making money” through outside customers
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.