Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalability Study of S3D using TAU Sameer Shende

Similar presentations


Presentation on theme: "Scalability Study of S3D using TAU Sameer Shende"— Presentation transcript:

1 Scalability Study of S3D using TAU Sameer Shende tau-team@cs.uoregon.edu

2 TAU Performance SystemS3D Scalability Study2 Acknowledgements  Alan Morris [UO]  Kevin Huck [UO]  Allen D. Malony [UO]  Kenneth Roche [ORNL]  Bronis R. de Supinski [LLNL] The performance data presented here is available at: http://www.cs.uoregon.edu/research/tau/s3d

3 TAU Performance SystemS3D Scalability Study3 TAU Parallel Performance System  http://www.cs.uoregon.edu/research/tau/  Multi-level performance instrumentation  Multi-language automatic source instrumentation  Flexible and configurable performance measurement  Widely-ported parallel performance profiling system  Computer system architectures and operating systems  Different programming languages and compilers  Support for multiple parallel programming paradigms  Multi-threading, message passing, mixed-mode, hybrid

4 TAU Performance SystemS3D Scalability Study4 Scalability Study  Harness testcase  Platform: Jaguar Cray XT3 at ORNL  1p  8p  64p  512p  Goal: to evaluate scaling properties of code regions  Scalability of MPI operations

5 TAU Performance SystemS3D Scalability Study5 Introduction to ParaProf: Main Window click left mouse button click right mouse button % paraprof *.ppk load all 1p, 8p, 64p, 512p profile datasets together

6 TAU Performance SystemS3D Scalability Study6 ParaProf: MFLOPs sorted by Exclusive Time

7 TAU Performance SystemS3D Scalability Study7 Source Code View

8 TAU Performance SystemS3D Scalability Study8 Comparison Window: Inclusive Time

9 TAU Performance SystemS3D Scalability Study9 Comparing Level 1 Data Cache Misses

10 TAU Performance SystemS3D Scalability Study10 CPU Resource Stalls

11 TAU Performance SystemS3D Scalability Study11 ParaProf: 3D view for 512 cpus - Jagged Edges!

12 TAU Performance SystemS3D Scalability Study12 MPI_Wait - Jagged Edges Seen in 3D Window pattern repeats every 8 cpus! 512 cpus

13 TAU Performance SystemS3D Scalability Study13 MPI_Wait - Histogram (Bins) View

14 TAU Performance SystemS3D Scalability Study14 Comparing MPI_Wait  MPI_Wait time increases steadily with processors!

15 TAU Performance SystemS3D Scalability Study15 PerfDMF: Performance Data Mgmt. Framework

16 TAU Performance SystemS3D Scalability Study16 PerfExplorer - Comparative Analysis  Relative speedup, efficiency  total runtime, by event, one event, by phase  Breakdown of total runtime  Group fraction of total runtime  Correlating events to total runtime  Timesteps per second

17 TAU Performance SystemS3D Scalability Study17 PerfExplorer TAU’s PerfDMF database S3D

18 TAU Performance SystemS3D Scalability Study18 PerfExplorer: Select Experiment & Analysis

19 TAU Performance SystemS3D Scalability Study19 Relative Efficiency By Event

20 TAU Performance SystemS3D Scalability Study20 Relative Efficiency For S3D - Weak Scaling

21 TAU Performance SystemS3D Scalability Study21 Relative Speedup

22 TAU Performance SystemS3D Scalability Study22 Relative Efficiency & Speedup for One Event

23 TAU Performance SystemS3D Scalability Study23 Data Mining: Event Correlation to Total Time r = 1 implies direct correlation

24 TAU Performance SystemS3D Scalability Study24 MPI Scaling

25 TAU Performance SystemS3D Scalability Study25 Total Runtime Breakdown by Events

26 TAU Performance SystemS3D Scalability Study26 S3D - Building with TAU  Change name of compiler in build/make.XT3  ftn=> tau_f90.sh  cc => tau_cc.sh  Set compile time environment variables  setenv TAU_MAKEFILE /spin/proj/perc/TOOLS/tau_latest/xt3/lib/ Makefile.tau-callpath-multiplecounters-mpi-papi-pdt-pgi  Choose callpath, PAPI counters, MPI profiling, PDT for source instrumentation  setenv TAU_OPTIONS ‘-optTauSelectFile=select.tau -optPreProcess’  Selective instrumentation file eliminates instrumentation in lightweight routines  Pre-process Fortran source code using cpp before compiling  Set runtime environment variables for instrumentation control and event PAPI counter selection in job submission script:  export TAU_THROTTLE=1  export COUNTER1 GET_TIME_OF_DAY  export COUNTER2 PAPI_FP_INS  export COUNTER3 PAPI_L1_DCM  export COUNTER4 PAPI_RES_STL  export COUNTER5 PAPI_L2_DCM

27 TAU Performance SystemS3D Scalability Study27 Selective Instrumentation in TAU % cat select.tau BEGIN_EXCLUDE_LIST MCADIF GETRATES TRANSPORT_M::MCAVIS_NEW MCEDIF MCACON CKYTCP THERMCHEM_M::MIXCP THERMCHEM_M::MIXENTH THERMCHEM_M::GIBBSENRG_ALL_DIMT CKRHOY MCEVAL4 THERMCHEM_M::HIS THERMCHEM_M::CPS THERMCHEM_M::ENTROPY END_EXCLUDE_LIST BEGIN_INSTRUMENT_SECTION loops routine="#" END_INSTRUMENT_SECTION

28 TAU Performance SystemS3D Scalability Study28 Getting Access to TAU on Jaguar  set path=(/spin/proj/perc/TOOLS/tau_latest/x86_64/bin $path)  Choose Stub Makefiles (TAU_MAKEFILE env. var.) from /spin/proj/perc/TOOLS/tau_latest/xt3/lib/Makefile.*  Makefile.tau-mpi-pdt-pgi (flat profile)  Makefile.tau-mpi-pdt-pgi-trace (event trace, for use with Vampir)  Makefile.tau-callpath-mpi-pdt-pgi (single metric, callpath profile)  Binaries of S3D can be found in:  ~sameer/scratch/S3D-BINARIES withtau »papi, multiplecounters, mpi, pdt, pgi options without_tau

29 TAU Performance SystemS3D Scalability Study29 Concluding Discussion  Performance tools must be used effectively  More intelligent performance systems for productive use  Evolve to application-specific performance technology  Deal with scale by “full range” performance exploration  Autonomic and integrated tools  Knowledge-based and knowledge-driven process  Performance observation methods do not necessarily need to change in a fundamental sense  More automatically controlled and efficiently use  Develop next-generation tools and deliver to community  Open source with support by ParaTools, Inc.  http://www.cs.uoregon.edu/research/tau

30 TAU Performance SystemS3D Scalability Study30 Support Acknowledgements  Department of Energy (DOE)  Office of Science  LLNL, LANL, ORNL, ASC  PERI


Download ppt "Scalability Study of S3D using TAU Sameer Shende"

Similar presentations


Ads by Google