Download presentation
Presentation is loading. Please wait.
Published byTiffany Chatten Modified over 9 years ago
1
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC 27514 george@hiperism.com http://www.hiperism.com HiPERiSM Consulting, LLC.
2
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com CHOOSING A COMPILER FOR AQM APPLICATIONS ON LINUX George Delic, Ph.D. Models-3 User’s Workshop October 27-29, 2003 RTP, NC
3
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Overview 1.Introduction 2.Choice of Hardware 3.Choice of Compilers 4.Choice of Benchmarks 5.Comparing Execution Times 6.Evaluation of SSE Results 7.Tests for AQM’s 8.Conclusions
4
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Introduction Motivation AQM’s are migrating to COTS hardware Linux is preferred Rich choice of compilers is now available Need to learn about portability issues What is known about compilers for IA-32? CMAQ releases switch compilers w/o comment Where is the analysis of differences in Performance? Numerical accuracy & stability? Portability problems?
5
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Choice of Hardware & Compilers Hardware Intel Pentium III (933 MHz, dual processor) with SSE extensions and 256MB L2 cache Linux 2.4.20 kernel Fortran compilers for IA-32 Absoft 8.0 Intel 7.1 Lahey 5.6 Portland CDK 4.0
6
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Choice of Benchmarks Kallman Integer and Logical Algorithm Uses only I & L operations with bit intrinsics Negligible I/O and memory operations Six cases with problem size scaling Stommel Ocean Model sp Floating Point Algorithm Jacobi iteration sweep over 2-D physical domain Regular loops optimal for testing vectorization Six cases in the range N=2x10 3 to 7x10 3 with N 2 =4 to 49 million data points
7
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Choice of Benchmarks (cont.) Princeton Ocean Model dp FP Algorithm Example of “real-world” code that is numerically unstable with sp arithmetic! 500+ vectorizable loops to exercise compilers 9 procedures account for 85% of CPU time 2-Day simulation for two cases: Small problem: 65 x 49 x 21 Large problem: 100 x 40 x 15
8
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: Kallman compiler switches Compiler and version Compiler command and selected switches Absoft 8.0f90 –O3 –ffixed Intel 7.1ifc –O3 –tpp6 -FI Lahey 5.6lf95 –tpp –fix Portland 4.0pgf90 –fast
9
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: Kallman (seconds) NAbsoftIntelLaheyPortland 300.210.360.480.60 4440.3880.1998.45135.29 486.4413.1516.1622.52 5223.0348.2059.3083.28 56197.78412.83509.31712.42 6012891.5826734.0932833.0845451.38
10
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: Kallman (log10 seconds)
11
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: Kallman (ratio to Absoft time)
12
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: SOM (POM) compiler switches (without SSE) Compiler and version Compiler command and selected switches Absoft 8.0f90 –s –cpu:p6–O3 (-N113) – ffixed Intel 7.1ifc –O3 (-r8) –tpp6 -FI Lahey 5.6lf95 –tpp (-dbl) –fix Portland 4.0pgf90 –fast (-r8) –Mvect
13
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: SOM without SSE (seconds) NAbsoftIntelLaheyPortland 200050.038.836.441.4 3000110.594.487.792.7 4000197.7159.6150.3163.3 5000305.3224.3246.8253.1 6000443.4320.0332.0388.5 7000586.5427.6477.9524.4
14
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: SOM (without SSE)
15
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Statistics for four compilers: SOM (without SSE)
16
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: POM (without SSE) CaseAbsoftIntelLaheyPortland 1909.1826.4728.8836.3 2825.1786.9671.2755.3
17
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Statistics for four compilers: Variability vs. problem size
18
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Evaluation of SSE Results IA-32 Hardware Intel Pentium III+ supports Streaming- Single-Instruction-Multiple-Data Extensions (SSE) Linux 2.4.20 kernel supports SSE Fortran compilers that enable SSE Intel 7.1 Portland CDK 4.0
19
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: SOM (POM) compiler switches (with SSE) Compiler and version Compiler command and selected switches Intel 7.1ifc –O3 -xK (-r8) –tpp6 -FI Portland 4.0pgf90 –fast (-r8) –Mvect=sse
20
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: SOM (with SSE)
21
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Comparing Execution Times: POM (with SSE)
22
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Evaluation of SSE Results Fortran compilers with SOM (sp) Intel 7.1 Average speed up of 1.44 Portland CDK 4.0 Average speed up of 1.70 Fortran compilers with POM (dp) Intel 7.1 Average speed up of 1.25 Portland CDK 4.0 Average speed up of 1.19
23
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Tests for AQM’s Next steps for CMAQ with four compilers: Report on portability issues Re-compilation of all libraries Performance instrumentation & analysis Numerical & stability analysis OpenMP performance study Please propose scenarios worthwhile using for these tests!
24
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com Conclusions Hardware: COTS is the way to go but ……. Linux: Operating System is popular but ….. Programming Environment: rich in choices Consequences for AQM: the combination of hardware, Linux, and programming environment needs careful on-going evaluation. HiPERiSM is ready for this task!
25
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com HiPERiSM’s URL http://www.hiperism.com Talk to us about your requirements
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.