Download presentation
Presentation is loading. Please wait.
Published byLogan Roberts Modified over 9 years ago
1
How fast are fast computers? Xing Cai October 26, 1998
2
2 Xing Cai Overview Modern fast computers at a glimpseModern fast computers at a glimpse Fast computers & scientific computingFast computers & scientific computing A closer look at SC performanceA closer look at SC performance Current situation & future trendsCurrent situation & future trends Concluding remarksConcluding remarks
3
October 26, 1998 3 Xing Cai An indirect answer The slowest fast computer is faster than the fastest slow computer. The slowest fast computer is faster than the fastest slow computer.
4
October 26, 1998 4 Xing Cai Performance ranking of world’s 500 most powerful computersPerformance ranking of world’s 500 most powerful computers LINPACK benchmark (floating-pt intensive)LINPACK benchmark (floating-pt intensive) J. Dongara, H. Meuer, E. StrohmaierJ. Dongara, H. Meuer, E. Strohmaier Report every 6 months since June 93Report every 6 months since June 93 A good correction of peak performanceA good correction of peak performance http://www.top500.org KFlopsMFlopsGFlopsTFlops
5
http://www.top500.org 6/98
6
October 26, 1998 6 Xing Cai ASCI Red TFLOPS 85 cabinets, 9216 Intel Pentium Pro processors http://www.sandia.gov/ASCI/Red/main.html
7
October 26, 1998 7 Xing Cai Some “high-end” computers SGI Cray T3E 1200SGI Cray T3E 1200 SGI Cray Origin 2000SGI Cray Origin 2000 Fujitsu VPP 700Fujitsu VPP 700 NEC SX-4NEC SX-4 IBM RS/6000 SPIBM RS/6000 SP
8
October 26, 1998 8 Xing Cai Vendor overview http://www.netlib.org/utk/people/JackDongarra/top500-698/
9
October 26, 1998 9 Xing Cai Vendor overview http://www.netlib.org/utk/people/JackDongarra/top500-698/
10
Scientific computing 50 years ENIAC - world’s 1st electronic computer for scientific computing
11
October 26, 1998 11 Xing Cai Advance in hardware Rapid advance of microprocessor tech.Rapid advance of microprocessor tech. World’s most powerful computerWorld’s most powerful computer –ENIAC 330 Flops, 1946 –Digital Alpha-21164 processor 1.2 GFlops, 1997 World’s most powerful computing siteWorld’s most powerful computing site –ONR 583.73 KFlops, 1956 –NSA 4,088.76 GFlops, 1998-Oct-14 http://www.cnct.com/~gunter “If car industry had made equal progress, you could buy a car for a few $, drive across US in a few minutes, and park it in your pocket!”
12
Scientific computing today http://www.psc.edu/science/projects.html Earth & environment DNA modelling & medical research
13
October 26, 1998 13 Xing Cai Grand challenge “Fundamental problem in science or engineering, with potentially broad economic, political and/or scientific impact, that could be advanced by applying high performance computing resources.” “Fundamental problem in science or engineering, with potentially broad economic, political and/or scientific impact, that could be advanced by applying high performance computing resources.” Keyword: simulation
14
Numerical simulation Phy.phenomMath.model Software hardware Algorithm 3rd paradigm of science!
15
October 26, 1998 15 Xing Cai Advance in numerics Solution of Poisson’s equationSolution of Poisson’s equation For “standard” size n =10 6 (100x100x100)For “standard” size n =10 6 (100x100x100) –Multigrid 14.42 seconds –Banded LU 232.96 days 56 MBytes 160 GBytes Linear system with sparse matrices
16
October 26, 1998 16 Xing Cai How fast (and big) should fast computers be? Global weather prediction Navier-Stokes on 3D grid for the earthNavier-Stokes on 3D grid for the earth 100 m cells, 100 levels - 5x10 12 cells100 m cells, 100 levels - 5x10 12 cells 5 variables per cell - 200 TBytes5 variables per cell - 200 TBytes 100 Flops/cell/minute100 Flops/cell/minute Required performance: 8TFlopsRequired performance: 8TFlops There is never enough computing power?
17
October 26, 1998 17 Xing Cai Electrical potential depolarization in human heart Grid node spacing 2 nodes/mmGrid node spacing 2 nodes/mm Estimated 3D grid - 4,200,000 nodesEstimated 3D grid - 4,200,000 nodes Estimated CPU time - one processorEstimated CPU time - one processor –cpu per node 3.3 seconds –total: 4,200,000x3.3 = 160 days Elapsed physical time: 300 msElapsed physical time: 300 ms http://www.ifi.uio.no/~xingca/HEART/ We need parallel computing
18
October 26, 1998 18 Xing Cai Parallel computing We are approaching the limit of single microprocessor performanceWe are approaching the limit of single microprocessor performance We want to run larger simulationsWe want to run larger simulations We want shorter simulation timeWe want shorter simulation time More cost-effective computingMore cost-effective computing
19
October 26, 1998 19 Xing Cai Oil reservoir simulation Simulation of 1000 days of gas injection Single-processor workstation simulationSingle-processor workstation simulation –one day for 80,000 unknowns –10 days for 800,000 unknowns –200 days for 32,000,000 unknowns (impossible) Efficient parallel computingEfficient parallel computing –128 processor IBM SP –23 minutes for 32,000,000 unknowns (PETSc) Importance of efficient parallel computing! http://www.mcs.anl.gov/petsc/petsc.html
20
October 26, 1998 20 Xing Cai Main question Actual performance of real-life SC applications are well below the peak performance. Why? Actual performance of real-life SC applications are well below the peak performance. Why?
21
October 26, 1998 21 Xing Cai LINPACK benchmark revisited Direct solution of dense matrix systemsDirect solution of dense matrix systems Limited application in SCLimited application in SC Simple data structureSimple data structure Close to artificial test problemClose to artificial test problem Only a more realistic upper-bound of achievable peak performance - 20% of reported performance can be expectedOnly a more realistic upper-bound of achievable peak performance - 20% of reported performance can be expected
22
October 26, 1998 22 Xing Cai Characteristics of SC Data intensive computingData intensive computing –1 GFlops - memory bandwidth 24GB/s (example DAXPY) –Memory hierarchy Complex data structureComplex data structure –Sparse matrices –Structured grid vs unstructured grid –Adaptive grid refinement Communication & synchronizationCommunication & synchronization
23
October 26, 1998 23 Xing Cai Multigrid method Suits well for large sparse systemsSuits well for large sparse systems –asymptotically optimal operation count –less 100 floating pt ops per unknown Complex data structureComplex data structure Relatively low performanceRelatively low performance Stals & Rüde - Techniques for improving the data locality of iterative methods Stals & Rüde - Techniques for improving the data locality of iterative methods
24
October 26, 1998 24 Xing Cai Architecture bottleneck Imbalance between processor speed and memory access speedImbalance between processor speed and memory access speed –Processor speed annual increase >= 60% –Memory access speed annual increase 5%-10% Inter-processor communication latency & bandwidthInter-processor communication latency & bandwidth Memory sizeMemory size
25
October 26, 1998 25 Xing Cai SC software today Inefficient (not very cache-aware)Inefficient (not very cache-aware) Not very portableNot very portable Not very easy to maintainNot very easy to maintain Not very user-friendlyNot very user-friendly Hard to program real-life applicationsHard to program real-life applications Limited compiler parallelismLimited compiler parallelism –Hard to program parallel codes
26
October 26, 1998 26 Xing Cai O-O numerical software Better representation of mathematicsBetter representation of mathematics Manpower effectiveManpower effective Stable code, easy maintenanceStable code, easy maintenance Good flexibility & extensibilityGood flexibility & extensibility Structured & efficient parallelizationStructured & efficient parallelization Need care for efficiencyNeed care for efficiency Standard is not settled yetStandard is not settled yet
27
October 26, 1998 27 Xing Cai Trend in architecture http://www.netlib.org/utk/people/JackDongarra/top500-698/
28
October 26, 1998 28 Xing Cai Trend in CPU technology http://www.netlib.org/utk/people/JackDongarra/top500-698/
29
October 26, 1998 29 Xing Cai Future trends Progress of semi-conductor technologyProgress of semi-conductor technology –over 10 9 transistors per chip in future –increased on-chip parallelism Architecture changes are neededArchitecture changes are needed Impact on scientific computingImpact on scientific computing –Rüde:Technological trends and their impact on the future of supercomputers Different levels of parallelismDifferent levels of parallelism
30
October 26, 1998 30 Xing Cai Metacomputing Demand for enormous computing powerDemand for enormous computing power –US airforce battle simulation (8 US supercomputing centers) –Unicore project (link supercomputers in Germany and US) Better utilization of idle comp. powerBetter utilization of idle comp. power “Seamless web” - heterogeneous comp.“Seamless web” - heterogeneous comp. Need a balanced system connected by high-speed networksNeed a balanced system connected by high-speed networks Need a scalable distr. operating systemNeed a scalable distr. operating system
31
October 26, 1998 31 Xing Cai Supercomputers in future ASCI Option White - IBM 10 TFlopsASCI Option White - IBM 10 TFlops 100 TFlops computers in near future100 TFlops computers in near future Petaflops (10 15 )Petaflops (10 15 ) –10,000-1,000,000 procs –feasible and “affordable” in 2010?
32
October 26, 1998 32 Xing Cai Some observations HPSC is a small but exciting fieldHPSC is a small but exciting field Supercomputers adopt commodity techSupercomputers adopt commodity tech Affordable parallel systems availableAffordable parallel systems available –SMP, distributed shared memory –cluster of shared memory machines –parallel computing standard appearing Scientific software industry is still in its early stageScientific software industry is still in its early stage
33
October 26, 1998 33 Xing Cai Challenges for SC NumericsNumerics –faster algorithms –good data locality –low communication requirement SoftwareSoftware –efficient (performance, manpower) –high-level problem solving environment HardwareHardware –changes of architecture
34
October 26, 1998 34 Xing Cai Some citations ‘Intentions of the scientific users strongly differ from the industrial users.’ ‘Intentions of the scientific users strongly differ from the industrial users.’ Ulrich Trottenberg, GMD Ulrich Trottenberg, GMD ‘There’s a future for high-performance parallel computing out there.’ ‘There’s a future for high-performance parallel computing out there.’ Tony Hey, Univ. Southampton Tony Hey, Univ. Southampton ‘Allow datastructures and algorithms to guide us to the appropriate architecture.’ ‘Allow datastructures and algorithms to guide us to the appropriate architecture.’ John Vrolyk, SGI senior vice president John Vrolyk, SGI senior vice president
35
October 26, 1998 35 Xing Cai The whole picture We are in the same boat... SupercomputerVendorScientificComputingIndustry Government General Public?
36
October 26, 1998 36 Xing Cai Concluding remarks Huge potential of scientific computingHuge potential of scientific computing More real-life applications to comeMore real-life applications to come Growing demand of computing powerGrowing demand of computing power Scientific computing needs advances inScientific computing needs advances in –numerical algorithms –software technology –hardware
37
October 26, 1998 37 Xing Cai Quiz What was world’s fastest computer on June 2nd 1998? ‘It was a HP notebook used on Space shuttle “Discovery” to compute orbital position. The speed was 17,500 mph.’ Jack Dongara
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.