Download presentation
Presentation is loading. Please wait.
Published byJerome Oliver Modified over 9 years ago
1
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Varying Memory Size with TPC-C Performance and Resource Effects Jay Veazey and Blaine Gaither Hewlett-Packard Jay.Veazey@hp.com Blaine.Gaither@hp.com
2
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Motivation --- why is this interesting? More memory increases performance → How much? →Why exactly? → Reveal and quantify the underlying causes Focus is R&D tradeoffs →Performance, cost, schedule, power →How much memory to design into a commercial server? →Is memory latency more important than memory size?
3
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Experimental Design Vary memory 32-192 GBytes → Measure Throughput Resource utilization –CPU, disk I/O, memory BW, CPI, OS context switches HP Integrity rx6600 → Itanium 2 9050 CPUs (2S/4C) → About 750 disk drives TPC-C → Resource intensive → Standard, “coin of the realm”…easy to communicate → Unofficial results
4
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Throughput Increase of 48% in throughput
5
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Resource Utilization I/O reduction accounts for 20% of the 48% throughput improvement. Where’s the rest of it? Disk I/O and CPU utilization GB Memthruput CPU Util.IOs / sec Relative thruput approx. % insts. devoted to I/O 32149,93499.7%71,0681.0031% 64173,01799.0%58,9071.1524% 96184,71699.7%50,5741.2320% 128196,52199.5%44,3971.3117% 192221,28999.9%29,4221.4811%
6
CAECW 2008 -- Salt Lake City -- Veazey & Gaither CPI and Memory As memory is added, CPU cycles are used more efficiently But this is an effect, not a cause---why does CPI fall?
7
CAECW 2008 -- Salt Lake City -- Veazey & Gaither CPI and Memory Bandwidth CPI can change for many reasons, most irrelevant here Memory accesses are relevant – When a load misses cache, the delay counts toward CPI
8
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Caches Stabilize with Increasing Memory Units normalized for throughput –accesses (or misses) / sec / CPU / tpmC L1 accesses imply that the registers also stabilize memory L1 accesses L1 misses L2 misses L3 misses 326901154918322 646219137715519 965943129713917 1285683123212716 1925122109510914
9
CAECW 2008 -- Salt Lake City -- Veazey & Gaither OS Thread Switches and Memory Reduced thread switches probably cause of register / cache stabilization --- working sets stay around longer
10
CAECW 2008 -- Salt Lake City -- Veazey & Gaither Summary and Conclusions Adding memory increases performance significantly I/O is reduced, as well as I/O instruction pathlength Context switches are reduced as a result of less I/O –Fewer memory accesses –Lower CPI –More stable caches and registers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.