Download presentation
Presentation is loading. Please wait.
Published byFrank Watts Modified over 9 years ago
1
ECE/CS 752: Midterm 2 Review Instructor:Mikko H Lipasti Spring 2012 University of Wisconsin-Madison
2
Midterm 2 Review Topics Replacement policies (Lect 11) Advanced Caches (Lect 14) Prefetching (Lect 15) Main Memory (Lect 16) Advanced Microarchitecture (Lect 17) Multiple Threads (Lect 18)
3
3 Replacement Policies Replacement policies affect capacity and conflict misses Policies covered: Belady’s optimal replacement Least-recently used (LRU) Practical pseudo-LRU (tree LRU) Protected LRU LIP/DIP variant Set dueling to dynamically select policy Not-recently-used (NRU) or clock algorithm RRIP (re-reference interval prediction) Least frequently used (LFU) Contest results
4
Advanced Caches Coherent Memory Interface Evaluation methods Better miss rate: skewed associative caches, victim caches Reducing miss costs through software restructuring Higher bandwidth: Lock-up free caches, superscalar caches Beyond simple blocks Two level caches
5
Prefetching Prefetching anticipates future memory references –Software prefetching –Next-block, stride prefetching –Markov, Global history buffer prefetching Issues/challenges –Accuracy –Timeliness –Overhead (bandwidth) –Conflicts (displace useful data)
6
Main Memory DRAM chips Memory organization –Interleaving –Banking Memory controller design Hybrid Memory Cube Phase Change Memory (reading) Virtual memory TLBs Interaction of caches and virtual memory (Baer et al.)
7
Adv. Memory - Readings Read on your own: –Review: Shen & Lipasti Chapter 3 –W.-H. Wang, J.-L. Baer, and H. M. Levy. Organization of a two-level virtual-real cache hierarchy, Proc. 16th ISCA, pp. 140-148, June 1989 (B6) Online PDFOnline PDF –D. Kroft. Lockup-Free Instruction Fetch/Prefetch Cache Organization, Proc. International Symposium on Computer Architecture, May 1981 (B6). Online PDFOnline PDF –N.P. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers, Proc. International Symposium on Computer Architecture, June 1990 (B6). Online PDFOnline PDF Discuss in class: –Review due 3/24/2010: Benjamin C. Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, Doug Burger, "Phase-Change Technology and the Future of Main Memory," IEEE Micro, vol. 30, no. 1, pp. 143-143, Jan./Feb. 2010 –Read Sec. 1, skim Sec. 2, read Sec. 3: Bruce Jacob, “The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It,” Synthesis Lectures on Computer Architecture 2009 4:1, 1-77. 7
8
Adv Microarchitecture Instruction scheduling overview –Scheduling atomicity –Speculative scheduling –Scheduling recovery Complexity-effective instruction scheduling techniques –CRIB reading Scalable load/store handling –NoSQ reading Building large instruction windows –Runahead, CFP, iCFP Control Independence 3D die stacking
9
Adv Microarch -- Readings Read on your own: – Shen & Lipasti Chapter 10 on Advanced Register Data Flow – skim – I. Kim and M. Lipasti, “Understanding Scheduling Replay Schemes,” in Proceedings of the 10th International Symposium on High-performance Computer Architecture (HPCA-10), February 2004. – Srikanth Srinivasan, Ravi Rajwar, Haitham Akkary, Amit Gandhi, and Mike Upton, “Continual Flow Pipelines”, in Proceedings of ASPLOS 2004, October 2004. – Ahmed S. Al-Zawawi, Vimal K. Reddy, Eric Rotenberg, Haitham H. Akkary, “Transparent Control Independence,” in Proceedings of ISCA-34, 2007. To be discussed in class: – T. Shaw, M. Martin, A. Roth, “NoSQ: Store-Load Communication without a Store Queue, ” in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006. – Erika Gunadi, Mikko Lipasti: CRIB: Combined Rename, Issue, and Bypass, ISCA 2011. – Andrew Hilton, Amir Roth, "BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution," Proceedings of HPCA 2010. – Loh, G. H., Xie, Y., and Black, B. 2007. Processor Design in 3D Die-Stacking Technologies. IEEE Micro 27, 3 (May. 2007), 31-48.
10
Multiple Threads - Topics Thread-level parallelism Synchronization Multiprocessors Explicit multithreading Implicit multithreading: Multiscalar Niagara case study
11
Multiple Threads - Readings Read on your own: –Shen & Lipasti Chapter 11 –G. S. Sohi, S. E. Breach and T.N. Vijaykumar. Multiscalar Processors, Proc. 22nd Annual International Symposium on Computer Architecture, June 1995. –Dean M. Tullsen, Susan J. Eggers, Joel S. Emer, Henry M. Levy, Jack L. Lo, and Rebecca L. Stamm. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor, Proc. 23rd Annual International Symposium on Computer Architecture, May 1996 (B5) To be discussed in class: –Poonacha Kongetira, Kathirgamar Aingaran, Kunle Olukotun, Niagara: A 32-Way Multithreaded Sparc Processor, IEEE Micro, March-April 2005, pp. 21-29.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.