01/26/2009CS267 - Lecture 2 1 Experimental Study of Memory (Membench)‏ Microbenchmark for memory system performance time the following loop (repeat many.

Slides:



Advertisements
Similar presentations
CS492B Analysis of Concurrent Programs Memory Hierarchy Jaehyuk Huh Computer Science, KAIST Part of slides are based on CS:App from CMU.
Advertisements

Lecture 19: Cache Basics Today’s topics: Out-of-order execution
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
1 CMPT 300 Introduction to Operating Systems Virtual Memory Sample Questions.
1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
Components of a Computer
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Quiz 4 Solution. n Frequency = 2.5GHz, CLK = 0.4ns n CPI = 0.4, 30% loads and stores, n L1 hit =0, n L1-ICACHE : 2% miss rate, 32-byte blocks n L1-DCACHE.
Caches Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
University of Amsterdam Computer Systems – cache characteristics Arnoud Visser 1 Computer Systems Cache characteristics.
Using one level of Cache:
08/29/2002CS267 Lecure 21 High Performance Programming on a Single Processor: Memory Hierarchies Katherine Yelick
CS252 Graduate Computer Architecture Lecture 22 Caching Optimizations April 19 th, 2010 John Kubiatowicz Electrical Engineering and Computer Sciences University.
Memory System Performance October 29, 1998 Topics Impact of cache parameters Impact of memory reference patterns –matrix multiply –transpose –memory mountain.
CS61C L22 Caches III (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #22: Caches Andy.
CS 61C L23 VM I (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #23: VM I Andy Carle.
CS 61C L35 Caches IV / VM I (1) Garcia, Fall 2004 © UCB Andy Carle inst.eecs.berkeley.edu/~cs61c-ta inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
Virtual Memory. Why do we need VM? Program address space: 0 – 2^32 bytes –4GB of space Physical memory available –256MB or so Multiprogramming systems.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1 Lecture 7: Caching in Row-Buffer of DRAM Adapted from “A Permutation-based Page Interleaving Scheme: To Reduce Row-buffer Conflicts and Exploit Data.
Virtual Memory Topics Virtual Memory Access Page Table, TLB Programming for locality Memory Mountain Revisited.
CS61C L32 Caches III (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C.
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
1 Copyright © 2011, Elsevier Inc. All rights Reserved. Appendix B Authors: John Hennessy & David Patterson.
Systems I Locality and Caching
ECE Dept., University of Toronto
1 Minimizing Communication in Numerical Linear Algebra Introduction, Technological Trends Jim Demmel EECS & Math Departments,
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
CMP 301A Computer Architecture 1 Lecture 3. Outline zQuick summary zMultilevel cache zVirtual memory y Motivation and Terminology y Page Table y Translation.
1 Lecture 2 Single Processor Machines: Memory Hierarchies and Processor Features UCSB CS240A, Winter 2013 Modified from Demmel/Yelick’s slides.
Lecture 19: Virtual Memory
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Memory and cache CPU Memory I/O. CEG 320/52010: Memory and cache2 The Memory Hierarchy Registers Primary cache Secondary cache Main memory Magnetic disk.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-8 Memory Management (2) Department of Computer Science and Software.
CS194 Lecture 31 Memory Hierarchies and Optimizations Kathy Yelick
1 CENG 450 Computer Systems and Architecture Cache Review Amirali Baniasadi
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.
Chapter 5 Memory III CSE 820. Michigan State University Computer Science and Engineering Miss Rate Reduction (cont’d)
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Advanced Topics: Prefetching ECE 454 Computer Systems Programming Topics: UG Machine Architecture Memory Hierarchy of Multi-Core Architecture Software.
CPE 779 Parallel Computing - Spring Memory Hierarchies & Optimizations Case Study: Tuning Matrix Multiply Based on slides by: Kathy Yelick
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CS 162 Discussion Section Week 6. Administrivia Project 2 Deadlines – Initial Design Due: 3/1 – Review Due: 3/5 – Code Due: 3/15.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
COSC3330 Computer Architecture
CS2100 Computer Organization
Caches in Systems Feb 2013 COMP25212 Cache 4.
Copyright © 2011, Elsevier Inc. All rights Reserved.
Architecture Background
Lecture 6 Memory Hierarchy
CS 105 Tour of the Black Holes of Computing
ECE 445 – Computer Organization
Lecture 21: Memory Hierarchy
Cache Memories September 30, 2008
Chapter 8 Digital Design and Computer Architecture: ARM® Edition
Memory Hierarchies.
Systems Architecture II
ECE Dept., University of Toronto
Lecture 22: Cache Hierarchies, Memory
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?
Computer System Design Lecture 9
Lecture 21: Memory Hierarchy
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Cache Memory Rabi Mahapatra
Cache Memory and Performance
Overview Problem Solution CPU vs Memory performance imbalance
Memory Management & Virtual Memory
Presentation transcript:

01/26/2009CS267 - Lecture 2 1 Experimental Study of Memory (Membench)‏ Microbenchmark for memory system performance time the following loop (repeat many times and average)‏ for i from 0 to L load A[i] from memory (4 Bytes)‏ for array A of length L from 4KB to 8MB by 2x for stride s from 4 Bytes (1 word) to L/2 by 2x time the following loop (repeat many times and average)‏ for i from 0 to L by s load A[i] from memory (4 Bytes)‏ s 1 experiment

01/26/2009CS267 - Lecture 2 2 Membench: What to Expect Consider the average cost per load Plot one line for each array length, time vs. stride Small stride is best: if cache line holds 4 words, at most ¼ miss If array is smaller than a given cache, all those accesses will hit (after the first run, which is negligible for large enough runs)‏ Picture assumes only one level of cache Values have gotten more difficult to measure on modern procs s = stride average cost per access total size < L1 cache hit time memory time size > L1

01/26/2009CS267 - Lecture 2 3 Memory Hierarchy on a Sun Ultra-2i L1: 16 KB 2 cycles (6ns)‏ Sun Ultra-2i, 333 MHz L2: 64 byte line See for details L2: 2 MB, 12 cycles (36 ns)‏ Mem: 396 ns (132 cycles)‏ 8 K pages, 32 TLB entries L1: 16 B line Array length

01/26/2009CS267 - Lecture 2 4 Memory Hierarchy on a Pentium III L1: 32 byte line ? L2: 512 KB 60 ns L1: 64K 5 ns, 4-way? Katmai processor on Millennium, 550 MHz Array size

01/26/2009CS267 - Lecture 2 5 Memory Hierarchy on a Power3 (Seaborg)‏ Power3, 375 MHz L2: 8 MB 128 B line 9 cycles L1: 32 KB 128B line.5-2 cycles Array size Mem: 396 ns (132 cycles)‏

01/26/2009CS267 - Lecture 2 6 Memory Performance on Itanium 2 (CITRIS)‏ Itanium2, 900 MHz L2: 256 KB 128 B line.5-4 cycles L1: 32 KB 64B line.34-1 cycles Array size Mem: cycles L3: 2 MB 128 B line 3-20 cycles Uses MAPS Benchmark: