CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.

Slides:



Advertisements
Similar presentations
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 23, 2002 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
Modified from notes by Saeid Nooshabadi COMP3221: Microprocessors and Embedded Systems Lecture 25: Cache - I Lecturer:
CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.
Memory Subsystem and Cache Adapted from lectures notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
CIS429/529 Cache Basics 1 Caches °Why is caching needed? Technological development and Moore’s Law °Why are caches successful? Principle of locality °Three.
Memory Computer Architecture Lecture 16: Memory Systems.
Now, Review of Memory Hierarchy
331 Week13.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 20 - Memory.
Computer ArchitectureFall 2008 © October 27th, 2008 Majd F. Sakr CS-447– Computer Architecture.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
CS 61C L35 Caches IV / VM I (1) Garcia, Fall 2004 © UCB Andy Carle inst.eecs.berkeley.edu/~cs61c-ta inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
CIS629 - Fall 2002 Caches 1 Caches °Why is caching needed? Technological development and Moore’s Law °Why are caches successful? Principle of locality.
CIS °The Five Classic Components of a Computer °Today’s Topics: Memory Hierarchy Cache Basics Cache Exercise (Many of this topic’s slides were.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Computer ArchitectureFall 2007 © November 7th, 2007 Majd F. Sakr CS-447– Computer Architecture.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Revision Mid 2 Prof. Sin-Min Lee Department of Computer Science.
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 14 Instructor: L.N. Bhuyan
Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.
DAP Spr.‘98 ©UCB 1 Lecture 11: Memory Hierarchy—Ways to Reduce Misses.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
CPE232 Memory Hierarchy1 CPE 232 Computer Organization Spring 2006 Memory Hierarchy Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
CMPE 421 Parallel Computer Architecture
CS1104: Computer Organisation School of Computing National University of Singapore.
Lecture 14 Memory Hierarchy and Cache Design Prof. Mike Schulte Computer Architecture ECE 201.
Lecture 19 Today’s topics Types of memory Memory hierarchy.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
EEL5708/Bölöni Lec 4.1 Fall 2004 September 10, 2004 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Review: Memory Hierarchy.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
1010 Caching ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
The Goal: illusion of large, fast, cheap memory Fact: Large memories are slow, fast memories are small How do we create a memory that is large, cheap and.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Caches Hiding Memory Access Times. PC Instruction Memory 4 MUXMUX Registers Sign Ext MUXMUX Sh L 2 Data Memory MUXMUX CONTROLCONTROL ALU CTL INSTRUCTION.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 14: Memory Hierarchy Chapter 5 (4.
What is it and why do we need it? Chris Ward CS147 10/16/2008.
For each of these, where could the data be and how would we find it? TLB hit – cache or physical memory TLB miss – cache, memory, or disk Virtual memory.
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
Overview of microcomputer structure and operation
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
CACHE _View 9/30/ Memory Hierarchy To take advantage of locality principle, computer memory implemented as a memory hierarchy: multiple levels.
CMSC 611: Advanced Computer Architecture
COSC3330 Computer Architecture
Computer Organization
Yu-Lun Kuo Computer Sciences and Information Engineering
The Goal: illusion of large, fast, cheap memory
Lec 3 – Memory Hierarchy Review
Rose Liu Electrical Engineering and Computer Sciences
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers Memory Hierarchy: Introduction
Presentation transcript:

CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics Arizona State University Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB

CML CMLAnnouncements Alternate Project –Due Today Real Examples Finals –Tuesday, Dec 08, 2009 –Please come on time (You’ll need all the time) –Open book, notes, and internet –No communication with any other human

CML CML Time, Time, Time Making a Single Cycle Implementation is very easy –Difficulty and excitement is in making it fast Two fundamental methods to make Computers fast –Pipelining –Caches AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data

CML CML Effect of high memory Latency Single Cycle Implementation –Cycle time becomes very large –Operation that do not need memory also slow down AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data

CML CML Effect of high memory Latency Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout Multi-cycle Implementation –Cycle time becomes long But –Can make memory access multi-cycle –Avoid penalty to instructions that do not use memory

CML Effects of high memory latency ALU RegIMDMReg Pipelined Implementation −Cycle time becomes long But −Can make memory access multi-cycle −Avoid penalty to instructions that do not use memory −Can overlap execution of other instructions with a memory operation

CML CML Kinds of Memory CPU Registers 100s Bytes <10s ns SRAM K Bytes ns $.00003/bit DRAM M Bytes 50ns-100ns $.00001/bit Disk G Bytes ms cents Tape infinite sec-min Flipflops SRAM DRAM Disk Tape faster larger

CML CMLMemories CPU Registers, Latches –Flip flops: very fast, but very small SRAM – Static RAM –Very fast, Low Power, but small –Data is persistent, until there is power DRAM – Dynamic RAM –Very dense –Like a vanishing ink – data disappears with time –Need to refresh the contents

CML CML Flip Flops Fastest form of memory –Store data using combinational logic components only SR, JK, T, D- flip flops 2/10/2009CSE 420: Computer Architecture I 9

CML CML SRAM Cell Computer Scientist View b b’ Electrical Engineering View

CML CML A 4-bit SRAM Word -+ Wr Driver SRAM Cell SRAM Cell SRAM Cell SRAM Cell -+ Wr Driver WrEn Precharge Din 0Din 1Din 2Din 3

CML CML Sense Amp A 16X4 Static RAM (SRAM) Word 0 Word 1 Word Wr Driver Address Decoder SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell ::: Dout 0Dout 1Dout 2 SRAM Cell SRAM Cell SRAM Cell : Dout 3 -+ Wr Driver WrEn Precharge Din 0Din 1Din 2Din 3 A0 A1 A2 A3

CML CML Dynamic RAM (DRAM) Value is stored in the capacitor –Discharges with time –Needs to be refreshed regularly Dummy read will recharge the capacitor Very high density –Newest technology is first tried on DRAMs Intel became popular because of DRAM –Biggest vendor of DRAM

CML CML Why Not Only DRAM? Not large enough for some things –Backed up by storage (disk) –Virtual memory, paging, etc. –Will get back to this Not fast enough for processor accesses –Takes hundreds of cycles to return data –OK in very regular applications Can use SW pipelining, vectors –Not OK in most other applications

CML CML Is there a problem with DRAM? µProc 60%/yr. (2X/1.5yr) DRAM 9%/yr. (2X/10yrs) DRAM CPU 1982 Processor-Memory Performance Gap: grows 50% / year Performance Time “Moore’s Law” Processor-DRAM Memory Gap (latency)

CML CML Memory Hierarchy Analogy: Library (1/2) You’re writing a term paper (Anthropology) at a table in Hayden Hayden Library is equivalent to disk –essentially limitless capacity –very slow to retrieve a book Table is memory –smaller capacity: means you must return book when table fills up –easier and faster to find a book there once you’ve already retrieved it

CML CML Memory Hierarchy Analogy: Library (2/2) Open books on table are cache –smaller capacity: can have very few open books fit on table; again, when table fills up, you must close a book –much, much faster to retrieve data Illusion created: whole library open on the tabletop –Keep as many recently used books open on table as possible since likely to use again –Also keep as many books on table as possible, since faster than going to library

CML CML Memory Hierarchy: Goals Fact: Large memories are slow, fast memories are small How do we create a memory that gives the illusion of being large, cheap and fast (most of the time)?

CML CML Memory Hierarchy: Insights Temporal Locality (Locality in Time): => Keep most recently accessed data items closer to the processor Spatial Locality (Locality in Space): => Move blocks consists of contiguous words to the upper levels Lower Level Memory Upper Level Memory To Processor From Processor Blk X Blk Y

CML CML Memory Hierarchy: Solution CPU Registers 100s Bytes <10s ns Cache K Bytes ns cents/bit Main Memory M Bytes 200ns- 500ns $ cents /bit Disk G Bytes, 10 ms (10,000,000 ns) cents/bit -5-6 Capacity Access Time Cost Tape infinite sec-min Registers Cache Memory Disk Tape Instr. Operands Blocks Pages Files Staging Xfer Unit prog./compiler 1-8 bytes cache cntl bytes OS 4K-16K bytes user/operator Mbytes Upper Level Lower Level faster Larger Our current focus

CML CML Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (Block X) –Hit Rate: fraction of memory accesses found in the upper level –Hit Time: Time to access the upper level which consists of RAM access time + Time to determine hit/miss Miss: data needs to be retrieve from a block in the lower level (Block Y) –Miss Rate = 1 - (Hit Rate) –Miss Penalty: Time to replace a block in the upper level + Time to deliver the block the processor –Hit Time << Miss Penalty Lower Level Memory Upper Level Memory To Processor From Processor Blk X Blk Y

CML CML Memory Hierarchy: Show me numbers Consider application −30% instructions are load/stores −Suppose memory latency = 100 cycles −Time to execute 100 instructions = 70*1 + 30*100 = 3070 cycles Add a cache with latency 2 cycle −Suppose hit rate is 90% −Time to execute 100 instructions = 70*1 + 27*2 + 3*100 = = 424 cycles

CML CML Yoda says… You will find only what you bring in