VMF Detailed Design Version 3 by Yehuda Afek Alexander Matveev 5/11/2015VMF Dettailed Dezign ver21.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

1 Memory hierarchy and paging Electronic Computers M.
Data Dependencies Describes the normal situation that the data that instructions use depend upon the data created by other instructions, or data is stored.
Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Advanced Pipelining Optimally Scheduling Code Optimally Programming Code Scheduling for Superscalars (6.9) Exceptions (5.6, 6.8)
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
The Linux Kernel: Memory Management
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 3 Memory Management Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
G Robert Grimm New York University Cool Pet Tricks with… …Virtual Memory.
CS 333 Introduction to Operating Systems Class 12 - Virtual Memory (2) Jonathan Walpole Computer Science Portland State University.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management -3 CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
Chapter 12 Pipelining Strategies Performance Hazards.
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
Translation Buffers (TLB’s)
03/22/2004CSCI 315 Operating Systems Design1 Virtual Memory Notice: The slides for this lecture have been largely based on those accompanying the textbook.
CS 61C: Great Ideas in Computer Architecture
Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Chapter 10 The Stack Stack: An Abstract Data Type An important abstraction that you will encounter in many applications. We will describe two uses:
Cosc 3P92 Week 9 & 10 Lecture slides
CMPE 421 Parallel Computer Architecture
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Sutirtha Sanyal (Barcelona Supercomputing Center, Barcelona) Accelerating Hardware Transactional Memory (HTM) with Dynamic Filtering of Privatized Data.
Memory Management 3 Tanenbaum Ch. 3 Silberschatz Ch. 8,9.
Computer Architecture and the Fetch-Execute Cycle
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
Chapter 4 Memory Management Virtual Memory.
G53SEC 1 Reference Monitors Enforcement of Access Control.
Review (1/2) °Caches are NOT mandatory: Processor performs arithmetic Memory stores data Caches simply make data transfers go faster °Each level of memory.
Processes and Virtual Memory
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Branch Prediction CS 3220 Fall 2014 Hadi Esmaeilzadeh Georgia Institute of Technology Some slides adopted from Prof. Milos Prvulovic.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
COSC 3330/6308 Second Review Session Fall Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Data Prefetching Smruti R. Sarangi.
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Outline Paging Swapping and demand paging Virtual memory.
Modeling Page Replacement Algorithms
Chapter 10 The Stack.
Evolution in Memory Management Techniques
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Page that info back into your memory!
Modeling Page Replacement Algorithms
Computer Architecture and the Fetch-Execute Cycle
Translation Buffers (TLB’s)
Data Prefetching Smruti R. Sarangi.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Translation Buffers (TLB’s)
CSC3050 – Computer Architecture
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Control Hazards Branches (conditional, unconditional, call-return)
COMP3221: Microprocessors and Embedded Systems
CSE 471 Autumn 1998 Virtual memory
Lecture 9: Caching and Demand-Paged Virtual Memory
Translation Buffers (TLBs)
Interrupts and exceptions
Review What are the advantages/disadvantages of pages versus segments?
ARM920T Processor This training module provides an introduction to the ARM920T processor embedded in the AT91RM9200 microcontroller.We’ll identify the.
Presentation transcript:

VMF Detailed Design Version 3 by Yehuda Afek Alexander Matveev 5/11/2015VMF Dettailed Dezign ver21

Subjects  Abstract Idea  Transactional Memory:  Memory Model  Transaction  Illustration  Requirements:  VMF  Transactional Memory  Hardware Implementation:  Overview  VMF Prediction Table  Concept  Algorithm  VMF Stripe Cache  Concept  Algorithm  VMF Call Invoke Trigger  VMF Miss Prediction Handling  VMF Call Invoke implementation  TLB Modifications  VMF Code Segment: VMF_Start and VMF_End  Context Switch Handling 5/11/2015VMF Dettailed Dezign ver22

Abstract idea To automate the transactification process. Drastically simplify the compiler. The HW will automatically do the STM. It is like HW assisted STM 5/11/2015VMF Dettailed Dezign ver23

TM Memory Model Virtual memory pages are divided to stripes stripe can have an object inside or multiple objects. Also the objects can be splitted between the stripes. TM algorithm granularity will be stripe. Therefore, the read-set and write-set entries are stripes. TM algorithm is applied only for memory accesses to the shared memory. The user marks which memory pages are shared and which are not-shared. 5/11/2015VMF Dettailed Dezign ver24

TM Transaction Transaction is a block of code annotated by – {TxBegin, … block of code …, TxEnd}. Inside the transaction: – shared memory accesses: must be preceded by a TxLD or TxST function call. – non-shared memory accesses: proceed regularly TxLD or TxST: call will record the shared memory access (to read-set or write-set) and will perform the TM algorithm specific code. 5/11/2015VMF Dettailed Dezign ver25

.... Non-Shared Page stripe Virtual Memory CODE LD r, X1 ST Y, r LD r, X1+1 Transaction TM Illustration LD r, X2 Shared Page Non-Shared Page TxBegin TxEnd TxLD invoked TxST invoked no invokation Non-Shared Page 5/11/2015VMF Dettailed Dezign ver26

VMF Code Segment: block of code annotated as: – {VMF_Start, … block of code …, VMF_End) – VMF_Start and VMF_End are defined in next slides VMF Invocation Trigger: For VMF code segment, VMF Call is invoked for LD or ST instruction if all the following conditions are true: 1.memory access address page is marked as shared 2.If LD: memory access address stripe was not accessed before for LD (After the VMF_Start call) 3.If ST: memory access address stripe was not accessed before for ST (After the VMF_Start call) VMF Call Invoke: If LD/ST instruction invoked VMF then the order of execution is: 1.VMF function (given the read/write memory address) 2.The original LD/ST instruction VMF - Requirements 5/11/2015VMF Dettailed Dezign ver27

VMF Call = TxLD/TxST for LD/ST instruction VMF Code Segment is: – START = TxBegin – END = TxEnd TxBegin: 1.Execute VMF_Start (Defined in next slides) 2.Execute the TM_Start TxEnd: 1.Execute the TM_Commit using the constructed READ-SET and WRITE-SET. 2.Execute VMF_End (Defined in next slides) TM - Requirements 5/11/2015VMF Dettailed Dezign ver28

Hardware Implementation Overview VMF Call Invoke will be done once per read or write access to every stripe during the transaction. In order to support this the following changes will done to the pipeline: – Fetch Stage: VMF Prediction Table (VMF PT) will be used to predict VMF Call Invoke based on previous results for current LD/ST instruction. – MEM Stage: VMF Stripe Cache (VMF SC) will be used to cache stripes that have been already accessed. New Registers: Upon VMF Call we want to store: – VMF_ADDR_REG: The memory address accessed by the LD/ST instruction that invoked the VMF – VMF_PC_REG: The PC of the current LD/ST instruction so we can return back New Instructions: – VMF_Start opcode – VMF_End opcode 5/11/2015VMF Dettailed Dezign ver29

5/11/2015VMF Dettailed Dezign ver210 TLB PC Reg Addr Reg inst addrvmfF Inst PC Access Addr VMF Ctrl VMF PT SC

CONCEPT: We distinguish between three types of instruction occurrences which access shared data: Multiple Invoke: Instruction that access a different stripe in each occurrence of the instruction in a transaction. Once Invoke: Instruction accessing the same stripe throughout a transaction. First Time: Repeated: Instruction accessing the same stripe during the transaction AND it is accessing a stripe that was accessed before by some LD/ST instruction during THE SAME transaction. – Table entry is: ( instruction address, flags ) can be squeezed into 32 bits. (The last 2 bits of 32 bit instruction address can be used). – Flags (2 bit): 0: Once invoke type (Not yet invoked) 1: Once invoke type (Repeated) 2: Multiple invoke type (Not invoked yet) 3: Multiple invoke type (Already invoked) VMF Prediction Table Concept 5/11/2015VMF Dettailed Dezign ver211

VMF Prediction Table Algorithm 1.COMMON CASE: If instruction triggered VMF Call and VMF Call was predicted: – If instruction’s entry flags == 0 (Once Invoke – Not invoked yet) then – Update the instruction’s entry flags to 1 (Once Invoke – Already invoked) – Else if instruction’s entry flags == 2 (Multiple Invoke – Not invoked yet) then – Update the instruction’s entry flags to 3 (Multiple Invoke – Already invoked) – Else, – The instruction’s entry flags is 3 (Multiple Invoke – Already invoked). Do no changes to the entry. 1.STUDY CASE - Once Invoke: If instruction triggered VMF Call and VMF call was not predicted: – Create entry with ( current instruction address, flags = 1 ) 1.STUDY CASE – Multiple Invoke: If instruction triggered VMF Call and VMF call was not predicted because flags == 1 for the entry: – Set entry’s flags to 3 (Multiple Invoke – Already invoked ) 1.RESTUDY CASE – NOT COMMON: If instruction did not triggered VMF Call and VMF call was predicted: – Remove instruction’s entry 5/11/2015VMF Dettailed Dezign ver212

VMF Stripe Cache CONCEPT: The purpose of the VMF Stripe Cache is to record the stripes that were already accessed in the current transaction. ALGORITHM: – Every entry has: ( stripe address, flags) – Flags can be: 0 – read access was performed 1 – write access was performed – If shared memory access triggers VMF Call Invoke and the VMF PT flags is not equal to 3 (Multiple Invoke – Already Called) then: Entry is created in the table holding the stripe’s address and the access type information (0 for LD, 1 for ST) 5/11/2015VMF Dettailed Dezign ver213

VMF Call Invoke Trigger VMF Call Invoke Trigger: VMF Call Invoke is triggered if all following conditions are met for current LD/ST instruction: 1.Memory access address page is marked as shared 2.Memory access address stripe entry = (stripe address, flags = 0 for LD or 1 for ST) is not in VMF Stripe Cache Updates Upon Trigger: If VMF Call Invoke Trigger is True then: – VMF Stripe Cache is updated: new entry is added – VMF Prediction Table is updated: new entry is added, or entry’s flags is switched from 0 to 1 or from 1 to 2 or from 2 to 3. 5/11/2015VMF Dettailed Dezign ver214

VMF Miss Prediction Handling If current LD/ST instruction triggers VMF Call Invoke and VMF PT did not predict it then: 1.Store current LD/ST instruction: Memory access address (to special register) Instruction PC (to special register) 2.Flush the pipeline 3.Restart execution at PC = VMF Call Base Address 5/11/2015VMF Dettailed Dezign ver215

VMF Call Invoke Hardware Implementation In order to invoke the VMF Call we need to pass parameters to it. The basic parameters are current instruction’s: Memory access address Instruction PC – Therefore upon VMF Call Invoke Trigger: – Memory access address is forwarded to the VMF_ADDR_REG – Instruction PC is forwarded to the VMF_PC_REG – Current LD/ST instruction is aborted – (If was VMF miss prediction then pipeline flush is performed) – VMF Call Procedure is executed normally 5/11/2015VMF Dettailed Dezign ver216

Every page has additional bit – 0 – not marked (not shared) – 1 – marked (shared) page addrOption FlagsS Flag TLB Modifications 5/11/2015VMF Dettailed Dezign ver217

VMF_Start: – VMF PT: If entry’s flag is 1 then reset it to 0 If entry’s flag is 3 then reset it to 2 – VMF SC (Stripe Cache): remove entries – Trigger VMF on VMF_End: – Trigger VMF off VMF Code Segment 5/11/2015VMF Dettailed Dezign ver218

Context Switch Upon context switch: – Flush the VMF tables (to preserve correctness) 5/11/2015VMF Detailed Design ver319