CS Spring 2014 Prelim 2 Review

Slides:



Advertisements
Similar presentations
Project 5: Virtual Memory
Advertisements

Chapter 3 Memory Management
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a cache for secondary (disk) storage – Managed jointly.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 10 – Introduction to MIPS Procedures I If cars broadcast their speeds to.
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
SE-292 High Performance Computing
Datorteknik OperatingSystem bild 1 the Operating System (OS)
10/20: Lecture Topics HW 3 Problem 2 Caches –Types of cache misses –Cache performance –Cache tradeoffs –Cache summary Input/Output –Types of I/O Devices.
Memory Protection: Kernel and User Address Spaces  Background  Address binding  How memory protection is achieved.
Digital System Design Using Verilog
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
The University of Adelaide, School of Computer Science
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
1 Procedure Calls, Linking & Launching Applications Lecture 15 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Chapter 2 — Instructions: Language of the Computer — 1 Branching Far Away If branch target is too far to encode with 16-bit offset, assembler rewrites.
ECE 232 L6.Assemb.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 6 MIPS Assembly.
10/6: Lecture Topics Procedure call Calling conventions The stack
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
Procedure Calls Prof. Sirer CS 316 Cornell University.
MIPS Calling Convention Chapter 2.7 Appendix A.6.
Computer Architecture CSCE 350
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
Procedures II (1) Fall 2005 Lecture 07: Procedure Calls (Part 2)
The University of Adelaide, School of Computer Science
Apr. 12, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 6: Branching and Procedures in MIPS* Jeremy R. Johnson Wed. Apr. 12, 2000.
The University of Adelaide, School of Computer Science
Procedure call frame: Hold values passed to a procedure as arguments
Homework 2 Review Cornell CS Calling Conventions int gcd(int x, int y) { int t, a = x, b = y; while(b != 0) { t = b; b = mod(a, b); a = t; } return.
CS 536 Spring Code generation I Lecture 20.
Intro to Computer Architecture
13/02/2009CA&O Lecture 04 by Engr. Umbreen Sabir Computer Architecture & Organization Instructions: Language of Computer Engr. Umbreen Sabir Computer Engineering.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Topic 2d High-Level languages and Systems Software
Procedure Basics Computer Organization I 1 October 2009 © McQuain, Feng & Ribbens Procedure Support From previous study of high-level languages,
Procedure (Method) Calls Ellen Spertus MCS 111 September 25, 2003.
April 23, 2001Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Assemblers, Linkers, and Loaders * Jeremy R. Johnson Mon. April 23,
Calling Conventions Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University See P&H 2.8 and 2.12.
Calling Conventions Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 2.8 and 2.12.
Of Privilege, Traps, Interrupts & Exceptions Prof. Sirer CS 316 Cornell University.
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and notes from the Patterson and Hennessy Text.
Computer Architecture & Operations I
Computer structure: Procedure Calls
Lecture 5: Procedure Calls
MIPS Assembly Language Programming
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
RISC Concepts, MIPS ISA Logic Design Tutorial 8.
Morgan Kaufmann Publishers
Topic 2e High-Level languages and Systems Software
Prof. Hakim Weatherspoon
Calling Conventions Hakim Weatherspoon CS 3410, Spring 2012
Prof. Hakim Weatherspoon
Chap. 8 :: Subroutines and Control Abstraction
MIPS Instructions.
CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
The University of Adelaide, School of Computer Science
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Program and memory layout
Procedures and Calling Conventions
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Architectural Support for OS
Computer Architecture
10/6: Lecture Topics C Brainteaser More on Procedure Call
Where is all the knowledge we lost with information? T. S. Eliot
Translation Buffers (TLBs)
Virtual Memory.
Review What are the advantages/disadvantages of pages versus segments?
Topic 2b ISA Support for High-Level Languages
Presentation transcript:

CS 3410 - Spring 2014 Prelim 2 Review

Prelim 2 Coverage Calling Conventions Linkers Caches Virtual Memory Traps Multicore Architectures Synchronization

Calling Convention Prelim 2, 2013sp, Q5: Translate the following C code to MIPS assembly: int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; }

Calling Convention (cont.) int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; } Question 1: how many caller/callee save registers for which variables? Callee save (need the original value after a function call): a, b, c, tmp, q Caller save (do not need to preserve in a function call): d ($a3), e, z ($v0) Question 2: how many outgoing arguments we should leave space for? 5: bigfoo(q, tmp, a, b, c) Question 3: what is the stack frame size? ra + fp + 5 callee-save + 5 outgoing args = 12 words = 48 bytes

Calling Convention (cont.) int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; } #prolog ADDIU $sp, $sp, -48 # (== 5x outgoing args, 5x $sxx, $ra, $fp) SW $ra, 44($sp) SW $fp, 40($sp) SW $s0, 36($sp) # store, then $s0 = a SW $s1, 32($sp) # store, then $s1 = b SW $s2, 28($sp) # store, then $s2 = c SW $s3, 24($sp) # store, then $s3 = tmp = (a|b) – (d&e) SW $s4, 20($sp) # store, then $s4 = q = littlefoo(tmp) ADDIU $fp, $sp, 44

Calling Convention (cont.) int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; } #Initializing local variables MOVE $s0, $a0 MOVE $s1, $a1 MOVE $s2, $a2 OR $t0, $s0, $s1 # $t0 = (a|b) LW $t1, 64($sp) # 64 = 48(own stack) + 16(5th arg in parent) AND $t1, $a3, $t1 # $t1 = (d&e) SUB $s3, $t0, $t1 # $s3 = tmp = (a|b) – (d&e)

Calling Convention (cont.) int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; } #Calling littlefoo MOVE $a0, $s3 # $a0 = tmp JAL littlefoo NOP #Calling bigfoo MOVE $s4, $v0 # $s4 = q = littlefoo(tmp) MOVE $a0, $s4 # $a0 = $s4 = q MOVE $a1, $s3 # $a1 = $s3 = tmp MOVE $a2, $s0 # $a2 = $s0 = a MOVE $a3, $s2 # $a3 = $s1 = b SW $s2, 16($sp) # 5th arg = $s2 = c JAL bigfoo # bigfoo(q,tmp,a,b,c)

Calling Convention (cont.) int foo(int a, int b, int c, int d, int e) { int tmp = (a|b) – (d&e); int q = littlefoo(tmp); int z = bigfoo(q,tmp,a,b,c); return tmp + q – z; } #Generating return value ADD $t0, $s3, $s4 # $t0 = tmp + q SUB $v0, $t0, $v0 # $v0 = $t0 – z = (tmp + q) – z #epilog LW $s4, 20($sp) LW $s3, 24($sp) LW $s2, 28($sp) LW $s1, 32($sp) LW $s0, 36($sp) LW $fp, 40($sp) LW $ra, 44($sp) ADDIU $sp, $sp, 48 JR $ra NOP

Linkers and Program Layout Prelim 2, 2012sp, Q2b: The global pointer, $gp, is usually initialized to the middle of the global data segment. Why the middle? Load and store instructions use signed offsets. Having $gp point to the middle of the data segment allows a full 2^16 byte range of memory to be accessed using positive and negative offsets from $gp.

Linkers and Program Layout Prelim 2, 2012sp, Q2c: Bob links his Hello World program against 9001 static libraries. Amazingly, this works without any collisions. Why? The linker chooses addresses for each library and fills in all the absolute addresses in each with the numbers that it chose.

Caches Prelim 2, 2013sp, Q4: Assume that we have a byte-addressed 32-bit processor with 32-bit words (i.e. a word is 4 bytes). Assume further that we have a cache consisting of eight 16-byte lines

Caches (cont.) How many bits are needed for the tag, index, and offset for the following cache architectures? Direct Mapped Tag: 25, Index: 3, Offset: 4 2-way Set Associative Tag: 26, Index: 2, Offset: 4 4-way Set Associative Tag: 27, Index: 1, Offset: 4 Fully Associative Tag: 28, Index: 0, Offset: 4 Offset is only determined by the size of the cache line. Index is determined by how caches are organized. Tag = 32 – index - offset

Caches (cont.) For each access and for each specified cache organization, indicate whether there is a cache hit, a cold (compulsory) miss, conflict miss, or capacity miss.

Virtual Memory (2012 Prelim3, Q4) Virtual Address: 32-bit Page Size: 16 kB Single level page table Each page table entry is 4 bytes. Each process segment requires a separate physical page. Stack 8 kB Heap 8 kB 16 kB = 2^14 B So we need 14 bits Bits for page Offset? Data 8 kB Bits for page table index? 32-14 bits = 18 bits Code 8 kB Physical memory? Each segment size < one page size 4*16 kB = 64 kB 2^18 (PTE’s) * 4 bytes = 1 MB Total: 64kB + 1MB Memory layout of a single process

Virtual Memory (2012 Prelim3, Q4) Two level page table Assume there are enough page table entries to fill a second-level page table. (which means every entry in a second level page table will be used) Bits for page offset? 14 bits Bits for second level page table? 16kB/4B=2^12 So we need 12 bits Bits for page directory? 32-14-12 bits=6 bits Physical memory(each process segment requires a separate second-level page table)? 1st: 2^6 * 4B < 2^14B=> 16 kB 2nd: 4 * 16 kB Pages: 4 * 16 kB Total: 16kB+4*16kB+4*16kB ……

Syscall User Program Kernel syscall(arg1,arg2){ do operation } main(){ … syscall(arg1,arg2); } User Stub Kernel Stub syscall(arg1,arg2){ trap return } handler(){ copy arguments from user memory check arguments syscall(arg1,arg2); copy return value into user memory return } Hardware Trap Trap Return

Exceptions On an interrupt or exception CPU saves PC of exception instruction (EPC) CPU Saves cause of the interrupt/privilege (Cause register) Switches the sp to the kernel stack Saves the old (user) SP value Saves the old (user) PC value Saves the old privilege mode Sets the new privilege mode to 1 Sets the new PC to the kernel interrupt/exception handler Kernel interrupt/exception handler handles the event Saves all registers Examines the cause Performs operation required Restores all registers Performs a “return from interrupt” instruction, which restores the privilege mode, SP and PC

Syscall V.S. Exceptions

Concurrency (2012 Prelim3, Q5) mutex_lock(&m) operation mutex_unlock(&m) mutex_lock try: LI $t1, 1 LL $t0, 0($a0) BNEZ $t0, try SC $t1, 0($a0) BEQZ $t1, try Load-link returns the current value of a memory location, while a subsequent store-conditional to the same memory location will store a new value only if no updates have occurred to that location since the load-link. Together, this implements a lock-free atomic read-modify-write operation. mutex_unlock SW $zero, 0($a0)

Concurrency (2012 Prelim3, Q5) Critical Section:x = max(x, y) x: global variable, shared; y: local variable &x: $a1 y: $a2 Implement critical section using LL/SC without using mutex_lock and mutex_unlock try: LL $t0, 0($a1) BGE $t0, $a2, next NOP MOVE $t0, $a2 next: SC $t0, 0($a1) BEQZ $t0, try

Concurrency(Homework2 Q8) A: LW $t0, 0($s0) ADDIU $t0, $t0, 2 SW $t0, 0($s0) B: SW $zero, 0($s0) =>0 B: SW $zero, 0($s0) A: LW $t0, 0($s0) ADDIU $t0, $t0, 2 SW $t0, 0($s0) =>2 A: LW $t0, 0($s0) ADDIU $t0, $t0, 2 B: SW $zero, 0($s0) A: SW $t0, 0($s0) =>3

Concurrency(Homework2 Q8)