Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS444/544 Operating Systems II System Calls and Page Fault

Similar presentations


Presentation on theme: "CS444/544 Operating Systems II System Calls and Page Fault"— Presentation transcript:

1 CS444/544 Operating Systems II System Calls and Page Fault
Yeongjin Jang 04/30/19

2 About Lab 1 & Tokens There will be no Token deduction from Lab 1
You all have 5 tokens to use for Lab 2 and onward Missing lab1-final tag -5 pts (max 50), i.e., missing the tag will get 9% if you got 50/50

3 About Lab 1 Missing student.info COMMIT your info for lab2,
wuyo, Yuchia, wiltons, gaomin, jmanosu, xiongz, para, jaocalex, Mythcell, kulkarnr, REK, silverrhealm COMMIT your info for lab2, And send me an about the information ASAP GET 0 pts if the info is not presented

4 Recap: System Call (User)
User calls a function cprintf -> calls sys_cputs() sys_cputs() at user code will call syscall() (lib/syscall.c) This syscall() is at lib/syscall.c Set args in the register and then int $0x30 Now kernel execution starts…

5 Recap: System Call (Kernel)
CPU gets software interrupt TRAPHANDLER_NOEC(T_SYSCALL…) _alltraps() trap() trap_dispatch() Get registers that store arguments from struct Trapframe *tf Call syscall() using those registers This syscall() is at kern/syscall.c

6 Recap: System Call (return to User)
Finishing handling of syscall (return of syscall()) trap() calls env_run() Get back to the user environment! env_pop_tf() Runs iret Back to Ring 3! Restore trap frame from the stack

7 Recap: A High-level Overview of User/Kernel Execution
User Level (Ring 3) printf() A library call in ring 3 Libraries sys_write() A system call, From ring 3 to ring 0 OS Kernel (Ring 0) do_sys_write() A kernel function

8 Recap: A High-level Overview of User/Kernel Execution
User Level (Ring 3) printf() A library call in ring 3 ret (ring 3) Libraries sys_write() iret (ring 0 to ring 3) A system call, From ring 3 to ring 0 OS Kernel (Ring 0) do_sys_write() A kernel function

9 Today’s Topic More about System Call Page Fault
How an OS handle a fault and resume the execution?

10 Library Calls vs. System Calls
Ring 3 App Library Calls Library Calls APIs available in Ring 3 DO NOT include operations in Ring 0 Could be a wrapper for some computation or Could be a wrapper for system calls E.g., printf() internally uses write(), which is a system call Some system calls are available as library calls As wrappers in Ring 3 OS Syscalls Hardware

11 Library Calls vs. System Calls
APIs available in Ring 0 OS’s abstraction for hardware interface for userspace Called when Ring 3 application need to perform Ring 0 operations Ring 3 cannot do arbitrary Ring 0 operations Entries are only system calls (cannot call arbitrary kernel functions) Kernel checks all the arguments at the entrance A secure method to control access to Ring 0!

12 How an OS Protects System Call?
Call gate System call can be invoked only with trap handler int $0x30 – in JOS int $0x80 – in Linux (32-bit) int $0x2e – in Windows (32-bit) sysenter/sysexit (32-bit) syscall/sysret (64-bit) OS performs checks if userspace is doing a right thing Before performing important ring 0 operations E.g., accessing hardware.. Ring 3 App Library Calls int $0x30 CHECK!! OS Syscalls Hardware

13 An Example of Check at Syscall Entry
How can we protect ‘read()’ system call? read(int fd, void *buf, size_t count) Read count bytes from a file pointed by fd and store those in buf Usage

14 An Example of Check at Syscall Entry
Problem: what will happen if we call… This will overwrite kernel code with your keystroke typing.. Changing kernel code from Ring 3 is possible!

15 Call Gate We can hook all syscalls from Ring 3 at our syscall trap handler Algorithm Get the trap for syscall Check for the syscall number (3 for read in Linux) Check for its arguments (fd, buf, size) buf must points to Ring 3 writable address, not the kernel address If passed, perform the ‘read’ operation Otherwise, return -EFAULT

16 Call Gate We can hook all syscalls from Ring 3 at our syscall trap handler Algorithm Get the trap for syscall Check for the syscall number (3 for read in Linux) Check for its arguments (fd, buf, size) buf must points to Ring 3 writable address, not the kernel address If passed, perform the ‘read’ operation Otherwise, return -EFAULT Ring 3 App Library Calls int $0x30 CHECK!! OS Syscalls Hardware

17 Test

18 Check How System Calls are Invoked
Use strace in Linux

19 Check How Library Calls are Invoked
Use ltrace in Linux

20 Page Fault: A Case of Handling Faults
Occurs when paging fails !(pde&PTE_P) or !(pte&PTE_P): invalid translation Write access but !(pte&PTE_W): access violation Access from user but !(pte&PTE_U): protection violation Faults Faulting instruction has not executed (e.g., page fault) Resume the execution after handling the fault

21 Page Fault: an Example Accessing a Kernel address from User

22 Page Fault: What Does CPU Do?
CPU would like to let OS know why and where such a page fault happened CR2: stores the address of the fault Error code: stores the reason of the fault 1 0xf

23 CPU/OS Execution Example
User program accesses 0xf CPU generates page fault (pte&PTE_U == 0) Calls page fault handler in IDT OS: page_fault_handler Read CR2 (address of the fault, 0xf ) Read error code (contains the reason of the fault) Resolve error (if not, generate SIGSEGV, segmentation fault!) Continue user execution User: resume on that instruction (or stop by SIGSEGV)

24 Fault Resume Example: Stack Overflow
inc/memlayout.h We allocate one (1) page for the user stack If you use a large local variable on the stack Stack overflow (stack grows down…) NOT MAPPED!

25 Some Idea: Allocating New Stack Automatically
Can we detect such an access and allocate a new page for the stack automatically? Yes We will utilize ‘Page Fault’ Observations Stack overflow would be sequential (access pages adjacent to the stack) We should catch both read/write access (both should fault)

26 Example: New Stack Allocation by Fault (User)
Stack starts at 0xeebfe000 Suppose the current value of esp (stack) is 0xeebfe010 User program creates a new variable: char buf[32] buf = 0xeebfdff0 Buffer range: 0xeebfdff0 ~ 0xeebfe010 On accessing buf[0] = ‘1’; movb $0x31, (%eax) eax = 0xeebfdff0 No translation for 0xeebfd000

27 Example: New Stack Allocation by Fault (CPU)
Lookup page table No translation! Store 0xeebfdff0 to CR2 Set error code “The fault was caused by a non-present page!” Call page fault handler (interrupt #14) NOT MAPPED!

28 Example: New Stack Allocation by Fault (OS)
CPU invoked the Page_fault_handler() Read CR2 0xeebfdff0, it seems like the page right next to current stack page end Current end is: 0xeebfe000 Read error code “The fault was caused by a non-present page!” Let’s allocate a new page for the stack!

29 Example: New Stack Allocation by Fault (OS)
Allocate a new page for the stack Struct PageInfo *pp = page_alloc(ALLOC_ZERO); Get a new page, and wipe it to have all zero as its contents page_insert(env_pgdir, pp, 0xeebfd000, PTE_U|PTE_W); Map a new page to that address! iret!

30 Example: New Stack Allocation by Fault (User-Return)
On accessing buf[0] = ‘1’; movb $0x31, (%eax) eax = 0xeebfdff0 No translation for 0xeebfd000 Execute the faulting instruction again: buf[0] = ‘1’; eax = 0xeebfdff0 Now translation is valid! Continue to execute the loop..

31 Other Useful Examples of Using Page Fault
Copy-on-Write (CoW) Share pages read-only Create a private copy when the first write access happens Memory Swapping Limited amount of physical memory We have a bigger storage: SSD, Hard Disk, online storage, etc. Can we store some ‘currently unused but will be used later’ part into the disk?

32 Copy-on-Write Think about vm-jos server
Will run many /bin/bash, /usr/bin/gdb, /usr/bin/tmux, etc. Each of you will run those programs!! How can we build an OS to efficiently load them and minimize memory usage?

33 A Program .text .rodata .data .bss .bss (RW-)
Code area. Read-only and executable .rodata Data area, Read-only and not executable .data Data area, Read/Writable (not executable) Initialized by some values .bss Initialized as 0 .data (RW-) .rodata (R--) .text (R-X)

34 Page Cache Store one copy of file in the memory Process 2
.bss (RW-) Do we need to copy the same data for each process creation? Store one copy of file in the memory .data (RW-) Process 1 .bss (RW-) .bss (RW-) .rodata (R--) .data (RW-) .data (RW-) .text (R-X) .rodata (R--) .rodata (R--) .text (R-X) .text (R-X)

35 Sharing by Read-only Set page table to map the same physical address to share contents Process 1 Process 2 .bss (RW-) .bss (R--) .bss (R--) .data (RW-) .data (R--) .data (R--) .rodata (R--) .rodata (R--) .rodata (R--) .text (R-X) .text (R-X) .text (R-X)

36 OK for Read-only Sections
How can Process 1 write on .bss?? Process 1 .bss (RW-) .bss (R--) Write Page fault! .data (RW-) .data (R--) .rodata (R--) .rodata (R--) .text (R-X) .text (R-X)

37 Page Fault Handler Read CR2 Read Error code ToDo
An address that is in the page cache Hmm… a fault from one of the shared location! Read Error code Write on read-only memory Hmm… the process requires a private copy! ToDo Map a new physical page (page_alloc, page_insert) Copy the contents Mark it read/write Resume…

38 Copy–on-Write How can Process 1 write on .bss?? .bss (RW-) Process 1
MAP! .bss (RW-) .bss (RW-) .bss (R--) Write Page fault! .data (RW-) .data (R--) .rodata (R--) .rodata (R--) .text (R-X) .text (R-X)

39 Benefits? Can reduce time for copying contents that is already in some physical memory (page cache) Can reduce actual use of physical memory by sharing code/read-only data among multiple processes 1,000,000 processes, requiring only 1 copy of .text/.rodata At the same time Can support sharing of writable pages (if not has written at all) Can create private pages seamlessly on write

40 Memory Swapping Memory Hierarchy FAST Expensive Reg (KB) Cache (MB)
DISK (TB? PB?) Main Memory (GB) Cache (MB) Reg (KB) Memory Hierarchy SIZE

41 Challenge Suppose you have 8GB of main memory
Can you run a program that its program size is 16GB? Yes, you can load them part by part This is because we do not use all of data at the same time Can your OS do this execution seamlessly to your application?

42 Swapping PT Virtual Memory Physical Memory pgdir PT

43 Swapping – Remove a page…
PT Virtual Memory Physical Memory Page Fault! Access pgdir PT DISK

44 Swapping - OS Page fault handler
Read CR2 (get address) Read error code If error code says that the fault is caused by non-present page and The faulting page of the current process is stored in the disk Load that page into physical memory Map it!

45 Swapping – Remove a page…
PT Virtual Memory Physical Memory Create new map! Page Fault! Access pgdir Allocate New page! PT DISK READ from DISK

46 Page Fault A fault is meant to be handled by OS, and after that, it is meant to be resuming the execution from the faulting point Page fault handler is useful for implementing Automatic stack allocation Copy-on-write (will do in Lab4) Swapping


Download ppt "CS444/544 Operating Systems II System Calls and Page Fault"

Similar presentations


Ads by Google