Low level Programming.

Slides:



Advertisements
Similar presentations
Low level Programming. Linux ABI System Calls – Everything distills into a system call /sys, /dev, /proc  read() & write() syscalls What is a system.
Advertisements

CS 4284 Systems Capstone Godmar Back Processes and Threads.
University of Washington Procedures and Stacks II The Hardware/Software Interface CSE351 Winter 2013.
Introduction to Machine/Assembler Language Noah Mendelsohn Tufts University Web:
Copyright 2014 – Noah Mendelsohn UM Macro Assembler Functions Noah Mendelsohn Tufts University Web:
Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
There are two types of addressing schemes:
COMP 2003: Assembly Language and Digital Logic
C Programming and Assembly Language Janakiraman V – NITK Surathkal 2 nd August 2014.
Inline Assembly Section 1: Recitation 7. In the early days of computing, most programs were written in assembly code. –Unmanageable because No type checking,
PC hardware and x86 3/3/08 Frans Kaashoek MIT
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
1 Function Calls Professor Jennifer Rexford COS 217 Reading: Chapter 4 of “Programming From the Ground Up” (available online from the course Web site)
Chapter 12: High-Level Language Interface. Chapter Overview Introduction Inline Assembly Code C calls assembly procedures Assembly calls C procedures.
1 Homework Reading –PAL, pp , Machine Projects –Finish mp2warmup Questions? –Start mp2 as soon as possible Labs –Continue labs with your.
High-Level Language Interface Chapter 17 S. Dandamudi.
6.828: PC hardware and x86 Frans Kaashoek
64-Bit Architectures Topics 64-bit data New registers and instructions Calling conventions CS 105 “Tour of the Black Holes of Computing!”
Machine/Assembler Language Control Flow & Compiling Function Calls Noah Mendelsohn Tufts University Web:
INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Machine-Level Programming III: Procedures Topics IA32 stack discipline Register-saving conventions Creating pointers to local variables CS 105 “Tour of.
Derived from "x86 Assembly Registers and the Stack" by Rodney BeedeRodney Beede x86 Assembly Registers and the Stack Nov 2009.
Assembly Language for x86 Processors 7th Edition Chapter 13: High-Level Language Interface (c) Pearson Education, All rights reserved. You may modify.
Carnegie Mellon 1 Odds and Ends Intro to x86-64 Memory Layout.
1 Carnegie Mellon Assembly and Bomb Lab : Introduction to Computer Systems Recitation 4, Sept. 17, 2012.
Chapter 2 Parts of a Computer System. 2.1 PC Hardware: Memory.
Machine/Assembler Language Control Flow & Compiling Function Calls Noah Mendelsohn Tufts University Web:
Functions/Methods in Assembly
Compiler Construction Code Generation Activation Records
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Instructor: San Skulrattanakulchai Machine-Level Programming.
1 Assembly Language: Function Calls Jennifer Rexford.
Microprocessors CSE- 341 Dr. Jia Uddin Assistant Professor, CSE, BRAC University Dr. Jia Uddin, CSE, BRAC University.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
Exceptional Control Flow
Instruction Set Architecture
Credits and Disclaimers
Homework Reading Lab with your assigned section starts next week
Credits and Disclaimers
Format of Assembly language
CSCE 212 Computer Architecture
Exceptional Control Flow
Anton Burtsev February, 2017
Homework Reading Machine Projects Labs PAL, pp ,
143A: Principles of Operating Systems Lecture 4: Calling conventions
Homework In-line Assembly Code Machine Language
Introduction to Compilers Tim Teitelbaum
High-Level Language Interface
Homework Reading Continue work on mp1
Assembly Language Programming V: In-line Assembly Code
Exceptional Control Flow: System Calls, Page Faults etc.
Machine-Level Programming III: Procedures
Recitation: Attack Lab
Introduction to Intel x86-64 Assembly, Architecture, Applications, & Alliteration Xeno Kovah – 2014 xkovah at gmail.
BIC 10503: COMPUTER ARCHITECTURE
Chapter 9 :: Subroutines and Control Abstraction
Machine-Level Programming III: Procedures /18-213/14-513/15-513: Introduction to Computer Systems 7th Lecture, September 18, 2018.
Carnegie Mellon Machine-Level Programming III: Procedures : Introduction to Computer Systems October 22, 2015 Instructor: Rabi Mahapatra Authors:
CSE 351 Section 10 The END…Almost 3/7/12
Assembly Language Programming II: C Compiler Calling Sequences
Machine Level Representation of Programs (IV)
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Machine-Level Representation of Programs (x86-64)
Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Instructor: John Barr * Modified slides from the book.
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Credits and Disclaimers
Credits and Disclaimers
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Low level Programming.
Presentation transcript:

Low level Programming

Linux ABI System Calls What is a system call? Everything distills into a system call /sys, /dev, /proc  read() & write() syscalls What is a system call? Special purpose function call Elevates privilege Executes function in kernel But what is a function call?

Mechanism for app to interact with the OS Similar to function calls Code securely implemented in the OS Follows predefined interface Called “ABI” – Application Binary Interface Functions referenced by predefined number Example syscall triggers “trap” instruction Special “syscall” instruction Forced memory exception

Examples getpid() brk() Return process’s ID Function 39 in 64-bit Linux, 20 in FreeBSD brk() Return current “top” of heap Function 12 in 64-bit Linux, 69 in FreeBSD

What is a function call? Special form of jmp Execute a block of code at a given address Special instruction: call <fn-address> Why not just use jmp? What do function calls need? int foo(int arg1, char * arg2); Location: foo() Arguments: arg1, arg2, … Return code: int Must be implemented at hardware level

Hardware implementation int foo(int arg1, char * arg2) { return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 89 7d fc mov %edi,-0x4(%rbp) 10e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 112: b8 00 00 00 00 mov $0x0,%eax 117: c9 leaveq 118: c3 retq Location Address of function + ret instruction Arguments Passed in registers (which ones? And why those?) Return code Stored in register: EAX To understand this we need to know about assembly programming…

Assembly basics What makes up assembly code? Instructions Operands Architecture specific Operands Registers Memory (specified as an address) Immediates Conventions Rules of the road and/or behavior models

Registers General purpose Environmental Special uses 16bit: AX, BX, CX, DX, SI, DI 32 bit: EAX, EBX, ECX, EDX, ESI, EDI 64 bit: RAX, RBX, RCX, RDX, RSI, RDI + others Environmental RSP, RIP RBP = frame pointer, defines local scope Special uses Calling conventions RAX == return code RDI, RSI, RDX, RCX… == ordered arguments Hardware defined Some instructions implicitly use specific registers RSI/RDI  String instructions RBP  leaveq

Memory X86 provides complex memory addressing capabilities Immediate addressing mov %rsi, ($0xfff000) Direct addressing mov %rsi, (%rbp) Offset Addressing mov %rsi, $0x8(%rax) Base + (Index * Scale) + Displacement A.K.A. SIB Occasionally seen Hardly ever used by hand movl %ebp, (%rdi,%rsi,4) Address = rdi + rsi * 4 A more complicated example segment:disp(base, index, scale)

8/16/32/64 bit operands Programmer explicitly specifies operand length in operand Example: mov reg, reg 8 bits: movb %al, %bl 16 bits: movw %ax, %bx 32 bits: movl %eax, %ebx 64 bits: movq %rax, %rbx What about “movl %ebx, (%rdi)”?

Function call implementation We can now decode what is going on here int foo(int arg1, char * arg2) { return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 89 7d fc mov %edi,-0x4(%rbp) 10e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 112: b8 00 00 00 00 mov $0x0,%eax 117: c9 leaveq 118: c3 retq Location Address of function + ret instruction Arguments Passed in registers (which ones? And why those?) Return code Stored in register: EAX

OS development requires assembly programming OS operations are not typically expressible with a higher level language Examples: atomic operations, page table management, configuring segments, System calls(!) How to mix assembly with OS code (in C) Compile with assembler and link with C code .S files compiled with gas Inline w/ compiler support .c files compiled with gcc

Implementing assembler functions C functions: Location, args, return code ASM functions: Location only Programmer must implement everything else Arguments, context, return values Everything in foo() from before + function body Programmer takes place of compiler Must match calling conventions

Calling assembler functions Programmer implements calling convention Behaves just like a regular function Only need location Linker takes care of the rest Defines a global variable .globl foo foo: push %rbp mov %rsp, %rbp … extern int foo(int, char *); int main() { int x = foo(1, “test”); } main.c foo.S

Inline OS only needs a few full blown assembly functions Context switches, interrupt handling, a few others Most of the time just need to execute a single instruction i.e. set a bit in this control register GCC provides ability to incorporate inline assembly instructions into a regular .c file Not a function Compiler handles argument marshaling

Overview Inline assembly includes 2 components Assembly code Compiler directives for operand marshaling asm ( assembler template : output operands /* optional */ : input operands /* optional */ : list of clobbered registers /* optional */ );

Inline assembly execution Sequence of individual assembly instructions Can execute any hardware instruction Can reference any register or memory location Can reference specified variables in C code 3 Stages of execution Load C variables into correct registers or memory Execute assembly instructions Copy register and memory contents into C variables

Specifying inline operands How does compiler copy C variables to/from registers? C variables and registers are explicitly linked in asm specification Sections for input and output operands Compiler handles copying to and from variables before and after assembly executed Assembly code references marshaled values (index of operand) instead of raw registers

Explicit Register codes Operand Codes Wide range of operand codes (“constraints”) are available Input: “code”(c-variable) Output: “=code”(c-variable) a = %rax, %eax, %ax b = %rbx, %ebx, %bx c = %rcx, %ecx, %cx d = %rdx, %edx, %dx S = %rsi, %esi, %si D = %rdi, %edi, %di r = Any register q = a, b, c, d regs m = memory operand f = floating point reg i = immediate g = anything Explicit Register codes Other Operand codes And many more….

Register example What does this do? 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 53 push %rbx 10c: 89 7d e4 mov %edi,-0x1c(%rbp) 10f: 48 89 75 d8 mov %rsi,-0x28(%rbp) 113: c7 45 f0 0a 00 00 00 movl $0xa,-0x10(%rbp) 11a: 8b 45 f0 mov -0x10(%rbp),%eax 11d: 89 c1 mov %eax,%ecx 11f: 89 cb mov %ecx,%ebx 121: 89 d8 mov %ebx,%eax 123: 89 45 f4 mov %eax,-0xc(%rbp) 126: b8 00 00 00 00 mov $0x0,%eax 12b: 5b pop %rbx 12c: c9 leaveq 12d: c3 retq int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n“ “movl %%ecx, %0;\n" : ”=b"(b) /* output */ : “a"(a) /* input */ : ); return 0; } What does this do?

Memory example X86 can also use memory (SIB, etc) operands “m” operand code 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) 12: 8b 4d fc mov -0x4(%rbp),%ecx 15: 89 4d f8 mov %ecx,-0x8(%rbp) 18: b8 00 00 00 00 mov $0x0,%eax 1d: c9 leaveq 1e: c3 retq int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : ); return 0; }

Input/output operands Sometimes input and output operands are the same variable Transform input variable in some way int foo(int arg1, char * arg2) { int a=10, b=5; asm (“addl %1, %0;\n" : "=r"(b) : "m"(a), "0"(b) : ); return 0; } 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq

Input/output operands (2) Input/output operands can also be specified with “+” int foo(int arg1, char * arg2) { int a=10, b=5; asm (“addl %1, %0;\n" : “+r"(b) : "m"(a) : ); return 0; } 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq

Clobbered list We cheated earlier… int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : ); return 0; } We cheated earlier… How does compiler know to save/restore ECX? It doesn’t We must explicitly tell compiler what registers have been implicitly messed with In this case ECX, but other instructions have implicit operands (CHECK THE MANUALS) Second set of constraints to inline assembly Clobber list: Operands not used as either input or output but still must be saved/restored by compiler

Why clobber list? Why do we need this? Clobber lists tell compiler: Compilers try to optimize performance Cache intermediate values and assume values don’t change Compiler cannot inspect ASM behavior outside scope of compiler Clobber lists tell compiler: “You cannot trust the contents of these resources after this point” Or “Do not perform optimizations that span this block on these resources”

Using clobber lists int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : “ecx”, “memory” ); return 0; } ECX is used implicitly so its value must be saved/restored What about “memory”?

Back to system calls Function calls not that special Just an abstraction built on top of hardware System calls are basically function calls With a few minor changes Privilege elevation Constrained entry points Functions can call to any address System calls must go through “gates”

Implementing system calls System calls are implemented as a single function call: syscall() read() and write() actually just invoke syscall() What does syscall do? Enters into the kernel at a known location Elevates privilege Instantiates kernel level environment Once inside the kernel, an appropriate system call handler is invoked based on arguments to syscall()

x86 and Linux Number of different mechanisms for implementing syscall Legacy: int 0x80 – Invokes a single interrupt handler 32 bit: SYSENTER – Special instruction that sets up preset kernel environment 64 bit: SYSCALL – 64 bit version of SYSENTER All jump to a preconfigured execution environment inside kernel space Either interrupt context or OS defined context What about arguments? syscall(int syscall_num, args…)

Specific system calls Each system call has a number assigned to it Index into a system call table Function pointers referencing each syscall handler Syscall(int syscall_num, args…) Sets up kernel environment Invokes syscall_table[syscall_num](args…); Returns to user space: Resets environment to state before call