Buffer Overflows Many of the following slides are based on those from

Slides:



Advertisements
Similar presentations
Recitation 4 Outline Buffer overflow –Practical skills for Lab 3 Code optimization –Strength reduction –Common sub-expression –Loop unrolling Reminders.
Advertisements

University of Washington Procedures and Stacks II The Hardware/Software Interface CSE351 Winter 2013.
Copyright 2014 – Noah Mendelsohn UM Macro Assembler Functions Noah Mendelsohn Tufts University Web:
Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
David Brumley Carnegie Mellon University Credit: Some slides from Ed Schwartz.
Assembly תרגול 8 פונקציות והתקפת buffer.. Procedures (Functions) A procedure call involves passing both data and control from one part of the code to.
September 22, 2014 Pengju (Jimmy) Jin Section E
Recitation: Bomb Lab June 5, 2015 Dipayan Bhattacharya.
University of Washington CSE 351 : The Hardware/Software Interface Section 5 Structs as parameters, buffer overflows, and lab 3.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2011.
Assembly, Stacks, and Registers Kevin C. Su 9/26/2011.
Buffer Overflow Computer Organization II 1 © McQuain Buffer Overflows Many of the following slides are based on those from Complete Powerpoint.
University of Washington Today More on procedures, stack etc. Lab 2 due today!  We hope it was fun! What is a stack?  And how about a stack frame? 1.
University of Washington Today Happy Monday! HW2 due, how is Lab 3 going? Today we’ll go over:  Address space layout  Input buffers on the stack  Overflowing.
Machine-Level Programming III: Procedures Topics IA32 stack discipline Register-saving conventions Creating pointers to local variables CS 105 “Tour of.
Buffer Overflows Many of the following slides are based on those from
Carnegie Mellon 1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Machine-Level Programming V: Buffer overflow Slides.
Reminder Bomb lab is due tomorrow! Attack lab is released tomorrow!!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition Carnegie Mellon Instructor: San Skulrattanakulchai Machine-Level Programming.
CS 3214 Computer Systems Godmar Back Lecture 7. Announcements Stay tuned for Project 2 & Exercise 4 Project 1 due Sep 16 Auto-fail rule 1: –Need at least.
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2014.
Spring 2016Assembly Review Roadmap 1 car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Car c = new Car(); c.setMiles(100);
Buffer Overflow Buffer overflows are possible because C doesn’t check array boundaries Buffer overflows are dangerous because buffers for user input are.
CS 177 Computer Security Lecture 9
Credits and Disclaimers
Recitation 5: Attack Lab
Machine-Level Programming I: Basics
Buffer Overflows Many of the following slides are based on those from
Credits and Disclaimers
The Hardware/Software Interface CSE351 Winter 2013
Machine-Level Programming V: Miscellaneous Topics
CSCE 212 Computer Architecture
Recitation: Bomb Lab _______________ 18 Sep 2017.
More GDB, Intro to x86 Calling Conventions, Control Flow, & Lab 2
Recitation: Attack Lab
Buffer Overflows Many of the following slides are based on those from
Recitation: Bomb Lab _______________ 06 Feb 2017.
Run-time organization
Introduction to Compilers Tim Teitelbaum
The Stack & Procedures CSE 351 Spring 2017
Machine-Level Programming III: Procedures
Recitation: Attack Lab
Buffer Overflows Many of the following slides are based on those from
CMSC 414 Computer and Network Security Lecture 21
Machine-Level Programming III: Procedures /18-213/14-513/15-513: Introduction to Computer Systems 7th Lecture, September 18, 2018.
Carnegie Mellon Machine-Level Programming III: Procedures : Introduction to Computer Systems October 22, 2015 Instructor: Rabi Mahapatra Authors:
Recitation: Attack Lab
CSE 351 Section 10 The END…Almost 3/7/12
Assembly Language Programming II: C Compiler Calling Sequences
The Stack & Procedures CSE 351 Winter 2018
Machine Level Representation of Programs (IV)
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2015.
Week 2: Buffer Overflow Part 1.
Roadmap C: Java: Assembly language: OS: Machine code: Computer system:
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Machine-Level Representation of Programs (x86-64)
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Instructors: Majd Sakr and Khaled Harras
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2016.
Computer Organization and Assembly Language
Ithaca College Machine-Level Programming VII: Procedures Comp 21000: Introduction to Computer Systems & Assembly Lang Spring 2017.
Credits and Disclaimers
Credits and Disclaimers
Credits and Disclaimers
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2013.
Exploitation Part 1.
Computer Architecture and System Programming Laboratory
Return-to-libc Attacks
Presentation transcript:

Buffer Overflows Many of the following slides are based on those from This presentation is based on my old notes for the buffer lab project and CMU's notes for Recitation 5, CS 15213 Fall 2015. Many of the following slides are based on those from Complete Powerpoint Lecture Notes for Computer Systems: A Programmer's Perspective (CS:APP) Randal E. Bryant and David R. O'Hallaron http://www.cs.cmu.edu/afs/cs/academic/class/15213-f15/www/schedule.html The book is used explicitly in CS 2505 and CS 3214 and as a reference in CS 2506.

Agenda Stack review Attack lab overview Phases 1-3: Buffer overflow attacks Phases 4-5: ROP attacks Phases YTC: Heap spraying attacks

x86-64 Registers %rax %r8 %rbx %r9 %rcx %r10 %rdx %r11 %rsi %r12 %rdi Return %rax %eax %r8 %r8d Arg 5 %rbx %ebx %r9 %r9d Arg 6 Arg 4 %rcx %ecx %r10 %r10d Arg 3 %rdx %edx %r11 %r11d Arg 2 %rsi %esi %r12 %r12d Arg 1 %rdi %edi %r13 %r13d It's important to note the roles played by %rax, %rsp, and the argument registers. AMD's documentation says the following about the frame pointer (rbp): The conventional use of %rbp as a frame pointer for the stack frame may be avoided by using %rsp (the stack pointer) to index into the stack frame. This technique saves two instructions in the prologue and epilogue and makes one additional general-purpose register (%rbp) available. It's not really necessary to mention this, but it does explain some differences between the behavior of 32-bit and 64-bit code... see Notes page for slide 13. Stack ptr %rsp %esp %r14 %r14d %rbp %ebp %r15 %r15d

x86-64: Register Conventions Arguments are passed in registers (default): %rdi, %rsi, %rdx, %rcx, %r8, %r9 Return value: %rax Callee-saved: %rbx, %r12, %r13, %r14, %rbp, %rsp Caller-saved: %rdi, %rsi, %rdx, %rcx, %r8, %r9, %rax, %r10, %r11 Stack pointer: %rsp Instruction pointer: %rip

x86-64: The Stack %rsp points to top of stack 0x7fffffffffff %rsp Grows downward towards lower memory addresses %rsp points to top of stack %rsp Top Bottom 0x7fffffffffff pushq %reg subtract 8 from %rsp, put val in %reg at (%rsp) popq %reg put val at (%rsp) in %reg, add 8 to %rsp All of my stack diagrams follow the following conventions: - each "row" represents 8 bytes of data - if I show individual bytes, addresses within a "row" increase from left to right

x86-64: Stack Frames Every function call has its own stack frame. Think of a frame as a workspace for each call. - local variables - callee & caller-saved registers - optional arguments for a function call I put minor emphasis on the issues of saved registers and the argument build... I doubt either arises in this assignment. I do make the observation that storing the return address on the stack makes it potentially vulnerable.

x86-64: Function Call Setup and Return Going in… Caller: - allocates stack frame large enough for saved registers, optional arguments - save any caller-saved registers in frame - save any arguments beyond the first six (in reverse order) in frame - call foo: push %rip to stack, jump to label foo Callee: - push any callee-saved registers, decrease %rsp to make room for new frame …coming back Callee: - increase %rsp - pop any callee-saved registers (in reverse order) - execute ret: pop %rip

x86-64: Function Return Two vulnerabilties: max: . . . leave # %rsp <- %rbp ret # pop %rbp # pop %rip . . . older frame ptr ?? return-to address rbp old frame pointer rbp - 4 rsp caller's frame max's frame Two vulnerabilties: - if old frame pointer is corrupted, the stack is broken - if return-to address is corrupted, the function will "return" to the wrong instruction: ret: pop %rip

Attack Lab Overview: Phases 1-3 Exploit x86-64 by overwriting the stack Overflow a buffer, overwrite return address Execute injected code (code placed into the victim's buffer on the stack) Key Advice Brush up on your x86-64 conventions! Use objdump –d to determine relevant offsets Use GDB to determine stack addresses

Buffer on the Stack 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF Suppose that a function creates a buffer (array of some sort, char is particularly inviting) in its stack frame: 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF return address frame for function F() buffer buf

Buffer on the Stack 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF And, suppose the running program gives you the opportunity to write data into the buffer: 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF … and nothing prevents you from writing beyond the end of the buffer… ... and here return address So, you are passed a pointer to the beginning of the buffer… … and you write data into the buffer… You write here ... buffer buf

Buffer Overflows 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF Exploit C String library vulnerabilities to overwrite important info on stack When this function returns, where will it begin executing? ret:pop %rip What if we want to inject new code to execute? 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF old return address buf

Exploiting a Buffer Overflow Step 1: Put malicious machine code bytes into the caller's buffer: 0xAABBCCDDEEFFGGHH 0xFFFFFFFFFFFFFFFF Malicious code is written to buffer return address buf

Exploiting a Buffer Overflow Step 2: How do we trick the running program to execute our code? Overwrite the old return address with the address where our malicious code begins. address of buffer Padding bytes are written to "push" the return address we want to the correct position on the stack Malicious code is written to buffer new return address Now, when this function completes, it will: ret:pop %rip So, it will "return" to our code. buf

Std Library Enables the Attack One might be tempted to say that gets() is no longer relevant, but it's still available and it was widely used in legacy code that has not been rewritten to fix the issue. Implementation of Unix function gets No way to specify limit on number of characters to read // Get string from stdin char *gets(char *dest) { int c = getc(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getc(); } *p = '\0'; return dest; So bad it's officially deprecated, but it has been used widely, is still used, and gcc still supports it.

Vulnerable Buffer Code int main() { printf("Type a string:"); echo(); return 0; } // Echo Line void echo() { char buf[4]; // Dangerously small! // Stored on the stack!! gets(buf); // Call to oblivious fn puts(buf); }

Buffer Overflow Executions When compiled to 32-bit code, I get a segfault with a string of length 5 (not counting the terminator). That is partly due to the difference in frame layouts, and partly due to the fact that the saved frame pointer ebp is more critical in that context. CentOS > ./bufdemo Type a string:abcd abcd CentOS > ./bufdemo Type a string:abcdefghijklmnopqrst abcdefghijklmnopqrst Type a string:abcdefghijklmnopqrstuvwx abcdefghijklmnopqrstuvwx Segmentation fault (core dumped)

Buffer Overflow Mechanics The next slide explains just how I know the layout of the frame for echo(). Remind them that %rip is the program counter; so the backup of it stores the address of the instruction to which we want to return when echo() is finished. // Echo Line void echo() { char buf[4]; // Way too small! gets(buf); puts(buf); } echo: pushq %rbp # setup stack frame movq %rsp, %rbp subq $16, %rsp leaq -16(%rbp), %rax # calculate buf movq %rax, %rdi # set param for buf movl $0, %eax call gets # call gets . . . buf %rbp Saved %rip Saved %rbp Stack Frame for main frame for echo

Buffer Overflow Mechanics I used gdb, with a breakpoint set at echo() and the command "info f" to examine the stack layout. I also printed the value of rbp to verify the beginning of echo()'s frame, and the address of buf[0] to determine just where buf begins on the stack. buf: %rbp: Saved %rip Saved %rbp Stack Frame for main 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 (gdb) info f Stack level 0, frame at 0x7fffffffdff0: rip = 0x4005ec in echo (echo.c:4); saved rip 0x4005dd called by frame at 0x7fffffffe000 source language c. Arglist at 0x7fffffffdfe0, args: Locals at 0x7fffffffdfe0, Previous frame's sp is 0x7fffffffdff0 Saved registers: rbp at 0x7fffffffdfe0, rip at 0x7fffffffdfe8 (gdb) p/a $rbp $6 = 0x7fffffffdfe0 (gdb) p/a &buf[0] $7 = 0x7fffffffdfd0

Buffer Overflow Example #1 Nothing was corrupted... no issues can arise. buf: %rbp: Saved %rip Saved %rbp Stack Frame for main 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 Before call to gets() Entered string was "abc" buf: %rbp: Saved %rip Saved %rbp Stack Frame for main 'a' 'b' 'c' '\0' 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 After call to gets() No problem

Buffer Overflow Example #2 Here, the saved value of rbp is corrupted, but no runtime error occurs since the stack manipulations, when echo() returns, do not depend on that. buf: %rbp: Saved %rip Saved %rbp Stack Frame for main 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 Before call to gets() Entered string was "abcd...qrst" buf: %rbp: Saved %rip Stack Frame for main 'a' 'b' 'c' 'd' 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' '\0' After call to gets() No problem??

Buffer Overflow Example #3 Now, the return address has been corrupted... in this case, that results in a segfault when echo() returns. QTP: Which byte of the return address was overwritten, the high byte or the low byte? Why? Hint: how are multi-byte values stored? (little-endian) buf: %rbp: Saved %rip Saved %rbp Stack Frame for main 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 Before call to gets() Entered string was "abcd...qrstuvwxyz" buf: %rbp: Stack Frame for main 'a' 'b' 'c' 'd' 00007fffffffdff0 00007fffffffdfe8 00007fffffffdfe0 00007fffffffdfd0 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' After call to gets() 'z' '\0' Problem!!

Demonstration: Generating Byte Codes One could also write the exploit in C (perhaps)... The code used here serves no particular logical purpose. Use gcc and objdump to generate machine code bytes for assembly instruction sequences: CentOS > cat exploit.s movq %rbx, %rdx addq $5, %rdx movq %rdx, %rax retq CentOS > gcc -c exploit.s CentOS > objdump -d exploit.o exploit.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <.text>: 0: 48 89 da mov %rbx,%rdx 3: 48 83 c2 05 add $0x5,%rdx 7: 48 89 d0 mov %rdx,%rax a: c3 retq

Demonstration: Trying the Exploit exploit.hex is an edited version of the objdump output shown on the previous slide. See the assignment specification for the requirements. hex2raw is necessary because we need to put the actual binary values of the machine code bytes into the buffer, not the characters shown here. FWIW, 0x48 is the code for "H" and none of the other bytes correspond to printable characters. The students should use the –q switch (after rtarget) if they want to avoid contacting the server; it's necessary in any case if they want to run this on their own CentOS machines. Since my exploit string was essentially crap, it fails to solve the problem. Edit the objdump output as shown. hex2raw will generate an appropriate binary version of that for input to ctarget: CentOS > cat exploit.hex 48 89 da /* mov %rbx,%rdx */ 48 83 c2 05 /* add $0x5,%rdx */ 48 89 d0 /* mov %rdx,%rax */ c3 /* retq */ CentOS > hex2raw -i exploit.hex H??H??H??? CentOS > hex2raw -i exploit.hex | rtarget -q Cookie: 0x754e7ddd Type string:No exploit. Getbuf returned 0x1 Normal return (The strange output from hex2raw is due to the presence of bytes in the machine code that correspond to unprintable ASCII characters.)

Modern Systems exploit.hex is an edited version of the objdump output shown on the previous slide. See the assignment specification for the requirements. hex2raw is necessary because we need to put the actual binary values of the machine code bytes into the buffer, not the characters shown here. FWIW, 0x48 is the code for "H" and none of the other bytes correspond to printable characters. The students should use the –q switch (after rtarget) if they want to avoid contacting the server; it's necessary in any case if they want to run this on their own CentOS machines. Since my exploit string was essentially crap, it fails to solve the problem. Modern systems will defeat the sort of exploit shown here because instructions cannot be fetched from the range of addresses holding the stack. So… can we trick the victim of our buffer overflow into doing what we want by fetching instructions legally? We know the victim can legally fetch instructions that are part of the victim's own executable code…

Attack Lab Overview: Phases 4-5 Utilize return-oriented programming to execute arbitrary code - Useful when stack is non-executable or randomized Find gadgets, string together to form injected code Key Advice - Use mixture of pop & mov instructions + constants to perform specific task

ROP Example Draw a stack diagram and ROP exploit to: - pop the value 0xBBBBBBBB into %rbx, and - move it into %rax void foo(char *input){ char buf[32]; ... strcpy (buf, input); return; } Gadgets: address1: pop %rbx; ret address2: mov %rbx, %rax; ret Inspired by content created by Professor David Brumley

ROP Example: Solution More ROP entries... Gadgets: Consider what happens when echo() completes: - rsp winds up pointing to the location of the old return address - ret pops the stack, putting Address 1 into rip, and leaving rsp at the location of the constant - the first gadget is "called", pops the constant into rbx, and leaves rsp pointing at Address 2 - the first gadget executes ret, which pops the stack, putting Address 2 into rip - the second gadget is "called", copying rbx into rax - the second gadget executes ret, therefore we must be concerned about what's above Address 2... Gadgets: Address 1: pop %rbx; ret Address 2: mov %rbx, %rax; ret More ROP entries... Address 2 0x00000000BBBBBBBB Address 1 0xFFFFFFFFFFFFFFFF old return address was here void foo(char *input){ char buf[32]; ... strcpy (buf, input); return; } buf

ROP Demonstration: Looking for Gadgets See the beginning of my gadget farm on the following slide. Make the point that one can "subset" part(s) of longer machine instructions when forming a gadget. Show the students the encoding tables in the assignment specification. How to identify useful gadgets in your code: - a gadget must come from the supplied gadget "farm" - a gadget must end with the byte c3 (corresponding to retq) - pop and mov instructions are useful - so are carefully-chosen constants You must use objdump to examine farm.o for candidate gadgets. Then, you must determine the correct virtual addresses for those gadgets within rtarget. Use objdump to examine rtarget...

ROP Demonstration: Looking for Gadgets Creative "subsetting" of the machine code bytes may be useful: 0000000000000000 <start_farm>: 0: b8 01 00 00 00 mov $0x1,%eax 5: c3 retq 0000000000000006 <getval_168>: 6: b8 48 89 c7 94 mov $0x94c78948,%eax b: c3 retq 000000000000000c <getval_448>: c: b8 7e 58 89 c7 mov $0xc789587e,%eax 11: c3 retq 0000000000000012 <getval_387>: 12: b8 48 89 c7 c3 mov $0xc3c78948,%eax 17: c3 retq 0000000000000018 <getval_247>: 18: b8 18 90 90 90 mov $0x90909018,%eax 1d: c3 retq 000000000000001e <addval_452>: 1e: 8d 87 48 89 c7 c3 lea -0x3c3876b8(%rdi),%eax 24: c3 retq

ROP Demonstration: Looking for Gadgets Address 1: pop %rbx; ret Address 2: mov %rbx, %rax; ret 5b c3 48 89 d8 c3 4018d2: c7 07 5b c3 4f 58 4018d8: c3 . . . 4018e4: 48 89 d8 4018e7: c3 QTP: What's the address of the first gadget? Why?

Tools objdump –d ./hex2raw gdb gcc –c View byte code and assembly instructions, determine stack offsets ./hex2raw Pass raw ASCII strings to targets gdb Step through execution, determine stack addresses gcc –c Generate object file from assembly language file

More Tips Draw stack diagrams Be careful of byte ordering (little endian) READ the assignment specification! A video presentation of CMU's version is available as Recitation 5 at the link below: https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=60c65748-2026-463f-8c57-134fd6661cdf The video may be useful, so be sure to mention it. But... most of the lecture videos there correspond to our CS 3214, not CS 2506.