1 Machine-Level Programming V: Advanced Topics Andrew Case Slides adapted from Jinyang Li, Randy Bryant & Dave O’Hallaron
2 Today Structures and Unions Memory Layout Buffer Overflow Vulnerability Protection
3 struct rec { int a[3]; int i; struct rec *n; }; Structure Allocation Memory Layout ian Struct members laid out contiguously in memory Offset of each struct member determined at compile time Members may be of different types
4 struct rec { int a[3]; int i; struct rec *n; }; IA32 Assembly # %edx = val, %eax = r movl %edx, 12(%eax) # Mem[r+12] = val void set_i(struct rec *r, int val) { r->i = val; } Structure Access Accessing Structure Member Pointer indicates first byte of structure Access elements with offsets ian r+12r
5 # %edx = r.L17:# loop: movl12(%edx), %eax# r->i movl%ecx, (%edx,%eax,4)# r->a[i] = val movl16(%edx), %edx# r = r->n testl%edx, %edx# Test r jne.L17# If != 0 goto loop void set_all_values (struct rec *r, int val) { while (r) { int i = r->i; r->a[i] = val; r = r->n; } Following Linked List struct rec { int a[3]; int i; struct rec *n; }; ian Element i
6 Structures & Alignment Unaligned Data Aligned Data Important for memory management (paging/etc.) Done by the compiler For a primitive data type of K bytes, address is multiple of K Can be inefficient usage of space ci[0]i[1]v 3 bytes4 bytes p+0p+4p+8p+16p+24 Multiple of 4Multiple of 8 ci[0]i[1]v pp+1p+5p+9p+17 struct S1 { char c; int i[2]; double v; } *p; struct S1 { char c; int i[2]; double v; } *p;
7 struct S1 { char c; int i[2]; double v; } *p; struct S1 { char c; int i[2]; double v; } *p; Satisfying Alignment with Structures Alignment requirement: 1. Must align each element of a struct 2. Initial address & structure length must be multiples of the biggest alignment of a struct’s elements ci[0]i[1]v 3 bytes4 bytes p+0p+4p+8p+16p+24 Multiple of 4Multiple of 8 Biggest alignment of elements: 8
8 Saving Space Define a struct to put large data types first Note: May not be worth optimizing struct S4 { char c; int i; char d; } *p; struct S4 { char c; int i; char d; } *p; struct S5 { int i; char c; char d; } *p; struct S5 { int i; char c; char d; } *p; ci 3 bytes d cid 2 bytes
9 Union Allocation Unions can store different kinds of data in one memory allocation (but only one type at a time) Allocated according to largest element union U1 { char c; int i[2]; double v; } *up; union U1 { char c; int i[2]; double v; } *up; struct S1 { char c; int i[2]; double v; } *sp; struct S1 { char c; int i[2]; double v; } *sp; c 3 bytes i[0]i[1] 4 bytes v sp+0sp+4sp+8sp+16sp+24 c i[0]i[1] v up+0up+4up+8
10 Byte Ordering Example union { unsigned char c[8]; unsigned short s[4]; unsigned int i[2]; unsigned long l[1]; } dw; 32-bit 64-bit
11 Today Structures and Unions Memory Layout Buffer Overflow Vulnerability Protection
12 IA32 Linux Memory Layout Stack Runtime stack (8MB limit) E. g., local variables Heap Dynamically allocated storage Used when calling malloc(), calloc(), new() Data Statically allocated data E.g., Global variables Text Executable machine instructions Read-only 0xFF****** 0x00****** Stack Text Data Heap 0x08****** 8MB
13 Memory Allocation Example char big_array[1<<24]; /* 16 MB */ char huge_array[1<<28]; /* 256 MB */ int beyond; char *p1, *p2, *p3, *p4; int useless() { return 0; } int main() { p1 = malloc(1 <<28); /* 256 MB */ p2 = malloc(1 << 8); /* 256 B */ p3 = malloc(1 <<28); /* 256 MB */ p4 = malloc(1 << 8); /* 256 B */ /* Some print statements... */ } FF 00 Stack Text Data Heap 08 Where does everything go?
14 IA32 Example Addresses $esp0xffffbcd0 p3 0x p1 0x p40x1904a110 p20x1904a008 &p20x &beyond 0x big_array 0x huge_array 0x main()0x080483c6 useless() 0x final malloc()0x006be166 address range ~2 32 FF 00 Stack Text Data Heap malloc() is dynamically linked address determined at runtime
15 x86-64 Example Addresses address range ~ F Stack Text Data Heap not drawn to scale malloc() is dynamically linked address determined at runtime $rsp0x00007ffffff8d1f8 p3 0x00002aaabaadd010 p1 0x00002aaaaaadc010 p40x p20x &p20x a60 &beyond 0x a44 big_array 0x a80 huge_array 0x a50 main()0x useless() 0x final malloc()0x ae6a170
16 Today Structures and Unions Memory Layout Buffer Overflow Vulnerability Protection
17 Internet Worm November, 1988 Internet Worm attacks thousands of Internet hosts. How did it happen?
18 String Library Code Implementation of Unix function gets() No way to specify limit on number of characters to read Similar problems with other library functions strcpy, strcat : Copy strings of arbitrary length scanf, fscanf, sscanf, when given %s conversion specification /* Get string from stdin */ char *gets(char *dest) { int c = getchar(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getchar(); } *p = '\0'; return dest; }
19 Vulnerable Buffer Code void call_echo() { echo(); } /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } unix>./bufdemo Type a string: unix>./bufdemo Type a string: Segmentation Fault unix>./bufdemo Type a string: ABC Segmentation Fault
20 Buffer Overflow Disassembly 80485c5:55 push %ebp 80485c6:89 e5 mov %esp,%ebp 80485c8:53 push %ebx 80485c9:83 ec 14 sub $20,%esp 80485cc:8d 5d f8 lea -8(%ebp),%ebx 80485cf:89 1c 24 mov %ebx,(%esp) 80485d2:e8 9e ff ff ff call d7:89 1c 24 mov %ebx,(%esp) 80485da:e8 05 fe ff ff call 80483e df:83 c4 14 add $20,%esp 80485e2:5b pop %ebx 80485e3:5d pop %ebp 80485e4:c3 ret 80485eb:e8 d5 ff ff ff call 80485c f0:c9 leave 80485f1:c3 ret call_echo: echo:
21 Buffer Overflow Stack echo: pushl %ebp# Save %ebp on stack movl %esp, %ebp pushl %ebx# Save %ebx subl $20, %esp# Allocate stack space leal -8(%ebp),%ebx# Compute buf as %ebp-8 movl %ebx, (%esp)# Push buf on stack call gets# Call gets... /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo [3][2][1][0] buf Before call to gets Saved %ebx
22 Buffer Overflow Stack Example unix> gdb bufdemo (gdb) break echo Breakpoint 1 at 0x80485c9 (gdb) run Breakpoint 1, 0x80485c9 in echo () (gdb) print /x $ebp $1 = 0xffffd678 (gdb) print /x *(unsigned *)$ebp $2 = 0xffffd688 (gdb) print /x *((unsigned *)$ebp + 1) $3 = 0x80485f0 0xffffd678 buf 0xffffd688 Return Address Saved %ebp Stack Frame for main Stack Frame for echo [3][2][1][0] Stack Frame for main Stack Frame for echo xx buf ff d f0 Before call to gets Saved %ebx 80485eb:call 80485c f0:leave
23 Buffer Overflow Example #1 Overflow buf, and corrupt %ebx Stack Frame for echo xx buf Stack Frame for echo buf Before call to gets Input \0 0xffffd678 0xffffd688 Stack Frame for main ff d f0 0xffffd678 0xffffd688 Stack Frame for main ff d f0 Saved %ebx
24 Buffer Overflow Example #2 Stack Frame for echo xx buf Stack Frame for echo buf Before call to gets Input \0 0xffffd678 0xffffd688 Stack Frame for main ff d f0 0xffffd678 0xffffd688 Stack Frame for main ff d f0 Saved %ebx echo:... call leave # Reset %ebp to corrupted value ret
25 Buffer Overflow Example #3 Stack Frame for echo xx buf Stack Frame for echo buf Before call to gets Input ABC\0 0xffffd678 0xffffd688 Stack Frame for main ff d f0 0xffffd678 0xffffd688 Stack Frame for main Saved %ebx echo:... call leave # Reset %ebp to corrupted value ret
26 Malicious Use of Buffer Overflow Input string contains byte representation of executable code Overwrite return address A with address of buffer B When bar() executes ret, will jump to exploit code int bar() { char buf[64]; gets(buf);... return...; } void foo(){ bar();... } Stack after call to gets() B return address A foo stack frame bar stack frame B exploit code pad data written by gets()
27 Exploits Based on Buffer Overflows Buffer overflow bugs allow remote machines to execute arbitrary code on victim machines Internet worm Early versions of the finger server (fingerd) used gets() to read the argument sent by the client: finger Worm attacked fingerd server by sending phony argument: finger “exploit-code padding new-return- address” exploit code: executed a root shell on the victim machine with a direct TCP connection to the attacker.
28 Code Red Exploit Code Exploited bug in Microsoft IIS web server Spread self Generate random IP addresses & send attack string Infected >300,000 hosts Attack Send 98,304 packets; sleep for 4-1/2 hours; repeat Denial of service attack Deface server’s home page After waiting 2 hours
29 Avoiding Overflow Vulnerability Use library routines that limit string lengths fgets instead of gets strncpy instead of strcpy Don’t use scanf with %s conversion specification Use fgets to read the string Or use %ns where n is a suitable integer /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ fgets(buf, 4, stdin); puts(buf); }
30 System-Level Protections unix> gdb bufdemo (gdb) break echo (gdb) run (gdb) print /x $ebp $1 = 0xffffc638 (gdb) run (gdb) print /x $ebp $2 = 0xffffbb08 (gdb) run (gdb) print /x $ebp $3 = 0xffffc6a8 Randomized stack offsets At start of program, allocate random amount of space on stack Makes it difficult for hacker to predict beginning of inserted code Nonexecutable code segments In traditional x86, can mark region of memory as either “read-only” or “writeable” Can execute anything readable X86-64 added explicit “execute” permission
31 Compile-Level Protection: Stack Canaries Idea Place special value (“canary”) on stack just beyond buffer Check for corruption before exiting function GCC Implementation -fstack-protector -fstack-protector-all unix>./bufdemo-protected Type a string: unix>./bufdemo-protected Type a string:12345 *** stack smashing detected ***
32 Protected Buffer Disassembly d:55 push %ebp e:89 e5 mov %esp,%ebp :53 push %ebx :83 ec 14 sub $20,%esp :65 a mov %gs:0x14,%eax a:89 45 f8 mov %eax,0xfffffff8(%ebp) d:31 c0 xor %eax,%eax f:8d 5d f4 lea 0xfffffff4(%ebp),%ebx :89 1c 24 mov %ebx,(%esp) :e8 77 ff ff ff call 80485e a:89 1c 24 mov %ebx,(%esp) d:e8 ca fd ff ff call c :8b 45 f8 mov -8(%ebp),%eax : xor %gs:0x14,%eax c:74 05 je e:e8 a9 fd ff ff call c :83 c4 14 add $20,%esp :5b pop %ebx :5d pop %ebp :c3 ret echo:
33 Setting Up Canary echo:... movl%gs:20, %eax# Get canary movl%eax, -8(%ebp)# Put on stack xorl%eax, %eax # Erase canary... /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo [3][2][1][0] buf Before call to gets Saved %ebx Canary
34 Checking Canary echo:... movl-8(%ebp), %eax# Retrieve from stack xorl%gs:20, %eax# Compare with Canary je.L24# Same: skip ahead call__stack_chk_fail# ERROR.L24:... /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo [3][2][1][0] buf Before call to gets Saved %ebx Canary
35 Canary Example (gdb) break echo (gdb) run (gdb) stepi 3 (gdb) print /x *((unsigned *) $ebp - 2) $1 = 0x3e37d00 Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo [3][2][1][0] buf Before call to gets Saved %ebx 03e37d00 Return Address Saved %ebp %ebp Stack Frame for main Stack Frame for echo buf Input 1234 Saved %ebx 03e37d Benign corruption! (allows programmers to make silent off-by-one errors)