Hello ASM World: A Painless and Contextual Introduction to x86 Assembly rogueclown DerbyCon 3.0 September 28, 2013
who? security consultant by vocation mess around with computers, code, CTFs by avocation frustrated when things feel like a black box
what is assembly language? not exactly machine language…but close – instructions: mnemonics for machine operations – normally a one-to-one correlation between ASM instruction and machine instruction varies by processor – today, we will be discussing 32-bit x86
why learn assembly language? some infosec disciplines require it curious about lower-level details of memory or interfacing with an operating system it’s fun and challenging!
how does assembly language work?
hello memory what parts of computer memory does assembly language commonly access? how does assembly language access those parts of computer memory?
where is this memory? what one “normally” thinks of as memory – RAM – virtual memory CPU – registers
computer memory layout heap – global variables, usually allocated at compile-time – envision a bookshelf…that won’t let you push books together when you take one out stack – local, contextual variables – envision a card game discard pile – you will use this when coding ASM. a lot.
registers memory located on the CPU registers are awesome because they are fast. registers are a pain because they are tiny.
registers general purpose registers – alphabet soup eax, ebx, ecx, edx can address in parts: ax, ah, al – stack and base pointers esp ebp – index registers esi, edi
registers instruction pointer – eip – records the next instruction for the program to follow other registers – eflags – segment registers
instructions mov – moves a value to a register – can either specify a value, or specify a register where a value resides syntax in assembly – Intel syntax: mov ebx, 0xfee1dead – AT&T syntax: mov $0xfee1dead, %eax
instructions interrupt – int 0x80 – int 0x3 system calls – how a program interacts with the kernel of the OS
instructions mathematical instructions – add, sub, mul, div mov eax, 10 cdq; edx is now 0 div 3; eax is now 3, edx is now 1 – dec, inc – useful for looping mov ecx, 3 dec ecx; ecx is now 2
jumps jge, jg, jle, jl – work with a compare (cmp) instruction jz, jnz, js, jns – check zero flag or sign flag for jump
instructions stack operations: push and pop mov eax, 10 push eax; 10 on top of stack inc eax; eax is now 11 push eax; 11 on top of stack pop ebx; ebx is now 11 pop ecx; ecx is now 10
instructions function access instructions – call places the address of the next instruction on top of the stack moves execution to identified function – ret returns to the memory address on top of the stack designed to work in tandem with the “call” instruction…but we’re hackers, yes?
sections of ASM code.data – constant variables initialized at compile time.bss – declaration of variables that may are set of changed during runtime.text – executable instructions
instructions: how do they work?
putting it together time to take a bit of C code, and reimplement it in assembly language!
where does shellcode come in?
what is shellcode? instructions injected into a running process lacks some of the luxuries of writing a stand-alone program – no laying out nice memory segments in a.bss or.data section – basically, just one big.text section
a first stab at shellcode… this is going to look mostly familiar, except for how data is handled.
why did it fail? bad characters – shellcode is often passed to an application as a string. – if a character makes a string act funny, you may not want it in your shellcode 0x00, 0x0a, 0x0d, etc. – use an encoder, or do it yourself
try that shellcode again…
where can i learn more about assembly language?
suggested resources dead trees – “Hacking: The Art of Exploitation” by Jon Erickson – “Practical Malware Analysis” by Michael Sikorski and Andrew Honig – “Gray Hat Python” by Justin Seitz
suggested resources the series of tubes – – quick and dirty opcode reference – – Netwide Assembler documentation system calls – Linux: /usr/include/asm/unistd.h man 2 $syscall – Windows: %28vs.85%29 – Windows API reference
how to find me IRC: #derbycon, #misec, or #burbsec on Freenode or, just wave me down at the con