Integrity & Malware Dan Fleck CS469 Security Engineering Some of the slides are modified with permission from Quan Jia. Coming up: Integrity – Who Cares? 1111
Integrity – Who Cares? IARPA – Funded GMU (and others) to research software validation. Securely Taking On New Executable Software Of Uncertain Provenance (STONESOUP) is a multi-year Intelligence Advanced Research Projects Activity (IARPA) Overall Goal: Eliminate effects of software vulnerabilities in code. GMU’s Part: Determine if a program has been compromised (taken over) by malware to do something it shouldn’t be doing and report to another component How we do it: Binary Instrumentation! Coming up: Program Compromise: SQL Injection 3322
Program Compromise: SQL Injection SQL Injection – by inserting code into the data, a poorly written program can accidentally run unexpected SQL: name: Dan Fleck’; update personnel set password=‘abc123’; Code executes like this: select account number from accounts where name=Dan Fleck’; update personnel set password=‘abc123’; Return oriented programming (more direct) Coming up: Program Compromise: Buffer Overflow 4433
Program Compromise: Buffer Overflow Overflowing: Input data to a program that gets stored in a local variable (variable “A” for example) No bounds checking… can overwrite the Return Address strcpy in C does not check bounds! Coming up: Program Exploit: Buffer Overflow 5544
Program Exploit: Buffer Overflow Exploiting: Get a program to store information on the heap somehow Add to the stack: NOP sled Shell Code Return address into the NOP sled Coming up: Protections 6655
Protections Windows Data Execution Protection – don’t execute code on the stack! W X protection – no memory address can be both writeable and executable (Circa 2004 – Win XP sp 2) gcc –fstack-protector --- add in stack canaries Can we do it without executing code on the stack? Return oriented programming (ROP) Coming up: Return Oriented Programming 7766
Return Oriented Programming Re-use code that already exists in the system. Steps: Take control on the stack (somehow). Put variables you need on the stack for a specific function call, and jump to the function – Solar 1997 Nergal 2001 – Phrack article – how to chain multiple function calls Defense: hardware supported non-executable segments introduced. No longer could store function arguments on the stack, must be in registers. Stealth 2005 – use chunks of functions that start by copying values from the stack into registers. DEPLib created which automates this approach. Loops and conditionals unsupported. Shacham introduces “return oriented programming” which allows loops and conditionals (These chunks are “gadgets”) Coming up: Return Oriented Programming 8877
Return Oriented Programming Gadget: small piece of binary code that does some simple operation and returns: add r1, r2 return Find lots of these by disassembling the binary program to create a gadget library (use automated tools (e.g. ROPGadget)) Chain them together by modifying the stack Ref: Coming up: ROP 9988
ROP Normal program uses instruction pointer ROP program uses stack pointer Coming up: Defenses 10 99
Defenses Many groups are trying to stop ROP. One student we work with did okay But what about the next thing? and the next? Can we check the integrity of the program based on behavior? Coming up: Monitoring Behavior 11 10
Monitoring Behavior Another approach is to monitor the behaviors of an application and determine if it’s out of “normal” Challenges: What is normal? Speed versus security tradeoff others… Our specific part is resource-based attacks: Does the program use more resources than it should? Disk, CPU, memory, network, semaphores. Coming up: PIN Tools 12 11
PIN Tools Intel’s PIN tool is a dynamic binary instrumentation tool Lets you run a program and instrument it while it is running. For example, find all the function calls and add in your own code before/after the call. Lets see an example: Pin/html/index.html#EXAMPLES Pin/html/index.html#EXAMPLES Coming up: What is Instrumentation? 13 12
What is Instrumentation? A technique that inserts extra code into a program to collect runtime information. Program analysis : performance profiling, error detection, capture & replay Architectural study : processor and cache simulation, trace collection Binary translation : Modify program behavior, emulate unsupported instructions 13Coming up: Instrumentation Approaches 14 13
Instrumentation Approaches Source Code Instrumentation (SCI) – instrument source programs Binary Instrumentation (BI) – instrument binary executable directly 14Coming up: SCI Example (Code Coverage) 15 14
SCI Example (Code Coverage) Original Program void foo() { bool found=false; for (int i=0; i<100; ++i) { if (i==50) break; if (i==20) found=tru e; } printf("foo\n"); } Instrumented Program char inst[5]; void foo() { bool found=false; inst[0]=1; for (int i=0; i<100; ++i) { if (i==50) { inst[1]=1;break;}if (i==20) { inst[2]=1;found=true;} inst[3]=1; } printf("foo\n"); inst[4]=1; } 15Coming up: Binary Instrumentation (BI) 16 15
Binary Instrumentation (BI) Static binary instrumentation – inserts additional code and data before execution and generates a persistent modified executable Dynamic binary instrumentation – inserts additional code and data during execution without making any permanent modifications to the executable. 16Coming up: BI Example – Instruction Count 17 16
BI Example – Instruction Count 17 sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax counter++; Coming up: BI Example – Instruction Trace 18 17
BI Example – Instruction Trace 18 sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax Print(ip); Coming up: Advantages 19 18
Advantages Binary instrumentation Language independent Machine-level view Instrument legacy/proprietary software Dynamic instrumentation No need to recompile or relink Discover code at runtime Handle dynamically-generated code Attach to running processes 19Coming up: Advantages of Pin Instrumentation 20 19
Advantages of Pin Instrumentation Easy-to-use Instrumentation: Uses dynamic instrumentation - Do not need source code, recompilation, post-linking Programmable Instrumentation: Provides rich APIs to write in C/C++ your own instrumentation tools (called Pintools) Multiplatform: Supports x86, x86-64, Itanium, Xscale OS’s: Windows, Linux, OSX, Android Robust: Instruments real-life applications: Database, web browsers, … Instruments multithreaded applications Supports signals Efficient: Applies compiler optimizations on instrumentation code 20Coming up: Widely Used and Supported 21 20
21 Widely Used and Supported Large user base in academia and industry 30,000+ downloads citations Active mailing list (Pinheads) Actively developed at Intel Intel products and internal tools depend on it Nightly testing of binaries on 15 platforms Coming up: Using Pin 22 21
22 Using Pin Launch and instrument an application $ pin –t pintool.so –- application Instrumentation engine (provided in the kit) Instrumentation tool (write your own, or use one provided in the kit) Attach to and instrument an application $ pin –t pintool.so –pid 1234 Coming up: Pin and Pintools 23 22
Pin and Pintools Pin – the instrumentation engine Pintool – the instrumentation program Pin provides the framework and API, Pintools run on Pin to perform meaningful tasks. Pintools – Written in C/C++ using Pin APIs – Many open source examples provided with the Pin kit – Certain Do’s and Don’ts apply 23Coming up: Pin Instrumentation Capabilities 24 23
Replace application functions with your own. Fully examine any application instruction – insert a call to your instrumenting function whenever that instruction executes. Pass a large set of supported parameters to your instrumenting function. Register values (including IP), Register values by reference (for modification) Memory addresses read/written by the instruction Full register context Track function calls including syscalls and examine/change arguments. Track application threads. Intercept signals. Instrument a process tree. ……… Pin Instrumentation Capabilities 24 Coming up: Pintool 1: Instruction Count 25 24
Pintool 1: Instruction Count 25 sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax counter++; Coming up: Pintool 1: Invocation See icount example
Pintool 1: Invocation Windows examples: > pin.exe -t inscount0.dll -- dir.exe > pin.exe -t inscount0.dll -o incount.out -- gzip.exe FILE Linux examples: $ pin -t inscount0.so -- /bin/ls $ pin -t inscount0.so -o incount.out -- gzip FILE 26Coming up: Pintool 1: Invocation 27 26
27 ManualExamples/inscount0.cpp instrumentation routine analysis routine #include #include "pin.h" UINT64 icount = 0; void docount() { icount++; } void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END); } void Fini(INT32 code, void *v) { std::cerr << "Count " << icount << endl; } int main(int argc, char * argv[]) { PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0; } switch to pin stack save registers call docount restore registers switch to app stack Pintool 1: Coming up: Pin Instrumentation APIs 28 27
Pin Instrumentation APIs Basic APIs are architecture independent: Provide common functionalities like determining: Control-flow changes Memory accesses Architecture-specific APIs E.g., Info about segmentation registers on IA32 Call-based APIs: Instrumentation routines Analysis routines Coming up: Pintool 2: Instruction Trace 29 28
Pintool 2: Instruction Trace 29 sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax Print(ip); Coming up: Pintool 2: Instruction Trace 30 29
ManualExamples/itrace.cpp argument to analysis routine analysis routine instrumentation routine #include #include "pin.H" FILE * trace; void printip(void *ip) { fprintf(trace, "%p\n", ip); } void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END); } void Fini(INT32 code, void *v) { fclose(trace); } int main(int argc, char * argv[]) { trace = fopen("itrace.out", "w"); PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0; } Pintool 2: Coming up: Examples of Arguments to Analysis Routine 31 30
31 Examples of Arguments to Analysis Routine IARG_INST_PTR Instruction pointer (program counter) value IARG_UINT32 An integer value IARG_REG_VALUE Value of the register specified IARG_BRANCH_TARGET_ADDR Target address of the branch instrumented IARG_MEMORY_READ_EA Effective address of a memory read And many more … (refer to the Pin manual for details) Coming up: Instrumentation Points 32 31
Instrumentation Points Instrument points relative to an instruction: Before (IPOINT_BEFORE) After: Fall-through edge (IPOINT_AFTER) Taken edge (IPOINT_TAKEN) cmp%esi, %edx jle mov$0x1, %edi : mov $0x8,%edi count() Coming up: Instrumentation Granularity 33 32
33 Instruction Basic block A sequence of instructions terminated at a control-flow changing instruction Single entry, single exit Trace A sequence of basic blocks terminated at an unconditional control-flow changing instruction Single entry, multiple exits Instrumentation Granularity sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax jmp 1 Trace, 2 BBs, 6 insts Instrumentation can be done at three different granularities: Coming up: Alternative Hw # jumpmix example – which types of jump instructions are called?
Alternative Hw #2 Instead of the given HW #2 you can write a PIN tool Task: Write a PIN tool that monitors which files are opened by a program and stores it to a log. Turn in your program and the output of it running in lieu of Hw#2. Note: This is much harder than the original Hw#2, but more fun. 34Coming up: Lessons 35 34
Lessons Integrity of programs can be violated in many ways Many defenses exist Monitoring programs for normal can be done through binary instrumentation PIN is one example of a powerful binary instrumentation tool Coming up: References 35
References to-return-oriented-programming/ to-return-oriented-programming/ Stealth: Nergel, End of presentation 36