CSC-682 Advanced Computer Security Automatic Patch-Based Exploit Generation : techniques and implications Based on an article by : Pongsin Poosankam David Brumley Dawn Song Jiang Zheng Presented by : Pompi Rotaru
Formulating the problem One study revealed that for Windows Update it takes 24 hours for 80 % of the PC to check for updates Given a program P and a patched version of the program P' automatically generate an exploit for the potentially unknown vulnerability present in P but fixed in P' These type of attacks are possible because patch updates usually stagger over hours or longer periods of time This study is focused on automatically generated exploits for 5 Microsoft programs based upon patches provided via Windows Update
Definitions Φ = (safety policy) is a first-order logic Boolean predicate from the programs state space to one of two values: safe or unsafe. x = an input P = version of a program containing a vulnerability P' = version of a program where the vulnerability was patched F = the weakest precondition (is a constraint formula) Generate an input x such that : Φ (P(x)) = unsafe Φ (P'(x)) = safe
Challenges Often the code is available in binary format What changes have occurred from P to P' ? Need to automatically generate inputs which exploit the vulnerability in the original unpatched program What is the speed at which exploits can be generated from patches in order to design adequate security defences ?
How it works (hacker's perspective) A new patch reveals some information about an existing vulnerability, and having early access to a patch may confer advantages to an attacker over hosts who have not yet received the patch An exploit is generated within a few minutes It can generate polymorphic exploit variants The tool addressed the binary and the libraries If the attack succeeds then : crash the program and cause denial of service hijack control of the program
How it works (victim's perspective) User visits a malicious web site that uses inputs that take advantage of that unpatched vulnerability User visits a legitimate web site that has been hacked
General concept APEG is based on the observation that input-validation bugs are usually fixed by adding the missing sanitization checks
The approach : step by step 1. Identify the new sanitization checks added in P' ; compute the differences between the two versions 2. Generate a candidate exploit x which fails the new check in P' by: Calculating the weakest precondition to fail the new check in P' Use a solver to find x such that F(x) = true. x is the candidate exploit 3. Verify a candidate exploit is a real exploit by running Φ(P(x)) 4. (optional) Generate polymorphic variants
Differencing Two Binaries We look for difference between the P and P' versions It is based on purely syntactic analysis of the disassembled binary Since not all the differences are possible exploits, they prioritized the new checks that appear in procedures that have changed very little
Generate candidate exploit - Dynamic Generating a constraint formula from a sample execution Dynamic approach considers a single path at a time The number of exploitable paths is typically only a fraction of all possible execution paths Dynamic approach produces formulas that are typically the smallest of the three approaches Dynamic approach is usually the fastest for producing candidate exploits
Generate candidate exploit - Static Generating a constraint formula over a CFG Encompasses multiple paths without enumerating them individually Perform program chopping on the program CFG in order to create a CFG that only includes paths to the new check Computing a formula over the CFG is more efficient than computing a separate formula for each path in the CFG separately Formulas are typically larger and therefore take longer to solve, because they include all instructions in the CFG fragment
Generate candidate exploit - Combined Combination of dynamic and static analysis We combine information about code paths we know how to execute via known inputs, and additional code paths we wish to explore using static analysis Provides a way of considering a subset of paths so that the generated formula is small enough for the solver to generate a candidate exploit
Verifying the exploits Goal : inputs taking on a satisfying assignment will make the program execution reach the point of the new check and fail the new check They used STP (a decision procedure that supports bit level operations), as a solver to generate candidate exploits from the constraint formula After a predefined timeout they move to build another constraint formula covering other paths Verification is done using an off-the-shelf dynamic-taint analysis-style exploit detector that returns unsafe when the candidate exploit is verified
Generating Polymorphic Exploits There are potentially many different exploits, with each individual exploit called a polymorphic variant We need to find new input exploit x' and a new function F' such that F'(x') = true
Evaluation Done for 5 vulnerable Microsoft programs which have patches available Name of the program/routine that is affected Vulnerability is exploited (memory allocation, information disclosure) Time required to generate an exploit (sec … min) Count the functions that were changed or added (21 chg. + 5 add.) Select one of the functions for attack Generate the exploit Addition details (repeatedly launching an attack until the memory layout matches what the exploit expects)
How to mitigate such attacks 3 solutions to mitigate this type of attacks Make it hard to find new checks (through obfuscation) Make it so everyone can download the update before anyone can apply it (using encryption) Make it so everyone can download the patch at the same time (using P2P)
Conclusions APEG is possible in several real-world cases An exploit can be generated within a few minutes Each analysis exploit approach was proven to be useful in certain situations Can generate polymorphic exploit variants The technique described may not work in all cases The current patch distribution schemes are insecure, and should be redesigned to defend against APEG questions ?!?