1 The Project of this year Mariano Ceccato FBK - Fondazione Bruno Kessler
2 Traditional computer security Most computer security research: Protect the integrity of a benign host (and its data) from attacks by malicious client programs Basis of the Java security model Downloaded applet or virus infested application Restrict the actions that the client is allowed to perform Software isolation A program is not able to write outside of a designated area (sandbox)
3 More recent computer security Interest in mobile agents changed the view of computer security Benign client code being threatened by host on which it has downloaded/installed Defend a client is much more difficult than defend a host. To defend the host all is needed is to restrict the client Once the client code is in the host, the host can use any technique to violate its integrity. Software piracy Reverse Engineering Software tampering
4 Problem: Malicious Reverse Engineering Valuable piece of code is extracted from an application and incorporated into competitors code.
5 Obfuscation Obfuscation transforms a program into a new program which: Has the same semantics Is harder to reverse engineer
6 Example public class Fibonacci { public int fib ( int n ) { if ( n <= 2 ) return 1; else return fib( n - 1 ) + fib( n - 2 ); }
7 Example: Obfuscation public class x {public int x ( int x ) { return x <=2 ? 1 : x(x-1)+x(x-2); }}
8 What is obfuscation? It is a software protection technique. Transforms the application into one that is functionally identical to the original but is more difficult to reverse engineer. Can never completely protect an application from malicious reverse engineering. Given sufficient time and resources, an adversary can reverse engineer any obfuscated code.
9 Potential application domains Good ones … Obscure program logic. Hide ownership information (e.g. watermarks --- discussed by Mariano) Bad ones … Development of polymorphic virus or code that contains obfuscated malicious payload. Code Plagiarism!
10 Defining Obfuscation Let P P be a transformation from source program P to target program P. P P is an obfuscating transformation if P and P have the same observable behaviour; i.e. the following two conditions hold (Collberg and Thomborson): If P fails to terminate or terminates with an error, then P may or may not terminate. Otherwise, P must terminate and produce the same output as P. Two important conditions that need to be preserved: functionality – the obfuscated program should have the same input/output behaviour as the input program (semantics preserving transformation), and unintelligibility – the obfuscated program should be unintelligible to the adversary in some sense.
11 Goals of obfuscation … Ideal obfuscator (Ehud Barak, PhD, 2004):- Should simulate the black box property. Fails if there exists at least one program that cannot be obfuscated by this method; i.e. an adversary can learn something from an examination of the obfuscated version of this program that cannot be learned by merely executing the program repeatedly. Practical obfuscator (What we have now):- Use transforms such that the resources required for undoing them are too expensive for attackers.
12 Taxonomy of Obfuscations Layout obfuscation: Changes or removes useful information from the IL without affecting real instructions. E.g. comment stripping, identifier renaming. Data Obfuscation: Targets data and data structures in the program. E.g. changing data encoding, splitting/merging arrays. Control-flow obfuscation: Affects the control-flow within the code. E.g. Reordering statements, introducing dummy control-flow.
13 Layout Obfuscation Changes or removes useful information from the IL without affecting real instructions. E.g. comment stripping, identifier renaming. Used in commercial obfuscators like DashO for Java and Dotfuscator for MSIL … both from PreEmptive Corp.
14
15 Data Obfuscations Variable Encoding
16 Data Obfuscations Variable splitting and merging Arrays can be split into several sub-arrays, two or more arrays can be merged into one bigger array, folded so as to increase the number of dimensions, or flattened to decrease the number of dimensions.
17
18 Control-flow Obfuscations Aggregation/De-Aggregation: The original control-flow logic is disturbed by coalescing unrelated methods or splitting related methods. E.g. DOJ (Design Obfuscator for Java) Method inlining, outlining, cloning, and loop transformations are also fall in this class. Ordering: This category performs reordering operations on statements, loops, and expressions to disturb the locality of related information. Spurious Computations: This type of obfuscation is done by modifying the real control-flow by adding spurious computation blocks. E.g. Opaque predicates
19 Opaque Predicates An opaque predicate ( ): conditional expression thus called predicate value is known to the obfuscator, value difficult for the adversary to deduce (by statically analysing the code) thus called opaque The opacity property of predicates determines the resilience of control-flow transformations, i.e. opaque a predicate difficulty in determining its outcome by static analysis.
20 Opaque Predicates T / F – always evaluates to T/F (Opaquely T/F Predicate) ? – may sometimes evaluate to T and sometimes to F. (Opaquely Unknown Predicate)
21 Embedding of opaque predicates (Dummy Code insertion)
22 Embedding of opaque predicates (Loop condition extension) i = 1; while (i < 100){ … i++; } Can be transformed into: i = 1; j = 100; while ((i < 100) && (j*j*(j+1)*(j+1)%4 == 0) T ){ … i++; j = j*i+3; }
23 Opaque Predicates based on aliasing Aliasing occurs when two variables refer to the same memory location. In the presence of aliasing, inter-procedural static analysis is intractable. This intractability property of pointer aliasing can be used to construct opaque predicates. Construction based on the fact that it is impossible for approximate static analysers to detect all aliases all of the time. The basic idea: Construct a dynamic data structure and maintain a set of pointers on it. Make opaque predicates from these pointers. Insert code for manipulating these pointer locations, yet maintain the invariant condition.
24 Opaque Predicates based on aliasing
class A { int f1 ; int f2 ; void m ( ) { int tmp ; f1 = 1 ; f2 = f1 ++; tmp = f1 ; tmp = tmp - f1 ; f1 = f1 +f2 ; } class A { int f1 ; int f2 ; void m ( ) { int tmp ; if ( f ==g ) { f1 = 1 ; f2 = f1 ++; } else { } if ( g != h ) { tmp = f1 ; tmp = tmp - f1 ; f1 = f1 +f2 ; } else { } class A { int f1 ; int f2 ; void m ( ) { int tmp ; if ( f ==g ) { f1 = 1 ; f2 = f1 ++; } else { tmp = f1 +f2 / 5 ; f1 = f2 - tmp ; } if ( g != h ) { tmp = f1 ; tmp = tmp - f1 ; f1 = f1 +f2 ; } else { f1 = tmp / f2 ; tmp = f2%59+f2 ; } Aliases : f = = g g ! = h Update : g = g.left( ) f = g.left().move() class A { int f1 ; int f2 ; void m ( ) { f1 = 1 ; f2 = f1 ++; int tmp = f1 ; tmp = tmp - f1 ; f1 = f1 + f2 ; } class A { int f1 ; int f2 ; void m ( ) { int tmp ; if ( f ==g ) { f1 = 1 ; g = g.left( ) ; f2 = f1 ++; } else { g = g.left ( ) ; tmp = f1 +f2 / 5 ; f1 = f2 - tmp ; } if ( g != h ) { f = g.left().move() ; tmp = f1 ; tmp = tmp - f1 ; g = g.left( ) ; f1 = f1 +f2 ; } else { f1 = tmp / f2 ; tmp = f2%59+f2 ; f = g.left().move() ; } Alias based opaque predicates
26 JSnapScreen Open source java project (2k LoC) It takes snapshoot of the current screen
27 Resources Java grammar for Txl JSnapScreen code Separated sources All the sources in a single file (merged) JSnapScreen class diagram Pointer intensive data-structure List of update expression List of opaque predicates
28 Mandatory requirements Work on the merged file Break basic blocks into many sub-parts Add opaque predicates Add random code Add update statements Txl rules must be briefly commented Deliver a readme describing how to run the obfuscator
29 Optional requirements Work on separated source files Transformation is non-deterministic If applied twice, it gives different results The changed code compiles The changed code runs
30 Delivery The project must be delivered one week (7 days) before the date of the exam