Download presentation
Presentation is loading. Please wait.
Published byJocelyn Stevenson Modified over 9 years ago
1
Using software metrics for estimating code similarities in binaries Saša Stojanović, Miloš Cvetanović, Zaharije Radivojević School of Electrical Engineering, Belgrade University 15th Workshop “Software Engineering Education and Reverse Engineering” Bohinj, Slovenija 23-30 August 2015
2
15th Workshop SEE and RE 2/34 Agenda Estimating Code Similarities in Binaries Theory of Communication Mapping to Code Similarity Results Conclusions
3
15th Workshop SEE and RE 3/34 Estimating Code Similarities How to find if some binary code originates from particular source code?
4
15th Workshop SEE and RE 4/34 Estimating Code Similarities
5
15th Workshop SEE and RE 5/34 Estimating Code Similarities Source code Device
6
15th Workshop SEE and RE 6/34 Estimating Code Similarities
7
15th Workshop SEE and RE 7/34 Estimating Code Similarities Live chip
8
15th Workshop SEE and RE 8/34 Estimating Code Similarities Connection device Live chip
9
15th Workshop SEE and RE 9/34 Estimating Code Similarities
10
15th Workshop SEE and RE 10/34 Estimating Code Similarities
11
15th Workshop SEE and RE 11/34 Estimating Code Similarities Binary code!
12
15th Workshop SEE and RE 12/34 Estimating Code Similarities CompilerSource code Binary code What compiler?
13
15th Workshop SEE and RE 13/34 Estimating Code Similarities Compiler 2Source code Binary code 2 Compiler 1Binary code 1 Compiler 3Binary code 3
14
15th Workshop SEE and RE 14/34 Problem schema CompilerSource code Destination code
15
15th Workshop SEE and RE 15/34 Problem schema DisturbanceSourceDestination
16
15th Workshop SEE and RE 16/34 Problem schema NoiseSourceDestination Information theory!
17
15th Workshop SEE and RE 17/34 Information theory Information Source TransmitterNoise Source ReceiverDestination Message Signal Received Signal Message
18
15th Workshop SEE and RE 18/34 Information theory An information source that produces a message A transmitter that operates on the message to create a signal which can be sent through a channel A channel, which is the medium over which the signal, carrying the information that composes the message, is sent A receiver, which transforms the signal back into the message intended for delivery A destination, which can be a person or a machine, for whom or which the message is intended Wikipedia
19
15th Workshop SEE and RE 19/34 Information theory Information Source TransmitterNoise Source ReceiverDestination Message Signal Received Signal Message
20
15th Workshop SEE and RE 20/34 Information theory & Code detection MessageSymbolProcedure Instruction Measure of Information – Entropy Measure of Information – Entropy!
21
15th Workshop SEE and RE 21/34 Information theory & Code detection How to solve the code detection problem? Increase amount of information with positive influence Decrease amount of information with negative influence
22
15th Workshop SEE and RE 22/34 Approach
23
15th Workshop SEE and RE 23/34 Information theory & Code detection MessageSymbol Lossy Compression ProcedureInstruction Metrics Measure of Information – Entropy
24
15th Workshop SEE and RE 24/34 Approach
25
15th Workshop SEE and RE 25/34 Inline Optimization
26
15th Workshop SEE and RE 26/34 Opcode Sequences
27
15th Workshop SEE and RE 27/34 Filtering Stack Instructions
28
15th Workshop SEE and RE 28/34 Filtering Transfer Instructions
29
15th Workshop SEE and RE 29/34 Approach
30
15th Workshop SEE and RE 30/34 Results (STAMP + Busy Box)
31
15th Workshop SEE and RE 31/34 Results (STAMP + Busy Box)
32
15th Workshop SEE and RE 32/34 Results (STAMP + Busy Box)
33
Code similarities can be viewed from Information theory perspective Code similarities using software metrics can be observed as a lossy compression Filters stack instructions has the largest contribution to ranking. 15th Workshop SEE and RE 33/34 Conclusion
34
Thank you! Radivojevic Zaharije
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.