DBPD: A Dynamic Birthmark-based Software Plagiarism Detection Tool Zhenzhou Tian zztian@stu.xjtu.edu.cn MOE Key Lab for Intelligent Networks and Network Security Xi’an Jiaotong University, China 2017/4/23
Introduction Software plagiarism has been a serious threat to the healthy development of software industry Violate licenses for commercial interests or unwittingly Weak code protection awareness Powerful automated code obfuscation tools Distributed in binary form
Introduction Many software birthmark based techniques are proposed Static Birthmarks: CVFV,SMC,IS,UC… Dynamic Birthmarks: WPP, SCSSB, SCDG, DKISB… Seldom tools are publically available Dynamic birthmarks are believed to perform better than static birthmarks Tool Static/Dynamic Language Sandmark Static Java bytecode Stigmata Birthmarking Dynamic JPlag Source code constant values in field variables, sequence of method calls, inheritance structure and used classes
Framework of DBPD Software Birthmark Design Overview A set of characteristics extracted from a program that reflects intrinsic properties of the program, and which can be used to identify the program uniquely. Design Overview
Three Dynamic Birthmarks Three Birthmark Approaches Implemented DKISB: Dynamic Key Instruction Sequence Birthmark Generated using k-gram algorithm from dynamic key instructions (instructions that are both value updating and input correlated). SCSSB: System Call Short Sequence Birthmark Extracted by splitting system call sequence into short sub-sequences SODB: Stack Operation Dynamic Birthmark Generated by analyzing the behavior of stack operations, utilizing the law of push and pop operation of call stack to uniquely identify a program
Independently implemented software with similar functionalities Demonstration Independently implemented software with similar functionalities
Plagiarism Using Different Compilers and Optimization Levels Demonstration Plagiarism Using Different Compilers and Optimization Levels
Plagiarism Using Specific Obfuscation Tools Demonstration Plagiarism Using Specific Obfuscation Tools
Cross-Platform Plagiarism Scenario Demonstration Cross-Platform Plagiarism Scenario
Some Definitions
Some Definitions