Download presentation
Presentation is loading. Please wait.
1
NICIAR Local Site Visit
Annapolis Junction, MD, January 25, 2008 Process Coloring: an Information Flow-Preserving Approach to Malware Investigation Eugene Spafford, Dongyan Xu, Ryan Riley Department of Computer Science and Center for Education and Research in Information Assurance and Security (CERIAS) Purdue University Xuxian Jiang Department of Computer Science George Mason University
2
Outline Project overview Results in 1st and 2nd quarters
Open problems and potential solutions Plan for 3rd quarter
3
Motivation Internet malware remains a top threat
Malware: virus, worms, rootkits, spyware, bots…
4
Motivation
5
Technical Approach: Process Coloring (PC )
Key idea: propagating and logging malware break-in provenance information (“colors”) along OS-level information flows Existing tools only consider direct causality relations without preserving and exploiting break-in provenance information Virtual Machine Log Monitor Log MySQL DNS Sendmail Apache Logger Guest OS Virtual Machine Monitor (VMM)
6
New Capabilities Enabled by PC
Capability 1: Color-based malware warning Initial coloring s30sendmail s30sendmail s55sshd s55sshd Syscall Log s45named s45named init rc s80httpd s80httpd /etc/shadow Confidential Info httpd netcat Capability 3: Color-based log partition for contamination analysis Local files /bin/sh Capability 2: Color-based identification of malware break-in point Coloring diffusion wget Rootkit
7
Task I: Color Diffusion Model Definition (Month 1-6)
Operation Diffusion syscalls CREATE create <s1, o1> create <s1, s2> color(o1) = color(s1) color(s2) = color(s1) create, mkdir, link fork, vfork, clone READ read <s1, o1> read <s1, s2> color(s1) = color(s1)υ color(o1) color(s1) = color(s1)υ color(s2) read, readv, recv ptrace WRITE write <s1, o1> write <s1, s2> color(o1) = color(s1)υ color(o1) color(s2) = color(s1)υ color(s2) write, writev, send Ptrace, wait, signal DESTROY destroy <s1, o1> destroy <s1, s2> unlink, rmdir, close exit, kill Status: Color diffusion model instantiated in prototype.
8
Task II: PC for Server Side Malware Investigation (Month 2-6)
Consolidated server with independent server applications “Clustered” information flows partitioned by server application colors Color mixing highly unlikely between applications Status: Server-side malware investigation supported by prototype.
9
Task III: PC for Client Side Malware Investigation (Month 6-18)
Inter-dependent client applications (e.g., text editor compiler; latex dvips ps2pdf) More inter-application information flows Color saturation a serious concern Status: Causes of color saturation identified and being studied.
10
PC Prototype Basic release packaged On Xen 3.0.4_1
Targeting server apps For testing: Good For production: Bad Bugs and un-optimized The prototype has been released. It has a few bugs (still) but is suitable for testing out the basic concept of process coloring on server-side applications. I wouldn’t recommend it on useful systems for real data gathering, though. It is also totally unoptimized and performance hasn’t been tested.
11
PC Prototype Challenge - Logging
Log data “trafficking” Harder than you think Not optimal Problem in Xen community I thought I would share one of the technical challenges related to the project. Interestingly, this isn’t where I would have expected to find a technical challenge. Xen doesn’t currently have a good mechanism for getting lots of arbitrary data out of a guest and into the control domain. A lot of people have noticed this, and there’s some work being done on it, but right now there’s no good mechanism in place. This meant that I had to write that portion of the system myself. The problem is difficult because we need to ensure that no logging data is lost and that the logging mechanism doesn’t cause undue performance overhead.
12
PC Prototype Challenge - Logging
Details of our solution Shared pages Ring buffers Host kernel buffers 2 or 3 copies (to be optimized…) Some fun details about it is that basically I setup a structure where the two virtual domains setup some shared memory pages and then signal each other with interrupts to show that there’s new data available. There’s lots of buffering on both ends (which is bad) and the host kernel ends up being the one to store most of the data. My current implementation is far from optimal, mainly because the data is getting copied so many times, but it gets the job done. Now that I’ve done it once I’d love to rewrite that portion of the mechanism and get something cleaner and faster. Plus, there’s been some related work called XenSocket that finally did a code release that has some good tricks I’d like to use.
13
Ring Buffer-based Solution
4 3 Xen VMM 5 2 Dom0 (Host) Log Entry DomU (Guest) Log Entry 6 1 Shared Memory
14
Demo! Prototype Status First version completed w/ simple documentation
Initial experiments with Malware (e.g., Lion worm) Untrusted application (e.g., Skype) Demo!
15
Observation of Skype
16
Research Challenge: PC on Client Side
Focus of ongoing investigation Much harder than server side Work-in-progress Observing client-side color mixing via experiments Identifying root causes of color saturation Investigating approaches to mitigation So, the big thing for today is actually a discussion on what we’ve been discovering as we rework process coloring for a client-side environment. We’ve identified two main problems I want to walk you through that will hopefully help you understand what we’re working on right now and how we’re approaching the problems. I also hope that you guys will have some insights to spur us in our own thoughts, because these are definitely the “What we’re working on right now and don’t necessarily have a good solution yet” problems.
17
Two Root Causes of Color Saturation
Important observation from experiments It’s all about sinks: Sink processes Sink files Guidance to mitigating color saturation As we’ve been doing basic experiments, we’ve observed two major hurdles at this stage. Really, all of it is about “sinks” – A file or process that ends up getting and propagating a lot of color. Usually needlessly from a security perspective.
18
Sink Process – 1st Example
The “GUI” experiment GUI Manager GUI Manager Firefox Thumbnailer Thumbnailer So, the first problem is fairly easy to understand. I call it the “sink” process. Basically, it’s a process that ends up communicating with so many different processes on a system that it causes color to flow a bit too freely. All of the one’s I’ve found so far are related to the system’s GUI. Let me give you an example… Document.gif Document.pdf
19
Sink Process – 2nd Example
The “File Preview” experiment So that’s one problem. Interestingly, I consider that to be an easier problem. Let’s look at two tough ones. But let’s take them one at a time. Here’s a screenshot from one of the color graphs produced by our prototype. This illustrates a problem I’m calling the file preview problem. Let me show it to you in a slightly easier to understand format…
20
Sink Process – 2nd Example
The “File Preview” experiment Employees.doc Finances.doc Finances.doc Preview Write Read OpenOffice OpenOffice OpenOffice So, the basic concept here is that I start up my office application and then open a document, finances.doc. The problem is that while I’m navigating the directory to get to it, the office app is reading other documents in the directory in order to give me a preview or some sort of information about the other files. That means that my office application has been tainted even though I didn’t want to open that file!
21
Sink File Example The “Configuration File” experiment
So, here’s another problem I think is really tough: Configuration files. Sometimes config files for one application, or sometimes between apps. This picture shows a PDF reader writing to recently-used and the office application reading from it later. Let me show you a simplified version of this problem involving only one applications…
22
Sink File Example The “Configuration File” experiment Employees.doc
Finances.doc Finances.doc OpenOffice OpenOffice OpenOffice OpenOffice The key point to see in this animation is that OpenOffice write out the config files and then re-reads them later, and the color diffuses through the configs. This means that every time the application is started it gets tainted by every color it has ever touched. Recently Used
23
1st Possible Approach Insulate the sink Issues Example – Sink process
Cannot inherit Cannot pass on Children insulated until exec() Issues Implicit trust Easier covert channels 23
24
1st Possible Approach – Covert Channels
.recently-used <?xml version="1.0"?> <RecentFiles> <RecentItem> <URI>file:///DTO_Purdue_ ppt</URI> <Mime-Type>application/vnd.ms-powerpoint</Mime-Type> <Timestamp> </Timestamp> <Groups> <Group>openoffice.org</Group> <Group>staroffice</Group> <Group>starsuite</Group> </Groups> </RecentItem> </RecentFiles>
25
2nd Possible Approach Use program execution context information Issues
Instead of always insulating a sink Diffuse color if [context predicate] holds Program execution context System call type and parameter System call timing Call stack Issues Requires execution context information Feasible? Fake-able? 25
26
2nd Possible Approach - Call Stack
Example: Imagine we require… main() -> terminate() -> write_config() -> write() Can an attacker fake this call stack?
27
3rd Possible Approach Leverage program-level information flows Issues
Available from peer NICIAR projects Determining if data “really” used Passing this information to PC Diffusing color if process really uses data In talk with SWRI+UT team Issues Requires program source code 27
28
4th Possible Approach Leverage application-level virtualization (e.g., MS SoftGrid, Trustware BufferZone) Group, isolate, and confine related processes Issues Handling valid interaction Usability 28
29
Discussion Potential points for discussion Universality?
Trust in application? Features vs. information flow? Will flow problems continue to increase? The goal of this slide is to basically discuss anything talked about so far. As you can tell, we don’t have clean, easy solutions for all of these problems. We’d love your thoughts on where we’re going with it and anything else.
30
Plan for 3rd Quarter Detailed design of PC for client side
Implement some proposed approaches Evaluation of each approach Observing color mixing in a client Experimenting with malware instances (e.g., bots) We are cautiously optimistic that color saturation will NOT happen – contrary to common wisdom.
31
For more information about the Process Coloring project:
Thank you! For more information about the Process Coloring project:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.