1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable.

Slides:



Advertisements
Similar presentations
USENIX Security Symposium, Baltimore, MD, Non-Control-Data Attacks Are Realistic Threats Shuo Chen *, Jun Xu, Emre Sezer, Prachi Gauriar, Ravi Iyer.
Advertisements

Buffer Overflows Nick Feamster CS 6262 Spring 2009 (credit to Vitaly S. from UT for slides)
Defenses. Preventing hijacking attacks 1. Fix bugs: – Audit software Automated tools: Coverity, Prefast/Prefix. – Rewrite software in a type safe languange.
Integrity & Malware Dan Fleck CS469 Security Engineering Some of the slides are modified with permission from Quan Jia. Coming up: Integrity – Who Cares?
Exploring Security Vulnerabilities by Exploiting Buffer Overflow using the MIPS ISA Andrew T. Phillips Jack S. E. Tan Department of Computer Science University.
Computer Security: Principles and Practice EECS710: Information Security Professor Hossein Saiedian Fall 2014 Chapter 10: Buffer Overflow.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 11 – Buffer Overflow.
Lecture 16 Buffer Overflow modified from slides of Lawrie Brown.
Defeating Memory Corruption Attacks via Pointer Taintedness Detection Shuo Chen †, Jun Xu ‡, Nithin Nakka †, Zbigniew Kalbarczyk † and Ravi K. Iyer † ‡
Securing software by enforcing data-flow integrity Manuel Costa Joint work with: Miguel Castro, Tim Harris Microsoft Research Cambridge University of Cambridge.
1 Security Vulnerability Analysis and Mitigation for Real-World Systems Shuo Chen Center for Reliable and High-Performance Computing Coordinated Science.
1 Achieving Trusted Systems by Providing Security and Reliability (Research Project #22) Project Members: Ravishankar K. Iyer, Zbigniew Kalbarczyk, Jun.
Characterizing and Reasoning about Security Vulnerabilities Shuo Chen Center for Reliable and High-Performance Computing Coordinated Science Laboratory.
Non-Control-Data Attacks and Securing software by enforcing data- flow integrity Zhiqiang Lin Mar 28, 2007 CS590 paper presentation.
Achieving Trusted Systems by Providing Security and Reliability Ravishankar K. Iyer, Zbigniew Kalbarczyk, Jun Xu, Shuo Chen, Nithin Nakka and Karthik Pattabiraman.
In vfprintf(), if (fmt points to “%n”) then **ap = (character count) Achieving Trusted Systems by Providing Security and Reliability FORMAL REASONING ON.
Methods For The Prevention, Detection And Removal Of Software Security Vulnerabilities Jay-Evan J. Tevis Department of Computer Science and Software Engineering.
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
Shuo Chen, Jun Xu, Emre C. Sezer, Prachi Gauriar, and Ravishankar K. Iyer Brett Hodges April 8, 2010.
Address Space Layout Permutation
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
Web Application Access to Databases. Logistics Test 2: May 1 st (24 hours) Extra office hours: Friday 2:30 – 4:00 pm Tuesday May 5 th – you can review.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Computer Security and Penetration Testing
BLENDED ATTACKS EXPLOITS, VULNERABILITIES AND BUFFER-OVERFLOW TECHNIQUES IN COMPUTER VIRUSES By: Eric Chien and Peter Szor Presented by: Jesus Morales.
1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable.
Learning, Monitoring, and Repair in Application Communities Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Introduction: Exploiting Linux. Basic Concepts Vulnerability A flaw in a system that allows an attacker to do something the designer did not intend,
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Mitigation of Buffer Overflow Attacks
1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and.
COMPUTER SECURITY MIDTERM REVIEW CS161 University of California BerkeleyApril 4, 2012.
Formal Reasoning of Security Vulnerabilities by Pointer Taintedness Semantics S. Chen, K. Pattabiraman, Z. Kalbarczyk and R. K. Iyer Center for Reliable.
Fundamentals of Proxying. Proxy Server Fundamentals  Proxy simply means acting on someone other’s behalf  A Proxy acts on behalf of the client or user.
Identification and Protection of Security-Critical Data Nora Sovarel University of Virginia Computer Science June 6, 2006 MCS Project Presentation.
Intrusion Detection Karthikeyan Mahadevan. Intrusion Detection What is Intrusion? Simply put, an intrusion is someone attempting to break into or misuse.
Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Buffer Overflow Group 7Group 8 Nathaniel CrowellDerek Edwards Punna ChalasaniAxel Abellard Steven Studniarz.
Buffer Overflow Attack Proofing of Code Binary Gopal Gupta, Parag Doshi, R. Reghuramalingam, Doug Harris The University of Texas at Dallas.
A Tool for Pro-active Defense Against the Buffer Overrun Attack D. Bruschi, E. Rosti, R. Banfi Presented By: Warshavsky Alex.
1 Enhancing Security of Real-World Systems with a Better Understanding of the Threats Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable.
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
A Survey on Runtime Smashed Stack Detection 坂井研究室 M 豊島隆志.
Information Security - 2. A Stack Frame. Pushed to stack on function CALL The return address is copied to the CPU Instruction Pointer when the function.
Foundations of Network and Computer Security J J ohn Black CSCI 6268/TLEN 5550, Spring 2013.
Security Attacks Tanenbaum & Bo, Modern Operating Systems:4th ed., (c) 2013 Prentice-Hall, Inc. All rights reserved.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 11, 2011.
VM: Chapter 7 Buffer Overflows. csci5233 computer security & integrity (VM: Ch. 7) 2 Outline Impact of buffer overflows What is a buffer overflow? Types.
1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and.
Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade Crispin Cowan SANS 2000.
Software Security. Bugs Most software has bugs Some bugs cause security vulnerabilities Incorrect processing of security related data Incorrect processing.
1 Introduction to Information Security , Spring 2016 Lecture 2: Control Hijacking (2/2) Avishai Wool.
@Yuan Xue Worm Attack Yuan Xue Fall 2012.
Mitigation against Buffer Overflow Attacks
Buffer Overflow Buffer overflows are possible because C doesn’t check array boundaries Buffer overflows are dangerous because buffers for user input are.
Protecting Memory What is there to protect in memory?
The Hardware/Software Interface CSE351 Winter 2013
Protecting Memory What is there to protect in memory?
Module 30 (Unix/Linux Security Issues II)
Protecting Memory What is there to protect in memory?
Secure Software Development: Theory and Practice
High Coverage Detection of Input-Related Security Faults
CS 465 Buffer Overflow Slides by Kent Seamons and Tim van der Horst
Software Security Lesson Introduction
CSC-682 Advanced Computer Security
CS5123 Software Validation and Quality Assurance
Understanding and Preventing Buffer Overflow Attacks in Unix
Format String Vulnerability
Presentation transcript:

1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Candidate of Ph.D. in Computer Science Center for Reliable and High Performance Computing Coordinated Science Laboratories University of Illinois at Urbana-Champaign

2 Security Threat Analysis and Mitigations in Real-World Systems Security Threat Analysis and Mitigations in Real-World Systems –How errors in hardware and software impose security threats to real-world systems? (Any common characteristics?) –How effective are current defense techniques? (Any substantial deficiencies?) –How to build better defenses? Analysis-centric research approach Analysis-centric research approach –Emulated hardware memory errors  impact on system security –Software vulnerabilities in Bugtraq and CERT databases –Source code of vulnerable applications –Current attack methods and defense techniques –Analysis results motivate the development of new defense techniques. Many areas in computer science are related Many areas in computer science are related –Security, dependability, formal method, programming language operating systems and computer architecture. My Dissertation

3 Analyzing and Identifying Security Threats on Real-World Software

4 How Real Systems are Susceptible to Hardware Memory Errors Attacker Target host Firewall (IPChains and Netfilter) Due to hardware memory errors, packets can penetrate firewalls Attacker Network server (FTP and SSH) Due to hardware memory errors, users can log in with arbitrary passwords Emulate random hardware memory errors Emulate random hardware memory errors Use a stochastic model to estimate the threats in real environments Use a stochastic model to estimate the threats in real environments Motivate other researchers to conduct physical fault injection experiments Motivate other researchers to conduct physical fault injection experiments –Java type system subverted due to memory errors. 

5 Significance of Software Memory Vulnerabilities CERT Advisories:  66% vulnerabilities are low level memory errors in software. CERT Advisories:  66% vulnerabilities are low level memory errors in software. Widely exploited by attackers, worms and viruses. Widely exploited by attackers, worms and viruses.

6 Execute malicious code Overwrite a return address Embed malicious contents in input State Machine Model: WU-FTP Server Attack get an FTP command Authentication; x = user ID repeat FTP_service() seteuid(x) SITE_EXEC(fn) printf(fn,…) seteuid(0) exec(“/bin/sh”)

7 Overwrite function pointer foo Corrupt heap structure Execute malicious code State Machine Model: NULL-HTTP Server Attack process HTTP header p=malloc(…) repeat HTTP_service() HTTP_POST() recv(p,…) seteuid(0) exec(“/bin/sh”) free(p) *foo()

8 Control Data Attack: Well-Known, Dominant Control data attack (i.e., overwriting control data): the most dominant form of memory corruption attacks (CERT and Microsoft Security Bulletin) Control data attack (i.e., overwriting control data): the most dominant form of memory corruption attacks (CERT and Microsoft Security Bulletin) Control data: Control data: –data used as targets of call, return and jump. –widely understood as security critical elements Many current defense techniques: to enforce program control flow integrity for security. Many current defense techniques: to enforce program control flow integrity for security. Non-control-data attacks Non-control-data attacks –Currently very rare in reality –One instance were suggested by Young and McHugh in –No study on the applicability of such attacks against real- world software.

9 An Important Question Are attackers really incapable to mount non-control- data attacks against many real systems? Are attackers really incapable to mount non-control- data attacks against many real systems? –Probably not. –Random hardware memory errors can subvert the security of real-world systems with a non-negligible probability. –Software vulnerabilities are more deterministic and more amenable to attacks. –Each attack exploiting software vulnerabilities is composed by multiple primitive components. Allow potentially polymorphic attacks. Dangerous.

10 Our Claim: General Applicability of Non-control-data Attacks We claim: We claim: –Many real-world software applications are susceptible to non-control-data attacks. –The severity of the attack consequences is equivalent to that due to control data attacks. Validate the claim by constructing non-control- data attacks to get the root privilege on major network servers: FTP, HTTP, SSH and Telnet servers Validate the claim by constructing non-control- data attacks to get the root privilege on major network servers: FTP, HTTP, SSH and Telnet servers –Over 1/3 of vulnerabilities in CERT advisories Non-control-data attacks are realistic threats. Non-control-data attacks are realistic threats.

11 Non-control-hijacking attack on WU-FTP Server (via a format string bug) int x; FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); if (a data command?) getdatasock(...); } getdatasock(... ) { seteuid(0); setsockopt(... ); seteuid(x); } x=109, run as EUID 0 x uninitialized, run as EUID 0 x=109, run as EUID 109. Lose the root privilege! Get a special SITE EXEC command. Exploit a format string vulnerability. x= 0, still run as EUID 109. x=0, run as EUID 0 When return to service loop, still runs as EUID 0 (root). Allow me to upload /etc/passwd I can grant myself the root privilege! Only corrupt an integer, not a control data attack. Get a data command (e.g., PUT)

12 Non-control-hijacking attack on NULL-HTTP Server (via a heap overflow bug) Attack the configuration string of CGI-BIN path. Attack the configuration string of CGI-BIN path. Mechanism of CGI Mechanism of CGI –suppose server name = CGI-BIN = /usr/local/httpd/exe –Requested URL = –The server executes Our attack Our attack –Exploit the vulnerability to overwrite CGI-BIN to /bin –Request URL –The server executes The server gives me a root shell! Only overwrite four characters in the CGI-BIN string. Not a control data attack. / usr/local/httpd/exe / bar /bin /sh

13 Non-control-hijacking attack on SSH Communications SSH Server (via an integer overflow bug) void do_authentication(char *user,...) { int auth = 0;... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) {... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…); } auth = 0 auth = 1 Password incorrect, but auth = 1 auth = 1 Logged in without correct password

14 More non-control-hijacking attacks Against NetKit Telnet server (default Telnet server of Redhat Linux) Against NetKit Telnet server (default Telnet server of Redhat Linux) –Exploit a heap overflow bug –Overwrite two strings: /bin/login –h foo.com -p (normal scenario) /bin/sh –h –p -p (attack scenario) –The server runs /bin/sh when it tries to authenticate the user. Against GazTek HTTP server Against GazTek HTTP server –Exploit a stack buffer overflow bug Send a legitimate URL Send a legitimate URL The server checks that “/..” is not embedded in the URL The server checks that “/..” is not embedded in the URL Exploit the bug to change the URL to Exploit the bug to change the URL to The server executes /bin/sh The server executes /bin/sh

15 Implications of Non-Control-Data Attacks Defense techniques that only ensure control flow integrity are not able to protect major network servers. Defense techniques that only ensure control flow integrity are not able to protect major network servers. Non-control-data attacks are specific to application semantics, but so many types of non-control data critical to software security Non-control-data attacks are specific to application semantics, but so many types of non-control data critical to software security –E.g., user identity data, configuration data, user input data and decision-making data. Once attackers have the incentive, they are likely to succeed in non-control-data attacks. A new challenge to defense techniques. Once attackers have the incentive, they are likely to succeed in non-control-data attacks. A new challenge to defense techniques.

16 Re-Examining Current Defense Techniques They were mainly tested against control-data attacks. They were mainly tested against control-data attacks. –Many of them are based on control flow integrity Monitor system call sequence Monitor system call sequence Protect control data Protect control data Non-executable stack and heap Non-executable stack and heap –Pointer encryption (PointGuard) Need to encrypt pointers in libraries to be effective (challenging because no enough type info, type casting very often, performance). Need to encrypt pointers in libraries to be effective (challenging because no enough type info, type casting very often, performance). –Address space randomization Good idea. In each run of the program, memory layout is different. Good idea. In each run of the program, memory layout is different. Challenging to deploy on all program segments. Challenging to deploy on all program segments. Even every segment is randomized, a recent paper shows the deployment on 32-bit address space doesn’t provide enough entropy. Even every segment is randomized, a recent paper shows the deployment on 32-bit address space doesn’t provide enough entropy. –StackGuard, Libsafe and FormatGuard They are specific to defeat stack smashing attacks and format string attacks. Not generic solutions. They are specific to defeat stack smashing attacks and format string attacks. Not generic solutions. Building a generic and secure defense technique to defeat memory corruption attacks is still an open problem. Building a generic and secure defense technique to defeat memory corruption attacks is still an open problem. Future defense research should consider non-control-data attacks more seriously. Future defense research should consider non-control-data attacks more seriously.

17 Pointer Taintedness Detection: Towards a Better Security Protection for Real-World Systems

18 Pointer Taintedness : a pointer value, including a return address, is derived from user input. Pointer Taintedness: a pointer value, including a return address, is derived from user input. Most memory corruption attacks are due to pointer taintedness. Most memory corruption attacks are due to pointer taintedness. –It allows users to arbitrarily specify the memory locations to read, write or transfer control to. Usually a pathological program behavior. Pointer taintedness provides a unifying perspective for reasoning about a significant number of security vulnerabilities. Pointer taintedness provides a unifying perspective for reasoning about a significant number of security vulnerabilities.

19 Most Memory Corruption Attacks are Due to Pointer Taintedness Format string attack Format string attack –Taint an argument pointer of functions such as printf, fprintf, sprintf and syslog. Stack buffer overflow (stack smashing) Stack buffer overflow (stack smashing) –Taint a function frame pointer or a return address. Heap corruption Heap corruption –Taint the free-chunk doubly-linked list of the heap. Glibc globbing attack Glibc globbing attack –User input resides in a location that is used as a pointer by the parent function of glob().

20 Internals of Stack Buffer Overflow Attacks Vulnerable code: char buf[100]; strcpy(buf,user_input); Return addr Frame pointer buf[99]…buf[1]buf[0] High Low Stack growth buf user_input Frame pointer or return address can be tainted.

21 ap: argument pointer fmt: format string pointer Internals of Format String Attacks In vfprintf(), if (fmt points to “%n”) then **ap = (character count) Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */ \xdd \xcc \xbb \xaa %d %d %d %n …%n%d%d%d0xaabbccdd fmt: format string pointer ap: argument pointer High Low Stack growth *ap is a tainted value.

22 Internals of Heap Corruption Attacks Free chunk A Free chunk B fd=A bk=C Allocated buffer buf Free chunk C user input Vulnerable code: buf = malloc(1000); recv(sock,buf,1024); free(buf); In free(): B->fd->bk=B->bk; B->bk->fd=B->fd; When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.

23 Building Defense Techniques based on Pointer Taintedness Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. –To uncover potential vulnerabilities Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. –To defeat memory corruption attacks (both control-hijacking and non-control-hijacking attacks)

24 Static Analysis about Pointer Taintedness: To Extract Security Specifications of Library Functions Appears in IFIP International Information Security Conference 2004

25 Library function specifications are crucial to secure programming Library function specifications are currently ad-hoc. Many of them are specified after real attacks are discovered. Library function specifications are currently ad-hoc. Many of them are specified after real attacks are discovered. –printf(fmt,…): fmt cannot be a user-specified string –strcpy(d,s): the length of string s should not exceed the size of buffer d, and d and s cannot be overlapped. –free(p): p must be a pointer obtained from a previous malloc; p cannot be freed before. –glob(), strtok(), savestr(), …. A unified reason why these specifications are required A unified reason why these specifications are required –They are required to eliminate the possibility of pointer taintedness. Extraction of security specifications of a function is reduced to a theorem proving problem: Extraction of security specifications of a function is reduced to a theorem proving problem: –Under which conditions can a function eliminate the possibility of pointer taintedness.

26 Semantics of Pointer Taintedness Formal definition of program semantics is required for theorem proving. Formal definition of program semantics is required for theorem proving. Taintedness-aware memory model Taintedness-aware memory model –The logic framework defines operations to fetch the content and test the taintedness (true/false) of each memory location. Incorporate pointer taintedness into program semantics Incorporate pointer taintedness into program semantics –(Extend the equational semantic definition proposed by Goguen and Malcolm) –Define program semantics at the assembly level to reason about memory layout. –Load/Store/ALU instructions: propagate taintedness from source data to destination data. –Input functions (scanf, recv and recvfrom) Axiom: The memory locations in the receiving buffer are tainted immediately after these function calls. Axiom: The memory locations in the receiving buffer are tainted immediately after these function calls.

27 Extracting Function Specifications by Theorem Prover C source code of a library function formal semantic representation Automatically translated to formal semantic representation Theorem generation Theorem proving A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function. For each pointer dereference in an assignment, generate a theorem stating that the pointer is not tainted

28 Example: vfprintf() int vfprintf (FILE *s, const char *format, va_list ap) { char * p, *q; int done,data,n,state; char buf[10]; p=format; done=0; if (p==NULL) return 0; state=NO_PENDING; while (*p != 0) { if (state==NO_PENDING) { if (*p=='%') state=PENDING; else outchar(s,*p); } else {switch (*p) { case '%': outchar(s,'%') break; case 'd':data=va_arg (ap, int); if (data<0) {outchar(s,'-'); data=-data; } n=0; while (data>0 && n<10) { buf[n]=data%10+'0'; data/=10; n++; } while (n>0) { n--; outchar(s,buf[n]); } break; case 's': q=va_arg (ap, char *); if (q==NULL) break; while (*q!=0) { outchar(s,*q) q++; } break; case 'n':q= va_arg(ap,void*) ; *(int*) q = done; break; default: outchar(s,*p) } state=NO_PENDING; } p++; } return done; } Theorem1: buf+n should not be a tainted value Theorem2: q should not be a tainted value

29 Extracting the Specifications of vfprintf() Try to prove the two theorems Try to prove the two theorems The theorem prover cannot complete the proof initially The theorem prover cannot complete the proof initially –only valid under certain preconditions. Add these preconditions as axioms to the theorem prover. Add these preconditions as axioms to the theorem prover. Repeat until all theorems are proved. Repeat until all theorems are proved. Finally, the following four preconditions are added, which are the specifications of Finally, the following four preconditions are added, which are the specifications of vfprintf (FILE *s, const char *format, va_list ap) –ap never points to any location within the current function frame. –*ap never points to the location of variable ap, i.e., *ap  &ap –Suppose the memory segment that ap sweeps over is called ap_activitiy_range, then *ap never points to any location within ap_activitiy_range. –No locations within ap_activitiy_range are tainted before vfprintf() is called. Suggest the scenario of format string vulnerability iterate

30 Other Studied Examples Function strcpy() Function strcpy() –Four security specifications indicating buffer overflow, buffer overlapping and buffer underflow scenarios causing pointer taintedness. Function free() of a heap management system Function free() of a heap management system –Seven security specifications are extracted, including several specifications indicating vulnerabilities. –Seven security specifications are extracted, including several specifications indicating heap corruption vulnerabilities. Socket read functions of Apache HTTPD and NULL HTTPD Socket read functions of Apache HTTPD and NULL HTTPD –The Apache function is proven to be free of pointer taintedness. –Two (known) vulnerabilities are exposed in the theorem proving process of NULL HTTPD function.

31 Runtime Detection of Pointer Taintedness: To Defeat Memory Corruption Attacks To appear in IEEE Conference on Dependable Systems and Networks, 2005.

32 The Technique A processor architectural level mechanism to detect pointer taintedness A processor architectural level mechanism to detect pointer taintedness –Implemented on SimpleScalar simulator Extended memory system to be taintedness- aware Extended memory system to be taintedness- aware Enhance load, store and ALU instructions to propagate taintedness bits in memory Enhance load, store and ALU instructions to propagate taintedness bits in memory Untaint data that are checked by compare instructions Untaint data that are checked by compare instructions Enhance input system calls to initialize taintedness Enhance input system calls to initialize taintedness Detect security attacks when tainted data are dereferenced, and stop the process. Detect security attacks when tainted data are dereferenced, and stop the process.

33 Evaluations on Real-World Software Evaluation Evaluation –Effectiveness of detection Run real network server programs on the enhanced SimpleScalar architecture. Run real network server programs on the enhanced SimpleScalar architecture. Both control hijacking and non-control-hijacking attacks are detected (in a uniform way). Both control hijacking and non-control-hijacking attacks are detected (in a uniform way). –No false alarm in any application tested Network servers run without any alarm when there is no attack. Network servers run without any alarm when there is no attack. No alarm during normal executions SPEC 2000 benchmarks. These are big applications such as GCC, BZIP2 and GZIP. No alarm during normal executions SPEC 2000 benchmarks. These are big applications such as GCC, BZIP2 and GZIP. –Transparent to applications Run precompiled binaries on the architecture. Run precompiled binaries on the architecture. –A small number of potential attack scenarios undetected. They are rare, and not defeated by current generic defense techniques either. They are rare, and not defeated by current generic defense techniques either. Pointer taintedness detection is a technique that can be applied to the whole program (application + library) of real software, which offers a substantial improvement on security protection. Pointer taintedness detection is a technique that can be applied to the whole program (application + library) of real software, which offers a substantial improvement on security protection.

34 Conclusions

35 Conclusions Our analysis shows that many real-world software can be compromised by corrupting non-control data. Non-control- data attacks represent a realistic threat. Our analysis shows that many real-world software can be compromised by corrupting non-control data. Non-control- data attacks represent a realistic threat. –It is insufficient to rely on control flow integrity for software security. Pointer taintedness is a common characteristic of most memory corruption attacks, including control data attacks and non-control-data attacks. Pointer taintedness is a common characteristic of most memory corruption attacks, including control data attacks and non-control-data attacks. Reasoning about pointer taintedness is a promising direction to enhance security on real-world systems Reasoning about pointer taintedness is a promising direction to enhance security on real-world systems –A theorem proving based code analysis approach is designed to reason about possibilities of pointer taintedness. E.g., to formally extract security specifications of library functions. E.g., to formally extract security specifications of library functions. –A runtime pointer taintedness detection mechanism is designed. It can effectively detect most memory corruption attacks.

36 Future Directions Short term goals Short term goals –Provide a higher degree of automation for the theorem proving technique. –Reduce the intrusiveness of the runtime pointer taintedness detection technique Combine with the theorem proving technique. The processor only checks function preconditions. Combine with the theorem proving technique. The processor only checks function preconditions. Long term goals Long term goals –Extract programming styles that susceptible to security attacks. e.g., long lifetime of security critical data is a big problem. Can compilers detect bad programming styles? –Identify a broader range of non-traditional security threats other than non-control-hijacking attacks. –Study historical data about how security vulnerabilities were discovered, reported and patched. How successful were we able to mitigate security threats? –Decompose the behaviors of viruses, worms and rootkits to a number of basic building blocks. Better understanding of their real capabilities, not disguised by their current forms.

37 Summary of My Research Methodology Analysis-centric approach Analysis-centric approach –A significant amount of effort in my dissertation is on analysis. I like doing analysis on real data and incidents I like doing analysis on real data and incidents –Tedious? Sometimes, but it is a step toward a lot of fun. –Rewarding? Definitely. Especially important for systems research. –Goal: strongly motivate research topics that solve problems in the reality.

38 Backup Slides

39 Static and Dynamic Approaches Static approaches (avoid producing memory vulnerabilities in programs) Static approaches (avoid producing memory vulnerabilities in programs) Writing code with type safe language Writing code with type safe language Compiler techniques to uncover memory vulnerabilities Compiler techniques to uncover memory vulnerabilities Compiler instruments source code according to program annotations. Compiler instruments source code according to program annotations. Challenges: legacy code and low level code, compatibility and performance. Challenges: legacy code and low level code, compatibility and performance. Fact: Memory vulnerabilities are still constantly discovered and exploited. Fact: Memory vulnerabilities are still constantly discovered and exploited. Intrusion detection techniques (defeat attacks, given the existence of vulnerabilities) Intrusion detection techniques (defeat attacks, given the existence of vulnerabilities) –Specialized techniques Defeat stack buffer overflow and format string attacks. Defeat stack buffer overflow and format string attacks. –Generic defense techniques Most techniques are designed to defeat control-hijacking attacks. Host intrusion detection system and control flow integrity protection techniques. very active research area. Most techniques are designed to defeat control-hijacking attacks. Host intrusion detection system and control flow integrity protection techniques. very active research area. Others have constraints and difficulties in their deployments. (pointer encryption and address randomization) Others have constraints and difficulties in their deployments. (pointer encryption and address randomization)

40 One-Slide Intro to Equational Logic Use term rewriting to establish proofs of theorems. Use term rewriting to establish proofs of theorems. Natural number addition expressed in the Maude system. Natural number addition expressed in the Maude system. 0 : Natural. s_ : Natural -> Natural. _+_ : Natural Natural -> Natural. vars N M : Natural Axiom: N + 0 = N. Axiom: N + s M = s (N + M). (s s s 0) + (s s 0) = s ((s s s 0) + (s 0)) = s( s((s s s 0) + 0)) = s(s((s s s 0)) = s s s s s 0 Intuitively, this is a proof of “3 + 2 = 5” in natural number algebra.

41 Taintedness-Aware Memory Model A store represents a snapshot of the memory state at a point in the program execution. For each memory location, we can evaluate two properties: content and taintedness (true/false). Operations on memory locations: The fetch operation Ftch(S,A) gives the content of the memory address A in store S The location-taintedness operation LocT(S,A) gives the taintedness of the location A in store S Operations on expressions: The evaluation operation Eval(S,E) evaluates expression E in store S The expression-taintedness operation ExpT(S,E) computes the taintedness of expression E in store S.

42 Axioms of Eval and ExpT operations Eval(S, I) = I // I is an integer constant Eval(S, ^ E1) = Ftch(S, Eval(S,E1)) Eval(S, E1 + E2) = Eval(S, E1) + Eval(S, E2) Eval(S, E1 - E2) = Eval(S, E1) - Eval(S, E2) … ExpT (S, I) = false ExpT(S, ^ E1) = LocT(S,Eval(S,E1)) ExpT(S,E1 + E2) = ExpT(S,E1) or ExpT(S,E2) ExpT(S,E1 - E2) = ExpT(S,E1) or ExpT(S,E2) … E.g., is the expression (^100)–2 tainted in store S? ExpT(S, (^100)–2) = ExpT(S, (^100)) or ExpT(S, 2) = LocT(S,100) or false = LocT(S,100) Note: ^ is the dereference operator, ^100 gives the content in the location 100

43 Semantics of My Assembly Language The following instructions are defined: The following instructions are defined: –mov [Exp1] <- Exp2 –branch (Condition) Label –call FuncName(Exp1,Exp2,…) Axioms defining mov instruction semantics Axioms defining mov instruction semantics –Specify the effects of applying mov instruction on a store –Allow taintedness to propagate from Exp2 to [Exp1]. Ftch((S ; mov [E1] <- E2),X1) = Eval(S,E2) if (Eval(S,E1) is X1). Ftch((S ; mov [E1] <- E2),X1) = Ftch(S,X1) if not (Eval(S,E1) is X1). LocT((S ; mov [E1] <- E2),X1) = ExpT(S,E2) if (Eval(S,E1) is X1). LocT((S ; mov [E1] <- E2),X1) = LocT(S,X1) if not (Eval(S,E1) is X1). Axioms defining the semantics of recv (similarly, scanf, recvfrom: user input functions) Axioms defining the semantics of recv (similarly, scanf, recvfrom: user input functions) – –Specify the memory locations tainted by the recv call.

44 Example: strcpy() char * strcpy (char * dst, char * src) { char * res; 0: res =dst; while (*src!=0) { 1: *dst=*src; dst++; src++; } 2: *dst=0; return res; } 0: mov [res] <- ^ dst lbl(#while#6) branch (^ ^ src is 0) #ex#while#6 1: mov [^ dst] <- ^ ^ src mov [dst] <- (^ dst) + 1 mov [src] <- (^ src) + 1 branch true #while#6 lbl(#ex#while#6) 2: mov [^ dst] <- 0 mov [ret] <- ^ res Translate to formal semantics a) Suppose S1 is the store before Line L1, then LocT(S1,dst) = false b) If S0 is the store before Line L0, and S2 is the store after Line L1, then I LocT(S2,I) = LocT(S0, I) c) Suppose S3 is the store before Line L2, then LocT(S3,dst) = false Theorem generation Theorem proving

45 Specifications Extracted Specifications that are extracted by the theorem proving approach Specifications that are extracted by the theorem proving approach –srclen <= dstsize –The buffers src and dst do not overlap in such a way that the buffer dst covers the string terminator of the src string. –The buffers dst and src do not cover the function frame of strcpy. –Initially, dst is not tainted Documented in Linux man page Not documented Suppose when function strcpy() is called, the size of destination buffer (dst) is dstsize, the length of user input string (src) is srclen Suppose when function strcpy() is called, the size of destination buffer (dst) is dstsize, the length of user input string (src) is srclen

46 Internships in Industrial Labs Summer’01, Avaya Labs, Basking Ridge, NJ Summer’01, Avaya Labs, Basking Ridge, NJ –Libsafe is a software package originally invented for Linux to detect stack buffer overflow attacks. I implemented it on Windows NT/2000. Summer’02, Bell Labs, Holmdel, NJ Summer’02, Bell Labs, Holmdel, NJ –Mitigate network congestive denial of service attacks by detecting TCP unfriendly flows Summer’03, Microsoft Research, Redmond, WA Summer’03, Microsoft Research, Redmond, WA –Audit-enhanced authentication in Kerberos Summer’04, Microsoft Research, Redmond, WA Summer’04, Microsoft Research, Redmond, WA –A tracing technique to identify the dependencies of Windows applications on Administrator privileges