1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and.

1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and High Performance Computing University of Illinois at Urbana-Champaign

2 Security Threat Analysis and Mitigations in Real-World Systems Security Threat Analysis and Mitigations in Real-World Systems –How errors in hardware and software impose security threats to real-world systems? (common characteristics?) –How effective are current defense techniques? (substantial deficiencies?) –How to build better defenses? Analysis-centric research approach Analysis-centric research approach –Study hardware memory errors  impact on system security –Software vulnerabilities reported in Bugtraq and CERT databases, source code of vulnerable applications –Current attack methods and defense techniques –Analysis results motivate the development of new defense techniques. Many areas related to my dissertation Many areas related to my dissertation My Dissertation

3 I as a System Hacker/Builder Summer’01, Avaya Labs, Basking Ridge, NJ Summer’01, Avaya Labs, Basking Ridge, NJ –Port Libsafe to Windows NT/2000. Summer’02, Bell Labs, Holmdel, NJ Summer’02, Bell Labs, Holmdel, NJ –Detection of network denial of service attacks –Hack FreeBSD TCP/IP, network card drivers Summer’03, Microsoft Research, Redmond, WA Summer’03, Microsoft Research, Redmond, WA –Audit-enhanced authentication in Kerberos –NTOS security subsystem, Kerberos, LSA, NTDLL Summer’04, Microsoft Research, Redmond, WA Summer’04, Microsoft Research, Redmond, WA –A tracing technique to identify the dependencies of Windows applications on Administrator privileges –NTOS security subsystem, access/privilege checking, application interactions with NTOS

4 Outlines Analyzing and Identifying Security Threats on Real-World Systems Analyzing and Identifying Security Threats on Real-World Systems –Security compromises due to HW/SW memory corruptions –A type of memory corruption attacks currently believed to be rare is a realistic threat. –Deficiencies of current defense techniques New Defense Techniques Towards a Better Security Protection New Defense Techniques Towards a Better Security Protection –A common characteristic of memory corruption attacks: pointer taintedness –A theorem proving based program analysis –A runtime detection technique Analyses Solutions

5 Analyzing and Identifying Security Threats on Real-World Systems

6 Threat of Hardware Memory Errors Attacker Target host Firewall (IPChains and Netfilter) Due to hardware memory errors, packets can penetrate firewalls Attacker Network server (FTP and SSH) Due to hardware memory errors, users can log in with arbitrary passwords Emulate random hardware memory errors Emulate random hardware memory errors A stochastic model to estimate such threats in real environments A stochastic model to estimate such threats in real environments Motivate other researchers to conduct physical fault injections Motivate other researchers to conduct physical fault injections –Java type system subverted due to random hardware memory errors. 

7 Threat of Software Vulnerabilities CERT Advisories:  66% vulnerabilities are low level memory errors in software. CERT Advisories:  66% vulnerabilities are low level memory errors in software. Widely exploited by attackers, worms and viruses. Widely exploited by attackers, worms and viruses.

8 Execute malicious code Overwrite a return address Embed malicious contents in input State Machine Model: WU-FTP Server Attack get an FTP command Authentication; x = user ID repeat FTP_service() seteuid(x) SITE_EXEC(fn) printf(fn,…) seteuid(0) exec(“/bin/sh”)

9 Overwrite function pointer foo Corrupt heap structure Execute malicious code State Machine Model: NULL-HTTP Server Attack process HTTP header p=malloc(…) repeat HTTP_service() HTTP_POST() recv(p,…) seteuid(0) exec(“/bin/sh”) free(p) *foo()

10 Control Data Attack: Well-Known, Dominant Control data: Control data: –data used as targets of call, return and jump. –widely understood as security critical elements Control data attack: the most dominant form of memory corruption attacks [CERT and Microsoft Security Bulletin] Control data attack: the most dominant form of memory corruption attacks [CERT and Microsoft Security Bulletin] Many current defense techniques: to enforce program control flow integrity to provide security. Many current defense techniques: to enforce program control flow integrity to provide security.

11 Non-control-data attacks Currently very rare in reality. Currently very rare in reality. One instance suggested by Young and McHugh in 1987. One instance suggested by Young and McHugh in 1987. How applicable are such attacks against many real-world software? How applicable are such attacks against many real-world software? –Not studied yet, but important.

12 An Important Question Are attackers in general incapable to mount non- control-data attacks against many real systems? Are attackers in general incapable to mount non- control-data attacks against many real systems? –PROBABLY NOT! –Random hardware memory errors can subvert the security of real-world systems with a non-negligible probability. –Software vulnerabilities are more deterministic and more amenable to attacks. –Each attack exploiting software vulnerabilities is composed by multiple primitive components. Allow potentially polymorphic attacks. Dangerous.

13 Our Claim: General Applicability of Non-control-data Attacks We claim: We claim: –Many real-world software applications are susceptible to non-control-data attacks. –The severity of the attack consequences is equivalent to that due to control data attacks. Validate the claim by constructing non-control- data attacks to get the root privilege on major network servers Validate the claim by constructing non-control- data attacks to get the root privilege on major network servers –FTP, HTTP, SSH and Telnet servers –Over 1/3 of vulnerabilities in CERT advisories Non-control-data attacks are realistic threats. Non-control-data attacks are realistic threats.

14 Non-control-data attack against WU-FTP Server (via a format string bug) int x; FTP_service(...) { authenticate(); x = user ID of the authenticated user; seteuid(x); while (1) { get_FTP_command(...); //vulnerable if (a data command?) getdatasock(...); } getdatasock(... ) { seteuid(0); setsockopt(... ); seteuid(x); } x=109, run as EUID 0 x uninitialized, run as EUID 0 x=109, run as EUID 109. Lose the root privilege! Get a special SITE EXEC command. Exploit a format string vulnerability. x= 0, still run as EUID 109. x=0, run as EUID 0 When return to service loop, still runs as EUID 0 (root). Allow me to upload /etc/passwd I can grant myself the root privilege! Only corrupt an integer, not a control data attack. Get a data command (e.g., PUT)

15 Non-control-hijacking attack against NULL-HTTP Server (via a heap overflow bug) Attack the configuration string of CGI-BIN path. Attack the configuration string of CGI-BIN path. Mechanism of CGI Mechanism of CGI –suppose server name = www.foo.com CGI-BIN = /usr/local/httpd/exe –Requested URL = http://www.foo.com/cgi-bin/bar –The server executes Our attack Our attack –Exploit the vulnerability to overwrite CGI-BIN to /bin –Request URL http://www.foo.com/cgi-bin/sh –The server executes The server gives me a root shell! Only overwrite four characters in the CGI-BIN string. Not a control data attack. / usr/local/httpd/exe / bar /bin /sh

16 Non-control-data attack against SSH Communications SSH Server (via an integer overflow bug) void do_authentication(char *user,...) { int auth = 0;... while (!auth) { /* Get a packet from the client */ type = packet_read(); switch (type) {... case SSH_CMSG_AUTH_PASSWORD: if (auth_password(user, password)) auth =1; case... } if (auth) break; } /* Perform session preparation. */ do_authenticated(…); } auth = 0 auth = 1 Password incorrect, but auth = 1 auth = 1 Logged in without correct password

17 More non-control-hijacking attacks Against NetKit Telnet server (default Telnet server of Redhat Linux) Against NetKit Telnet server (default Telnet server of Redhat Linux) –Exploit a heap overflow bug –Overwrite two strings: /bin/login –h foo.com -p (normal scenario) /bin/sh –h –p -p (attack scenario) –The server runs /bin/sh when it tries to authenticate the user. Against GazTek HTTP server Against GazTek HTTP server –Exploit a stack buffer overflow bug Send a legitimate URL http://www.foo.com/cgi-bin/bar Send a legitimate URL http://www.foo.com/cgi-bin/bar The server checks that “/..” is not embedded in the URL The server checks that “/..” is not embedded in the URL Exploit the bug to change the URL to http://www.foo.com/cgi-bin/../../../../bin/sh Exploit the bug to change the URL to http://www.foo.com/cgi-bin/../../../../bin/sh The server executes /bin/sh The server executes /bin/sh

18 Implications of Non-Control-Data Attacks Control flow integrity is not a sufficiently accurate approximation to software security. Control flow integrity is not a sufficiently accurate approximation to software security. Many types of non-control data critical to security Many types of non-control data critical to security Once attackers have the incentive, they are likely to succeed in non-control- data attacks. Once attackers have the incentive, they are likely to succeed in non-control- data attacks.

19 Re-Examining Current Defense Techniques Many of them are based on control flow integrity Many of them are based on control flow integrity –Monitor system call sequences –Protect control data –Non-executable stack and heap Pointer encryption PointGuard Pointer encryption PointGuard Address space randomization Address space randomization StackGuard, Libsafe and FormatGuard StackGuard, Libsafe and FormatGuard Building a generic and secure defense technique: still an open problem. Building a generic and secure defense technique: still an open problem.

20 Pointer Taintedness Detection: Towards a Better Security Protection for Real-World Systems

21 Pointer Taintedness : a pointer value, including a return address, is derived from user input. Pointer Taintedness: a pointer value, including a return address, is derived from user input. Most memory corruption attacks are due to pointer taintedness. Most memory corruption attacks are due to pointer taintedness. Pointer taintedness: a unifying perspective for reasoning about many security attacks. Pointer taintedness: a unifying perspective for reasoning about many security attacks.

22 Most Memory Corruption Attacks are Due to Pointer Taintedness Format string attack Format string attack –Taint an argument pointer of functions such as printf, sprintf and syslog. Stack buffer overflow (stack smashing) Stack buffer overflow (stack smashing) –Taint a frame pointer or a return address. Heap corruption Heap corruption –Taint the free-chunk doubly-linked list maintaining the heap structure. globbing attack globbing attack –User input resides in a location that is used as a pointer by the parent function of glob().

23 Internals of Stack Buffer Overflow Attacks Vulnerable code: char buf[100]; strcpy(buf,user_input); Return addr Frame pointer buf[99]…buf[1]buf[0] High Low Stack growth buf user_input Frame pointer or return address can be tainted.

24 ap: argument pointer fmt: format string pointer Internals of Format String Attacks In vfprintf(), if (fmt points to “%n”) then **ap = (character count) Vulnerable code: recv(buf); printf(buf); /* should be printf(“%s”,buf) */ \xdd \xcc \xbb \xaa %d %d %d %n …%n%d%d%d0xaabbccdd fmt: format string pointer ap: argument pointer High Low Stack growth *ap is a tainted value.

25 Internals of Heap Corruption Attacks Free chunk A Free chunk B fd=A bk=C Allocated buffer buf Free chunk C user input Vulnerable code: buf = malloc(1000); recv(sock,buf,1024); free(buf); In free(): B->fd->bk=B->bk; B->bk->fd=B->fd; When B->fd and B->bk are tainted, the effect of free() is to write a user specified value to a user specified address.

26 Building Defense Techniques based on Pointer Taintedness Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. Static code analysis: analyze the source code to extract the conditions under which the possibility of pointer taintedness exists. –To uncover potential vulnerabilities Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. Runtime detection: monitor at runtime whether a tainted value is dereferenced as a pointer. –To defeat memory corruption attacks

27 Static Analysis about Pointer Taintedness: To Extract Security Specifications of Library Functions IFIP International Information Security Conference 2004

28 Library function specifications are crucial to secure programming Library function specifications are specified empirically Library function specifications are specified empirically –printf(fmt,…), strcpy(d,s), free(p), glob(p), strtok(s,del), savestr(p), …. A unified reason why these specifications are required A unified reason why these specifications are required –Required to eliminate pointer taintedness. Extraction of security specifications of a function is reduced to a theorem proving task Extraction of security specifications of a function is reduced to a theorem proving task Formal and complete specifications required by compiler techniques to check application source code for security. Formal and complete specifications required by compiler techniques to check application source code for security.

29 Semantics of Pointer Taintedness Formal definition of program semantics is required for theorem proving. Formal definition of program semantics is required for theorem proving. –Currently defined using an equational logic framework Taintedness-aware memory model Taintedness-aware memory model –The logic framework defines operations to fetch the content and test the taintedness (true/false) of each memory location. Incorporate pointer taintedness into program semantics Incorporate pointer taintedness into program semantics –Define program semantics at the assembly level to reason about memory layout. –Load/Store/ALU instructions: propagate taintedness from source data to destination data. –Input functions (scanf, recv and recvfrom) Axiom: The memory locations in the receiving buffer are tainted immediately after these function calls. Axiom: The memory locations in the receiving buffer are tainted immediately after these function calls.

30 Extracting Function Specifications by Theorem Prover C source code of a library function formal semantic representation Automatically translated to formal semantic representation Theorem generation Theorem proving A set of sufficient conditions that imply the validity of the theorems. They are the security specifications of the analyzed function. For each pointer dereference in an assignment, generate a theorem stating that the pointer is not tainted

31 Example: vfprintf() int vfprintf (FILE *s, const char *format, va_list ap) { char * p, *q; int done,data,n,state; char buf[10]; p=format; done=0; if (p==NULL) return 0; state=NO_PENDING; while (*p != 0) { if (state==NO_PENDING) { if (*p=='%') state=PENDING; else outchar(s,*p); } else {switch (*p) { case '%': outchar(s,'%') break; case 'd':data=va_arg (ap, int); if (data<0) {outchar(s,'-'); data=-data; } n=0; while (data>0 && n<10) { buf[n]=data%10+'0'; data/=10; n++; } while (n>0) { n--; outchar(s,buf[n]); } break; case 's': q=va_arg (ap, char *); if (q==NULL) break; while (*q!=0) { outchar(s,*q) q++; } break; case 'n':q= va_arg(ap,void*) ; *(int*) q = done; break; default: outchar(s,*p) } state=NO_PENDING; } p++; } return done; } Theorem1: buf+n should not be a tainted value Theorem2: q should not be a tainted value

32 Extracting the Specifications of vfprintf() Try to prove the two theorems Try to prove the two theorems The theorem prover cannot complete the proof initially The theorem prover cannot complete the proof initially –only valid under certain preconditions. Add these preconditions as axioms to the theorem prover. Add these preconditions as axioms to the theorem prover. Repeat until both theorems are proved. Repeat until both theorems are proved. Four preconditions are added: the specifications of Four preconditions are added: the specifications of vfprintf (FILE *s, const char *format, va_list ap) –ap never points to any location within the current function frame. –*ap never points to the location of variable ap, i.e., *ap  &ap –Suppose the memory segment that ap sweeps over is called ap_activitiy_range, then *ap never points to any location within ap_activitiy_range. –No locations within ap_activitiy_range are tainted before vfprintf() is called. Suggest the scenario of format string vulnerability iterate

33 Other Studied Examples Function strcpy() Function strcpy() –Four security specifications indicating buffer overflow, buffer overlapping and buffer underflow scenarios causing pointer taintedness. Function free() of a heap management system Function free() of a heap management system –Seven security specifications are extracted, including several specifications indicating vulnerabilities. –Seven security specifications are extracted, including several specifications indicating heap corruption vulnerabilities. Socket read functions of Apache HTTP Server and NULL HTTP Server Socket read functions of Apache HTTP Server and NULL HTTP Server –Apache function is proven to be free of pointer taintedness. –Two (known) vulnerabilities are exposed in the theorem proving process of NULL HTTP Server function.

34 Runtime Pointer Taintedness Detection: To Defeat Memory Corruption Attacks To appear in IEEE Conference on Dependable Systems and Networks, 2005.

35 The Technique A processor architectural level mechanism to detect pointer taintedness A processor architectural level mechanism to detect pointer taintedness –On SimpleScalar simulator Implemented a taintedness-aware memory system Implemented a taintedness-aware memory system Extened instructions to track taintedness Extened instructions to track taintedness –To show the validity of pointer taintedness concept on whole programs of real applications Network servers Network servers SPEC 2000 integer benchmarks SPEC 2000 integer benchmarks

36 Evaluations on Real-World Software Evaluation Evaluation –Effectiveness of detection –No false alarm in any application evaluated –Transparent to applications –A small number of potential attack scenarios undetected. Pointer taintedness detection can be applied to the whole program of real software Pointer taintedness detection can be applied to the whole program of real software –offers a substantial improvement on security protection.

37 Conclusions

38 Conclusions Many real-world software can be compromised by corrupting non-control data. Many real-world software can be compromised by corrupting non-control data. –It is insufficient to rely on control flow integrity for software security. Pointer taintedness is a unifying perspective to reason about most memory corruption vulnerabilities/attacks. Pointer taintedness is a unifying perspective to reason about most memory corruption vulnerabilities/attacks. Reasoning about pointer taintedness is a promising direction to enhance security on real-world systems Reasoning about pointer taintedness is a promising direction to enhance security on real-world systems –A theorem proving based code analysis approach –A runtime pointer taintedness detection mechanism

39 Future Directions Short term goals Short term goals –Provide a higher degree of automation for the theorem proving technique. –Reduce the intrusiveness of the runtime pointer taintedness detection technique Combine with the theorem proving technique. The processor only checks function preconditions. Combine with the theorem proving technique. The processor only checks function preconditions. Long term goals Long term goals –Extract programming styles susceptible to security attacks. Can compilers detect bad programming styles? –Identify a broader range of non-traditional security threats. –Study historical data about how security vulnerabilities were discovered, reported and patched. –Decompose the behaviors of viruses, worms and rootkits to a number of basic building blocks.

40 Summary of My Research Methodology Analysis-centric approach Analysis-centric approach –A significant amount of effort in my dissertation is on analysis. –Starting from the reality (usually a mess) to define problems! I am a data analysis person I am a data analysis person –Excited to analyze real data and incidents –Tedious? Sometimes, but it is a step toward a lot of fun. –Rewarding? Definitely. Especially important for systems research. –Goal: strongly motivate research topics that solve problems in the reality.

1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and.

Similar presentations

Presentation on theme: "1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and.

Similar presentations

Presentation on theme: "1 Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D. Candidate in Computer Science Center for Reliable and."— Presentation transcript:

Similar presentations

About project

Feedback