S CRIPT G ARD Automatic Context-Sensitive Sanitization for Large-Scale Legacy Web Applications Prateek Saxena UC Berkeley David Molnar Microsoft Research Ben Livshits Microsoft Research
Large-Scale Legacy Applications Step-up in Scale – Half a Million LOC – Shared Development by teams of 100+ What’s The Difference? – Shifting Platforms isn’t practical – Long Program Paths, Many sanitizers Applied 2 How to Secure Legacy Apps?
XSS in Large-Scale Applications Small-Scale Apps Buggy Sanitizer Missing Sanitization – [ Pixy’06, PhpTaint’06,Cqual’04, Merlin’09,Securifly’05, PhpAspis’11, Saner’08, Bek’11 ] Large-Scale Applications 3 String Img.RenderControl() { Write(userimg); } String Img.RenderControl() { Write(Sanitize(userimg)); } New Sanitization Errors – [ CCS’11 ] S CRIPT G ARD
Contributions Does Sanitization Defense Fail In Practice? – 7 Commercial Applications, 400 KLOC 2 New Classes of Errors in Sanitizer Use – How Often & Why S CRIPT G ARD : Automated Sanitizer Use Analysis 4 Legacy.NET Minimal Specs Concrete Test Cases Can Auto-Correct Sanitization During Deployment
Error #1: Context-Mismatched Sanitization(CMS) 5 Diapers var name=‘Stewie’; JS String Context HtmlEncode JSStringEncode Which Sanitizer To Apply Where? \r\n; alert(document.cookie); HTML Tag Context 1,207 (4.7%) are CMS errors!
Why Does Context-Mismatch Happen? 6 Output Sink San Context is a Global Path-Sensitive Property But, developers select Sanitizers Locally
Error #2: Inconsistent Multiple Sanitization(IMS) 7 Output Sink San 1 San 2 Attack Input Safe? San 1 San 2 Does the Order Matter?
Inconsistent Multiple Sanitization(IMS): Does it Really Happen? 8 Attack Input HtmlEncode JSStringEncodeHtmlEncode JSStringEncode 285 (8%) of multiple sanitizations are errors!
Why Does IMS Happen? 9 Output Sink document.write (‘ ’); <a href=" userlink "> SERVER - SIDE OUTPUT
Why Does IMS Happen: Nested Contexts 10 document.write (‘ ’); <a href=" userlink JS String Context "> URL Attribute Context JS Parser HTML Parser JS Unicode Decode \u0022 " Html-Entity Decode " "
Why Does IMS Happen: Nested Contexts 11 JS Parser HTML Parser JS Unicode Decode Html-Entity Decode \u0022 \u0026quot; " " Correct Sanitizer Order Wrong Sanitizer Order " Nested Contexts Cause Developer Confusion!
How Common Are Nested Contexts? 12 Nesting Depth: Up to 4 Nesting Depth: Up to 4
Take-Aways… Small-Scale Apps Buggy Sanitizer Missing Sanitization – [ Pixy’06, PhpTaint’06,Cqual’04, Merlin’09,Securifly’05, PhpAspis’11, Saner’08, Bek’11 ] Large-Scale Applications 13 Shared Paths lead to… CMS & IMS Developers apply correct sanitizers wrongly
How Do We Find Sanitization Errors In Legacy Applications At Scale? 14
S CRIPT G ARD Analysis 15 S CRIPT G ARD HTTP Requests Inconsistently Sanitized Test Cases Instrumented Server-side DLLs Legacy.NET Sanitizer Specification
Browser Model S CRIPT G ARD Analysis: Key Ideas Path 1 Path 2 Path 3Path 4 Path-Sensitive Positive Taint-Tracking Determine Contexts
S CRIPT G ARD Analysis: Key Ideas 17 Trusted? +-+- Sanitizer Sequence HtmlAttributeEncode, JSStringEncode HtmlEncode, JSStringEncode HtmlAttributeEncodeJSStringEncode, HtmlEncode CMS IMS Path 1 Path 2 Path 3Path 4 Path-Sensitive Positive Taint-Tracking Determine Contexts
Precise Context Determination: Browser Parser Model 18 T Context s
How Can We Correct Sanitization Errors Automatically? How Can We Correct Sanitization Errors Automatically? 19
S CRIPT G ARD : Can We Auto-Patch Sanitization Errors? The Bad News: Large slowdown Observation: Less than 10% paths problematic Yes! – Preferential Path Profiling [ POPL’06 ] – Negligible Overhead 20 Can We Detect When A Problematic Path Is Executed?
S CRIPT G ARD Auto-Correction 21 SCRIPTGARD Pre-Release Analysis Sanitization CacheSanitizer Patch Deployment Preferential Path Profiler Server Code With Light-weight Instrumentation Sanitizer Patch
Conclusions 2 New Patterns of Errors in Sanitizer Use S CRIPT G ARD – Effective Analysis Tool – Auto-Correction with Negligible Overhead 22
You have been a wonderful audience 23 …you stayed… Prateek Saxena
Sanitizer Correction is Challenging 24 Output Sink San HtmlEncode Can We Just Replace HtmlEncode with another Sanitizer? Contexts Vary By Path Executed
Context Determination: An Abstract Browser Model 25 HTML URI JavaScript CSS …… ……… … document.write javascript: alert() alert() T
Browser Contexts 26 <img src=‘ String Img.RenderControl() { Write(“<img src=‘”); Write(userimg); Write(“’> ”); } Sunset.gif’> Expect < Expect URL Expect ’ Img Tag Src Attribute Attribute Value Start Parsing “Context”
27 <img src=‘ String Img.RenderControl() { Write(“<img src=‘”); Write(userimg); Write(“’> ”); } ’ onerror=alert(“XSS”):… Expect < Expect URL Img Tag Src Attribute Attribute Value Start Parsing “Context” Malicious string closes enclosing parsing context javascript: alert(“XSS”); Malicious string introduces new parsing context JS URL Context In a Scripting Attack…
Sanitizers & Contexts 28 Diapers var name=‘Stewie’; Quoted resource attribute Html-entity encode qoutes (" for “), Neuter javascript: URI CSS attribute Prevent moz-bindings, behavior: URLs Html Content Convert,&,”,’ to Html-entities JS String Literal Encode ‘,”,&,\n,\r,(,),,\ to Unicode encoding \u00XX
Insight #1: Why does it happen…. Nested Contexts Browser Model is Intricate 29 HTML Parser JavaScript Parser D HTML Parser D
Challenges Non-Solutions – “Rewrite The Application…” – “Use Favorite Static Auditing Tool…” – “Write Interface Specifications…” 30 How to Secure Against XSS? Code Specifications
Observation #2: The Browser Model Complexity 31 T Context s Can we Expect Developers To Retain This Model Mentally?
Contexts & Sanitizers 32 Diapers var name=‘Stewie’; Quoted URI attribute Html-entity encode qoutes (" for “), Neuter javascript: URI CSS attribute Prevent moz-bindings, behavior: URLs Html Content Convert,&,”,’ to Html-entities JS String Literal Encode ‘,”,&,\n,\r,(,),,\ to Unicode encoding \u00XX