Adventures in Large Scale HTTP Header Abuse Zachary Wolff
About Me SIEM Deployments, Research Engineer for LogRhythm (Labs) | 2+ years Threat Research Analyst for Webroot Software | 2+ years
Lets Talk About HTTP Headers Browser Web Server
Standard Fields for Request & Response defined in RFC (RTFRFC) GET / HTTP/1.0 is a legitimate request but may not return expected results RFC sets no limits on size of header, field name, value or number of headers Most webservers now impose their own limits : HTTP Headers Basics IIS v4 – 2MB v5 – 128K-16KB* V6 – 16 KB* V7 – 16 KB* Apache v2.3 – 8KB* *Per Header Field
Existing/Past HTTP Header Attacks
The Premise I want to break some logging applications!
par exemple, to begin
Round 1: Begin Original Premise: GET Request returns 302, 200, (valid response) then send a second GET with a malicious User Agent string* to see if we can get 500 response 1.Crawler to collect URL’s 2.Python script to send attack/test UA String 3.Store results in SQLite3 DB 4.Profit!
Round 1: Results Data set: 400K URL’s Lots of 500’s! Lots of smaller, low traffic site, some bigger high traffic sites Various different errors….
Round 1: Results Custom 500’s…
Regular expression parsing errors….
le yawn.. Non verbose IIS Errors…
Not as boring, generic apache 500 and the x.x.gov sites….?
Round 1: Conclusion What did we find? Some SQL injectable 500’s Possible application level DOS Lots of websites are not expecting malicious Header requests… Further exploration is warranted
the Question How extensive is the problem of improper HTTP header handling?
Round 2: Begin 1.Need a more effective way to identify vulnerabilities 2.Lets attack/audit more than just User-Agent Header 3.Expand beyond backtick, additional attack strings 4.Larger sample set, 1.6 Million URL’s 5.Must be able to store and access very large set of result data efficiently (Shodan is amazing)
Round 2: Vulnerability Identification 500’s are ok, but much to broad What is a good indication of a possible SQLi vulnerability? Run regular Expression against HTML.data response to match on, “you have an error in your sql syntax”
Round 2: Vulnerability Identification Improved error detection, basic SQLi & beyond *Thanks for contributing to regEx list
Beyond RegEx based Error Detection Byte Anomaly Detection Added (--bad) Compare content-length of response data from original/clean GET to data from malicious GET. * Set margin of alert to 150 bytes above and 150 bytes below clean request, log results (including HTML response data) to file
Round 2: Additional Header Fields Let’s test: Host, From*, X-Forwarded-For, Referer, User-Agent, Non existent Header Smart Mode (-s) : Will look at all Header fields returned by the server and test those (minus whitelist of rarely dynamic Headers) Cookies!
Cookie Support Cookie Support added. Server Sends us this: PyLobster Responds with this: And the server says?
Round 3: Updates Updated Testing Values: “,;,%00, %00’
Round 2: Design “I Improved the crawler to harvest 500K+ URL’s a day. You should put my picture in your whitepaper” Output additions (beyond SQLite): Elasticsearch Indexing support added (fast, efficient, JSON to webinterface) Flat File logging Mark Vankempen, LogRhythm Labs
More Improvments Added Footprint mode (-g) 1.Generate random(ish) Hash or value 2.Save to key.txt file in same directory as pylobster.py 3.Activate Footprint mode:./pylobster.py –g pyLobster will now send your unique string/hash as a request like so: Then, Wait for it… Days, Weeks, Months Google/Bing/duckduckgo your hash/string to discover unprotected Log directories ;)
pyLobsters maiden voyage Ready Begin! pyLobster is currently a single threaded tool so I divided my 1.6 Million URL’s into 78 unique lists and spawned 78 instances #!/bin/bash nohup python pyLobster.py -f a --bad -s -l -g & nohup python pyLobster.py -f b --bad -s -l -g & nohup python pyLobster.py -f c --bad -s -l -g & nohup python pyLobster.py -f d --bad -s -l -g & And so on……
PyLobster’s Maiden Voyage Results Sending a null byte in your HTTP Headers will catch a fair bit of IDS attention ;) Grep response HTML on regEx error match directory to find patterns & specific components/module/application/CMS vulnerabilities. (highest value finding: one vulnerable component can lead to many others, shared DB’s as well) Various vulnerable components identified
Findings: Breakdown by RegEx # * Out of 1.6 Million Unique URL’s, 14,500 Error RegEx’s Matched! *0,1 & 2 are MySQL errors, 18 & 19 are PHP
Findings: Error Breakdown by Test String Of the 14,500 Error RegEx’s Matched
Findings: Error breakdown by HTTP Header *Cookies: 1584
Findings: error #0, breakdown by header field Error #0: “you have an error in you SQL syntax"
Findings: Footprint Mode Footprint Mode 12/13/2012 02/25/2013
Foot Print Mode 3/27/2013
Findings: (--bad) Byte Anomaly Detection Results Work to be done…. grep over dir for [wordpress|joomla|error|pass.*=|cms|.*?|] Sort response files by size for like errors Sort by status code response & file size
Defending Against HTTP Header Attacks Raise developer awareness that any dynamically handled Header values need to be seen as user input and processed accordingly Audit your sites HTTP Header Processing (pyLobster on github, SQLmap now supports custom Header testing too. bingo!) Proactively review/monitor your web logs
This: Creates this Log trail:
The End Thank