What would you do with a pointer and a size?
Why do we need a new detection framework?
Attacks have switched from server attacks to client attacks Common attack vectors are easily obfuscated JavaScript Compression File formats are made by insane people Looking at you Flash and OLE guy… Back-channel systems are increasingly difficult to detect
Inline systems must emulate the processing of 1000s of desktops Detection of many backchannels is most successful with statistical evaluation of network traffic
Broadly speaking, IDS systems deal with packet-by-packet inspection with some level of reassembly Broadly speaking, AV systems typically target indicators of known bad files or system states
A system is needed that can handle varied detection needs A system is needed that extensible, open and scalable A system is needed that facilitates incident response, not just triggers it So……
Near-Realtime Detection Framework or: “Anything is Possible”
Provide entry to the system for any arbitrary data type Determine and manage detection based on a registered Deep Inspection Nugget Provide alerting to any framework capable system Provide verbose, detailed logging on the findings of the Nugget Farm Make intelligent use of all data discovered during the evaluation process
The heart of the NRT system APIs to handle: Deep Inspection Nugget registration Data Handler registration Detection requests Alerting Full analysis logging Output to API compliant systems Database driven
Implements a database to provide a centralized set of data chunk information Handles incoming queries for Data Handlers that have failed local cache hits Handles detection requests from both Data Handlers and DINs Handles incoming results from Deep Inspection Nuggets Handles database updates based on DIN data Writes out verbose logging based on DIN data Provides alerting to Data Handlers
Capture data and metadata Contact dispatcher for handling Has this data been evaluated before? Where should I send it? Pass that data set to a Deep Inspection Nugget Accept feedback from the Dispatcher for detection request Asynchronous alerting Local cache of detection outcome
Data (in this case a file) is captured Metadata is captured (in this case URL and filename) A local cache of MD5 sums and URLs of files previously collected A library to handle managing the initial file evaluation, cache checks and communication with the Dispatcher
Must handle data transfer from Data Handlers Must communicate with Dispatcher Register detection capability Request for additional processing of subcomponents Provide alerting feedback to Dispatcher
Registers with the Dispatcher Processes data provided by the Data Handlers, as instructed by the Dispatcher Handles incoming queries for Data Handlers that have failed local cache hits Handles detection requests from both Data Handlers and DINs Handles incoming results from Deep Inspection Nuggets Handles database updates based on DIN data Writes out verbose logging based on DIN data Provides alerting to Data Handlers
An implementation of the NRT goals on a Snort platform Target: Malicious pdf files
Let’s pretend that the PDF nugget already has the data…
Why are we passing back files?
MD5 is stored for all inspected data – packets, files, and subcomponents, both bad and good Primarily this is used to avoid reprocessing data we’ve already looked at But after a update to any DIN, all known- good entries are “tainted”
After an update to detection, previously analyzed files may be found to be bad We don’t rescan all files But if we see a match for md5 to a previous file, we will alert retroactively
When a subcomponent alerts, it is stored for logging in its fully normalized state. If a file is bad, when the DIN completes detection it passes the file to the Dispatcher Response teams have the entire file as well as each portion that alerted in an easily analyzed format
Verbose data back to Data Handler should also be as verbose as possible In this case we place data into the payload and provide a custom message to Snort so we can use established methods of handling Snort alerts 04/16-16:38: [**] [300: :1] URL:/users/pusscat/jbig2.pdf Hostname:metasploit.com Alert Info:Probable exploit of CVE (JBIG2) detected in object 8, declared as /Length 33/Filter [/FlateDecode/ASCIIHexDecode/JBIG2Decode ] [**] {TCP} :0 -> :0 04/16-16:38: :0:0:0:0:0 -> 0:0:0:0:0:0 type:0x800 len:0x :0 -> :0 TCP TTL:240 TOS:0x10 ID:0 IpLen:20 DgmLen:1280 ***AP*** Seq: 0x0 Ack: 0x0 Win: 0x0 TcpLen: C 3A 2F F URL:/users/pussc F 6A E F 73 at/jbig2.pdf Hos 74 6E 61 6D 65 3A 6D C 6F tname:metasploit 2E 63 6F 6D C E 66 6F 3A.com Alert Info: F C C 6F Probable exploit 20 6F D D of CVE A (JBIG2) detect E 20 6F 62 6A C 20 ed in object 8, C F 4C 65 6E declared as /Len F C B 2F gth 33/Filter [/ 46 6C F F FlateDecode/ASCI F F 4A IHexDecode/JBIG F D 20 Decode ]metasploit.com : : : :0
Seriously, what would you do with a pointer and a size?
Create file format and/or packet protocol templates which parse our elements and provide you a data structure Provide a full, common, scripting language interface to create rules (Ruby? Python? Both?) Only do the heavy work (templating) once per file format / protocol.
JBIG, ASCII Hex Decoding & Inflation
04/21-11:17: [**] [300: :1] URL:/wrl/first.pdf Hostname:wrl Alert Info:Probable exploit of CVE (JBIG2) detected in object 8, declared as /Length 29/Filter [/FlateDecode/ASCIIHexDecode/JBIG2Decode ] [**] {TCP} :0 -> :0 04/21-11:17: :0:0:0:0:0 -> 0:0:0:0:0:0 type:0x800 len:0x :0 -> :0 TCP TTL:240 TOS:0x10 ID:0 IpLen:20 DgmLen:1280 ***AP*** Seq: 0x0 Ack: 0x0 Win: 0x0 TcpLen: C 3A 2F C 2F E 70 URL:/wrl/first.p F E 61 6D 65 3A C 20 df Hostname:wrl 41 6C E 66 6F 3A F Alert Info:Proba 62 6C C 6F F ble exploit of C D D A 42 VE (JB E IG2) detected in 20 6F 62 6A C C 61 object 8, decla F 4C 65 6E red as /Length F C B 2F 46 6C /Filter [/Flate F F Decode/ASCIIHexD F F 4A F 64 ecode/JBIG2Decod D 20 e ] =+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ =+
What is that JavaScript up to?
[**] [300: :1] URL:/wrl/first.pdf Hostname:wrl Alert Info:The JavaScript variables in object 6, declared as /Length 5994/Filter [/FlateDecode/ASCIIHexDecode ], show a high degree of entropy [**] You tell me, does this string of variable names look weird to you? EvctenMNtrWDQVBKGrwGxrxKfMiZoYziRxAFEfjMdXRzjGNqVZYEAqogviSvzHp GpCkihcVtXRWcHphvhAnPOXnrxmTXJEUIkcYzelWZUCuIyKArtJvcEQXzUjHEzu SjGEJugOyFQnaSplNWwQsqOoV [**] [300: :1] URL:/wrl/first.pdf Hostname:wrl Alert Info:Found in the Javascript block, while searching object 6: unescape [**] Wait, did someone say unescape…
Sig up some common GetEIP techiniques… Heuristically hunt down shellcode decoder stubs Decode and parse shellcode Give back some REAL data.
DNS is our friend
05/11-16:09: [**] [300:12:1] Fast Flux DNS [**] {TCP} :0 -> :0 05/11-16:09: :0:0:0:0:0 -> 0:0:0:0:0:0 type:0x0 len:0x :0 -> :0 TCP TTL:240 TOS:0x10 ID:0 IpLen:20 DgmLen:1280 ***AP*** Seq: 0x0 Ack: 0x0 Win: 0x0 TcpLen: E E D 4F DNS PARSING MODU 4C 45 3A D 66 6C E LE: Fast-flux DN C C 20 3D S reply (TTL = F 6D 20 ) detected from 34 2E 32 2E 32 2E 31 m Example here is based off of record TTL Could have a cache of queries / responses to check for rapidly changing responses Whatever technique you want
05/11-16:09: [**] [300:11:1] Query to unauthorized DNS server [**] {TCP} :0 -> :0 05/11-16:09: :0:0:0:0:0 -> 0:0:0:0:0:0 type:0x0 len:0x :0 -> :0 TCP TTL:240 TOS:0x10 ID:0 IpLen:20 DgmLen:1280 ***AP*** Seq: 0x0 Ack: 0x0 Win: 0x0 TcpLen: E E D 4F DNS PARSING MODU 4C 45 3A F LE: Rogue server E E E F 6E contacted Some malware resets victim hosts to use custom DNS servers to get around blacklists
What is that unescape up to….
[**] [300: :1] URL:/wrl/first.pdf Hostname:wrl Alert Info:Reverse TCP connectback shellcode detected. Connecting to on port 4444 [**] Looking at the following: 10 d f6 d3 e c 7a f be a c 24 b0 b4 b1 3d 43 b d 4e 9b 7e d 12 f7 eb 4f 0d 7b 4a d5 1d 0b ff c6 c0 e3 03 f5 b3 b fd ba c b8 7b 30 d c 2a. bf a5 af 98 1d 1f e a 3a 5f 1f f0 87 c2 71 f1 e5 a0 77 f5 fe 94 fc d8 23 a d0 81 8e 37 a0 70 2f bc 79 0a a1 c c0 57 b9 37 a0 9f ef a2 71 a3 b8 a0 77 2c a fe 1f b5 87 c8 65 f5 ef 9e 1f f d1 a6 0a 37 a0 66 bc a2 75 a3 bc 9f 1d f a 0a 3a c9 b6 dc 29 4d b 75 f5 Gave us the shellcode type as well as the IP and port combination the connect back goes to. Wouldn’t it be great if something knew to start listening?
Take that IP address and Port, and auto-tcpdump when you get an alert Watch everything the attacker does over that back channel on the fly
We have hosted on a package that contains: Snort Preprocessor for snagging.exe,.dll and.pdf files from live traffic A commented library that will allow you to thread calls to a detection function A “Dumb Nugget” to simply write these files to disk A “Clam Nugget” to pass these files to ClamAV Local cache system to reduce detection overhead Alerting system that fires Snort alerts with arbitrary data Disclaimer For serious, this code was put together to pitch the idea to management it is…well it is what it is This project is a research project in the VRT no timeline for release either as open source or a Sourcefire product has been determined We’ll update it as we integrate the full dispatcher->data handler->deep inspection nugget code My DNS parsing code will go up in the next couple of weeks
Blog: Place we store bad ideas: