Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performing Security Auditing In Hardware

Similar presentations


Presentation on theme: "Performing Security Auditing In Hardware"— Presentation transcript:

1 Performing Security Auditing In Hardware
Fuzzing Processor Performing Security Auditing In Hardware TONY Tony Fynn Dustin Locke

2 Overview What is fuzzing? Project goals Architecture details
Optimizations Performance Conclusion TONY

3 What is Fuzzing? Sending semi-random data to an application to try and make it misbehave Used to detect vulnerabilities TONY Send data that looks good, but can break the application Properly formatted, but with nasty values Used for in-the-field vulnerability assessment Can detect various vulnerabilities (buffer overflows, format strings, integer overflows, etc.) Generally the fuzzer is an intermediary. *Starts with some good data source *Intercepts and modifies it *Sends it on its way to the application *Waits to see what happens (hopefully the application crashes) In general, fuzzing is very effective and accounts for a large number of found vulnerabilities

4 Acknowledgment number
Types of Fuzzing TCP Packet Source port Destination port Sequence number Acknowledgment number Hdr length Reserved/flags Window size Checksum Urgent pointer Dumb fuzzing Intelligent fuzzing Options Selectively fuzzes certain fields Naively fuzzes all data Data TONY *There are two types of fuzzing Dumb (or naïve) fuzzing Intelligent fuzzing *Dumb fuzzing just mangles all available data (that is, it is naïve about what data it corrupts) *Intelligent fuzzing uses knowledge of the data structure to fuzz only “interesting” fields, and leave other necessary fields alone (such as checksums, routing addresses, and ports)

5 Goals Ability to fuzz multiple types of data (robust)
Intelligent fuzzing Using structural knowledge to our advantage High-speed The goal would be to have a network protocol fuzzer that accepts packets on one side, mangles them, and sends them on their way through the other side For our purposes, we perform the fuzzing operation on data from input files TONY There are fuzzers out there, but they are almost exclusively software applications SPIKE, Protos, Smudge, Peach, etc… A company called Security Innovations in Florida makes a “fuzzing appliance” called Hydra, but is just a linux box w/ 2 network cards (fuzzing is still done in software) So, we want to implement a fuzzer In hardware What do we want to do with this project? Don’t want to be limited to the type of data we can fuzz (e.g., we don’t want to build simply a TCP fuzzer). We want to build an “intelligent” fuzzer For live networked applications, we want it to be able to keep up with reasonably high-bandwidth line speeds *The canonical example would be a fuzzer that sits between two networked hosts and mangles data as it passes through Our goal is not to build such a network fuzzer, but to create the hardware that would make such a fuzzer possible Thus, our testing was done with input data files, not live network data 1011 0110 Fuzzer

6 Architecture Register File – 256-bit registers, 32-bit mask
New Instructions – fzlw, fzsw, fuzz, mskh, mskl Fuzzing Unit 256-bit SRAM MUX mask MUX FUZZER SRAM 32 FUZZING UNIT 256 – BIT FUZZING REGISTERS 1 data 256 + addr wr_en PC ALU GENERAL REGISTERS DATA MEMORY MUX IMEM MUX

7 Fuzzing Unit Takes as input a data word and a mask specifying which bytes are “fuzzable” in the data word Generates a random number and XORs fuzzable data bytes with corresponding random number bytes DUSTIN The operation of the fuzzing unit is fairly simple. Generate a random number internally Get the mask over the bus and expand it to its 256-bit representation (just duplicate each bit 8 times) And the random number with the mask expansion to get a temporary value Get the actual data value over the input bus, and XOR it with the temporary value to get the fuzzed result Send the result out over the output bus

8 Register File 256 bit word length Parallel 32-bit data/mask registers
Read operation puts data word as well as its corresponding mask on the data output lines Register 1 Mask 1 Register 2 Mask 2 DUSTIN 8 parallel register pairs 256-bit data 32-bit mask Each bit in the mask corresponds to a byte of the data When a fuzzing operation is initiated, both the data and its corresponding mask are sent out to the fuzzing unit. The data is loaded from the special SRAM The mask is set manually by the programmer using “mask high” and “mask low” instructions Register 3 Mask 3 Register 4 Mask 4 Register 8 Mask 8

9 Optimizations Mask in register file is per byte, not per-bit
Each bit masks an entire byte in the data word 256-bit random number generated from 32 parallel 8-bit random numbers Prevents an expensive 256-bit multiply Drastically reduces gate delay of fuzzer DUSTIN One of our goals was to perform fuzzing at relatively high speeds. To ensure this happened, we introduced some optimizations. Our original design used a bit-to-bit data mask for the data to be fuzzed. Loading the mask registers would take many instructions without a compact representation Logically, intelligent fuzzing is done based on “fields” which are generally at the byte granularity So, we modified the mask registers to be bit-to-byte, and thus only 32-bits wide instead of 256. Now setting the mask takes two instructions (one for each nibble, high and low) Secondly, we need essentially a 256-bit random number to XOR our data with. Random number is generated using a multiply and an add Doing a 256-bit multiply is prohibitively expensive Instead, we have effectively 32 8-bit random number generators These are combined to produce a single 256-bit random number Increased hardware, but drastically reduced gate delay

10 Data Throughput Fuzzing unit has maximum gate delay of 21ns
Translates to maximum clock speed of about 48 MHz Effectively fuzz 256 bits of data in 5 clock cycles (for large amounts of data and a full pipeline) Resulting maximum throughput is ~2.5 Gbps for dedicated application Able to keep up with line speed of OC-48 fiber line (~2.5 Gbps) TONY Synthesis of our individual units shows the fuzzer to be the bottleneck, at 21ns. This means the maximum clock speed for our processor is about 48 MHz It takes at most 5 instructions to fuzz a single 256-bit word of data (one load, two set masks, and a fuzz) This means our maximum throughput is about 3 Gigabits per second Assumes a full pipeline and dedicated application Able to keep up with an OC-48 fiber line Note that for block fuzzing (i.e., where the mask does not change), this will be faster 1011 0110 Fuzzer

11 Conclusion/Summary Able to fuzz multiple types of data?
Yes Able to perform intelligent fuzzing? Use of data mask allows selective fuzzing High speed? Able to keep up with OC-48 It is entirely possible to perform intelligent, reconfigurable fuzzing in hardware at high speeds TONY Our goal was to show that fuzzing can be done in hardware efficiently, and that it can be done at high speeds We were able to fuzz data at a high enough rate to keep up with an OC-48 fiber line We also wanted the fuzzing to be as effective and robust as software applications Intelligent fuzzing is enabled through mask registers and control of how the data is fuzzed is given to the programmer

12 Questions TONY


Download ppt "Performing Security Auditing In Hardware"

Similar presentations


Ads by Google