Faster File matching using GPGPU’s Deephan Mohan Professor: Dr

Faster File matching using GPGPU’s Deephan Mohan Professor: Dr
Faster File matching using GPGPU’s Deephan Mohan Professor: Dr.John Cavazos University of Delaware Clarification about title: Only NVIDIA GPUS. Extension to others in progress. Inclusion of partial file matching support. 11/15/2018 SAAHPC 2010

Presentation Outline Introduction MD6 Algorithm CUDA MD6
Experiments and Results Conclusion 11/15/2018 SAAHPC 2010

Introduction 11/15/2018 SAAHPC 2010

Introduction File matching Motivation
Indispensable in fields like forensics and Information security Robustness of hashing algorithms used Motivation Advent of GPU computing Faster file matching Faster Hashing algorithms 11/15/2018 SAAHPC 2010

Faster file matching Hashing Algorithms
MD -4, -5, SHA -1, -2, -256, -512, Tiger, Whirlpool Used in Integrity checking, checksum calculation, message authentication etc. Existing file matching programs SSDEEP HASHDEEP Tons of proprietary file matching programs 11/15/2018 SAAHPC 2010

The MD6 Algorithm 11/15/2018 SAAHPC 2010

Merkle Tree Computation proceeds from bottom up
Each leaf represents a data chunk Each intermediate node represents a compression node 11/15/2018 SAAHPC 2010

The MD6 Algorithm MD6 Inputs: MD6 Compression :
M  the message to be hashed (mandatory). d  message digest length desired, in bits (mandatory) . K  key value (optional). r  number of rounds (optional). MD6 Compression : MD6 word size  8 Bytes MD6 Buffer size  64 words (512 Bytes) Each Buffer is pre-processed f : W64 W89  W16 W16 is post processed The final hash is exactly d bits in length 11/15/2018 SAAHPC 2010

CUDA MD6 11/15/2018 SAAHPC 2010

CUDA MD6 Implementation
Step (i): Host buffers in the content of the source file Step (ii): Allocates adequate memory on the device Step (iii): Invoke kernels md6_compress_block() – Preprocessing module md6_compress() – Compression module md6_rewrite() – MD6 hash aggregation module Step (iv): Do step iii N+1 times to generate the final hash Step (v): Perform hash comparison Step (vi): Store hash in the hashdb Step (ii): If file size larger, chunk it. Step(iii): Sequential run of three kernels. Keep data in GPU memory during computational time without offloading it to the host. Improves performance. 11/15/2018 SAAHPC 2010

CUDA MD6 md6_compress_block() md6_compress() md6_rewrite()
Data preprocessing module f : W64  W89 <<<Grid, Threads>>>  <<<Total number of buffers, 1>>> md6_compress() Performs MD6 compression <<<Grid, Threads>>>  <<<Total number of buffers, 16>>> md6_rewrite() Performs MD6 Hash Aggregation f : W89  W16 <<<Grid, Threads>>>  <<<(Total number of buffers/4), 1>>> Sequential run of three kernels 11/15/2018 SAAHPC 2010

Preprocessing kernel Transforms each MD6 buffer
15 words – constant vector (Q) 8 words - Key (K) U,V – unique control words Last 64 words – Data chunk A 15 word vector composed of constants (primes) Header Data chunk (64) 11/15/2018 SAAHPC 2010

CUDA MD6 Compression Kernel
For each block do Set index to blockID For each data[index] do Set i to n + ThreadID: /* 16 steps */ x = Si-n xor Ai-n xor Ai-t0 x = x xor (Ai-t1 ^ Ai-t2) xor (Ai-t3 ^ Ai-t4) x = x xor (x >> ri-n) x = x xor (x << ri-n) exit CUDA block exit CUDA kernel call 11/15/2018 SAAHPC 2010

CUDA MD6 Compression Coalesced memory reads and writes
Use of constant and shared memory within kernel Compression function loop unrolled Compression rounds Integrity Sliding window thread access 11/15/2018 SAAHPC 2010

CUDA MD6 Execution Thread block STEP 2: Compress data
Call md6_compress () <<< Total buffers, Threads>>>  <<< 8, 16>>> STEP 1: Read in data Call md6_compress_block () <<< Total buffers, Threads>>>  <<< 8, 1>>> Grid 11/15/2018 SAAHPC 2010

CUDA MD6 Execution STEP 3: Write hash into appropriate node
Call md6_rewrite() <<< Total buffers, Threads>>>  <<< 2, 1>>> 11/15/2018 SAAHPC 2010

CUDA MD6 Execution STEP 1: STEP 2: Read in data Compress data
Call md6_compress () <<< Total buffers, Threads>>>  <<< 2, 16>>> STEP 1: Read in data Call md6_compress_block () <<< Total buffers, Threads>>>  <<< 2, 1>>> 11/15/2018 SAAHPC 2010

CUDA MD6 Execution Final step Write out the final hash
End of CUDA kernel STEP 3: Write hash into appropriate node Call md6_rewrite() <<< Total buffers, Threads>>>  <<< 2, 1>>> 11/15/2018 SAAHPC 2010

CUDA MD6 for File matching
Absolute file matching Message digest is unique User can input predetermined set of hashes Comparison of Input hashes with GPU generated hashes File matching can be done in two modes Direct Hashing (Single files) Recursive Hashing (Archive of files) Hashing Larger Files Larger files are broken down into data chunks Each chunk is hashed and finally aggregated 11/15/2018 SAAHPC 2010

Experiments and Results
11/15/2018 SAAHPC 2010

Benchmarking platform
GPU Nvidia GeForce 8800 GTX card (112 cores) CUDA toolkit version 2.2 CPU Quad-core Intel Xeon E5335 CPU Sequential Iterative implementation of MD6 11/15/2018 SAAHPC 2010

Experiment 1: Executing CUDA MD6 on single files
11/15/2018 SAAHPC 2010

Experiment 2: Executing CUDA MD6 on archive of files
11/15/2018 SAAHPC 2010

Experiment 3: Executing CUDA MD6 with varying buffer sizes
11/15/2018 SAAHPC 2010

Number of compression rounds Vs Speedup
11/15/2018 SAAHPC 2010

Wall clock time Vs Kernel execution time
11/15/2018 SAAHPC 2010

Conclusion Speedup ranging from 2X to exceeding 250X
Performance degraders Host to device data transfer, Device initialization, Idle threads Faster hashing also depends on hash integrity Speedup should scale with increased number of GPU cores Point 2: 11/15/2018 SAAHPC 2010

Questions… 11/15/2018 SAAHPC 2010

Thank you!!! 11/15/2018 SAAHPC 2010

Faster File matching using GPGPU’s Deephan Mohan Professor: Dr

Similar presentations

Presentation on theme: "Faster File matching using GPGPU’s Deephan Mohan Professor: Dr"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Faster File matching using GPGPU’s Deephan Mohan Professor: Dr

Similar presentations

Presentation on theme: "Faster File matching using GPGPU’s Deephan Mohan Professor: Dr"— Presentation transcript:

Similar presentations

About project

Feedback