Presentation is loading. Please wait.

Presentation is loading. Please wait.

An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking Hatim Diab 1, Miaoqing Huang 1, Kris.

Similar presentations


Presentation on theme: "An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking Hatim Diab 1, Miaoqing Huang 1, Kris."— Presentation transcript:

1 An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking Hatim Diab 1, Miaoqing Huang 1, Kris Gaj 2, Tarek El-Ghazawi 1, Nikitas Alexandridis 1 1 The George Washington University 2 George Masson University

2 Diab1011/MAPLD'042 Objectives Implement pipelined RC5 Key Breaker on a single chip, Demonstrate automatic balancing of a pipeline by a compiler (SRC), Show the cost of added pipeline.

3 Diab1011/MAPLD'043 Requirements Given: –A matching pair of Plain text message (M) and Cipher text (C) Find the correct corresponding Secret Key –Test the possible Secrete Keys exhaustively, –Keys, 128bit-long key from all 0’s to all 1’s. Requirements –The processing element (PE) to be fed a new Secrete Key (K i ) each cycle, –Compare C with the output C i corresponding to K i

4 Diab1011/MAPLD'044 RC5 Algorithm Mixing in the Secret Key. i=j=0 A=B=0 do 3*max(26,4) times // S[0..25] is the array to be mixed for rc5 encryption A=S[i]=(S[i]+A+B)<<<3; // L[0…3] is the array converted from the secrete key K[0..15] B=L[j]=(L[j]+A+B)<<<(A+B); i=(i+1) mod (26); // The output is the array S[0..25], which will be used to encrypt j=(j+1) mod (4); // the plain text. Encryption. LE=A+S[0]; // A is the upper part of plain text RE=B+S[1]; // B is the low part of plain text for i=1 to 12 do LE=((LE ⊕ RE)<<<RE)+S[2*i]; RE=((RE ⊕ LE)<<<LE)+S[2*i+1]; The processed LE is the upper part of cipher text, The processed RE is the low part of cipher text.

5 Diab1011/MAPLD'045 Key-Breaking Flowchart

6 Diab1011/MAPLD'046 Condition & Implementation RC5 32/12/16 –Cipher text 32*2 bits = 64 bits –12 rounds –Key = 16 * 8bits = 128 bits Implement RC5 encryption using –12 rounds of encryption macros, with 6 clocks latency –78 iterations of key generation macros, with 3 clocks latency

7 Diab1011/MAPLD'047 Design & Bottleneck Pipelined design –Process one key every clock cycle in a pipelined fashion Data dependencies –One of the features of RC5 is the extensive use of data dependent rotations, –S value needed every 26 th step, –L value needed every 4 th step, Manual HDL-based realization of the pipeline proved to be time-consuming and error-prone.

8 Diab1011/MAPLD'048 Data Dependencies in Each Iteration

9 Diab1011/MAPLD'049 Solution Implement on one FPGA chip concurrently –78 key initialization macros –12 encryption macros Connect the macros in a linear pipeline. The SRC compiler will balance the pipeline by inserting delay channels to make all macros run synchronously.

10 Diab1011/MAPLD'0410 Delay Channels Added by SRC Compiler Delay 1 = 1 reg Delay 2 = 2 reg Delay 5 = 5 reg wire

11 Diab1011/MAPLD'0411 Detailed flow

12 Diab1011/MAPLD'0412 Compilation Result Device utilization summary: Number of External IOBs 594 out of 110453% Number of LOCed External IOBs 594 out of 594100% Number of Slices 33790 out of 3379299% Number of BUFGMUXs 1 out of 166% Maximum Clock Frequency

13 Diab1011/MAPLD'0413 Effectiveness of the Benchmark Cipher TextExpected KeyFound Key Time (SRC) (  s)Time (PC) (  s) EEDBA521 6D8F4B1500000000 0000000000000000 00000000 97,3420 C53073A4 8AFAE310 00000000 00000000 00000000 00010000 98,028359,000 07CEC757 C72BCAE9 00000000 00000000 00000000 10000000 2,781,9801,847,105,000 2F68DC4A ADBFACC6 00000000 00000000 00000000 20000000 5,466,2745,251,282,000 6643CACD D1EDD161 00000000 00000000 00000001 00000000 43,050,562 Too large to simulate 51C6514A 4EF0A99B 00000000 00000000 00000010 00000000 687,318,493Too large to simulate

14 Diab1011/MAPLD'0414 Conclusion The objective was realized, i.e., every clock one 128bit-long variable is pushed into the processing chain, A speed-up of 1000x over SW and 300x over serial HW implementations was achieved, For the flexible parameters used in RC5 algorithm, different map routines can be designed respectively to fit the distinct area and throughput requirements, The automated pipeline balancing of the SRC compiler proved to substantially decrease the development time of complex pipelined designs.


Download ppt "An automated pipeline balancing in the SRC Reconfigurable Computer and its application to the RC5 cipher breaking Hatim Diab 1, Miaoqing Huang 1, Kris."

Similar presentations


Ads by Google