doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 1 Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs) Submission Title: [AES Mode Discussion] Date Submitted: [15 May, 2002] Source: [Rene Struik] Company [Certicom Corp.] Address [5520 Explorer Drive, 4th Floor, Mississauga, ON Canada L4W 5L1] Voice:[+1 (905) ], FAX: [+1 (905) ], Re: [] Abstract:[This document discusses trade-offs between different block-cipher modes of operation and their suitability within the IEEE High-Rate WPAN context.] Purpose:[Highlight trade-offs that govern the choice of symmetric algorithms for the IEEE WPAN.] Notice:This document has been prepared to assist the IEEE P It is offered as a basis for discussion and is not binding on the contributing individual(s) or organization(s). The material in this document is subject to change in form and content after further study. The contributor(s) reserve(s) the right to add, amend or withdraw material contained herein. Release:The contributor acknowledges and accepts that this contribution becomes the property of IEEE and may be made publicly available by P
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 2 AES-Mode of Operation (and MAC-function) Discussion for IEEE WPANs René Struik, Certicom Research
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 3 Outline: Block Cipher Modes of Operation: - Cipher-Block Chaining (CBC Mode); - Counter Mode (CTR Mode) Message Authentication Codes: - Based on Block Codes: CBC-MAC; - Based on Un-Keyed Hash Functions: HMAC Implementation Issues - Hardware vs. software implementation; - Computational efficiency considerations.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 4 Block-Cipher Modes of Operation: CBC Mode (1) DKDK DKDK DKDK DKDK DKDK c1c1 c2c2 c3c3 c m-1 cmcm x1x1 x2x2 x3x3 x m-1 xmxm c 0 :=IV EKEK x1x1 c1c1 EKEK x2x2 c2c2 EKEK x3x3 c3c3 EKEK x m-1 c m-1 EKEK xmxm cmcm c 0 :=IV Encryption: Decryption: Encryption algorithm: c j :=E K (x j c j-1 ) for all j>0; c 0 :=IV. Decryption algorithm: x j :=D K (c j ) c j-1 for all j>0; c 0 :=IV.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 5 Block-Cipher Modes of Operation: CBC Mode (2) Encryption algorithm: c j :=E K (x j c j-1 ) for all j>0; c 0 :=IV. Decryption algorithm: x j :=D K (c j ) c j-1 for all j>0; c 0 :=IV. Security requirement: -IV should be unpredictable. Encryption computation: -No parallelization of computation possible; -Access to plaintext required for computation; -Access to IV needed for computation. Decryption computation: -Full parallelization of computation possible; -Access to ciphertext required for computation; -Access to IV needed for computation. Message size: -Plaintext expansion might be needed (size is multiple of encryption block length). Implementation: -Both encryption function E K and decryption function D K need to be implemented.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 6 Block-Cipher Modes of Operation: CBC Mode (3) Example: Use of CBC-mode of operation in IEEE High-Rate WPAN. Block-cipher: AES-128 {block-cipher length: 128 bits}. IV=E K ( Id A || Nonce || j), where Id A : identifier of sender (64-bits field, right-adjusted); Nonce: inter-frame sequence number (48-bits field, right-adjusted); j:intra-frame sequence number (16-bits field, right-adjusted). Motivation: IV is obtained via encryption, to ensure unpredictability hereof for outsiders; Id A is included to ensure logical separation between senders who have same key (no re-use of same IV value between different senders; no synchronization required); Nonce and j are included to ensure no re-use of same IV between different data frames (via increment of Nonce-value) and within data blocks in a frame (via increment of j-value). Combinatorial freedom: Maximum size of data frame= max. #blocks encryption block size = 2 16 * 2 7 = 2 23 = 1Mbytes; Maximum #data frames = max. #Nonce values = 2 48 data frames (At 1Gbps data rate, exhaustion after roughly 1 year, if all data frames consist of only 1 block.) NB: current max. frame length: 2 14 bits = 2 kbytes; at 55 Mbps data rate, exhaustion after >20 yrs.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 7 Block-Cipher Modes of Operation: CTR Mode (1) Encryption: Decryption: Encryption algorithm: c j :=E K (t j ) x j for all j>0. Decryption algorithm: x j :=D K (t j ) c j for all j>0. EKEK t1t1 c1c1 x1x1 EKEK t2t2 c2c2 x2x2 EKEK t3t3 c3c3 x3x3 EKEK t m-1 c m-1 x m-1 EKEK tmtm cmcm xmxm EKEK t1t1 x1x1 c1c1 EKEK t2t2 x2x2 c2c2 EKEK t3t3 x3x3 c3c3 EKEK t m-1 x m-1 c m-1 EKEK tmtm xmxm cmcm counters
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 8 Block-Cipher Modes of Operation: CTR Mode (2) Encryption algorithm: c j :=E K (t j ) x j for all j>0. Decryption algorithm: x j :=D K (t j ) x j for all j>0. Security requirement: -Counters t 1, t 2, t 3, … shall all be distinct over lifetime key K. Encryption computation: -Full parallelization of computation possible; -No access to plaintext required for computation; -Access to t 1, t 2, t 3, … needed for computation. Decryption computation: -Full parallelization of computation possible; -No access to ciphertext required for computation; -Access to t 1, t 2, t 3, … needed for computation. Message size: -No plaintext expansion needed! (ciphertext can be truncated to plaintext length). Implementation: -Only encryption function E K needs to be implemented.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 9 Block-Cipher Modes of Operation: CTR Mode (3) Example: Use of CTR-mode of operation in IEEE High-Rate WPAN. Block-cipher: AES-128 {block-cipher length: 128 bits}. counter value=(Id A || Nonce || j), where Id A : identifier of sender (64-bits field, right-adjusted); Nonce: inter-frame sequence number (48-bits field, right-adjusted); j:intra-frame sequence number (16-bits field, right-adjusted). Motivation: Id A is included to ensure logical separation between senders who have same key (no re-use of same IV value between different senders; no synchronization required); Nonce and j are included to ensure no re-use of same IV between different data frames (via increment of Nonce-value) and within blocks in a frame (via increment of j-value). Combinatorial freedom: Maximum size of data frame= max. #blocks encryption block size = 2 16 * 2 7 = 2 23 = 1Mbytes; Maximum #data frames = max. #Nonce values = 2 48 data frames. (At 1Gbps data rate, exhaustion after roughly 1 year, if all data frames consist of only 1 block.) NB: current max. frame length: 2 14 bits = 2 kbytes; at 55 Mbps data rate, exhaustion after >20 yrs.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 10 MACs Based on Block-Ciphers : CBC-MAC (1) CBC-MAC: x1x1 x2x2 EKEK EKEK EKEK x3x3 EKEK x m-1 EKEK xmxm cmcm IV:=0 CBC-MAC algorithm: c j :=E K (x j c j-1 ) for j=1,…,m; c 0 :=IV:=0; MAC:=c m. Strengthened CBC-MAC algorithm: c j :=E K (x j c j-1 ) for j=1,…,m; c 0 :=IV:=0; MAC:=E K (D K’ (c m )). EKEK x1x1 EKEK x2x2 EKEK x3x3 EKEK x m-1 EKEK xmxm IV:=0 D K’ EKEK MAC Strengthened CBC-MAC: (Bellare, Kilian, Rogaway)
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 11 MACs Based on Block-Ciphers: CBC-MAC (2) Security requirement: -Keys K and K’ should be independent; {This prevents chosen-text existential forgery attacks.} - If K=K’, then Strengthened CBC-MAC reduces to ‘folklore’ CBC-MAC (which is only secure for fixed length messages) (Strengthened) CBC-MAC computation: -No parallelization of computation possible; -Management of two keys, K and K’, required. Data integrity field size: -MAC value has size equal to encryption block length (truncated outputs possible, in exchange for reduced security level). Implementation: -Both encryption function E K and decryption function D K’ need to be implemented. Standard: FIPS Pub 113 for DES; unknown whether continued for AES-128
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 12 MACs Based on Un-keyed Hash Functions: HMAC (1) Security requirement: - HMAC should use un-keyed hash function of same security level; HMAC computation: -No parallelization of computation possible; -Management of 1 key, viz. K, required. Data integrity field size: -HMAC value has size equal to 1 encryption block length (truncated outputs possible, in exchange for reduced security level). Implementation: -Un-keyed hash function needs to be implemented. Standard: -Draft FIPS Pub #HMAC (specification of HMAC) -Draft FIPS Pub (specification of SHA-256)
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 13 MACs Based on Un-keyed Hash Functions: HMAC (2) HMAC-256: -Building block: SHA Block size: 512 bits. -Operations on 32-bits words: - logical AND, XOR, NOT; - integer additions modulo 2 32 ; - rotations, shifts. -Storage: - temporary storage: roughly 10 words (of 32 bits); - permanent storage: roughly 8 words (of 32 bits). -Computational overhead: - roughly same as SHA-256.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 14 Implementation Issues (1) Block-Cipher:AES-128 in one of the following modes: (1) CBC mode; (2) CTR mode. Un-keyed hash function: SHA-256 {block length: 512 bits}. Keyed hash function:(1) HMAC-256; (2) CBC-MAC function. AES-128 implementation: CBC Mode: implement both AES-128 encryption and AES-128 decryption; CTR Mode: implement AES-128 encryption only. Lowest gate count: AES-128 in CTR mode. SHA-256 cost during key agreement (if implemented in software): Full MQV with Key Confirmation: additional 15% workload compared to hardware only. Modified-ECIES TLS-Variant Key agreement: additional 30% workload compared to hardware only. MAC implementation (in hardware): CBC-MAC: implement both AES-128 encryption and AES-128 decryption HMAC: implement SHA-256. Lowest gate count: roughly equal!!! (if encr + auth computations not carried out in parallel) *AES-OCB with Authentic Side Information: implement both AES-128 encryption and decryption (Note: Attractive if 55 Mbps 500 Mbps, since encr + auth computations carried out in parallel.)
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 15 Implementation Issues (2) Gate count figures 1 Option 1: AES-128 in CBC Mode + SHA-256:30.7k (21.7k if slow SHA-256 used) Option 2: AES-128 in CTR Mode + SHA-256:27k (18k if slow SHA-256 used) Option 3: AES-128 in CTR Mode + Strengthened CBC-MAC:35.1k [26.7k] * Same but with ‘basic’ CBC-MAC26.7k [21.1k]* Option 4: AES-128 in OCB Mode:32.1k [29.3k]* Figures are based on the following: 1.Gate count for temporary and permanent storage (e.g., keying material and status info), both at cost of 6 gates/bit and 4 gates/bit {latter in square brackets * }. 2.Cost include gate count core algorithm plus mode gate count and the minimum number of registers to support base core function technology, based on 8051 processor, RAM/ROM, RF circuitry, and Register File. 4.Throughput rate: 50 Mbps; clock rate: 40 MHz. The architecture of the system IC design could have impact on both the gate count and performance of the hardware macro modules. 1 Estimates courtesy of Motorola Labs - Advanced Systems Architecture Security Technology Center. (improved gate count figures using ‘slow’ SHA-256 courtesy of Certicom Embedded Systems Group)
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 16 Implementation Issues (3) Speed comparison of Message Authentication Codes CBC-MAC Function: Cost of message authentication Cost of data encryption (per bit) [Message integrity over n blocks: n+2 block cipher applications; Encryption over n blocks: n block cipher operations (discarding initialization of IV vector)] HMAC-256 based on the un-keyed hash function SHA-256: Cost of message authentication 14% of cost of data encryption (per bit) [At 40MHz clock rate, the throughput figures are as follows (example): -AES-128 in CTR mode: ±50Mbps -SHA-256: ±350Mbps] Important Note: (Trade-offs gate count – efficiency of SHA-256 implementation) (!!!) - Full speed HMAC-256: 5k+17.5k gates; Slow HMAC-256: 5k+4.6k gates (‘Full speed’ HMAC-256 operates < 2 as fast as ‘Slow’ HMAC-256) Note: ‘slow’ HMAC-256 is relatively still much faster than AES-128 (so slow-down not noticeable) Message authentication based on AES-OCB mode of operation: Cost of message authentication 2/L Cost of data encryption (per bit) [L is message length in #encryption block sizes]
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 17 Implementation Issues (4) Side Effect of Choice of Message Authentication Codes SHA-256 cost during key agreement (if implemented in software): Full MQV with Key Confirmation: additional 15% workload compared to hardware only. Modified-ECIES TLS-Variant Key agreement: additional 30% workload compared to hardware only. SHA-256 cost during Implicit Certificate Verification (if implemented in software): Full MQV with Key Confirmation: additional 37% workload compared to hardware only. Modified-ECIES TLS-Variant Key agreement: additional 37% workload compared to hardware only. SHA-256 cost during ACL Processing (if implemented in software): Additional 100% workload compared to hardware only.
doc.: IEEE /218r2 Submission July, 2002 Rene Struik, Certicom Corp.Slide 18 Conclusion Optimum Choice of Symmetric encryption Algorithms Block cipher: AES-128 in CTR Mode; Hash function: SHA-256; Message authentication code: HMAC-256 based on SHA-256, truncated to leftmost 128 output bits. Required Changes to the Draft Replace § of document 02/210r0 by the following text: § CTR Mode of Operation: The counter mode of operation (CTR Mode) for block ciphers used in this security suite shall be preceded by a counter CTR and performed as specified in NIST Special Publication A [{xref} MODES}]. Recommendation to the Editor: define CTR as specified on Slide 9 of this presentation.