Inline Integrity plus Encryption Source: Lucent Technologies, Inc. Sarvar Patel and Ganesh Sundaram Recommendation: Review and adopt Lucent Technologies grants a free, irrevocable license to 3GPP2 and its Organizational Partners to incorporate text or other copyrightable material contained in the contribution and any modifications thereof in the creation of 3GPP2 publications; to copyright and sell in Organizational Partner's name any Organizational Partner's standards publication even though it may include all or portions of this contribution; and at the Organizational Partner's sole discretion to permit others to reproduce in whole or in part such contribution or the resulting Organizational Partner's standards publication. Lucent Technologies is also willing to grant licenses under such contributor copyrights to third parties on reasonable, non-discriminatory terms and conditions for purpose of practicing an Organizational Partner’s standard which incorporates this contribution. This document has been prepared by Lucent Technologies to assist the development of specifications by 3GPP2. It is proposed to the Committee as a basis for discussion and is not to be construed as a binding proposal on Lucent Technologies. Lucent Technologies specifically reserves the right to amend or modify the material contained herein and to any intellectual property of Lucent Technologies other than provided in the copyright statement above.
Goal Problem statement –All Message authentication proposals require RLP reassembly before tag verification because They are serial They do not allow out of order processing They are block level and not byte level HMAC-SHA suffers from this. AES-CBC suffers from this. Can we have one pass encryption/decryption and integrity tag creation/verification? –One movement of data from RAM to Phy and vice versa? Also good for software in allowing pre-computation for authentication also. –Use some published message authentication algorithm but adapt it to LBC needs. So that security can be easily verified
Approach We will try to use a message integrity algorithm that allows out of order processing. Assumptions: –During encryption an Application (App) packet’s RLP segments are sent for encryption without interleaving from other App pkt segments. No interleaving: no other App packet bits are sent until the first App pkt finishes. The only exception is – re-transmission of some previously sent RLP segment of other App pkts – or tunneled pkts. We focus in the following slides on the byte numbered RLP case, but can also be adapted to packet numbered RLP.
Encryption and Integrity algorithm Countermode encryption –Break message into M1, M2, …, Mn blocks each of 32 bits. –Ci = Mi + (respective 32 bits of) AESk(0,cryptosync i/4) Universal hash based integrity –Ai are also 32 bits –A1…An = AES(1,cryptosync 1)…AES(1,cryptosync n/4) –P is a 32 bit prime e.g ; final summation mod –CryptosyncOfLastByte = (last byte of RLP segment in last RLP segment of the application pkt) –We first illustrate the case when RLP segments are multiple of 4 bytes (4x). We deal with the non-multiple case later.
Encryption Example 8000 th byte in flow M1 (32 bits) M2M3M4M5M6M7M8 C1 C2C3C4C5C6C7C8 AES(0, 8000)AES(0, 8016)
C1 C2C3C4C5C6C7C8 AES(1, 8000)AES(1, 8016) A1A2 A3 A4 A5 A6 A7 A8 mod p mod mod lsb of AES(2,8028) (encryption of tag) 32 bit authentication tag Message Integrity tag creation
Hardware encryption processing For each RLP flow Id keep a 32 bit tag accumulator 1.An App pkt’s RLP segments reach hardware non-interleaved 2.First RLP segment is encrypted and partial tag created Byte based cryptosync is used for both encryption and tag creation Ai’s can be created as such: Given the byte sequence number, we can create AES(1,16*floor(byte number/16)) and take respective values from block. That is the byte number of first byte in block is used as cryptosync Partial 32 bit tag is created and stored in the accumulator 3.Non-last RLP segments are also encrypted and partial tag created. Partial 32 bit tag is added to the accumulator. 4.Last RLP segment is also encrypted and partial tag created Partial 32 bit tag is added to the accumulator. The tag gets encrypted. i.e. 32 bits of AES(2,LastByte) are added to the accumulator. Tag is appended to the RLP segment and sent. Tag is also sent to the CP/RAM for storage in case of RLP retransmission. Only for the final RLP segment of an App pkt is the tag sent to RAM The tag could be written by hardware in to a small buffer which has the last RLP segments byte number and the 32 bit tag. The CP can periodically read the buffer – so there is no waiting for access to PC bus.
Hardware encryption - Retransmission When the CP sends a RLP fragment to the hardware, it also sends a bit (not sent over the air) to indicate if something is a retransmission or not. –Retransmitted segments are encrypted again, but the partial tag is not created again and not added to a tag accumulator. This is because the tag calculation has already taken place for that App pkt. –For retransmission of the last RLP segment of an App pkt. Segment is encrypted. CP sends the authentication tag for the pkt since it was already calculated and stored in RAM previously.
Hardware decryption For each (non-last) RLP segment received –Create a 32 bit partial tag and decrypt the segment –Send both the segment and the tag to RAM The tag could be added to the RLP header or added to the tail. For last segment received. –create the 32 bits partial tag with adding of AES(2,”cryptosyncOfLastByte) to the partial tag. Add and subtracts on this slide are mod –Subtract the received tag from the partial tag above. –Decrypt the segment –Send both the segment and the tag to RAM CP –When CP assembles the App pkt, it also adds the partial tags from all the segments together. –If the tag summation == 0 then accept pkt else reject pkt. That is if all the summation of the calculated partial tags equals the received tag then tag is verified.
Computational Efficiency CP involvement is kept to a bare minimum –As RLP headers have to be processed anyway. –One more field is added to RLP header sent to RAM during decryption. –During encrytion, the CP has to receive the tag back from the hardware or read it from a small buffer in hardware. This flexible out of order procession has only a small overhead. –2 AES calls are needed per 128 bits, just like other modes, eg. Counter mode encryption + CBC MAC. –A 32 bit multiply is needed per 32 bits. Next we deal with non multiple of 4 byte (4x) case.
RLP segments non-multiple of 32 bits Given a byte sequence number, one can identify the beginning byte that is a multiple of 4 bytes. –From that point can use the universal hash on each 4 byte (32 bit) value. –Need to still handle the beginning bytes that are before the ‘multiple of 4 byte” boundary. Also need to do this for ending bytes of the segment. Beginning bytes of non-first RLP segment –The 32 bit ciphertext Ci can be written as C i,a C i,b C i,c 2 8 +C i,d 2 0 So C i A i mod p can be rewritten as A i C i,a A i C i,b A i C i,c A i C i,d 2 0 mod p –Example RLP 1 ends with 3 bytes extra after the last “4 byte” multiple; the next RLP segment (RLP 2) has the remaining byte before the “4 byte” multiple begins. – Partial tag of RLP1 has added A i C i,a A i C i,b A i C i,c 2 8 mod p – Partial tag of RLP 2 has added A i * C i,d mod p – Figure in next slide illustrates this.
Application packet Example RLP stream begins at byte number 8000 RLP 1 Header RLP 2 Header RLP segement 1 (7 bytes) RLP segement 2 (5 bytes) RLP 1 Header RLP 2 Header RLP segement 1RLP segement 2 Encrypt C1,aC1,bC1,cC1,dC2,aC2,bC2,cC2,dC3,aC3,bC3,c 32 bit C i are Made up of 4 Bytes C i,a C i,b C i,c and C i,d. C 2 has its first 3 bytes as part of RLP segment 1 and the last byte as part of RLP segment 2. Similarly 32 bit pseudorandom bits A i can be broken into bytes A i,a A i,b, A i,c, and A i,d. A2’s first 3 bytes will be used to integrity protect RLP segment 1 and the 4 th byte will be for RLP segment 2. Segment sizes of 7 and 5 bytes are for example purposes only.... RLP N AES(1,8000) A1,aA1,bA1,cA1,d A2,a A2,bA2,cA2,dA3,aA3,bA3,c AES(1,8000) Partial tag RLP 1 = A 1 C 1 + A 2 C 2,a A 2 C 2,b A 2 C 2,c 2 8 mod p Partial tag RLP 1 = A 2 C 2,d A 3 C 3 mod p Example of partial tag calculation for non-multiple of 32 bits RLP segments
Variable length App Packet Standard techniques like padding can be used to handle variable length packets. Details need to be specified. Last part of last RLP segment in App packet needs to be treated specially. Padding option: Last byte as cryptosync option: –However, the above padding option is not necessary if we are using the last byte of the App packet as cryptosync: AES(2,lastByte). –Thus App packets of differing length will end in different byte number and the encryption of the tags will be different. Two messages may create the same tag, for example a 4x byte length message M and the message M followed by a byte of 0. However, each will be encrypted differently, that is a different random string will be added for each message and security will be preserved. –Beginning byte of the packet also needs to provide AES input. Details to be specified.
Beginning and end of App packets Previously we dealt with RLP segments beginning and ending at non-multiple of 4 bytes (4x), but belonging to the same App packet. –We reused Ai at the ending bytes of a RLP segment and also at the beginning bytes of the next RLP segment. Since the message authentication algorithm required it. App pkt that end at non-multiples(4x) and the next App pkt that begin at non-multiples of 4 bytes –We don’t want to reuse the Ai for both App pkts (end of one and beginning of another) otherwise security may be harder to see. –For beginning bytes of App pkt we will continue the use 4 byte Ai based on current i, i.e. i equal to floor(beginning byte/4). –For ending bytes,4x+1 or 4x+2 or 4x+3 byte, we will use 1 to 3 leftover bytes of AES(2,lastbyte) output since only the first 4 bytes of output are used for tag encryption.
Options Can use 64 bits tags. –Added computation. Multiplication can be over 32 bit Galois Field Details for dealing with variable size needs to be specified. Methods to minimize the number of AES calls exist in the literature –But requires tree hash, larger key sizes, more complex.
Security (I) Universal hash plus one time pad is a secure message authentication algorithm. –Proposed and proved in Wegman and Carter, “New hash functions and their use in authentication and set equality,” 1980 Multiplicative hash as universal hash –Proposed in carter and wegman “Universal classes of hash function,” –Clear statement and proof of it as a delta-universal hash in Theorem 2 in Halevi and Krawczyk, “MMH: software message authentication in the Gbit/second rates” We are using the above universal hash so security proofs directly apply –We only need to deal with our variations for handling variable lengths and non-multiple of 4 byte boundaries. –Our security may be stronger in the sense that we use a different hash key for each message whereas the same hash key would have been sufficient according to above.
Packet number based RLP flows. Cryptosync for encryption is packet number concatenated by a subcounter which changes for each 128 bit block. i.e. AES(0,pkt#,subcounter) –The same packet number is used for the entire App packet with only the subcounter changing. We reuse AES(1,pkt#,subcounter) for creating Ai for message authentication. –For tag encryption, we would use AES(2,pkt#,subcounter, byte number) where byte number is 1 to 16 depending on the last byte number. –The rest is the same as with byte numbered RLP.