Alexandra Constantin James Cook Anindya De Computer Science, UC Berkeley
TPM – Trusted Platform Module Specs by Trusted Computing Group (TCG) Stores secret keys to be used for cryptographic protocols and authentication
Release data only if running Vista ! How to ascertain if the server is running Vista? Trust TPM hardware and ask it for integrity measurements
Attestation TPM hardware is trusted AIK key pair AIK credential signed by trusted third party (privacy CA)
Boot Process BIOS boot block = Core Root of Trust Chain of trust ◦ Boot block ◦ rest of the bios ◦ OS, etc. Integrity measurements = hash of code to be loaded Signed hash of code used to establish trust
Alice Bob Ascertained that it is Bob
Alice Bob Pick K. Send Enc(K,PK)
Alice Bob Send Enc(K,Data)
TPM hardware requirements to maintain efficiency for a system with many partitions? When to hash? Some simulation results
Privacy: Critical data to be hashed but source to remain undisclosed Deflection attack: Server initially deflects communication to a TPM based server and later starts communication Replay attacks : Continue to use certificate after switching OS Snoopy attacks : Pry on the communication line for certificates and use them as your own
Have to trust some part of the kernel DRAM is unsafe – freeze the computation Pry on the system bus Side Channel Attacks
Efficiency issues – Is the system reasonable when there are 20 cores and 120 partitions? Some partitions trusted and some untrusted Cannot even think of timestamping to prevent replay More privacy issues : Should not be able to ascertain two partitions are physically on the same computer Trusted Hardware for Partitioned Multicore
Virtual TPMs for Partitioned Multicore Multiple partitions hosting operating systems Virtual operating systems reside in virtual machines Changing partitions Virtualize TPMs Create one VTPM per partition Each VTPM has its own keys and resources and can replicate the functions of a real TPM A VTPM manager connects the VTPM instance and the OS partitions VTPM manager collects integrity measurements of VTPM instances
Virtualizing the TPM takes care of privacy issues Chain of trust now goes through the virtual TPM VTPM manager can give different privileges to different partitions. Assurance on Quality of Service (QoS) can be given : we have a novel priority algorithm Compromise of one partition ≠ Compromising the entire system
TPM Secure DRAM CPU Memory Encrypter Secure Box
Security unusually dependent on correctness of kernel Use the Hi-Star labeling mechanism There are categories and labels – {0,1,2,3} Rules for information flow – function of category, label tuple We have one Hi-star category for information flow from secure box to rest of the world
RSA vs. ECC protocols Advantages of ECC : smaller key size RSA is a malleable encryption scheme – cannot use for signing ECC arithmetic can be implemented very efficiently in hardware
ECC – GF(2 233 ) 83 milliseconds ECC – GF(2 117 ) 18 milliseconds RSA bits 186 milliseconds RSA bits 25 milliseconds JAVA simulation of RSA and Elliptic curve cryptography
ECC FPGA Coprocessors for Improved Performance [Rebeiro and Mukhopadhyay] 3 main modules: ALU, register bank, control unit ALU components ◦ 14 cascaded circuits quad circuits, used for inversion ◦ Multiplier ◦ N x Squarer ◦ N x Adder Register Bank: 233 bit dual port registers; input to the registers = base point or output of ALU Control unit: Finite State Machine for 32 control signals Replicate coprocessor components according to partitioned multicore performance requirements
OperationsTime Product – GF(2 233 )0.239 milliseconds Addition – GF(2 233 )0.001 milliseconds Inverse – GF(2 233 ) millseconds Results from Software Simulation Results from Hardware simulation OperationsTimeClock Cycles Product – GF(2 233 ) μs33 Inverse – GF(2 233 )68 μs10306
Efficient implementation of finite field primitives is of central importance Doubling a point on Elliptic curve: Can be done in 3 clock cycles (9 field multiplications) Adding two points on elliptic curve: Can be done in 8 clock cycles (13 field multiplications)
Tradeoffs: chip area, time complexity, power Even for basic multiplication (finite field or Z n ), one can have the hardware scale as n (log 3)/(log 2) and time as log n or have the hardware scale as n and time n (log 3)/(log 2) Circuits have been implemented in Verilog showing tradeoff