Symbolic and Computational Analysis of Network Protocol Security John Mitchell Stanford University Asian 2006
Outline Protocols Symbolic analysis of protocol security Some examples, some intuition Symbolic analysis of protocol security Models, results, tools Computational analysis Communicating Turing machines, composability Combining symbolic, computational analysis Some alternate approaches Protocol Composition Logic (PCL) Symbolic and computational semantics
Many Protocols Authentication Key Exchange Kerberos Key Exchange SSL/TLS handshake, IKE, JFK, IKEv2, Wireless and mobile computing Mobile IP, WEP, 802.11i Electronic commerce Contract signing, SET, electronic cash, See http://www.lsv.ens-cachan.fr/spore/, http://www.avispa-project.org/library
Mobile IPv6 Architecture Mobile Node (MN) IPv6 Direct connection via binding update Corresponding Node (CN) Authentication is a requirement Early proposals weak Home Agent (HA)
802.11i Wireless Authentication Supplicant UnAuth/UnAssoc 802.1X Blocked No Key Supplicant Auth/Assoc 802.1X UnBlocked PTK/GTK 802.11 Association EAP/802.1X/RADIUS Authentication MSK RSNA involves three entities: the Supplicant, the Authenticator, and the Authentication Server. In an general infrastructure network, the supplicant is implemented in the mobile devices like a PDA, a cellphone or a laptop; the authenticator is implemented in the Access Point; the authentication server could be implemented either in the Access Point with the authenticator, or separately in a backend server, like a RADIUS server. When it is implemented in a separate server, the links between the authenticator and the authentication server are assumed to be physically secure. In order to adopt 802.1X authentication in wireless networks, we consider the association between the supplicant and the authenticator as a logical 802.1X port. At the beginning, the supplicant and the authenticator are unauthenticated and unassociated, 802.1X ports are blocked, and there are no shared secrets between them. In the first stage, the supplicant and the authenticator set up an 802.11 association. After this stage, the supplicant and the authenticator go to the authenticated and associated state. Here “authenticated” only means that 802.11 open system authentication is successful, this is not considered to be a real authentication. Now there is a radio connection between the supplicant and the authenticator, but they cannot transmit data packets to each other because the logical 802.1X port is still blocked, only protocol packets are accepted. In the second stage, the authenticator will act as a relay, the supplicant and the authentication server run some mutual authentication protocol and generate some common secret, called MSK (Master Session Key). Then the server moves the MSK to the authenticator, and the supplicant and authenticator will derive the same PMK (Pair-wise Master Key) from the MSK locally. Here “move” means the server will delete the key itself after sending to the authenticator, because the server is assumed not to disclose and possess the key afterwards. This stage could be absent when a Pre-Shared Key or a cached PMK is used. In the third stage, the authenticator initialize a 4-way handshake to confirm the PMK existence and derive a fresh PTK for the following data sessions. Upon this point, the 802.1X ports are unblocked to accept data packets. As the fourth stage, the supplicant and the authenticator might also perform a group key handshake to distribute a GTK in case of multicast applications, however, this is not necessary since the GTK (Group Transient Key) could also be transmitted in the 4-way handshake. With the shared PTK and GTK, the supplicant and authenticator could derive the encryption key and MAC key, thus perform secure data communications. We will focus on the 4-way handshake because it is an essential part in all operation modes. 4-Way Handshake Group Key Handshake Data Communication
IKE subprotocol from IPSEC A, (ga mod p) B, (gb mod p) , signB(m1,m2) signA(m1,m2) A B Result: A and B share secret gab mod p Analysis involves probability, modular exponentiation, complexity, digital signatures, communication networks
Correct if no security violation in any run Run of a protocol Initiate Respond B C A Attacker D Correct if no security violation in any run
Protocol analysis methods Cryptographic reductions Bellare-Rogaway, Shoup, many others UC [Canetti et al], Simulatability [BPW] Prob poly-time process calculus [LMRST…] Symbolic methods (see also http://www.avispa-project.org/) Model checking FDR [Lowe, Roscoe, …], Murphi [M, Shmatikov, …], Symbolic search NRL protocol analyzer [Meadows] Theorem proving Isabelle [Paulson …], Specialized logics [BAN, …]
“The” Symbolic Model Messages are algebraic expressions Adversary Nonce, Encrypt(K,M), Sign(K,M), … Adversary Nondeterministic Observe, store, direct all communication Break messages into parts Encrypt, decrypt, sign only if it has the key Example: K1, Encrypt(K1, “hi”) K1, Encrypt(K1, “hi”) “hi” Send messages derivable from stored parts
Many formulations Word problems [Dolev-Yao, Dolev-Even-Karp, …] Each protocol step is symbolic function from input message to output message; cancellation law dkekx = x Rewrite systems [CDLMS, …] Each protocol step is symbolic function from state and input message to state and output message Logic programming [Meadows NRL Analyzer] Each protocol step can be defined by logical clauses Resolution used to perform reachability search Constraint solving [Amadio-Lugiez, … ] Write set constraints defining messages known at step i Strand space model [MITRE] Partial order (Lamport causality), reasoning methods Process calculus [CSP, Spi-calculus, applied , …) Each protocol step is process that reads, writes on channel Spi-calculus: use for new values, private channels, simulate crypto
Complexity results (see [Cortier et al]) Bounded # of sessions Unbounded number of sessions Without nonces With nonces Co-NP complete General: undecidable Bounded msg length: DEXP-time complete Bounded msg length: undecidable Tagged: exptime Tagged: decidable One-copy: DEXP-time complete Ping-pong protocols: Ptime Additional results for variants of basic model (AC, xor, modular exp, …)
Many protocol case studies Murphi [Shmatikov, He, …] SSL, Contract signing, 802.11i, … Meadows NRL tool Participation in IETF, IEEE standards Many important examples Paulson inductive method; Scedrov et al Kerberos, SSL, SET, many more Protocol logic BAN logic and successors (GNY, SvO, …) DDMP … Automated tools based on the symbolic model detect important, nontrivial bugs in practical, deployed, and standardized protocols
[Bellare-Rogaway, Shoup, …] Computational model I “Alice” “Bob” oracle tape oracle tape Adversary input tape work tape [Bellare-Rogaway, Shoup, …]
Computational model II Turing machine Turing machine Adversary Turing machine Turing machine [Canetti, …]
Computational model III Program Program Adversary In(c, x).Send(…) | In(d,y).new z. Send(…y z ..) | In(c, encrypt(k,…)). … program Program [Micciancio-Warinschi, …]
Computational security: encryption Several standard conditions on encryption Passive adversary Semantic security Chosen ciphertext attacks (CCA1) Adversary can ask for decryption before receiving a challenge ciphertext Chosen ciphertext attacks (CCA2) Adversary can ask for decryption before and after receiving a challenge ciphertext Computational model offers more choices than the symbolic model
Passive Adversary Challenger Attacker m0, m1 E(mi) guess 0 or 1
Chosen ciphertext CCA1 c D(c) Challenger Attacker m0, m1 E(mi) guess 0 or 1
Chosen ciphertext CCA2 c D(c) Challenger Attacker m0, m1 E(mi) c E(mj) D(c) guess 0 or 1
Equivalence-based methods: UC, RSIM Z Z P2 P1 P1 P2 S A P4 P3 P4 P3 Slide: R Canetti Equivalence-based methods: UC, RSIM input Z output Z Protocol execution Ideal functionality P1 P3 P4 P2 F P1 P2 S simulator A attacker P4 P3 A bit more precisely, we have an “environment machine, Z, that interacts with one of the following two processes: On the right we have parties running protocol pi, in an interaction with some adversary A. (A has the usual capabilities of an adversary interacting with a protocol. That is, it can hear messages, perhaps modify them, it can corrupt parties, etc. All inputs are received from Z, and all outputs are reported to Z. In addition, Z can talk freely with A throughout the interaction. On the left we have the ideal process for F. here. As soon as a party gets input from Z, it hands this input over to F. As soon as a party gets an output from Z, it reports this output to Z. We also have an adversary, S, in the ideal process, but this adversary is very limited – it can only talk with the functionality (and of course corrupt parties and learn their inputs and outputs.)
Can we have best of both worlds? Symbolic model [NS78,DY84,…] Complexity-theoretic model [GM84,…] Attacker actions – Fixed set of actions, nondeterminism (ABSTRACTION) + Any probabilistic poly-time computation Security properties – Idealized, e.g., secret message = not possessing atomic term representing message + Fine-grained, e.g., secret message = no partial information about bitstring representation Analysis methods + Successful array of tools and techniques; compositionality – Hand-proofs are difficult, error-prone, unsystematic; no automation
Some relevant approaches Simulation framework Backes, Pfitzmann, Waidner Correspondence theorems Micciancio, Warinschi Kapron-Impagliazzo logics Abadi-Rogaway passive equivalence (K2,{01}K3) , {({101}K2,K5 )}K2, {{K6}K4}K5 (K2, ) , {({101}K2,K5 )}K2, { }K5 (K1, ) , {({101}K1,K5 )}K1, { }K5 (K1,{K1}K7) , {({101}K1,K5 )}K1, {{K6}K7}K5 Proposed as start of larger plan for computational soundness … [Abadi-Rogaway00, …, Adao-Bana-Scedrov05]
Symbolic methods comp’l results Pereira and Quisquater, CSFW 2001, 2004 Studied authenticated group Diffie-Hellman protocols Found symbolic attack in Cliques SA-GDH.2 protocol Proved no protocol of certain type is secure, for >3 participants Micciancio and Panjwani, EUROCRYPT 2004 Lower bound for class of group key establishment protocols using purely Dolev-Yao reasoning Model pseudo-random generators, encryption symbolically Lower bounds is tight; matches a known protocol
Rest of talk: Protocol composition logic Honest Principals, Attacker Private Data Send Receive Alice’s information Protocol Private data Sends and receives Here we see this pictorially. Alice is a participant in the protocol. She only knows a limited amount of information. She knows the rules of the protocol she is using. She knows private information, such as the nonces she has created, and her own private key. And she knows what messages she has sent and received from the network, which can contain other people, both honest principals, and evil attackers. Logic has symbolic and computational semantics
Example A B { A, Noncea } { Noncea, … } Kb A B Ka Alice assumes that only Bob has Kb-1 Alice generated Noncea and knows that some X decrypted first message Since only X knows Kb-1, Alice knows X=Bob
More subtle example: Bob’s view { A, Noncea } { Noncea, B, Nonceb } { Nonceb} Kb A B Ka Kb Bob assumes that Alice follows protocol Since Alice responds to second message, Alice must have sent the first message
Execution model Protocol Initial configuration Run “Program” for each protocol role Initial configuration Set of principals and key Assignment of 1 role to each principal Run Position in run new x send{x}B A recv{x}B decr recv{z}B B new z send{z}B C
Formulas true at a position in run Action formulas a ::= Send(P,m) | Receive (P,m) | New(P,t) | Decrypt (P,t) | Verify (P,t) Formulas ::= a | Has(P,t) | Fresh(P,t) | Honest(N) | Contains(t1, t2) | | 1 2 | x | | Example a < b = (b a) Notation in papers varies slightly …
Modal Formulas After actions, condition Before/after assertions [ actions ] P where P = princ, role id Before/after assertions [ actions ] P Composition rule [ S ] P [ T ] P [ ST ] P Logic formulated: [DMP,DDMP] Related to: BAN, Floyd-Hoare, CSP/CCS, temporal logic, NPATRL
Example: Bob’s view of NSL Bob knows he’s talking to Alice [ receive encrypt( Key(B), A,m ); new n; send encrypt( Key(A), m, B, n ); receive encrypt( Key(B), n ) ] B Honest(A) Csent(A, msg1) Csent(A, msg3) where Csent(A, …) Created(A, …) Sent(A, …) msg1 msg3 In the correct Needham-Schroeder-Lowe Protocol, we would like to be able to prove that after Bob has successfully completed his responder role, he knows he is talking to Alice. Shown in brackets here is the entire responder role, instantiated for principal’s B and A. At the end of performing his role, Bob would like to be able to conclude the following – that if Alice is honest – in other words if Alice is following the protocol, then Alice is the person who created and sent the two messages Bob just received, shown here as message 1 and message 3.
Proof System Sample Axioms: Soundness Theorem: Reasoning about possession: [receive m ]A Has(A,m) Has(A, {m,n}) Has(A, m) Has(A, n) Reasoning about crypto primitives: Honest(X) Decrypt(Y, enc(X, {m})) X=Y Honest(X) Verify(Y, sig(X, {m})) m’ (Send(X, m’) Contains(m’, sig(X, {m})) Soundness Theorem: Every provable formula is valid in symbolic model
Modal Formulas After actions, condition Before/after assertions [ actions ] P where P = princ, role id Before/after assertions [ actions ] P Composition rule [ S ] P [ T ] P [ ST ] P
Composition example: Part 1 Diffie Hellman A B: ga B A: gb Shared secret (with someone) A deduces: Knows(Y, gab) (Y = A) ۷ Knows(Y,b) Authenticated
Composition example: Part 2 Challenge-Response A B: m, A B A: n, sigB {m, n, A} A B: sigA {m, n, B} Shared secret Authenticated A deduces: Received (B, msg1) Λ Sent (B, msg2)
Composition: Part 3 m := ga n := gb ISO-9798-3 Shared secret: gab A B: ga, A B A: gb, sigB {ga, gb, A} A B: sigA {ga, gb, B} Shared secret: gab Authenticated Obtained by substituting Diffie-Hellman exponentials for the fresh terms m and n in the challenge-response protocol.
Additional issues Reasoning about honest principals Invariance rule, called “honesty rule” Preserve invariants under composition If we prove Honest(X) for protocol 1 and compose with protocol 2, is formula still true?
More about composing protocols ’ DH Honest(X) … CR Honest(X) … ’ |- Authentication |- Secrecy ’ |- Secrecy ’ |- Authentication ’ |- Secrecy Authentication [additive] DH CR ’ [nondestructive] = ISO Secrecy Authentication
PCL Computational PCL Syntax Proof System Computational PCL Syntax ± Proof System ± Symbolic model Semantics Complexity-theoretic model Semantics
Some general issues Computational PCL Soundness Theorem: Benefits Symbolic logic for proving security properties of network protocols using public-key encryption Soundness Theorem: If a property is provable in CPCL, then property holds in computational model with overwhelming asymptotic probability. Benefits Retain compositionality Symbolic proofs about computational model Computational reasoning in soundness proof (only!) Different axioms rely on different crypto assumptions symbolic computational generally uses strong crypto assumptions
PCL Computational PCL Syntax, proof rules mostly the same Retain compositional approach But some issues with propositional connectives… Significant differences Symbolic “knowledge” Has(X,t) : X can produce t from msgs that have been observed, by symbolic algorithm Computational “knowledge” Possess(X,t) : can produce t by ppt algorithm Indist(X,t) : cannot distinguish from rand value in ppt More subtle system Some axioms rely on CCA2, some info-theoretically sound, etc.
Computational Traces Computational trace contains Runs of the protocol Symbolic actions of honest parties Mapping of symbolic variables to bitstrings Send-receive actions (only) of the adversary Runs of the protocol Set of all possible traces Each tagged with random bits used to generate trace Tagging set of equi-probable traces
Complexity-theoretic semantics Given protocol Q, adversary A, security parameter n, define T=T(Q,A,n), set of all possible traces [[]](T) a subset of T that respects in a specific way Intuition: valid when [[]](T) is an asymptotically overwhelming subset of T
Semantics of trace properties Defined in a straight forward way [[Send(X, m)]](T) All traces t such that t contains a Send(msg) action by X the bistring value of msg is the bitstring value of m
Inductive Semantics [[1 2]] (T) = [[1]] (T) [[2]] (T) Implication uses a form of conditional probability [[1 2]] (T) = [[1]] (T) [[2]] (T’) where T’ = [[1]] (T)
Semantics of Indistinguishable Not a trace property Intuition: Indist(X, m) holds if no algorithm can distinguish m from a random value, given X’s view of the run Protocol Attacker C D m View(X) if b then m else rand b’ [[Indist(X, m)]] (T, D, e) = T if | #(t: b=b’)-|T|/2 | < e
|[[]](T,D,f(n))| / |T| > 1 – f(n) Validity of a formula Q |= if adversary A distinguisher D negligible function f n0 s.t. n > n0 |[[]](T,D,f(n))| / |T| > 1 – f(n) Fraction of traces where “ is true” Fix protocol Q, PPT adversary A Choose value of security parameter n Vary random bits used by all programs Obtain set T=T(Q,A,n) of equi-probable traces T(Q,A,n) [[]](T,D,f)
Advantages of Computational PCL High-level reasoning, sound for “real crypto” Prove properties of protocols without explicit reasoning about probability, asymptotic complexity Composability PCL is designed for protocol composition Composition of individual steps Not just coarser composition available with UC/RSIM Can identify crypto assumptions needed ISO-9798-3 [DDMW2006] Kerberos V5 [unpublished] Thesis: existing deployed protocols have weak security properties, assuming weak security properties of primitives they use; UC/RSIM may be too strong
CPCL analysis of Kerberos V5 Kerberos has a staged architecture First stage generates a nonce and sends it encrypted Second stage uses nonce as key to encrypt another nonce Third stage uses second-stage nonce to encrypt other msgs Secrecy Logic proves “GoodKey” property of both nonces Authentication Proved assuming encryption provides ciphertext integrity Modular proofs using composition theorems
Challenges for computational reasoning More complicated adversary Actions of computational adversary do not have a simple inductive characterization More complicated messages Computational messages are arbitrary sequences of bits, without an inductively defined syntactic structure Different scheduler Simpler “non-preemptive” scheduling is typically used in computational models (change symbolic model for equiv) Power of induction ? Indistinguishability, other non-trace-based properties appear unsuitable as inductive hypotheses Solution: prove trace property inductively and derive secrecy
Current and Future Work Investigate nature of propositional fragment Non-classical implication related to conditional probability complexity-theoretic reductions connections with probabilistic logics (e.g. Nilsson86) Generalize reasoning about secrecy Work in progress, thanks to Arnab Need to incorporate insight of “Rackoff’s attack” Extend logic More primitives: signature, hash functions,… Complete case studies Produce correctness proofs for all widely deployed standards Collaborate on Foundational work – please join us ! Implementation and case studies – please help us !
Conclusions Symbolic model supports useful analysis Tools, case studies, high-level proofs Computational model more “correct” Captures accepted notions in cryptography Greater expressiveness for security properties Two approaches can be combined Several current projects and approaches One example: computational semantics for symbolic protocol logic
Credits Collaborators More information Talk this afternoon Collaborators M. Backes, A. Datta, A. Derek, N. Durgin, C. He, R. Kuesters, D. Pavlovic, A. Ramanathan, A. Roy, A. Scedrov, V. Shmatikov, M. Sundararajan, V. Teague, M. Turuani, B. Warinschi, … More information Web page on Protocol Composition Logic http://www.stanford.edu/~danupam/logic-derivation.html Science is a social process
Needham-Schroeder Protocol { A, NonceA } { NonceA, NonceB } { NonceB} Kb A B Ka Kb Result: A and B share two private numbers not known to any observer without Ka-1, Kb-1
Anomaly in Needham-Schroeder [Lowe] { A, Na } Ke A E { Na, Nb } Ka { Nb } Ke { Na, Nb } { A, Na } Evil agent E tricks honest A into revealing private key Nb from B. Kb Ka B Evil E can then fool B.
Universal composability Slide: Y Lindell Universal composability also “reactive simulatability” [BPW], … see [DKMRS] For every real adversary A there exists an adversary S Protocol interaction Trusted party REAL IDEAL
Proof system Information-theoretic reasoning [new n]X (Y X) Indist(Y, n) Complexity-theoretic reductions Verify(X, m, Y) Honest(X, Y) Y’ Sign(Y’, m) Asymptotic calculations
Parts of proof are similar to [Micciancio, Warinschi] Example Axiom Source(Y,u,{m}X) Decrypts(X, {m}X) Honest(X,Y) (Z X,Y) Indistinguishable(Z, u) Proof idea: crypto-style reduction Assume axiom not valid: A D negligible f n0 n > n0 s.t. [[]](T,D,f)|/|T| < 1 –f(n) Construct attacker A’ that uses A, D to break IND-CCA2 secure encryption scheme Conditional implication essential Parts of proof are similar to [Micciancio, Warinschi]
Applications of PCL IKE, JFK family key exchange IKEv2 in progress 802.11i wireless networking SSL/TLS, 4way handshake, group handshake Kerberos v5 [Cervesato et al] GDOI [Meadows, Pavlovic] Current work Use CPCL to understand computational security of these protocols, reliance on specific crypto properties