Trustworthy Services from Untrustworthy Components: Overview Fred B. Schneider Department of Computer Science Cornell University Ithaca, New York U.S.A. Joint work with Lidong Zhou and Robbert van Renesse.
1 Fault-tolerance by Replication The “fine print” … l Replica failures are independent. l Replica coordination protocol exists. l No secrets stored in server’s state. Servers Client
2 Trustworthy Services A trustworthy service… –tolerates component failures –tolerates attacks –might involve confidential data N.b. Cryptographic keys must be kept confidential and are useful for authentication, even when data is not secret.
3 Revisiting the “Fine Print” l Replica failures are independent. l Replica Coordination protocol exists. l No secrets stored in server’s state.
4 Revisiting the “Fine Print” l Replica failures are independent. –But attacks are not independent. l Replica Coordination protocol exists. l No secrets stored in server’s state.
5 Revisiting the “Fine Print” l Replica failures are independent. –But attacks are not independent. l Replica Coordination protocol exists. –But such protocols involve assumptions, and assumptions are vulnerabilities. Timing assumptions versus Denial of Service l No secrets stored in server’s state.
6 Revisiting the “Fine Print” l Replica failures are independent. –But attacks are not independent. l Replica Coordination protocol exists. –But such protocols involve assumptions, and assumptions are vulnerabilities. Timing assumptions versus Denial of Service l No secrets stored in server’s state. –But secrets cannot be avoided for authentication Replicating a secret erodes confidentiality.
7 Compromised Components Correct component satisfies specification. Compromised component does not. –Adversary might control a compromised component. –Component is compromised if adversary knows secrets being stored there. A recovery protocol transforms component: compromised correct
8 Component Correlation Components are correlated to the extent that one attack suffices to compromise all. Correlation arises from: –Dependence on the environment –Vulnerabilities in shared design / code –Shared secrets Goal: Eliminate sources of correlation.
9 Correlation: Environment Vulnerabilities Vulnerabilities = Assumptions –Weaker assumptions are better. “Synchronous system” assumption: –Bounded message delivery delay –Bounds on process execution speed violated by denial of service attacks needed for “agreement protocols” in deterministic systems [FLP]
10 Correlation > Towards Weaker Assumptions: Eschewing Synchronous Systems Asynchronous system model is weaker but requires making “sacrifices”: –Sacrifice determinacy: Use “randomized protocols” (requires randomness) –Sacrifice liveness but preserve safety. –Sacrifice state machine replication Use quorums or other weaker mechanisms Some service semantics cannot be implemented.
11 Component Correlation Correlation arises from: –Dependence on environment –Vulnerabilities in shared design / code –Shared secrets
12 Correlation: Eschewing Shared Design / Code Solution: Diversity! Expensive or impossible to obtain: Development costs Interoperability risks Still, what diversity does exist should be leveraged.
13 Correlation > Leveraging Extant Diversity: Adversary Structures t-resilience: Service is not compromised unless more than t components are. –Known as a threshold structure. FS-resilience: If FS = {F 1, F 2, … F r } then service not compromised provided the set C of compromised components satisfies C F i for some i. –Select FS according to dimensions of diversity. –Known as an adversary structure.
14 Component Correlation Correlation arises from: –Dependence on environment –Vulnerabilities in shared design / code –Shared secrets
15 Correlation: Eliminating Shared Secrets l (n,t) secret sharing [Shamir, Blakley] : –Secret s is divided into n shares. –Any t or more shares suffice for reconstructing s. –Fewer shares convey no information about s. –Can be adapted for arbitrary adversary structures. l Threshold cryptography: –Perform cryptographic operations piecewise using shares of private key; result is as if private key was used. Example: Threshold digital signatures
16 Proactive Recovery When is recovery protocol run? –After an attack is detected. Not sufficient to reboot from good system image. Must get system state (or have stateless service). Must also “refresh” secrets. –Periodically, even if an attack is not detected. Not all attacks are detected, proactive recovery defends against undetected attacks. Adversary strategy: Increase the window of vulnerability, interval between proactive recovery executions.
17 Proactive Recovery: Secret Refresh l Refresh secret shares: PSS and APSS l Refresh symmetric keys: Revisit KDC. Force new password choices. l Refresh public / private key pairs: Invent new server private key Must disseminate new server public key.
18 Proactive Recovery > Secret Refresh: Refresh Private / Public Keys I Approach: Tamper proof hardware. –Key material stored in tamper-resistant hw. Key cannot be read or modified. Attacker can still instigate crypto operations with key. Protocols must accommodate such possible rogue behavior.
19 Proactive Recovery > Secret Refresh: Refresh Private / Public Keys II Approach: Use off-line private keys. –New public keys are propagated through a secure out-of-band channel. Use off-line private keys to sign the new public keys. Components storing off-line keys can be connected to network using a one-way channel (e.g. “pump”).
20 Proactive Recovery > Secret Refresh: Refresh Private / Public Keys III Approach: Assume “awareness” of attacks. –System-wide private key shared among components. Component generates new private key Component uses old private key to sign new public key. Component requests system sign new public key. System refuses to sign new public key if old private key already subsumed. Out of band process then invoked.
21 Proactive Recovery: Transparency and Change I Scalability concerns dictate that clients be shielded from changes due to proactive recovery. l Service public / private key: –Proactive secret sharing changes private key shares without changing private key (or public key). l Server identities: –A single contacted server operates as a delegate. –Service key signs responses to client. –Self-verifying messages impede rogue delegates from spoofing as clients.
22 Proactive Recovery: Transparency and Change II l Server public keys. If client must know… –Local certificate: Signed by server using server’s off-line key –Global certificate: Local certificate signed by service private key Service signs only if local signature on certificate is valid Use t+1 threshold crypto for service signature Stored at 2t+1 servers. (Out of 3t+1) –Client obtains current public key for server i: Retrieve global certificate for all servers from 2t+1 servers epoch numbers in t+1 sets will be the same---that is current
23 Research Programme Trajectory l Cornell On-line Certification Authority (COCA) l Asynchronous Proactive Secret Sharing (APSS) l Distributed Blinding Protocol l Codex Secret Store Key ideas: –Weak computational models (asynchronous) –Thresholdization [sic] / “multi-party computation” –Proactive protocols (vs Transparency)