Ragib Hasan University of Alabama at Birmingham CS 491/691/791 Fall 2011 Lecture 10 09/15/2011 Security and Privacy in Cloud Computing
Securing Data Integrity 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan2 Goal: Learn about PoR based techniques for protecting data integrity in clouds Review Assignment #4 Kevin D. Bowers, Ari Juels, and Alina Oprea. HAIL: A high-availability and integrity layer for cloud storage. In Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS '09), 2009
PoR: Proof of Retrievability Definition: – A compact proof that the stored file is intact It can be retrieved Difference with PDP? – PDP proves the file is present in the server – PDP doesn’t prove the file is retrievable in entirety 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan3
Overview of PoR 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan4 Client Server Challenge c Response r File F Key Generator File Encoder Key k
HAIL: High Availability and Integrity Layer (RSA Labs) RAID for clouds!! Uses PoR and distributed file storage to ensure retrievability, integrity, and availability Allows recovering from malicious cloud providers 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan5
Why we need HAIL? PoR allows checking data retrievability, but if data is deleted by malicious provider, nothing can be done. Even single bit errors can render file useless Idea: – Use error-correcting codes to ignore small errors – Use PoR to detect larger errors – Use RAID like redundancy using multiple cloud providers (to ensure reconstruction) 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan6
Advantages of HAIL Strong file-intactness assurance Low overhead Strong adversarial model Direct client-server communication 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan7
RAID (Redundant Array of Inexpensive Disks) File block Parity block F F1F1 F 1 F 2 F 3 F3F3 F2F2 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan8
F F1F1 F 1 F 2 F 3 F3F3 F2F2 The Cloud isn’t necessarily so nice What if service providers lose data but… don’t tell you until file is lost? X XX Provider AProvider BProvider CProvider D 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan9
Mobile adversary A mobile adversary moves from device to device, corrupting as it goes—potentially silently Mobile adversary models, e.g., system failures / corruptions over time, virus propagation RAID isn’t designed for this kind of adversary – Designed for limited, readily detectable failures in devices you own—the benign case 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan10
Mobile adversary In cryptography, usual approach to mobile adversary is proactive 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan11
Mobile adversary In cryptography, usual approach to mobile adversary is proactive Another, cheaper possibility is reactive: We detect and remediate – Like whack-a-mole! PORs can provide detection here… 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan12
HAIL design principle TAR: Test and Redistribute – Divide time into epochs – At each epoch, test for any corruption/missing blocks – Rebuild corrupted blocks by getting data from other cloud providers, and distributing to damaged copy 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan13
Multiple providers: Naïve approach 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan14 Amazon S3 GoogleEMC Atmos Client F Sample and check consistency across providers FF F Naïve approach
Creeping attack 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan15 Amazon S3 GoogleEMC Atmos Client FFF The probability that client samples the corrupted block is low File can not be recovered after [n/b] epochs F F F
Local PoR checks are costly 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan16 Amazon S3 GoogleEMC Atmos Client F F FF ECC POR Cons: requires integrity checks for each replica
HAIL overview 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan17
Reconstruction in HAIL 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan18
19 Dispersal code Client F dispersal (n,m) P1P1 P2P2 P3P3 P4P4 P5P5 F Dispersal code parity blocks 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan
20 Dispersal code Client P1P1 P2P2 P3P3 P4P4 P5P5 Stripe Check that stripe is a codeword in dispersal code POR encoding to correct small corruption Dispersal code parity POR encoding F Dispersal code parity blocks How to increase file lifetime? 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan
21 Increasing file lifetime with MACs Client P1P1 P2P2 P3P3 P4P4 P5P5 MAC Can we reduce storage overhead? 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan
22 Integrity-protected dispersal code Client P1P1 P2P2 P3P3 P4P4 P5P5 Reed-Solomon dispersal code m h k 1 (m) UHF h k 2 (m) PRF + 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan
23 Integrity-protected dispersal code Client P1P1 P2P2 P3P3 P4P4 P5P5 MACs embedded into parity symbols m PRF+ 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan
Things to consider Practicality of the scheme (test and redistribute) Attacker model Other security issues 09/15/2011Fall 2011 Lecture 10 | UAB | Ragib Hasan24