Download presentation
Presentation is loading. Please wait.
Published byBrett Dixon Modified over 9 years ago
1
1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT
2
2 Keys & Passwords from Biometric Data Secure cryptographic keys: long, random strings Too hard to remember Popular solution: biometrics –Fingerprints, iris scans, voice prints, … –Same idea: personal information, e.g. SSN, Mom’s maiden name, street where you grew up, … Many nice ideas in literature, somewhat ad hoc
3
3 This Work Formal framework and general tools for handling personal / biometric key material Secure password storage Key encapsulation (for any application)
4
4 Issues with Personal Info / Biometrics Noise and human error Not Uniformly Random Should Not Be Stored in the Clear Fingerprint is Variable –Finger orientation, cuts, scrapes Personal info subject to memory failures / format –Name of your first girl/boy friend: “Um… Catherine? Katharine? Kate? Sue?” Measured data will be “close” to original in some metric (application-dependent)
5
5 Issues with Personal Info / Biometrics Noise and human error Not Uniformly Random Should Not Be Stored in the Clear (Crypto keys should be random) Fingerprints are represented as list of features –All fingers look similar Distribution is unknown –Ivanov is rare last name… unless first name is Sergei
6
6 Issues with Personal Info / Biometrics Noise and human error Not Uniformly Random Should Not Be Stored in the Clear Limited Changes Possible Theft is easy, makes info useless –Customer service representatives learn Mom’s maiden name, Social Security Number, … Keys cannot be changed many times –10 fingers, 2 eyes, 1 mother
7
7 Issues with Personal Info / Biometrics Noise and human error Not Uniformly Random Should Not Be Stored in the Clear How can we use this stuff as a password? Do we want to use this stuff as a password? (too late…)
8
8 Issues with Personal Info / Biometrics Noise and human error Not Uniformly Random Should Not Be Stored in the Clear How can we use this stuff as a password?
9
9 Example: Authentication
10
10 Authentication [EHMS,JW] AliceServer “How do I know you’re Alice”? Solution #1: Store a copy on server Problem: Password in the Clear ?¼?¼
11
11 Authentication [EHMS,JW] Alice “How do I know you’re Alice”? Solution #2: Store a hash of password Problem: No Error Tolerance H( ) Server ?=?= H( )
12
12 This Talk Formal framework and general tools for handling personal / biometric key material
13
13 Related Work Basic set-up studied for quite a while: Davida, Frankel, Matt ’98, Ellison, Hall, Milbert, Shneier ’00 First abstractions: Juels, Wattenberg ‘99, Frykholm, Juels ’01 –Handling noisy data in Hamming metric Juels, Sudan ’02 –Set difference metric Provable security: Linnartz, Tuyls ’03 –Provable security, specific distribution (multi-variate Gaussian) Acknowledgement: some slide ideas from Ari Juels
14
14 This Talk Formal framework and general tools for handling personal / biometric key material
15
15 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication Other metrics Stronger privacy? [DS]
16
16 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication Other metrics Stronger privacy? [DS]
17
17 Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then recover x 2.Secrecy: Given S(X), it’s hard to predict X Meaning of “close” depends on application How to measure hardness of prediction? x S(x)S(x) S Recover x’x’ S(x)S(x) x
18
18 Measuring Security X a random variable on {0,1} n Probability of predicting X = max x Pr[X = x] There are various ways to measure entropy… Min-entropy: H 1 (X) = -log ( max x Pr[X=x] ) Uniform on {0,1} n : H 1 (U n ) = n “ Password has min-entropy t ” means that adversary’s probability of guessing the password is 2 -t Passwords had better have high entropy!
19
19 Measuring Security X a random variable on {0,1} n Probability of predicting X = max x Pr[X = x] There are various ways to measure entropy… Min-entropy: H 1 (X) = -log ( max x Pr[X=x] ) Conditional entropy H 1 (X | Y )= -log ( prob. of predicting X given Y ) = -log ( Exp y { max x Pr[X=x | Y=y] } )
20
20 Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then recover x 2.Secrecy: Given S(X), it’s hard to predict X Goals: - Minimize entropy loss: H 1 (X) – H 1 (X | S(X) ) - Maximize tolerance: how “far” x ’ can be from x X S(X) S Recover x’x’ S(x)S(x) x
21
21 Example: Code-Offset Construction [BBR88, Cré97, JW02]
22
22 Code-Offset Construction [BBR,Cré,JW] View password as n bit string : x 2 {0,1} n Error model: small number of flipped bits Hamming distance: d H ( x, x ’) = # of positions in which x, x ’ differ Main idea: non-conventional use of standard error-correcting codes
23
23 Code-Offset Construction [BBR,Cré,JW] Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string Equiv: S( x ) = syndrome( x ) d x ECC(R) S(x)S(x) x’x’
24
24 Recovery Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string Equiv: S( x ) = syndrome( x ) Given S( x ) and x ’ close to x : –Compute x ’ © S( x ) –Decode to get ECC(R) –Compute x = S( x ) © ECC(R) d x ECC(R) S(x)S(x) x’x’ Corrects d/2 errors How much entropy loss?
25
25 Entropy Loss Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string H 1 (X | S(X)) = H 1 (X, R | S(X) ) ¸ H 1 (X) + H 1 (R) – |S(X)| = H 1 (X) + k – n d x ECC(R) S(x)S(x) Revealing n bits costs · n bits of entropy
26
26 Entropy Loss Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string H 1 (X | S(X)) = H 1 (X, R | S(X) ) ¸ H 1 (X) + H 1 (R) – |S(X)| = H 1 (X) + k – n Entropy loss = n – k = redundancy of code d x ECC(R) S(x)S(x)
27
27 Using Sketches for Authentication “How do I know you’re Alice”? x’=x’= x =x = S(x), Hash(x) Input to Hash should be uniformly random X is not uniform (especially given S(X)) Problem Recover Hash ?=?= AliceServer
28
28 Using Sketches for Authentication “How do I know you’re Alice”? x’=x’= x =x = S(x), Hash(x) AliceServer
29
29 Padding via Random Functions AliceServer x =x = S(x), F, Hash(F(x)) Scramble input with “random” function F. Store F. Solution Thm: “2-universal” F is sufficient (e.g., random linear map) Verification same as before Is hash input random enough? “How do I know you’re Alice”? x’=x’=
30
30 Padding via Random Functions AliceServer x =x = S(x), F, Hash(F(x)) Scramble input with “random” function F. Store F. Solution Secure as long as: Entropy-Loss S + | Hash | + 2 log(1/ ) · H 1 (X) Proof idea: S(x), F, Hash(F(x)) ¼ S(x), F, Hash(R) Similar to “left-over hash lemma / privacy amplification” “How do I know you’re Alice”? x’=x’=
31
31 Sketches and Authentication “Fuzzy Sketch” + Hashing Solves Authentication Assumption: X has high entropy –Necessary –X could be several passwords taken together “Hamming” errors can be handled with standard ECC Similar techniques imply one can use X as key for any crypto application (e.g. encryption) –Covers several previously studied settings
32
32 (Safely public) Fuzzy Extractor K = Gen … (Random given P) Some application e.g. (S( ), F ) K = Rec P ’ P
33
33 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication Other metrics Stronger privacy?
34
34 Other metrics Variants of code-offset work for many metrics –“transitivity” suffices [DRS] –Continuous normed spaces: Euclidean [LT,VTDL], L p [DRS] New constructions –Set difference metric (introduced by [JS]) –General technique via special embeddings: edit distance, permutation metric Real(er)-life metrics?
35
35 Why Set Difference? [EHMS,FJ,JS] Inputs: tiny subsets in a HUGE universe Some representations of personal / biometric data: –Fingerprints represented as feature list (minutiae/ridge meetings) –List of favorite books X µ {1,…, N}, #X = s d S (X,Y) = ½ # (X Y) = Hamming distance on vectors in {0,1} N. X Y X Y Code-offset not good: N-bit string is too long! Want: s log N bits
36
36 Recall: Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then 2.Secrecy: Given S(X), it’s hard to predict X Goals: - Minimize entropy loss: H 1 (X) – H 1 (X | S(X) ) - Maximize tolerance: how “far” x ’ can be from x X S(X) S Recover x’x’ S(x)S(x) x
37
37 New Constructions for Set Difference X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X Y) Two constructions 1.punctured Reed-Solomon code 2.Sublinear-time decoding of BCH codes from syndromes Both constructions: –As good as code-offset could be ( ¼ optimal) –Storage space · ( s + 1) log N –Entropy loss 2 e log N to correct e errors –Improve previous best [JS02] (+ analysis) X Y X Y
38
38 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ X P P’P’ s =7, e =2, k =2 12N
39
39 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ Recovery: Given P’ and X’ close to X 1.Reed-Solomon decoding yields P 2.Intersections of P and P’ yield X P P’P’ s =7, e =2, k =2 X’ 12N
40
40 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ Entropy loss: H 1 (X | P’) = H 1 (X,P | P’) ¸ H 1 (X) + H 1 (P) - | P’ | = H 1 (X) + ( k +1)log N – s log N = H 1 (X) – 2 e log N
41
41 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication “Set difference” distance: Reed-Solomon construction Other schemes via metric embeddings: edit distance Achieving stronger privacy
42
42 Other metrics? Real error models not as clean as Hamming & set diff. Algebraic techniques won’t apply directly. Possible Approaches: 1.Develop new scheme tailored to particular metric 2.Reduce to easier metric via embedding : M 1 ! M 2 x, y close ) ( x ), ( y ) close x, y far ) ( x ), ( y ) far M1M1 M2M2
43
43 Biometric embeddings Real error models not as clean as Hamming & set diff. Algebraic techniques won’t apply directly. Possible Approaches: 1.Develop new scheme tailored to particular metric 2.Reduce to easier metric via embedding : M 1 ! M 2 x, y close ) ( x ), ( y ) close x, y far ) ( x ), ( y ) far M1M1 M2M2 A is a large set ) (A) is large H 1 (A) large ) H 1 ( (A)) large
44
44 Edit Distance Strings of bits d(x,y) = number of insertions & deletions to go from x to y [ADGIR] No good standard embeddings into Hamming Shingling [Broder] : “biometric” embedding into Set.Diff. 101010101010 10101101010 101011010101
45
45 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication Other metrics Stronger privacy?
46
46 Stronger Privacy? Previous notion: Unpredictability –Can’t guess X even after seeing sketch –Sufficient for using X as a crypto key What about the privacy of X itself? –Do not want particular info about X leaked (say, first 20 bits) Ideal notion, ignoring computational restrictions: X almost independent of S(X) Problem: Some info must be leaked by S(X) (provably) –Mutual information I(X ; S(X)) is large We want the ensure that “useful” information is hidden
47
47 Attempt: Hiding Useful Info [CMR,RW,DS] Definition: S(X) hides all functions of X if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] < where X, X’ are independent, identical random variables Intuition:“A cannot guess g(X) given S(X)” S(X)S(X)X g(X)g(X) difficult Implies unpredictability
48
48 Attempt: Hiding Useful Info [CMR,RW,DS] Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] < where X, X’ are independent, identical random variables Non-computational version of semantic-security for high-entropy input distributions
49
49 Hiding Useful Info Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] < where X, X’ are independent, identical random variables Can be achieved in code-offset construction [DS03] –Use pseudo-randomly chosen code –Need to keep decodability –Exact parameters still unknown
50
50 Technique: Equivalence to Extraction Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] < where X, X’ are independent, identical random variables S() is an “extractor” if for all r.v.’s X 1, X 2 of min-entropy t S(X 1 ) ¼ S(X 2 ) Thm: S(X) hides all functions of X, whenever H 1 (X) ¸ t, S() is an “extractor” for r.v.’s of min-entropy t – 1
51
51 The right notion? Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] < where X, X’ are independent, identical random variables Is this strong enough? –Not secure if repeated: S(X), S’(X), S’’(X) may reveal X ! –What’s the “right” notion? –What computational notions might we use? (see [CMR])
52
52 Outline Basic Setting: Password Authentication Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication Other metrics Stronger privacy?
53
53 Conclusions Generic framework for turning noisy, non-uniform data into secure cryptographic keys –Only assume high entropy (necessary) –Abstraction, simplicity allow comparing schemes –Constructions for Hamming, Set Difference, Edit Metrics Possible to have strong privacy of data Future work –Real-life metrics (under way) –Other settings ([DV] use sketches in bounded-storage model) –Better definitions
54
54 Other Topics I Work On Distributed Crypto Protocols –Multi-party Computation –Byzantine Agreement Modeling partial disclosure of secret data –Exposure-resilient crypto, database privacy Quantum Cryptography –Authentication, Encryption of Quantum States –Multi-party Quantum Computation –Quantum Codes Combinatorial Tools for Cryptography
55
55 Questions?
56
56 Other Slides
57
57 The Problem: We’re Human “Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed when performing cryptographic operations. (They are also large, expensive to maintain, difficult to manage, and they pollute the environment. […] But they are sufficiently pervasive that we must design our protocols around their limitations.)” From Network Security by Kaufman, Perlman and Speciner.
58
58 Stuff I want to say Previous work cute, clever, not formal enough, no good analysis of how to use it generic framework general tools About authentication: interplay between computational assumptions and information-theoretic technique practical… may be implemented General context: provable security
59
59 Using Biometric Passwords Crypto moving beyond encryption / authentication Newer applications often poorly modeled –Ad hoc solutions –Get broken! Crypto Theory: models, proofs This talk:– formal framework – provably secure constructions for using biometric data for authentication, key recovery
60
60 The Problem: We’re Human Secure cryptographic keys: long, random strings Too hard to remember Keep copy under doormat? Use short, easy-to-remember password (PIN)? –easy to remember = easy to guess
61
61 Solution: Use Password I Never Forget Personal Info –Mom’s maiden name, Date of birth –Name of first pet, name of street where you grew up Biometrics –Fingerprints –Iris Scan –Voice print –Genetic Subsequence
62
62 Biometric embeddings : M 1 ! M 2 We care about entropy: non-standard requirements x,y close ) (x), (y) close A is a large set ) (A) is large d1(x,y)d1(x,y) d 2 ( (x), (y)) · · # A # (A) ¸ or, equivalently H1(X)H1(X) H 1 ( (X)) ¸
63
63 Statistical Distinguishability Statistical Difference (L 1 ): For distributions p 0 (x), p 1 (x): SD(p 0,p 1 ) = ½ x | p 0 (x) – p 1 (x) | SD measures distinguishability: If b à {0,1}, x à p b then max A | Pr[A(x)=b] – ½ | = ½ SD(p 0,p 1 ) (Notation: A ¼ B if SD(A,B) · )
64
64 Statistical Distinguishability Two probability distributions p 0 (x), p 1 (x) Sphinx: 1.Flips a fair coin 2.- Heads: Samples Z according to p 0 - Tails: Samples Z according to p 1 3.Shows Z to Greek Hero Greek Hero: Guesses if coin was heads or tails. Hero can wins with probability at least ½ Hero wins w. prob. ½ + ) p 0, p 1 are -distinguishable
65
65 Fuzzy Extractor (generalizes extractors [NZ]) 1.Error-correction: If x ’ is “close” to x, then 2.Security: Pub, K ¼ Pub, R Recover X Pub K = Pub x’x’ K Pub = S(x), F K = F(x) Generate
66
66 Key Recovery [EHMS, FJ, JS] www. Fingers2Keys.com Generate Recover
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.