1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT.

1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT

2 Keys & Passwords from Biometric Data Secure cryptographic keys: long, random strings Too hard to remember Popular solution: biometrics –Fingerprints, iris scans, voice prints, … –Same idea: personal information, e.g. SSN, Mom’s maiden name, street where you grew up, … Many nice ideas in literature, somewhat ad hoc

3 This Work Formal framework and general tools for handling personal / biometric key material Secure password storage Key encapsulation (for any application)

4 Issues with Personal Info / Biometrics  Noise and human error  Not Uniformly Random  Should Not Be Stored in the Clear Fingerprint is Variable –Finger orientation, cuts, scrapes Personal info subject to memory failures / format –Name of your first girl/boy friend: “Um… Catherine? Katharine? Kate? Sue?” Measured data will be “close” to original in some metric (application-dependent)

5 Issues with Personal Info / Biometrics  Noise and human error  Not Uniformly Random  Should Not Be Stored in the Clear (Crypto keys should be random) Fingerprints are represented as list of features –All fingers look similar Distribution is unknown –Ivanov is rare last name… unless first name is Sergei

6 Issues with Personal Info / Biometrics  Noise and human error  Not Uniformly Random  Should Not Be Stored in the Clear  Limited Changes Possible Theft is easy, makes info useless –Customer service representatives learn Mom’s maiden name, Social Security Number, … Keys cannot be changed many times –10 fingers, 2 eyes, 1 mother

7 Issues with Personal Info / Biometrics  Noise and human error  Not Uniformly Random  Should Not Be Stored in the Clear How can we use this stuff as a password? Do we want to use this stuff as a password? (too late…)

8 Issues with Personal Info / Biometrics  Noise and human error  Not Uniformly Random  Should Not Be Stored in the Clear How can we use this stuff as a password?

9 Example: Authentication

10 Authentication [EHMS,JW] AliceServer “How do I know you’re Alice”? Solution #1: Store a copy on server Problem: Password in the Clear ?¼?¼

11 Authentication [EHMS,JW] Alice “How do I know you’re Alice”? Solution #2: Store a hash of password Problem: No Error Tolerance H( ) Server ?=?= H( )

12 This Talk Formal framework and general tools for handling personal / biometric key material

13 Related Work Basic set-up studied for quite a while: Davida, Frankel, Matt ’98, Ellison, Hall, Milbert, Shneier ’00 First abstractions: Juels, Wattenberg ‘99, Frykholm, Juels ’01 –Handling noisy data in Hamming metric Juels, Sudan ’02 –Set difference metric Provable security: Linnartz, Tuyls ’03 –Provable security, specific distribution (multi-variate Gaussian) Acknowledgement: some slide ideas from Ari Juels

14 This Talk Formal framework and general tools for handling personal / biometric key material

15 Outline  Basic Setting: Password Authentication  Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication  Other metrics  Stronger privacy? [DS]

16 Outline  Basic Setting: Password Authentication  Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication  Other metrics  Stronger privacy? [DS]

17 Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then recover x 2.Secrecy: Given S(X), it’s hard to predict X Meaning of “close” depends on application How to measure hardness of prediction? x S(x)S(x) S Recover x’x’ S(x)S(x) x

18 Measuring Security X a random variable on {0,1} n Probability of predicting X = max x Pr[X = x] There are various ways to measure entropy… Min-entropy: H 1 (X) = -log ( max x Pr[X=x] ) Uniform on {0,1} n : H 1 (U n ) = n “ Password has min-entropy t ” means that adversary’s probability of guessing the password is 2 -t Passwords had better have high entropy!

19 Measuring Security X a random variable on {0,1} n Probability of predicting X = max x Pr[X = x] There are various ways to measure entropy… Min-entropy: H 1 (X) = -log ( max x Pr[X=x] ) Conditional entropy H 1 (X | Y )= -log ( prob. of predicting X given Y ) = -log ( Exp y { max x Pr[X=x | Y=y] } )

20 Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then recover x 2.Secrecy: Given S(X), it’s hard to predict X Goals: - Minimize entropy loss: H 1 (X) – H 1 (X | S(X) ) - Maximize tolerance: how “far” x ’ can be from x X S(X) S Recover x’x’ S(x)S(x) x

21 Example: Code-Offset Construction [BBR88, Cré97, JW02]

22 Code-Offset Construction [BBR,Cré,JW] View password as n bit string : x 2 {0,1} n Error model: small number of flipped bits Hamming distance: d H ( x, x ’) = # of positions in which x, x ’ differ Main idea: non-conventional use of standard error-correcting codes

23 Code-Offset Construction [BBR,Cré,JW] Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string Equiv: S( x ) = syndrome( x ) d x ECC(R) S(x)S(x) x’x’

24 Recovery Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string Equiv: S( x ) = syndrome( x ) Given S( x ) and x ’ close to x : –Compute x ’ © S( x ) –Decode to get ECC(R) –Compute x = S( x ) © ECC(R) d x ECC(R) S(x)S(x) x’x’ Corrects d/2 errors How much entropy loss?

25 Entropy Loss Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string H 1 (X | S(X)) = H 1 (X, R | S(X) ) ¸ H 1 (X) + H 1 (R) – |S(X)| = H 1 (X) + k – n d x ECC(R) S(x)S(x) Revealing n bits costs · n bits of entropy

26 Entropy Loss Error-correcting code ECC: k bits ! n bits Any two codewords differ by at least d bits S( x ) = x © ECC(R) where R is random string H 1 (X | S(X)) = H 1 (X, R | S(X) ) ¸ H 1 (X) + H 1 (R) – |S(X)| = H 1 (X) + k – n Entropy loss = n – k = redundancy of code d x ECC(R) S(x)S(x)

27 Using Sketches for Authentication “How do I know you’re Alice”? x’=x’= x =x = S(x), Hash(x) Input to Hash should be uniformly random X is not uniform (especially given S(X)) Problem Recover Hash ?=?= AliceServer

28 Using Sketches for Authentication “How do I know you’re Alice”? x’=x’= x =x = S(x), Hash(x) AliceServer

29 Padding via Random Functions AliceServer x =x = S(x), F, Hash(F(x)) Scramble input with “random” function F. Store F. Solution Thm: “2-universal” F is sufficient (e.g., random linear map) Verification same as before Is hash input random enough? “How do I know you’re Alice”? x’=x’=

30 Padding via Random Functions AliceServer x =x = S(x), F, Hash(F(x)) Scramble input with “random” function F. Store F. Solution Secure as long as: Entropy-Loss S + | Hash | + 2 log(1/  ) · H 1 (X) Proof idea: S(x), F, Hash(F(x)) ¼ S(x), F, Hash(R) Similar to “left-over hash lemma / privacy amplification” “How do I know you’re Alice”? x’=x’=

31 Sketches and Authentication “Fuzzy Sketch” + Hashing Solves Authentication Assumption: X has high entropy –Necessary –X could be several passwords taken together “Hamming” errors can be handled with standard ECC Similar techniques imply one can use X as key for any crypto application (e.g. encryption) –Covers several previously studied settings

32 (Safely public) Fuzzy Extractor K = Gen … (Random given P) Some application e.g. (S( ), F ) K = Rec P ’ P

33 Outline  Basic Setting: Password Authentication  Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication  Other metrics  Stronger privacy?

34 Other metrics Variants of code-offset work for many metrics –“transitivity” suffices [DRS] –Continuous normed spaces: Euclidean [LT,VTDL], L p [DRS] New constructions –Set difference metric (introduced by [JS]) –General technique via special embeddings: edit distance, permutation metric Real(er)-life metrics?

35 Why Set Difference? [EHMS,FJ,JS] Inputs: tiny subsets in a HUGE universe Some representations of personal / biometric data: –Fingerprints represented as feature list (minutiae/ridge meetings) –List of favorite books X µ {1,…, N}, #X = s d S (X,Y) = ½ # (X  Y) = Hamming distance on vectors in {0,1} N. X Y X  Y Code-offset not good: N-bit string is too long! Want: s log N bits

36 Recall: Fuzzy Sketch 1.Error-correction: If x ’ is “close” to x, then 2.Secrecy: Given S(X), it’s hard to predict X Goals: - Minimize entropy loss: H 1 (X) – H 1 (X | S(X) ) - Maximize tolerance: how “far” x ’ can be from x X S(X) S Recover x’x’ S(x)S(x) x

37 New Constructions for Set Difference X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X  Y) Two constructions 1.punctured Reed-Solomon code 2.Sublinear-time decoding of BCH codes from syndromes Both constructions: –As good as code-offset could be ( ¼ optimal) –Storage space · ( s + 1) log N –Entropy loss 2 e log N to correct e errors –Improve previous best [JS02] (+ analysis) X Y X  Y

38 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X  Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ X P P’P’ s =7, e =2, k =2 12N

39 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X  Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ Recovery: Given P’ and X’ close to X 1.Reed-Solomon decoding yields P 2.Intersections of P and P’ yield X P P’P’ s =7, e =2, k =2 X’ 12N

40 Reed-Solomon-based Sketch X µ {1,…, N}, #X = s, d S (X,Y) = ½ # (X  Y) Suppose N is prime, work in Z N 1. k := s – 2 e – 1 2. Pick random poly. P() of degree · k 3. P’() := monic degree s poly. s.t. P’( z )=P( z ) 8 z 2 X 4. Output S(X)=P’ Entropy loss: H 1 (X | P’) = H 1 (X,P | P’) ¸ H 1 (X) + H 1 (P) - | P’ | = H 1 (X) + ( k +1)log N – s log N = H 1 (X) – 2 e log N

41 Outline  Basic Setting: Password Authentication  Simple abstraction: Fuzzy Sketch Example: Hamming distance Fuzzy Sketch ) Authentication  “Set difference” distance: Reed-Solomon construction  Other schemes via metric embeddings: edit distance  Achieving stronger privacy

42 Other metrics? Real error models not as clean as Hamming & set diff. Algebraic techniques won’t apply directly. Possible Approaches: 1.Develop new scheme tailored to particular metric 2.Reduce to easier metric via embedding  : M 1 ! M 2 x, y close )  ( x ),  ( y ) close x, y far )  ( x ),  ( y ) far M1M1 M2M2 

43 Biometric embeddings Real error models not as clean as Hamming & set diff. Algebraic techniques won’t apply directly. Possible Approaches: 1.Develop new scheme tailored to particular metric 2.Reduce to easier metric via embedding  : M 1 ! M 2 x, y close )  ( x ),  ( y ) close x, y far )  ( x ),  ( y ) far M1M1 M2M2  A is a large set )  (A) is large H 1 (A) large ) H 1 (  (A)) large

44 Edit Distance Strings of bits d(x,y) = number of insertions & deletions to go from x to y [ADGIR] No good standard embeddings into Hamming Shingling [Broder] : “biometric” embedding into Set.Diff. 101010101010 10101101010 101011010101

46 Stronger Privacy? Previous notion: Unpredictability –Can’t guess X even after seeing sketch –Sufficient for using X as a crypto key What about the privacy of X itself? –Do not want particular info about X leaked (say, first 20 bits) Ideal notion, ignoring computational restrictions: X almost independent of S(X) Problem: Some info must be leaked by S(X) (provably) –Mutual information I(X ; S(X)) is large We want the ensure that “useful” information is hidden

47 Attempt: Hiding Useful Info [CMR,RW,DS] Definition: S(X) hides all functions of X if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] <  where X, X’ are independent, identical random variables Intuition:“A cannot guess g(X) given S(X)” S(X)S(X)X g(X)g(X) difficult Implies unpredictability

48 Attempt: Hiding Useful Info [CMR,RW,DS] Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] <  where X, X’ are independent, identical random variables Non-computational version of semantic-security for high-entropy input distributions

49 Hiding Useful Info Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] <  where X, X’ are independent, identical random variables Can be achieved in code-offset construction [DS03] –Use pseudo-randomly chosen code –Need to keep decodability –Exact parameters still unknown

50 Technique: Equivalence to Extraction Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] <  where X, X’ are independent, identical random variables S() is an “extractor” if for all r.v.’s X 1, X 2 of min-entropy t S(X 1 ) ¼  S(X 2 ) Thm: S(X) hides all functions of X, whenever H 1 (X) ¸ t, S() is an “extractor” for r.v.’s of min-entropy t – 1

51 The right notion? Definition: S(X) hides all functions of X when H 1 (X) ¸ t if For all functions g, for all adversaries A Pr [ A(S(X)) = g(X) ] – Pr [ A(S(X’)) = g(X) ] <  where X, X’ are independent, identical random variables Is this strong enough? –Not secure if repeated: S(X), S’(X), S’’(X) may reveal X ! –What’s the “right” notion? –What computational notions might we use? (see [CMR])

53 Conclusions Generic framework for turning noisy, non-uniform data into secure cryptographic keys –Only assume high entropy (necessary) –Abstraction, simplicity allow comparing schemes –Constructions for Hamming, Set Difference, Edit Metrics Possible to have strong privacy of data Future work –Real-life metrics (under way) –Other settings ([DV] use sketches in bounded-storage model) –Better definitions

54 Other Topics I Work On Distributed Crypto Protocols –Multi-party Computation –Byzantine Agreement Modeling partial disclosure of secret data –Exposure-resilient crypto, database privacy Quantum Cryptography –Authentication, Encryption of Quantum States –Multi-party Quantum Computation –Quantum Codes Combinatorial Tools for Cryptography

55 Questions?

56 Other Slides

57 The Problem: We’re Human “Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed when performing cryptographic operations. (They are also large, expensive to maintain, difficult to manage, and they pollute the environment. […] But they are sufficiently pervasive that we must design our protocols around their limitations.)” From Network Security by Kaufman, Perlman and Speciner.

58 Stuff I want to say Previous work cute, clever, not formal enough, no good analysis of how to use it generic framework general tools About authentication: interplay between computational assumptions and information-theoretic technique practical… may be implemented General context: provable security

59 Using Biometric Passwords Crypto moving beyond encryption / authentication Newer applications often poorly modeled –Ad hoc solutions –Get broken! Crypto Theory: models, proofs This talk:– formal framework – provably secure constructions for using biometric data for authentication, key recovery

60 The Problem: We’re Human Secure cryptographic keys: long, random strings Too hard to remember Keep copy under doormat? Use short, easy-to-remember password (PIN)? –easy to remember = easy to guess

61 Solution: Use Password I Never Forget Personal Info –Mom’s maiden name, Date of birth –Name of first pet, name of street where you grew up Biometrics –Fingerprints –Iris Scan –Voice print –Genetic Subsequence

62 Biometric embeddings  : M 1 ! M 2 We care about entropy: non-standard requirements x,y close )  (x),  (y) close A is a large set )  (A) is large d1(x,y)d1(x,y) d 2 (  (x),  (y)) · ·  # A #  (A) ¸  or, equivalently H1(X)H1(X) H 1 (  (X)) ¸ 

63 Statistical Distinguishability Statistical Difference (L 1 ): For distributions p 0 (x), p 1 (x): SD(p 0,p 1 ) = ½  x | p 0 (x) – p 1 (x) | SD measures distinguishability: If b Ã {0,1}, x Ã p b then max A | Pr[A(x)=b] – ½ | = ½ SD(p 0,p 1 ) (Notation: A ¼  B if SD(A,B) ·  )

64 Statistical Distinguishability Two probability distributions p 0 (x), p 1 (x) Sphinx: 1.Flips a fair coin 2.- Heads: Samples Z according to p 0 - Tails: Samples Z according to p 1 3.Shows Z to Greek Hero Greek Hero: Guesses if coin was heads or tails. Hero can wins with probability at least ½ Hero wins w. prob. ½ +  ) p 0, p 1 are  -distinguishable

65 Fuzzy Extractor (generalizes extractors [NZ]) 1.Error-correction: If x ’ is “close” to x, then 2.Security: Pub, K ¼  Pub, R Recover X Pub K = Pub x’x’ K Pub = S(x), F K = F(x) Generate

66 Key Recovery [EHMS, FJ, JS] www. Fingers2Keys.com Generate Recover

1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT.

Similar presentations

Presentation on theme: "1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT.

Similar presentations

Presentation on theme: "1 Generating Strong Keys from Biometric Data Yevgeniy Dodis, New York U Leonid Reyzin, Boston U Adam Smith, MIT."— Presentation transcript:

Similar presentations

About project

Feedback