B504/I538: Introduction to Cryptography Spring 2017 • Lecture 2 (2017—01—12)
Assignment 0 is due on Tuesday! (2017—01—17)
Discrete probability 101
Q: Why discrete probability? A: I’m the prof, and I said so! A: Security definitions rely heavily on probability theory
Probability distributions Defⁿ: A (discrete) probability distribution over a finite set S is a function Pr:S→[0,1] such that ∑x∈SPr(x)=1. The set S is called the sample space of Pr The elements of S are called outcomes or “elementary events”
Probability distributions Defⁿ: A (discrete) probability distribution over a finite set S is a function Pr:S→[0,1] such that ∑x∈SPr(x)=1. ∑x∈SPr(x)=1. $\Pr\colon S\to[0,1]$ $\sum_{x\in S}\Pr(x)$
Probability distributions What kind of nonsensical “definition” was that? Actually, it was perfectly sensible Let’s unpack it… A:
Probability distributions, unpacked S lists every conceivable outcome of a random process The function Pr associates a likelihood to each outcome That is, Pr(x) describes “how likely” x is to happen The range [1,0] ensures each outcome is associated with a probability that “makes sense” Pr(x)=0 means: Pr(x)=1 means: 0<Pr(x)<1 means: ∑x∈SPr(x)=1 ensures exactly one outcome happens (in any given “trial” of the random process) “x NEVER happens” “x ALWAYS happens” “x SOMETIMES—but NOT ALWAYS—happens”
EACH AND EVERY TIME YOU ENCOUNTER ONE!! When should I “unpack” mathematical definitions like you just did?
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 1/4 1/8 1/2 1/8 + + + = 1
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 1/4 1/8 1/2 1/8 + + + = 1 $S=\{0,1\}^2=\{00,01,10,11\}$
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 1/4 1/4 1/4 1/4 Common/important distributions: Uniform distribution: ∀x∈S, Pr(x)=1/|S| + + + = 1
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 1/4 1/4 1/4 1/4 Common/important distributions: Uniform distribution: ∀x∈S, Pr(x)=1/|S| + + + = 1 $\forall x\in S,\Pr(x)=1/|S|$
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 0 1 0 0 Common/important distributions: Uniform distribution: ∀x∈S, Pr(x)=1/|S| Point distribution at x0: Pr(x0)=1∧∀x≠x0, Pr(x)=0 + + + = 1
Probability distributions Eg.: S = {0,1}² = {00,01,10,11} Pr : 00 01 10 11 ↧ ↧ ↧ ↧ 0 1 0 0 Common/important distributions: Uniform distribution: ∀x∈S, Pr(x)=1/|S| Point distribution at x0: Pr(x0)=1∧∀x≠x0, Pr(x)=0 + + + = 1 $\Pr(x_0)=1\wedge\forall x\neq x_0, \Pr(x)=0$
Events Pr[E]≔∑x∈EPr(x). Defⁿ: If Pr:S→[0,1] is a probability distribution, then any subset E⊆S is called an event. The probability of E is Pr[E]≔∑x∈EPr(x). Convention: Square brackets ⇒ probability of event Parentheses ⇒ probability of outcome
Events Pr[E]≔∑x∈EPr(x). Defⁿ: If Pr:S→[0,1] is a probability distribution, then any subset E⊆S is called an event. The probability of E is $E\subseteq S$ Pr[E]≔∑x∈EPr(x). Convention: Square brackets ⇒ probability of event Parentheses ⇒ probability of outcome
Events Q: How many events are there in S? A: 2|S|, including the two “trivial” events The universal event; i.e., E=S The empty event (or null event); i.e., E=Ø Also includes the “elementary events”; i.e., the singleton set {x} for each x∈S (Pr[S]=1) (Pr[Ø]=0) Note: Each outcome is part of many events!
Events Eg.: S={0,1}⁸ E={x∈S|lsb2(x)=11} Q: What is Pr[E]? A: Pr(0000011)+Pr(00000111)+⋯+Pr(11111111) To be more precise, need to fix a distribution Pr! Q: Suppose Pr:S→[0,1] is the uniform distribution. Now compute Pr[E]. A: Pr[E]=1/4 (how did we compute this?)
Pr[E]= |E|/|S| Counting theorem Thm (Counting theorem): If Pr:S→[0,1] is a uniform distribution and E⊆S is an event, then Pr[E]= |E|/|S| =∑x∈EPr(x) (defⁿ of Pr[E]) = ∑x∈E1/|S| (defⁿ of uniform distribution) =1/|S|+⋯+1/|S| =|E|/|S| Proof: Pr[E] |E| times ☐
fine probability you’ve got there! Complimetary events That’s a mighty fine probability you’ve got there!
Complementary events S∖E≔{x∈S|x∉E} Defⁿ: If Pr:S→[0,1] is a probability distribution and E⊆S is an event,then the complement of E is S∖E≔{x∈S|x∉E}
$S\setminus E\coloneqq\{x\in S\mid x\not\in E\}$ Complementary events Defⁿ: If Pr:S→[0,1] is a probability distribution and E⊆S is an event,then the complement of E is S∖E≔{x∈S|x∉E} $S\setminus E\coloneqq\{x\in S\mid x\not\in E\}$
Complementary events S∖E≔{x∈S|x∉E} Defⁿ: If Pr:S→[0,1] is a probability distribution and E⊆S is an event,then the complement of E is S∖E≔{x∈S|x∉E} Intuitively, Ē is the event “E does not occur” Notation: Complement of E is often denoted Ē (note the “bar” over E), which is read “ E bar”
Complementary events S∖E≔{x∈S|x∉E} Defⁿ: If Pr:S→[0,1] is a probability distribution and E⊆S is an event,then the complement of E is S∖E≔{x∈S|x∉E} Intuitively, Ē is the event “E does not occur” Notation: Complement of E is often denoted Ē (note the “bar” over E), which is read “ E bar” $\bar{E}$
Complement rule Pr[Ē]=1−Pr[E] Thm (Complement rule): If Pr:S→[0,1] is a probability distribution and E⊆S is an event, then Pr[Ē]=1−Pr[E] Proof: Pr[Ē] =∑x∈ĒPr(x) (defⁿ of Pr[Ē]) = ∑Pr(x) (defⁿ of Ē) =∑x∈SPr(x)−∑x∈EPr(x) (rearranging) =1−∑x∈EPr(x) (defⁿ of Pr) =1−Pr[E] (defⁿ of Pr[E]) x∈S∖E ☐
Union bound Pr[E∪F]≤Pr[E]+Pr[F] Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F]
$\Pr[E\cup F]\leq\Pr[E]+\Pr[F]$ Union bound Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F] $\Pr[E\cup F]\leq\Pr[E]+\Pr[F]$
Union bound Pr[E∪F]≤Pr[E]+Pr[F] Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F] Proof: Pr[E∪F] =∑Pr(x) (defⁿ of Pr[E∪F]) = ∑x∈EPr(x)+∑x∈FPr(x)−∑Pr(x) (inclusion-exclusion) =Pr[E]+Pr[F]−Pr[E∩F] (defⁿ of Pr[E] & Pr[F]) ≤Pr[E]+Pr[F] (defⁿ of Pr) x∈E∪F x∈E∩F ☐
Union bound Pr[E∪F]≤Pr[E]+Pr[F] Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F] Corollary: If E∩F=Ø, then Pr[E∪F]=Pr[E]+Pr[F]
Union bound Pr[E∪F]≤Pr[E]+Pr[F] Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F] Corollary: If E∩F=Ø, then Pr[E∪F]=Pr[E]+Pr[F] $E\cap F=\emptyset$
Union bound Pr[E∪F]≤Pr[E]+Pr[F] Thm (Union bound): If Pr:S→[0,1] is a probability distribution and E,F⊆S are events, then Pr[E∪F]≤Pr[E]+Pr[F] Corollary: If E∩F=Ø, then Pr[E∪F]=Pr[E]+Pr[F] If E∩F=Ø, we call E and F mutually exclusive events Q: Is the converse of the above corollary true? A: No! We might have A∩B≠Ø but Pr[A∩B]=0!
Random variables Defⁿ: Let Pr:S→[0,1] be a probability distribution and let V be an arbitrary finite set. A random variable is a function X:S→V. Sample space S is a “list” of all possible outcomes Distribution Pr says “how likely” each outcome is Random variable X assigns a “meaning” or “interpretation” to each outcome
· Random variables Pr[X=0]=Pr[X=1]=1/2 Pr[X=v]≔Pr[X⁻¹(v)] Eg.: Let S={0,1}ⁿ and let X:S→{0,1} such that X(y)≔lsb(y) If Pr is the uniform distribution, then Pr[X=0]=Pr[X=1]=1/2 A random variable X “induces” a probability distribution on its range V via: Pr[X=v]≔Pr[X⁻¹(v)] · lsb=0 lsb=1 V U an event
The uniform random variable Defⁿ: Let Pr:S→[0,1] be a uniform distribution. The identity function X:S→S is called the uniform random variable on S.
The uniform random variable Defⁿ: Let Pr:S→[0,1] be a uniform distribution. The identity function X:S→S is called the uniform random variable on S. Notation: We write x∊S to denote that x is output by the uniform random variable on S Other common notations: x←S or x←R S or x←$ S
The uniform random variable Defⁿ: Let Pr:S→[0,1] be a uniform distribution. The identity function X:S→S is called the uniform random variable on S. Notation: We write x∊S to denote that x is output by the uniform random variable on S Other common notations: x←S or x←R S or x←$ S $x\gets S$
Independent events Pr[E∩F]=Pr[E]·Pr[F] Defⁿ: If Pr:S→[0,1] is a probability distribution, then two events E,F⊆S are independent if Pr[E∩F]=Pr[E]·Pr[F]
$\Pr[E\cap F]\coloneqq\Pr[E]\cdot\Pr[F]$ Independent events Defⁿ: If Pr:S→[0,1] is a probability distribution, then two events E,F⊆S are independent if Pr[E∩F]=Pr[E]·Pr[F] $\Pr[E\cap F]\coloneqq\Pr[E]\cdot\Pr[F]$
Independent events Pr[E∩F]=Pr[E]·Pr[F] Defⁿ: If Pr:S→[0,1] is a probability distribution, then two events E,F⊆S are independent if Pr[E∩F]=Pr[E]·Pr[F] Intuitively, E and F are independent events if E occurs with the same probability whether or not F occurs, and vice versa
Independent random variables Defⁿ: Two random variables X≔S→V and Y≔S→V are independent if ∀a,b∈S, Pr[X=a∧Y=b]=Pr[X=a]·Pr[Y=b] Intuitively, X and Y are independent random variables if the event X=a occurs with the same probability whether or not Y=b occurs, and vice versa for every possible choice of a,b
Independent random variables Eg.: Pr:{0,1}ⁿ→[0,1] is the uniform distribution X:{0,1}ⁿ→{0,1} and Y:{0,1}ⁿ→{0,1} are random variable such that X(r)≔lsb(r) and Y(r)≔msb(r) Then, for any b0,b1∈{0,1}, Pr[X=b0∧Y=b1]=Pr[r=b1b0]=1/4 and Pr[X=b0]·Pr[Y=b1]=(1/2)·(1/2)=1/4 Hence, X and Y are independent random variables.
Lesson: always check for implicit/hidden assumptions! Hold on! What if n=1!? Then X(r)=Y(r)! Good observation! Lesson: always check for implicit/hidden assumptions!
Exclusive OR The exclusion-OR of two strings is their bitwise addition modulo 2 x y x⊕y 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 ⊕
Exclusive OR The exclusion-OR of two strings is their bitwise addition modulo 2 x y x⊕y $x\oplus y$ 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 ⊕
Exclusive OR The exclusion-OR of two strings is their bitwise addition modulo 2 x y x⊕y 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 1 1 0 ⊕
An important property of XOR Thm (XOR preserves uniformity): If X:{0,1}ⁿ→{0,1}ⁿ is a uniform random variable and Y:{0,1}ⁿ→{0,1}ⁿ is an arbitrary random variable that is independent of X, Z≔X⊕Y is a uniform random variable. Proof: Left as an exercise (see Assignment 1). ☐
Conditional probability Defⁿ: If Pr:S→[0,1] is a probability distribution and E,F⊆S with Pr[F]≠0, the conditional probability of E given F is Pr[E|F]≔Pr[E∩F]⁄Pr[F] Pr[E|F] is the probability that E occurs given that F also occurs Alt. defⁿ: Pr[E∩F]≔Pr[E|F]·Pr[F]
Law of Total Probabilities Thm (Law of Total Probabilities): If Pr:S→[0,1] is a probability distribution and E,F⊆S with Pr[F]≠0, then Pr[E]=Pr[E|F]·Pr[F]+Pr[E|Ĕ]·Pr[Ĕ] =∑x∈EPr(x) =∑Pr(x)+∑Pr(x) =Pr[E∩F]+Pr[E∩Ĕ] =Pr[E|F]·Pr[F]+Pr[E|Ĕ]·Pr[Ĕ] Proof: Pr[E] x∈E∩F x∈E∩Ĕ ☐
Pr[E|F]=Pr[F|E]·Pr[E]/Pr[F] Bayes’ Theorem Thm (Bayes’ Theorem): If Pr:S→[0,1] is a probability distribution and E,F⊆S with Pr[F]≠0, then Pr[E|F]=Pr[F|E]·Pr[E]/Pr[F] Proof: Pr[E|F] =Pr[E∩F]/Pr[F] (defⁿ of Pr[E|F]) =Pr[F∩E]/Pr[F] (commutativity of ⋂) =Pr[F|E]·Pr[F]/Pr[E] (alt. defⁿ of cond. prob.) ☐
Exp[X]≔∑v∈VPr[X=v]·v Expectation Defⁿ: Suppose V⊆ℝ. If X:S→V is a random variable, then the expected value of X is Exp[X]≔∑v∈VPr[X=v]·v
Exp[X]≔∑v∈VPr[X=v]·v Expectation $V\subseteq\mathbb{R}$ Defⁿ: Suppose V⊆ℝ. If X:S→V is a random variable, then the expected value of X is Exp[X]≔∑v∈VPr[X=v]·v
Exp[X]≔∑v∈VPr[X=v]·v Expectation Defⁿ: Suppose V⊆ℝ. If X:S→V is a random variable, then the expected value of X is Exp[X]≔∑v∈VPr[X=v]·v Fact: Expectation is linear; that is, Exp[X+Y]=Exp[X]+Exp[Y] Fact: If X and Y are independent random variables, then Exp[X·Y]=Exp[X]·Exp[Y]
Pr[X≥v]≤Exp[X]/v Markov’s Inequality Thm (Markov’s Inequality): Suppose V⊆ℝ. If X:S→V is a non-negative random variable and v>0, then Pr[X≥v]≤Exp[X]/v Proof: Exp[X] =∑x∈VPr[X=x]·x (defⁿ of Exp[X]) =∑x<vPr[X=x]·x+∑x≥vPr[X=x]·x (regrouping) ≤∑x<vPr[X=x]·0+∑x≥vPr[X=x]·v = Pr[X≥v]·v ☐
Var[X]≔Exp[(X−Exp[X])²] Variance Defⁿ: Suppose V⊆ℝ. If X:S→V is a random variable, then the variance of X is Var[X]≔Exp[(X−Exp[X])²] Intuitively, Var[X] indicates how far we expect X to deviate from Exp[X] Fact: Var[X]=Exp[X2]−(Exp[X])² Fact: Var[aX+b]=a² Var[X]
Chebyshev’s inequality Thm (Chebyshev’s Inequality): Suppose V⊆ℝ. If X:S→V is a random variable and δ>0, then Pr[|X−Exp[X]|≥δ]≤Var[X]⁄δ² Proof: Pr[|X−Exp[X]|≥δ] =Pr[|X−Exp[X]|²≥δ²] ≤Exp[(X−Exp[X])²]⁄δ² (Markov’s) =Var[X]/δ² (Defⁿ) ☐
Chernoff’s bound Thm (Chernoff’s bound): Fix ε>0 and b∈{0,1}, and let X1,…,XN be independent random variables on {0,1} such that Pr[Xi=b]=1/2+ε for each i=1,…,N. The probability that the “majority value” of the Xi is not b is at most ℯ-ε2 N⁄2.
Markov v. Chebyshev v. Chernoff Suppose we have a biased coin that homes up heads with probability 0.9 and tails with probability 0.1 Consider random variable X that counts the number of tails after N=100 tosses. What is the probability that X is greater than or equal to 50? Markov: Pr[X≥50]≤Exp[X]⁄50=10⁄50 = 0.2 Chebyshev: Pr[|X−Exp[X]|≥40]≤Var[X]⁄402 = 0.005625… Chernoff: Pr[X≥50]≤ℯ-0.42∙100⁄2 = 0.000335…
That’s all for today, folks!