Download presentation
Presentation is loading. Please wait.
Published byΠρίσκιλλα Δράκος Modified over 6 years ago
1
Chernoff bounds The Chernoff bound for a random variable X is
obtained as follows: for any t >0, Pr[X a] = Pr[etX eta] ≤ E[etX ] / eta Similarly, for any t <0, Pr[X ≤ a] = Pr[etX eta] ≤ E[etX ] / eta The value of t that minimizes E[etX ] / eta gives the best possible bounds.
2
Moment generating functions
Def: The moment generating function of a random variable X is MX(t) = E[etX]. E[Xn] = MX(n)(0) , which is the nth derivative of MX(t) evaluated at t = 0. Fact: If MX(t)= MY(t) for all t in (-c, c) for some c > 0, then X and Y have the same distribution. If X and Y are independent r.v., then MX+Y(t)= MX(t) MY(t).
3
Chernoff bounds for the sum of Poisson trials
Poisson trials: the distribution of a sum of independent 0-1 random variables, which may not be identical. Bernoulli trials: same as above except that all the random variables are identical. Xi:i=1…n, mutually independent 0-1 r.v. with Pr[Xi=1]=pi. Let X =X1+…+Xn and E[X] =μ=p1+...+pn. MXi(t) =E[etXi] = piet +(1-pi) = 1 + pi (et -1) ≤ exp (pi (et -1) ).
4
Chernoff bound for a sum of Poisson trials
𝑀 𝑋 𝑡 = 𝑖=1 𝑛 𝑀 𝑋 𝑖 𝑡 ≤ 𝑖=1 𝑛 𝑒 𝑝 𝑖 𝑒 𝑡 −1 =exp 𝑖=1 𝑛 𝑝 𝑖 𝑒 𝑡 −1 = 𝑒 𝑒 𝑡 −1 𝜇 Theorem: Let 𝑋 1 ,…, 𝑋 𝑛 be independent Poisson trails such that Pr( 𝑋 𝑖 )= 𝑝 𝑖 . Let 𝑋= 𝑖=1 𝑛 𝑋 𝑖 and 𝜇=𝑬[𝑋]. Then, 1. for any 𝛿>0, Pr 𝑋≥ 1+𝛿 𝜇 < 𝑒 𝛿 (1+𝛿) 1+𝛿 𝜇 ; 2. for 0<𝛿≤1, Pr 𝑋≥ 1+𝛿 𝜇 ≤ 𝑒 − 𝜇 𝛿 ; 3. for 𝑅≥6𝜇, Pr 𝑋≥𝑅 ≤ 2 −𝑅 .
5
Proof By Markov’s ineq., for any t>0 we have Pr 𝑋≥ 1+𝛿 𝜇 =Pr( 𝑒 𝑡𝑋 ≥ 𝑒 𝑡(1+𝛿)𝜇 )≤ 𝑬[ 𝑒 𝑡𝑋 ] 𝑒 𝑡(1+𝛿)𝜇 ≤ 𝑒 ( 𝑒 𝑡 −1)𝜇 𝑒 𝑡(1+𝛿)𝜇 For any 𝛿>0, set 𝑡= ln 1+𝛿 >0 to get (1). To obtain (2) we need to show for 0<𝛿≤1, 𝑒 𝛿 (1+𝛿) 1+𝛿 ≤ 𝑒 − 𝛿 2 /3 . Taking the logarithm of both sides, we have 𝛿− 1+𝛿 ln 1+𝛿 + 𝛿 2 3≤0 , which can be proved with calculus. To prove (3), let 𝑅≥(1+𝛿)𝜇. Then for 𝑅≥6𝜇, 𝛿= 𝑅 𝜇 −1=5. Hence, using (1) Pr 𝑋≥ 1+𝛿 𝜇 ≤ 𝑒 𝛿 (1+𝛿) 1+𝛿 𝜇 ≤ 𝑒 1+𝛿 (1+𝛿)𝜇 ≤ 𝑒 6 𝑅 ≤ 2 −𝑅 .
6
Similarly, we have Theorem: Let 𝑋 1 ,…, 𝑋 𝑛 be independent Poisson trails such that Pr( 𝑋 𝑖 )= 𝑝 𝑖 . Let 𝑋= 𝑖=1 𝑛 𝑋 𝑖 and 𝜇=𝑬[𝑋]. Then, for 0<𝛿<1: 1. Pr 𝑋≤ 1−𝛿 𝜇 ≤ 𝑒 −𝛿 (1−𝛿) 1−𝛿 𝜇 ; 2. Pr 𝑋≤ 1−𝛿 𝜇 ≤ 𝑒 − 𝜇 𝛿 2 2 .
7
Example: Let X be the number of heads of n independent fair coin flips
Example: Let X be the number of heads of n independent fair coin flips. Applying the above Corollary, we have: Pr 𝑋− 𝑛 2 ≥ 𝑛 ln 𝑛 ≤2 exp − 𝑛 ln 𝑛 𝑛 = 2 𝑛 Pr 𝑋− 𝑛 2 ≥ 𝑛 4 ≤2 exp − 𝑛 =2 𝑒 −𝑛/24 By Chebyshev ineq., i.e. Pr[|𝑋−𝑬[𝑋]|≥𝑎]≤ 𝑉𝑎𝑟[𝑋] 𝑎 2 , we have Pr 𝑋− 𝑛 2 ≥ 𝑛 4 ≤ 4 𝑛 .
8
Application: Estimating a parameter
Given a DNA sample, a lab test can determine if it carries the mutation. Since the test is expensive and we would like to obtain a relatively reliable estimate from a small number of samples. Let p be the unknown parameter that we are looking for estimation. Assume we have n samples and X= 𝑝 𝑛 of these samples have the mutation. For sufficient large number of samples, we expect p to be close to 𝑝 . Def: A 1−𝛾 confidence interval for a parameter p is an interval 𝑝 −𝛿, 𝑝 +𝛿 such that Pr 𝑝∈ 𝑝 −𝛿, 𝑝 +𝛿 ≥1−𝛾.
9
Among the n samples, we find X= 𝑝 𝑛 mutation samples
Among the n samples, we find X= 𝑝 𝑛 mutation samples. We need to find 𝛿 and 𝑟 for which Pr 𝑝∈ 𝑝 −𝛿, 𝑝 +𝛿 = Pr 𝑛𝑝∈ 𝑛( 𝑝 −𝛿),𝑛( 𝑝 +𝛿) ≥1−𝑟. 𝑋=𝑛 𝑝 has a binomial distribution with parameters 𝑛 and 𝑝, so 𝑬 𝑋 =𝑛𝑝. If 𝑝∉ 𝑝 −𝛿, 𝑝 +𝛿 then one of the following events holds: (1) If 𝑝< 𝑝 −𝛿 then 𝑋=𝑛 𝑝 >𝑛 𝑝+𝛿 =𝐸[𝑋](1+ 𝛿 𝑝 ); (2) If 𝑝> 𝑝 +𝛿 then 𝑛 𝑝 <𝑛 𝑝−𝛿 =𝐸[𝑋](1− 𝛿 𝑝 ). Thus Pr 𝑝∉ 𝑝 −𝛿, 𝑝 +𝛿 =Pr X<𝑛𝑝 1− 𝛿 𝑝 +Pr X>𝑛𝑝 1+ 𝛿 𝑝 ≤ 𝑒 − 𝑛𝑝 𝛿 𝑝 𝑒 − 𝑛𝑝 𝛿 𝑝 2 3 = 𝑒 − 𝑛 𝛿 2 2𝑝 + 𝑒 − 𝑛 𝛿 2 3𝑝 . Setting 𝑟= 𝑒 − 𝑛 𝛿 2 2𝑝 + 𝑒 − 𝑛 𝛿 2 3𝑝 , we have a tradeoff between 𝛿, n and 𝑟.
10
Better bound for special cases
Theorem: Let 𝑋= 𝑖=1 𝑛 𝑋 𝑖 , where 𝑋 1 ,…, 𝑋 𝑛 are n independent random variables with Pr 𝑋 𝑖 =1 = Pr 𝑋 𝑖 =−1 = Pr 𝑋 𝑖 =1 = Pr 𝑋 𝑖 =−1 = Then for any 𝑎>0, Pr 𝑋≥𝑎 ≤ 𝑒 − 𝑎 2 2𝑛 . Pf: For any t>o, 𝐸 𝑒 𝑡 𝑋 𝑖 = 𝑒 𝑡 2 + 𝑒 −𝑡 2 , 𝑒 𝑡 =1+𝑡+ 𝑡 2 2! +…+ 𝑡 𝑖 𝑖! +… and 𝑒 −𝑡 =1−𝑡+ (−𝑡) 2 2 +…+ (−𝑡) 𝑖 𝑖! +…, using Taylor series. Thus 𝐸 𝑒 𝑡 𝑋 𝑖 = 𝑖≥0 𝑡 2𝑖 2𝑖 !≤ 𝑖≥0 (𝑡 2 /2) 𝑖 𝑖! ≤ 𝑒 𝑡 2 /2 . 𝐸 𝑒 𝑡𝑋 = 𝑖=1 𝑛 𝐸 𝑒 𝑡 𝑋 𝑖 ≤ 𝑒 𝑡 2 𝑛/2 and Pr 𝑋≥𝑎 = Pr[𝑒 𝑡𝑋 ≥ 𝑒 𝑡𝑎 ]≤𝐸 𝑒 𝑡𝑋 / 𝑒 𝑡𝑎 ≤ 𝑒 𝑡 2 𝑛/2 / 𝑒 𝑡𝑎 . Setting 𝑡=𝑎/𝑛, we have Pr 𝑋≥𝑎 ≤ 𝑒 − 𝑎 2 2𝑛 . By symmetry we have Pr 𝑋≤−𝑎 ≤ 𝑒 − 𝑎 2 2𝑛 . Pr 𝑋≤−𝑎 ≤ 𝑒 − 𝑎 2 2𝑛 .
11
Better bound for special cases
Corollary: Let 𝑋= 𝑖=1 𝑛 𝑋 𝑖 , where 𝑋 1 ,…, 𝑋 𝑛 are n independent random variables with Pr 𝑋 𝑖 =1 = Pr 𝑋 𝑖 =−1 = Pr 𝑋 𝑖 =1 = Pr 𝑋 𝑖 =−1 = Then for a>0, Pr |𝑋|≥𝑎 ≤ 2𝑒 − 𝑎 2 2𝑛 . Let 𝑌 𝑖 =( 𝑋 𝑖 +1)/2, we have: Corollary: Let 𝑌= 𝑖=1 𝑛 𝑌 𝑖 , where 𝑌 1 ,…, 𝑌 𝑛 are n independent random variables with Pr 𝑌 𝑖 =1 = Pr 𝑌 𝑖 =0 = Pr 𝑌 𝑖 =1 = Pr 𝑌 𝑖 =0 = Let 𝜇=𝐸 𝑌 =𝑛/2. For any 𝑎>0, Pr 𝑌≥𝜇+𝑎 ≤ 𝑒 − 2𝑎 2 𝑛 . For any 𝛿>0, Pr 𝑌≥(1+𝛿)𝜇 ≤ 𝑒 − 𝛿 2 𝜇 . Pf: (1) 𝑌= 𝑖=1 𝑛 𝑌 𝑖 = 𝑖=1 𝑛 𝑋 𝑖 𝑛 2 = 𝑋 2 +𝜇. Thus Pr 𝑌≥𝜇+𝑎 = Pr 𝑋≥2𝑎 ≤ 𝑒 − 2𝑎 2 𝑛 . (2) Set 𝑎=𝛿𝜇=𝛿𝑛/2. Thus Pr 𝑌≥(1+𝛿)𝜇 Pr 𝑌≥(1+𝛿)𝜇 = Pr 𝑋≥2𝛿𝜇 ≤ 𝑒 − 2 𝛿𝜇 2 𝑛 = 𝑒 − 𝛿 2 𝜇 .
12
Better bound for special cases
Corollary: Let 𝑌= 𝑖=1 𝑛 𝑌 𝑖 , where 𝑌 1 ,…, 𝑌 𝑛 are n independent random variables with Pr 𝑌 𝑖 =1 = Pr 𝑌 𝑖 =0 = Pr 𝑌 𝑖 =1 = Pr 𝑌 𝑖 =0 = Let 𝜇=𝐸 𝑌 =𝑛/2. For any μ>𝑎>0, Pr 𝑌≤𝜇−𝑎 ≤ 𝑒 − 2𝑎 2 𝑛 . For any 1>𝛿>0, Pr 𝑌≤(1−𝛿)𝜇 ≤ 𝑒 − 𝛿 2 𝜇 . Application: (Set balancing) Given an 𝑛×𝑚 matrix A with entries 0 or 1, let 𝑣 be an 𝑚-dimensional vector with entries in −1,1 and 𝑐 be an 𝑛-dimensional vector s.t. 𝐴𝑣=𝑐. Theorem: For a random vector with entries randomly and with equal probability form −1,1 , Pr[ 𝑚𝑎𝑥 𝑖 | 𝑐 𝑖 |≥ 4𝑚 ln 𝑛 ]≤2/𝑛.
13
Proof of set balancing Proof: Let the i-th row of A be 𝑎 𝑖 =( 𝑎 𝑖,1 ,…, 𝑎 𝑖,𝑚 ) and suppose there are k 1s in 𝑎 𝑖 . If 𝑘≤ 4𝑚 ln(𝑛) , then | 𝑎 𝑖 𝑣|≤ 4𝑚 ln(𝑛) . Suppose 𝑘> 4𝑚 ln(𝑛) , then there are k non-zero terms in 𝑍 𝑖 = 𝑗=1 𝑚 𝑎 𝑖,𝑗 𝑣 𝑗 , which is the sum of k independent random variables. By the Chernoff bound and the fact 𝑚≥𝑘, we have Pr 𝑍 𝑖 > 4𝑚 ln 𝑛 ≤2 𝑒 − 4𝑚 ln 𝑛 2𝑘 ≤ 2 𝑛 2 . By the union bound we have the bound for every row is at most 2/𝑛.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.