Download presentation
Presentation is loading. Please wait.
1
Probability for Machine Learning
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
2
Probabilistic Machine Learning
Not all machine learning models are probabilistic … but most of them have probabilistic interpretations Predictions need to have associated confidence Confidence = probability Arguments for probabilistic approach Complete framework for Machine Learning Makes assumptions explicit Recovers most non-probabilistic models as special cases Modular: Easily extensible Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
3
References “Introduction to Probability Models”, Sheldon Ross
“Introduction to Probability and Statistics for Engineers and Scientists”, Sheldon Ross “Introduction To Probability”, Dimitri P. Bertsekas, John N. Tsitsiklis Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
4
Basics Random experiment 𝐸, outcome 𝜔∈Ω, events 𝐹, sample space (Ω,𝐹)
Probability measure 𝑃:𝐹→𝑅 Axioms of probability, basic laws of probability Discrete sample space, discrete probability measure Continuous sample space, continuous probability measure Conditional probability, multiplicative rule, theorem of total probability, Bayes theorem Independence, pair-wise, mutual, conditional independence Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
5
Random Variables 𝑋:Ω→𝑅 Example: Experiment: Tossing of two coins
Random variable: sum of two outcomes 𝑋=2 ≡ 𝜔:𝑠𝑢𝑚 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠=2 = 1,1 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
6
Discrete Random Variables
Probability mass function Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
7
Example distributions: Discrete
Bernoulli: 𝑥∼𝐵𝑒𝑟 𝑝 , 𝑥∈{0,1}≡𝑝 𝑥 = 𝑝 𝑥 1−𝑝 1−𝑥 Binomial: 𝑥∼𝐵𝑖𝑛 𝑛,𝑝 , 𝑥∈{0,…,𝑛}≡𝑝 𝑥 =𝑛𝐶𝑥 𝑝 𝑥 1−𝑝 1−𝑥 Poisson: 𝑥∼𝑃𝑜𝑖𝑠𝑠𝑜𝑛 𝜆 , 𝑥∈{0,1, …}≡𝑝 𝑥 = 𝑒 −𝜆 𝜆 𝑘 𝑘! Geometric: 𝑥∼𝐺𝑒𝑜 𝑝 , 𝑥∈{1,…,𝑛}≡𝑝 𝑥 = 1−𝑝 𝑥−1 𝑝 Empirical distribution: Given 𝐷= 𝑥 1 ,…, 𝑥 𝑛 , 𝑝 𝑒𝑚𝑝 𝐴 = 1 𝑁 𝑖 𝛿 𝑥 𝑖 (𝐴) , where 𝛿 𝑥 𝑖 (𝐴) is the Dirac delta measure Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
8
Continuous Random Variables
Probability density function Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
9
Example density functions
Uniform: 𝑥∼𝑈 𝑎,𝑏 ≡𝑓 𝑥 = 1 𝑏−𝑎 Exponential: 𝑥∼𝐸𝑥𝑝 𝜆 ≡𝑓 𝑥 =𝜆 𝑒 −𝜆𝑥 Standard Normal: 𝑥∼𝑁 0,1 ≡𝑓 𝑥 = 1 √2𝜋 𝑒 − 𝑥 2 /2 Gaussian: 𝑥∼𝑁(𝜇,𝜎)≡𝑓 𝑥 = 1 √2𝜋𝜎 𝑒 −( 𝑥−𝜇) 2 /2 𝜎 2 Laplace: 𝑥∼𝐿𝑎𝑝(𝜇,𝑏)≡𝑓 𝑥 = 1 2𝑏 𝑒 −|𝑥−𝜇|/𝑏 Gamma: 𝑥∼𝐺𝑎𝑚(𝛼,𝛽)≡𝑓 𝑥 = 𝛽 𝛼 Γ(𝛼) 𝑥 𝛼−1 𝑒 −𝛽𝑥 Beta: 𝑥∼𝐵𝑒𝑡𝑎(𝛼,𝛽)≡𝑓 𝑥 = Γ 𝛼 Γ(𝛽) Γ 𝛼+𝛽 𝑥 𝛼−1 (1−𝑥) 𝛽−1 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
10
Random Variables Cumulative distribution function
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
11
Moments Mean Variance 𝐸 𝑋 = 𝑥𝑓 𝑥 𝑑𝑥 𝑉𝑎𝑟 𝑋 =𝐸[ 𝑋−𝐸 𝑋 2 ]
𝐸 𝑋 = 𝑥𝑓 𝑥 𝑑𝑥 Variance 𝑉𝑎𝑟 𝑋 =𝐸[ 𝑋−𝐸 𝑋 2 ] Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
12
Random Vectors and Joint Distributions
Discrete Random Vector Joint pmf Continuous Random Vector Joint pdf Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
13
Example multi-variate distributions
Multi-variate Gaussian 𝑥∼𝑁 𝜇,Σ ≡𝑓 𝑥 = 2𝜋 − 𝑘 2 Σ −1 𝑥−𝜇 𝑇 Σ −1 (𝑥−𝜇) Multinomial 𝑥∼𝑀𝑢𝑙𝑡 𝑝 1 ,…, 𝑝 𝑘 ≡𝑓 𝑥 1 ,…, 𝑥 𝑘 = 𝑛! 𝑥 1 !… 𝑥 𝑘 ! 𝑝 1 𝑥 1 … 𝑝 𝑘 𝑥 𝑘 Dirichlet 𝑥∼𝐷𝑖𝑟 𝛼 1 ,…, 𝛼 𝑘 ≡𝑓 𝑥 1 ,…, 𝑥 𝑘 = Γ 𝑖 𝛼 𝑖 𝑖 Γ 𝛼 𝑖 𝑖 𝑥 𝑖 𝛼 𝑖 −1 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
14
Random Vectors and Joint Distributions
Given 𝑓( 𝑥 1 ,… 𝑥 𝑘 ), Marginal distributions 𝑓 𝑋 1 𝑥 1 = 𝑥 𝑥 3 … 𝑓 𝑥 1 ,…, 𝑥 𝑘 𝑑 𝑥 2 𝑑 𝑥 3 … Expectation 𝐸[𝑋]= 𝑥 𝑥 2 … ( 𝑥 1 ,…, 𝑥 𝑘 )𝑓 𝑥 1 ,…, 𝑥 𝑘 𝑑 𝑥 1 𝑑 𝑥 2 … Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
15
Conditional Probability
Conditional pmf Conditional pdf Given 𝑓 𝑋 1 𝑋 2 ( 𝑥 1 , 𝑥 2 ), 𝑓 𝑋 1 | 𝑋 2 𝑥 1 𝑥 2 = 𝑓 𝑋 1 𝑋 2 ( 𝑥 1 , 𝑥 2 )/ 𝑓 𝑋 2 ( 𝑥 2 ) Multiplication Rule Bayes rule 𝑓 𝑋 1 | 𝑋 2 𝑥 1 𝑥 2 = 𝑓 𝑋 2 | 𝑋 1 𝑥 2 𝑥 1 𝑓 𝑋 1 𝑥 𝑥 1 𝑓 𝑋 2 | 𝑋 1 𝑥 2 𝑥 1 𝑓 𝑋 1 𝑥 1 𝑑 𝑥 1 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
16
Conditional Probability
Given 𝑓 𝑋 1 𝑋 2 ( 𝑥 1 , 𝑥 2 ), Conditional Expectation 𝐸 𝑋 1 𝑥 2 = 𝑥 1 𝑓 𝑋_1| 𝑋 2 𝑥 1 𝑥 2 𝑑 𝑥 1 Law of Total Expectation 𝐸 𝑋 1 = 𝐸 𝑋 1 𝑥 2 𝑓 𝑋 2 𝑥 2 𝑑 𝑥 2 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
17
Independence and Conditional Independence
𝑓 𝑋 1 𝑋 2 𝑥 1 , 𝑥 2 = 𝑓 𝑋 1 ( 𝑥 1 ) 𝑓 𝑋 2 ( 𝑥 2 ) Conditional Independence 𝑓 𝑋 1 𝑋 2 | 𝑋 3 𝑥 1 , 𝑥 2 | 𝑥 3 = 𝑓 𝑋 1 | 𝑋 3 ( 𝑥 1 | 𝑥 3 ) 𝑓 𝑋 2 | 𝑋 3 ( 𝑥 2 | 𝑥 3 ) Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
18
Covariance Covariance Correlation co-efficient
𝐶𝑜𝑣 𝑋,𝑌 =𝐸[(𝑋−𝐸[𝑋])(𝑌−𝐸[𝑌])] Correlation co-efficient 𝜌 𝑋,𝑌 =𝐶𝑜𝑣(𝑋,𝑌)/√𝑉𝑎𝑟(𝑋)√𝑉𝑎𝑟(𝑌) Covariance matrix for a random vector X 𝐶𝑜𝑣 𝑋 =𝐸[ 𝑋−𝐸 𝑋 𝐸−𝐸 𝑋 𝑇 ] Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
19
Central Limit Theorem N i.i.d. random variables 𝑋 𝑖 with mean 𝜇, variance 𝜎 2 𝑆 𝑁 = 𝑖 𝑋 𝑖 𝑍 𝑁 = 𝑆 𝑁 −𝑁𝜇 𝜎 𝑁 As N increases the distribution of 𝑍 𝑁 approaches the standard normal distribution Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
20
Notions from Information Theory
Entropy 𝐻 𝑋 =− 𝑘 𝑃 𝑋=𝑘 log 2 𝑃(𝑋=𝑘) KL divergence 𝐾𝐿 𝑝 𝑞 = 𝑥 𝑝 𝑘 log 𝑝 𝑘 𝑞 𝑘 Mutual Information 𝐼 𝑋,𝑌 =𝐾𝐿 𝑝 𝑋,𝑌 ,𝑝 𝑋 𝑝 𝑌 = 𝑥,𝑦 𝑝 𝑥,𝑦 log 𝑝(𝑥,𝑦) 𝑝 𝑥 𝑝(𝑦) Point-wise Mutual Information 𝑃𝑀𝐼 𝑥,𝑦 = log 𝑝(𝑥,𝑦) 𝑝 𝑥 𝑝(𝑦) Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
21
Jensen’s Inequality For a convex function f() and a random variable X
𝑓 𝐸 𝑋 ≤𝐸 𝑓 𝑥 Equality holds if f(x) is linear Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.