Maximum Likelihood Estimate

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Basics of Statistical Estimation

EMNLP, June 2001Ted Pedersen - EM Panel1 A Gentle Introduction to the EM Algorithm Ted Pedersen Department of Computer Science University of Minnesota.

AP Statistics Chapter 7 – Random Variables. Random Variables Random Variable – A variable whose value is a numerical outcome of a random phenomenon. Discrete.

Lecture 4A: Probability Theory Review Advanced Artificial Intelligence.

Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.

Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.

A.P. STATISTICS LESSON 7 – 1 ( DAY 1 ) DISCRETE AND CONTINUOUS RANDOM VARIABLES.

.. . Parameter Estimation using likelihood functions Tutorial #1 This class has been cut and slightly edited from Nir Friedman’s full course of 12 lectures.

Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.

Part 2b Parameter Estimation CSE717, FALL 2008 CUBS, Univ at Buffalo.

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.

. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.

Probability and Statistics Review

Visual Recognition Tutorial

C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.

Thanks to Nir Friedman, HU

EM Algorithm Likelihood, Mixture Models and Clustering.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

1 CY1B2 Statistics Aims: To introduce basic statistics. Outcomes: To understand some fundamental concepts in statistics, and be able to apply some probability.

Probability theory: (lecture 2 on AMLbook.com)

All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.

Modeling and Simulation CS 313

Probability Distributions - Discrete Random Variables Outcomes and Events.

A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.

Chapter 7 Point Estimation

: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.

Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.

Probability Review-1 Probability Review. Probability Review-2 Probability Theory Mathematical description of relationships or occurrences that cannot.

Psychology 202a Advanced Psychological Statistics September 22, 2015.

Week 41 How to find estimators? There are two main methods for finding estimators: 1) Method of moments. 2) The method of Maximum likelihood. Sometimes.

Maximum Likelihood Estimation

Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.

Distribution Function properties Let us consider the experiment of tossing the coin 3 times and observing the number of heads The probabilities and the.

Maximum Likelihood Estimate Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Information Bottleneck versus Maximum Likelihood Felix Polyakov.

Conditional Expectation

Random Variables Lecture Lecturer : FATEN AL-HUSSAIN.

Bayesian Estimation and Confidence Intervals Lecture XXII.

Applied statistics Usman Roshan.

Statistical Modelling

Bayesian Estimation and Confidence Intervals

Quadratic Classifiers (QC)

Probability Theory and Parameter Estimation I

CS 2750: Machine Learning Density Estimation

Parameter Estimation 主講人：虞台文.

STATISTICS Random Variables and Distribution Functions

Bayes Net Learning: Bayesian Approaches

Maximum Likelihood Estimation

Samples and Populations

Review of Probability and Estimators Arun Das, Jason Rebello

PROBABILITY The probability of an event is a value that describes the chance or likelihood that the event will happen or that the event will end with.

Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.

The Binomial and Geometric Distributions

Advanced Artificial Intelligence

Covered only ML estimator

Circularly Linked Lists and List Reversal

Discrete & Continuous Random Variables

M248: Analyzing data Block A UNIT A3 Modeling Variation.

Discrete Random Variables: Basics

Discrete Random Variables: Basics

Experiments, Outcomes, Events and Random Variables: A Revisit

Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.

Discrete Random Variables: Basics

Presentation transcript:

Maximum Likelihood Estimate Jyh-Shing Roger Jang (張智星) CSIE Dept, National Taiwan University

Intro. to Maximum Likelihood Estimate MLE Maximum likelihood estimate Goal: Given a dataset with no labels, how can we find the best statistical model with the optimum parameters to describe the data? Applications Prediction Analysis

What Are Statistical Models? Statistical models are used to describe the probabilities of random variables Discrete variables  Probability functions Continuous variables  Probability density functions (PDF) Examples Discrete variables The outcome of tossing a coin or a die Continuous variables The distance to the bull eye when throwing a dart The time needed to run 100-m dash The heights of second-grade students Personalized PDF!

More about Models Discrete variables Continuous variables Outcome of tossing a coin  Pr{head}=1/2, Pr{tail}=1/2 Continuous variables Distance to the bull’s eye when throwing a dart  A PDF of Gaussian or normal distribution Quiz! Probability of x in [4, 6]

Basic Steps in MLE Steps Examples Perform a certain experiment to collect the data. Choose a parametric model of the data, with certain modifiable parameters. Formulate the likelihood as an objective function to be maximized. Maximize the objective function and derive the parameters of the model. Examples Flip a coin  To find the probabilities of head and tail Throw a dart  To find your PDF of distance to the bull eye

Probability Functions for Discrete Variables Flip an unfair coin 5 times to get 3 heads and 2 tails By intuition: Pr{head}=3/5, Pr{tail}=2/5 By MLE Assume these 5 tosses are independent events to have the overall probability

Inequality of Arithmetic and Geometric Means AM-GM inequality Proof of this inequality Wikipedia How to use the inequality to solve MLE problem? Quiz!

How to Prove AM-GM Inequality? Basic inequality Proof by induction Jensen’s inequality

Proof by Induction

Probability Functions for Discrete Variables Toss a 3-side die for many times and obtain n1 of side 1, n2 of side 2, and n3 of side 3, then what is the most likely probabilities for sides 1, 2, and 3, respectively? Our intuition… By MLE…

MLE for PDF of Continuous Variables of 1D Detailed coverage PDF Overall PDF, or likelihood Log likelihood MLE Quiz!

MLE for PDF of Continuous Variables of ND Detailed coverage PDF Overall PDF, or likelihood Log likelihood MLE

Q & A Questions Can we choose other probability density function instead of Gaussian/normal distributions?  Yes! What are the other available PDF?