Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Maximum Likelihood Estimates and the EM Algorithms II
Point Estimation Notes of STAT 6205 by Dr. Fan.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: introduction to maximum likelihood estimation Original citation: Dougherty,
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Bayesian Methods with Monte Carlo Markov Chains III
. Learning – EM in ABO locus Tutorial #08 © Ydo Wexler & Dan Geiger.
. Learning – EM in The ABO locus Tutorial #8 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Visual Recognition Tutorial
An Introduction of R. Installment of R Step 1: Click here.
Section 6.1 Let X 1, X 2, …, X n be a random sample from a distribution described by p.m.f./p.d.f. f(x ;  ) where the value of  is unknown; then  is.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
2. Point and interval estimation Introduction Properties of estimators Finite sample size Asymptotic properties Construction methods Method of moments.
Visual Recognition Tutorial
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Maximum likelihood (ML)
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
An Introduction to Scilab Tsing Nam Kiu 丁南僑 Department of Mathematics The University of Hong Kong 2009 January 7.
Discrete Probability Distributions
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
1 Nonparametric Methods I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
STATISTICAL INFERENCE PART I POINT ESTIMATION
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Sébastien Lê Agrocampus Rennes A very short introduction to “R” The “Rcmdr” package and its environment.
Section 2.4 solving equations with variables on both sides of the equal sign. Day 1.
Multinomial Distribution
Matlab Basics Tutorial. Vectors Let's start off by creating something simple, like a vector. Enter each element of the vector (separated by a space) between.
1 Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.
1 Bayesian Methods with Monte Carlo Markov Chains I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Week 41 How to find estimators? There are two main methods for finding estimators: 1) Method of moments. 2) The method of Maximum likelihood. Sometimes.
Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Maximum Likelihood Estimation
Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.
1 Three examples of the EM algorithm Week 12, Lecture 1 Statistics 246, Spring 2002.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Maximum Likelihood Estimates and the EM Algorithms III Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Information Bottleneck versus Maximum Likelihood Felix Polyakov.
Conditional Expectation
MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.
Bayesian Estimation and Confidence Intervals
Classification of unlabeled data:
Matlab Workshop 9/22/2018.
Estimation Maximum Likelihood Estimates Industrial Engineering
Can you figure out where our buzzwords go??
StatLab Workshop: Intro to Matlab for Data Analysis and Statistical Modeling 11/29/2018.
Estimation Maximum Likelihood Estimates Industrial Engineering
Can you figure out where our buzzwords go??
EM Algorithm 主講人:虞台文.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Mathematical Foundations of BME Reza Shadmehr
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Applied Statistics and Probability for Engineers
Presentation transcript:

Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University 1

Part 1 Computation Tools 2

Computation Tools  R ( good for statistical computinghttp://  C/C++: good for fast computation and large data sets  More: e/teachers/hslu/course/statcomp/links.htm e/teachers/hslu/course/statcomp/links.htm 3

The R Project  R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.  Similar to the commercial software of Splus.  C/C++, Fortran and other codes can be linked and called at run time.  More: 4

Download R from 5

Choose one Mirror Site of R 6

Choose the OS System 7

Select the Base of R 8

Download the Setup Program 9

Install R Double click R-icon to install R 10

Execute R 11 Interactive command window

Download Add-on Packages 12

Choose a Mirror Site Choose a mirror site close to you 13

Select One Package to Download Choose one package to download, like “rgl” or “adimpro”. 14

Load Packages  There are two methods to load packages: 15 Method 1: Click from the menu bar Method 2: Type “ library(rgl) ” in the command window

Help in R (1)  What is the loaded library? help(rgl) 16

Help in R (2)  How to search functions for key words? help.search( “ key words ” ) It will show all functions has the key words. help.search( “ 3D plot ” ) 17

Help in R (3)  How to find the illustration of function? ?function name It will show the usage, arguments, author, reference, related functions, and examples. ?plot3d 18

R Operators (1)  Mathematic operators: +, -, *, /, ^ Mod: % sqrt, exp, log, log10, sin, cos, tan, … 19

R Operators (2)  Other operators: :sequence operator %*%matrix algebra, =inequality ==, !=comparison &, &&, |, ||and, or ~formulas <-, =assignment 20

Algebra, Operators and Functions >1+2 [1] 3 >1>2 [1] FALSE >1>2 | 2>1 [1] TRUE >A = 1:3 >A [1] >A*6 [1] >A/10 [1] >A%2 [1] >B = 4:6 >A*B [1] >t(A)%*%B [1] [1] 32 >A%*%t(B) [1] [2] [3] [1] [2] [3] >sqrt(A) [1] >log(A) [1] >round(sqrt(A), 2) [1] >ceiling(sqrt(A)) [1] >floor(sqrt(A)) [1] >eigen(A%*%t(B)) $values [1] 3.20e e e-16 $vectors [1] [2] [3] [1,] [2,] [3,]

Variable Types ItemDescriptions Vector X=c(10.4,5.6,3.1,6.4) or Z=array(data_vector, dim_vector) MatricesX=matrix(1:8,2,4) or Z=matrix(rnorm(30),5,6) FactorsStatef=factor(state) Listspts = list(x=cars[,1], y=cars[,2]) Data Frames data.frame(cbind(x=1, y=1:10), fac=sample(LETTERS[1:3], 10, repl=TRUE)) Functionsname=function(arg_1,arg_2,…) expression Missing Values NA or NAN 22

Define Your Own Function (1)  Use "fix(myfunction)" # a window will show up  function(parameter){ statements; return (object); # if you want to return some values }  Save the document  Use "myfunction(parameter)" in R 23

Define Your Own Function (2)  Example: Find all the factors of an integer 24

Define Your Own Function (3) 25 When you leave the program, remember to save the work space for the next use, or the function you defined will disappear after you close R project.

Read and Write Files  Write Data to a TXT File  Write Data to a CSV File  Read TXT and CSV Files  Demo 26

Write Data to a TXT File  Usage: write(x, file, …) > X = matrix(1:6, 2, 3) > X [,1] [,2] [,3] [1,] [2,] > write(t(X), file = "d:/out1.txt", ncolumns = 3) > write(X, file = "d:/out2.txt", ncolumns = 3) 27 d:/out1.txt d:/out2.txt

Write Data to a CSV File  Usage: write.table(x, file = "foo.csv", …) > X = matrix(1:6, 2, 3) > X [,1] [,2] [,3] [1,] [2,] > write.table(t(X), file = "d:/out1.csv", sep = ",", col.names = FALSE, row.names = FALSE) > write.table(X, file = "d:/out2.csv", sep = ",", col.names = FALSE, row.names = FALSE) 28 d:/out1.csv 1,2 3,4 5,6 d:/out2.csv 1,3,5 2,4,6

Read TXT and CSV Files  Usage: read.table(file,...) > X = read.table(file = "d:/out1.txt") > X V1 V2 V > Y = read.table(file = "d:/out1.csv", sep = ",", header = FALSE) > Y V1 V

Demo (1)  Practice for read file and basic analysis > Data = read.table(file = "d:/01.csv", header = TRUE, sep = ",") > Data Y X1 X2 [1,] [2,] [3,] [4,] [5,] [6,] [7,] csv

Demo (2)  Practice for read file and basic analysis > mean(Data$Y) [1] > boxplot(Data$Y) > boxplot(Data) 31

Part 2 Motivation Examples 32

Example 1 in Genetics (1)  Two linked loci with alleles A and a, and B and b A, B: dominant a, b: recessive  A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab 33 A Bb a B A b a 1/2 a B b A A B b a

Example 1 in Genetics (2)  Probabilities for genotypes in gametes 34 No RecombinationRecombination Male1-rr Female1-r ’ r’r’ ABabaBAb Male(1-r)/2 r/2 Female(1-r ’ )/2 r ’ /2 A Bb a B A b a 1/2 a B b A A B b a

Example 1 in Genetics (3)  Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.  More: yes/bank/handout12.pdf 35

Example 1 in Genetics (4) 36 MALE AB (1-r)/2 ab (1-r)/2 aB r/2 Ab r/2 FEMALEFEMALE AB (1-r ’ )/2 AABB (1-r) (1-r ’ )/4 aABb (1-r) (1-r ’ )/4 aABB r (1-r ’ )/4 AABb r (1-r ’ )/4 ab (1-r ’ )/2 AaBb (1-r) (1-r ’ )/4 aabb (1-r) (1-r ’ )/4 aaBb r (1-r ’ )/4 Aabb r (1-r ’ )/4 aB r ’ /2 AaBB (1-r) r ’ /4 aabB (1-r) r ’ /4 aaBB r r ’ /4 AabB r r ’ /4 Ab r ’ /2 AABb (1-r) r ’ /4 aAbb (1-r) r ’ /4 aABb r r ’ /4 AAbb r r ’ /4

Example 1 in Genetics (5)  Four distinct phenotypes: A*B*, A*b*, a*B* and a*b*.  A*: the dominant phenotype from (Aa, AA, aA).  a*: the recessive phenotype from aa.  B*: the dominant phenotype from (Bb, BB, bB).  b*: the recessive phenotype from bb.  A*B*: 9 gametic combinations.  A*b*: 3 gametic combinations.  a*B*: 3 gametic combinations.  a*b*: 1 gametic combination.  Total: 16 combinations. 37

Example 1 in Genetics (6)  Let, then 38

Example 1 in Genetics (7)  Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution: We know that and So 39

Example 1 in Genetics (8)  Suppose that we observe the data of which is a random sample from Then the probability mass function is 40

Estimation Methods  Frequentist Approaches: Method of Moments Estimate (MME) _%28statistics%29 Maximum Likelihood Estimate (MLE)  Bayesian Approaches: 41

Method of Moments Estimate (MME)  Solve the equations when population moments are equal to sample moments: for k = 1, 2, …, t, where t is the number of parameters to be estimated.  MME is simple.  Under regular conditions, the MME is consistent!  More: ments_%28statistics%29 ments_%28statistics%29 42

MME for Example 1  Note: MME can ’ t assure 43

MME by R > MME <- function(y1, y2, y3, y4){ n = y1+y2+y3+y4; phi1 = 4.0*(y1/n-0.5); phi2 = 1-4*y2/n; phi3 = 1-4*y3/n; phi4 = 4.0*y4/n; phi = (phi1+phi2+phi3+phi4)/4.0; print("By MME method"); return(phi); # print(phi); } > MME(125, 18, 20, 24) [1] "By MME method" [1]

MME by C/C++ 45

Maximum Likelihood Estimate (MLE)  Likelihood:  Maximize likelihood: Solve the score equations, which are setting the first derivates of likelihood to be zeros.  Under regular conditions, the MLE is consistent, asymptotic efficient and normal!  More: ihood 46

Example 2 (1)  We toss an unfair coin 3 times and the random variable is If p is the probability of tossing head, then 47

Example 2 (2)  The distribution of “# of tossing head”: 48 # of tossing head( )probability 0(0,0,0)(1-p) 3 1(1,0,0) (0,1,0) (0,0,1)3p(1-p) 2 2(0,1,1) (1,0,1) (1,1,0)3p 2 (1-p) 3(1,1,1)p3p3

Example 2 (3)  Suppose we observe the toss of 1 heads and 2 tails, the likelihood function becomes One way to maximize this likelihood function is by solving the score equation, which sets the first derivative to be zero: 49

Example 2 (4)  The solution of p for the score equation is 1/3 or 1.  One can check that p=1/3 is the maximum point. (How?)  Hence, the MLE of p is 1/3 for this example. 50

MLE for Example 1 (1)  Likelihood  MLE: 51

MLE for Example 1 (2) 52 A B C

MLE for Example 1 (3)  Checking: Compare ? 53

Use R to find MLE (1) > #MLE > y1 = 125; y2 = 18; y3 = 20; y4 = 24 > f <- function(phi){ + ((2.0+phi)/4.0)^y1 * ((1.0-phi)/4.0)^(y2+y3) * (phi/4.0)^y4 + } > plot(f, 1/4, 1, xlab = expression(varphi), ylab = "likelihood function multipling a constant") > optimize(f, interval = c(1/4, 1), maximum = T) $maximum [1] $objective [1] e-82 54

Use R to find MLE (2) 55

Use C/C++ to find MLE (1) 56

Use C/C++ to find MLE (2) 57

Exercises  Write your own programs for those examples presented in this talk.  Write programs for those examples mentioned at the following web page: ihood  Write programs for the other examples that you know. 58

More Exercises (1)  Example 3 in genetics: The observed data are where,, and fall in such that Find the likelihood function and score equations for,, and. 59

More Exercises (2)  Example 4 in the positron emission tomography (PET): The observed data are and  The values of are known and the unknown parameters are.  Find the likelihood function and score equations for. 60

More Exercises (3)  Example 5 in the normal mixture: The observed data are random samples from the following probability density function:  Find the likelihood function and score equations for the following parameters: 61