Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Slides:

Advertisements

Similar presentations

Bayes rule, priors and maximum a posteriori

Advertisements

Image Modeling & Segmentation

Maximum Likelihood Estimates and the EM Algorithms II

Introduction to Matlab Workshop Matthew Johnson, Economics October 17, /13/20151.

Point Estimation Notes of STAT 6205 by Dr. Fan.

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.

1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: introduction to maximum likelihood estimation Original citation: Dougherty,

Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.

Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.

Bayesian Methods with Monte Carlo Markov Chains III

. Learning – EM in The ABO locus Tutorial #8 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.

1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Visual Recognition Tutorial

An Introduction of R. Installment of R Step 1: Click here.

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.

. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.

Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.

CASE STUDY: Genetic Linkage Analysis via Bayesian Networks

Visual Recognition Tutorial

. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.

Maximum likelihood (ML)

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

An Introduction to Scilab Tsing Nam Kiu 丁南僑 Department of Mathematics The University of Hong Kong 2009 January 7.

Discrete Probability Distributions

Section 5.1 What is Probability? 5.1 / 1. Probability Probability is a numerical measurement of likelihood of an event. The probability of any event is.

Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.

Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.

Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.

1 Nonparametric Methods I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

STATISTICAL INFERENCE PART I POINT ESTIMATION

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.

Section 2.4 solving equations with variables on both sides of the equal sign. Day 1.

Matlab Basics Tutorial. Vectors Let's start off by creating something simple, like a vector. Enter each element of the vector (separated by a space) between.

1 Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

1 Bayesian Methods with Monte Carlo Markov Chains I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.

Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

Week 41 How to find estimators? There are two main methods for finding estimators: 1) Method of moments. 2) The method of Maximum likelihood. Sometimes.

Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Maximum Likelihood Estimation

1 Three examples of the EM algorithm Week 12, Lecture 1 Statistics 246, Spring 2002.

M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.

Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,

Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Maximum Likelihood Estimates and the EM Algorithms III Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Information Bottleneck versus Maximum Likelihood Felix Polyakov.

Conditional Expectation

MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.

Genes – Basics A gene is an inherited instruction consisting of a sequence of DNA. Genes control a variety of structures and functions. Human genetic material.

Early Belief about Inheritance

Bayesian Estimation and Confidence Intervals

Classification of unlabeled data:

Matlab Workshop 9/22/2018.

StatLab Matlab Workshop

Can you figure out where our buzzwords go??

StatLab Workshop: Intro to Matlab for Data Analysis and Statistical Modeling 11/29/2018.

Can you figure out where our buzzwords go??

EM Algorithm 主講人：虞台文.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Applied Statistics and Probability for Engineers

Presentation transcript:

Maximum Likelihood Estimates and the EM Algorithms I Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University 1

Part 1 Computation Tools 2

Computation Tools  R ( good for statistical computinghttp://  C/C++: good for fast computation and large data sets  More: /teachers/hslu/course/statcomp/links.htm /teachers/hslu/course/statcomp/links.htm 3

The R Project  R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.  Similar to the commercial software of Splus.  C/C++, Fortran and other codes can be linked and called at run time.  More: 4

Download R from 5

Choose one Mirror Site of R 6

Choose the OS System 7

Select the Base of R 8

Download the Setup Program 9

Install R Double click R-icon to install R 10

Execute R Interactive command window 11

Download Add-on Packages 12

Choose a Mirror Site Choose a mirror site close to you

Select One Package to Download Choose one package to download, like rgl

Load Packages  There are two methods to load packages: Method 1: Click from the menu bar Method 2: Type “ library(rgl) ” in the command window 15

Help in R (1)  What is the loaded library? help(rgl) 16

Help in R (2)  How to search functions for key words? help.search( “ key words ” ) It will show all functions has the key words. help.search( “ 3D plot ” ) Function name (belong to which package) description 17

Help in R (3)  How to find the illustration of function? ?function name It will show the usage, arguments, author, reference, related functions, and examples. ?plot3d 18

R Operators (1)  Mathematic operators: +, -, *, /, ^ Mod: % Sqrt, exp, log, log10, sin, cos, tan, … 19

R Operators (2)  Other operators: :sequence operator %*%matrix algebra, =inequality ==, !=comparison &, &&, |, ||and, or ~formulas <-, =assignment 20

Algebra, Operators and Functions >1+2 [1] 3 >1>2 [1] FALSE >1>2|2>1 [1] TRUE >A=1:3 >A [1] >A*6 [1] >A/10 [1] >A%2 [1] >B=4:6 >A*B [1] >t(A)%*%B [1] [1] 32 >A%*%t(B) [1] [2] [3] [1] [2] [3] >sqrt(A) [1] >log(A) [1] >round(sqrt(A),2) [1] >ceiling(sqrt(A)) [1] >floor(sqrt(A)) [1] >eigen(A%*%t(B)) $values [1] 3.20e e e-16 $vectors [1] [2] [3] [1] [2] [3]

Variable Types ItemDescriptions Vector X=c(10.4,5.6,3.1,6.4) or Z=array(data_vector, dim_vector) Matrices X=matrix(1:8,2,4) or Z=matrix(rnorm(30),5,6) FactorsStatef=factor(state) Listspts = list(x=cars[,1], y=cars[,2]) Data Frames data.frame(cbind(x=1, y=1:10), fac=sample(LETTERS[1:3], 10, repl=TRUE)) Functionsname=function(arg_1,arg_2,…) expression Missing Values NA or NAN 22

Define Your Own Function (1)  Use “ fix(myfunction) ” # a window will show up  function (parameter){ statements; return (object); # if you want to return some values }  Save the document  Use “ myfunction(parameter) ” in R 23

Define Your Own Function (2)  Example: Find all the factors of an integer

Define Your Own Function (3)  When you leave the program, remember to save the work space for the next use, or the function you defined will disappear after you close R project. 25

Read and Write Files  Write Data to a CSV File  Write Data to a TXT File  Read TXT and CSV Files  Demo 26

Write Data to a TXT File  Usage: write(x,file,…) >X=matrix(1:6,2,3) >X [,1] [,2] [,3] [1,] [2,] >write(t(X),file=“d:/out2.txt”,ncolumns=3) >write(X,file=“d:/out3.txt”,ncolumns=3) d:/out2.txt d:/out3.txt

Write Data to a CSV File d:/out4.txt 1,2 3,4 5,6 d:/out5.txt 1,3,5 2,4,6  Usage: write.table(x,file=“foo.csv”,sep=“,”,…) > X=matrix(1:6,2,3) > X [,1] [,2] [,3] [1,] [2,] >write.table(t(X),file=“d:/out4.txt”,sep=“,”,col.names=FALS E,row.names=FALSE) >write.table(X,file=“d:/out5.txt”,sep=“,”,col.names=FALSE, row.names=FALSE) 28

Read TXT and CSV Files  Usage: read.table(file,...) >X=read.table(file="d:/out2.txt") >X v1 v2 v > Y=read.table(file="d:/out5.txt",sep=",",header=FALSE) >Y V1 V

Demo >Data=read.table(file="d:/01.csv",header=TRUE,sep=",") >Data Y X1 X >mean(Data$Y) [1] >boxplot(Data$Y) 01.csv 30

Part 2 Motivation Examples 31

Example 1 in Genetics (1)  Two linked loci with alleles A and a, and B and b A, B: dominant a, b: recessive  A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab F ( Female) 1- r ’ r ’ (female recombination fraction) M (Male) 1-r r (male recombination fraction) A Bb a B A b a a B b A A B b a 32

Example 1 in Genetics (2)  r and r ’ are the recombination rates for male and female  Suppose the parental origin of these heterozygote is from the mating of. The problem is to estimate r and r ’ from the offspring of selfed heterozygotes.  Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79 – 92.  nk/handout12.pdf nk/handout12.pdf 33

Example 1 in Genetics (3) b a B A A B b a a bb aA BB A A B A B b a b a 1/2 a B b A A B b a ABabaBAb Male(1-r)/2 r/2 Female(1-r ’ )/2 r ’ /2 34

Example 1 in Genetics (4) MALE AB (1-r)/2 ab (1-r)/2 aB r/2 Ab r/2 FEMALEFEMALE AB (1-r ’ )/2 AABB (1-r) (1-r ’ )/4 aABb (1-r) (1-r ’ )/4 aABB r (1-r ’ )/4 AABb r (1-r ’ )/4 ab (1-r ’ )/2 AaBb (1-r) (1-r ’ )/4 aabb (1-r) (1-r ’ )/4 aaBb r (1-r ’ )/4 Aabb r (1-r ’ )/4 aB r ’ /2 AaBB (1-r) r ’ /4 aabB (1-r) r ’ /4 aaBB r r ’ /4 AabB r r ’ /4 Ab r ’ /2 AABb (1-r) r ’ /4 aAbb (1-r) r ’ /4 aABb r r ’ /4 AAbb r r ’ /4 35

Example 1 in Genetics (5)  Four distinct phenotypes: A*B*, A*b*, a*B* and a*b*.  A*: the dominant phenotype from (Aa, AA, aA).  a*: the recessive phenotype from aa.  B*: the dominant phenotype from (Bb, BB, bB).  b* : the recessive phenotype from bb.  A*B*: 9 gametic combinations.  A*b*: 3 gametic combinations.  a*B*: 3 gametic combinations.  a*b*: 1 gametic combination.  Total: 16 combinations. 36

Example 1 in Genetics (6) 37

Example 1 in Genetics (7) Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution: 38

Example 1 in Genetics (8) Suppose that we observe the data of y = (y1, y2, y3, y4) = (125, 18, 20, 24), which is a random sample from Then the probability mass function is 39

Estimation Methods  Frequentist Approaches: Method of Moments Estimate (MME) _%28statistics%29 Maximum Likelihood Estimate (MLE)  Bayesian Approaches: 40

Method of Moments Estimate (MME)  Solve the equations when population means are equal to sample means: for k = 1, 2, …, t, where t is the number of parameters to be estimated.  MME is simple.  Under regular conditions, the MME is consistent!  More: _%28statistics%29 _%28statistics%29 41

MME for Example 1 Note: MME can ’ t assure 42

MME by R 43

MME by C/C++ 44

Maximum Likelihood Estimate (MLE)  Likelihood:  Maximize likelihood: Solve the score equations, which are setting the first derivates of likelihood to be zeros.  Under regular conditions, the MLE is consistent, asymptotic efficient and normal!  More: elihood 45

Example 2 (1) # of tossing head ( )probability 0(0,0,0)(1-p) 3 1(1,0,0) (0,1,0) (0,0,1)p(1-p) 2 2(0,1,1) (1,0,1) (1,1,0)p 2 (1-p) 3(1,1,1)p3p3 We toss an unfair coin 3 times and the random variable is If p is the probability of tossing head, then 46

Example 2 (2) Suppose we observe the toss of 1 heads and 2 tails, the likelihood function becomes One way to maximize this likelihood function is by solving the score equation, which sets the first derivative to be zero: 47

Example 2 (3)  The solution of p for the score equation is 1/3 or 1.  One can check that p=1/3 is the maximum point. (How?)  Hence, the MLE of p is 1/3 for this example. 48

MLE for Example 1 (1)  Likelihood  MLE: A B C 49

MLE for Example 1 (2)  Checking: (1) (2) (3) 50

Use R to find MLE (1) 51

Use R to find MLE (2) 52

Use C/C++ to find MLE (1) 53

Use C/C++ to find MLE (2) 54

Exercises  Write your own programs for those examples presented in this talk.  Write programs for those examples mentioned at the following web page: kelihood  Write programs for the other examples that you know. 55

More Exercises (1)  Example 3 in genetics: The observed data are (nO, nA, nB, nAB) = (176, 182, 60, 17) ~ Multinomial(r^2, p^2+2pr, q^2+2qr, 2pq), where p, q, and r fall in [0,1] such that p+q+r = 1. Find the likelihood function and score equations for p, q, and r. 56

More Exercises (2)  Example 4 in the positron emission tomography (PET): The observed data are n*(d) ~Poisson(λ*(d)), d = 1, 2, …, D, and  The values of p(b,d) are known and the unknown parameters are λ(b), b = 1, 2, …, B.  Find the likelihood function and score equations for λ(b), b = 1, 2, …, B.. 57

More Exercises (3)  Example 5 in the normal mixture: The observed data x i, i = 1, 2, …, n, are random samples from the following probability density function:  Find the likelihood function and score equations for the following parameters: 58