Lecture 1.31 Criteria for optimal reception of radio signals.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Detection Chia-Hsin Cheng. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outlines Detection Theory Simple Binary Hypothesis Tests Bayes.
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
Bayesian Decision Theory
Visual Recognition Tutorial
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
Statistical Decision Theory, Bayes Classifier
Machine Learning CMPT 726 Simon Fraser University
Introduction to Signal Detection
Computer vision: models, learning and inference
1 Digital Communication Systems Lecture-3, Prof. Dr. Habibullah Jamal Under Graduate, Spring 2008.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Principles of Pattern Recognition
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Theory of Probability Statistics for Business and Economics.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 02: BAYESIAN DECISION THEORY Objectives: Bayes.
Naive Bayes Classifier
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Baseband Demodulation/Detection
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 8 Sept 23, 2005 Nanjing University of Science & Technology.
1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Bayesian Decision Theory Basic Concepts Discriminant Functions The Normal Density ROC Curves.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Machine Learning 5. Parametric Methods.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.
Baseband Receiver Receiver Design: Demodulation Matched Filter Correlator Receiver Detection Max. Likelihood Detector Probability of Error.
ELEC 303 – Random Signals Lecture 17 – Hypothesis testing 2 Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 2, 2009.
Digital Communications I: Modulation and Coding Course Spring Jeffrey N. Denenberg Lecture 3c: Signal Detection in AWGN.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Objectives: Loss Functions Risk Min. Error Rate Class. Resources: DHS – Chap. 2 (Part 1) DHS – Chap. 2 (Part 2) RGO - Intro to PR MCE for Speech MCE for.
Lecture 2. Bayesian Decision Theory
12. Principles of Parameter Estimation
Probability Theory and Parameter Estimation I
Special Topics In Scientific Computing
LECTURE 03: DECISION SURFACES
Comp328 tutorial 3 Kai Zhang
Special Topics In Scientific Computing
Review of Probability and Estimators Arun Das, Jason Rebello
LECTURE 05: THRESHOLD DECODING
Error rate due to noise In this section, an expression for the probability of error will be derived The analysis technique, will be demonstrated on a binary.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 05: THRESHOLD DECODING
Digital Communication Systems Lecture-3, Prof. Dr. Habibullah Jamal
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
EE513 Audio Signals and Systems
Pattern Recognition and Machine Learning
LECTURE 23: INFORMATION THEORY REVIEW
LECTURE 07: BAYESIAN ESTIMATION
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Parametric Methods Berlin Chen, 2005 References:
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
11. Conditional Density Functions and Conditional Expected Values
LECTURE 05: THRESHOLD DECODING
11. Conditional Density Functions and Conditional Expected Values
12. Principles of Parameter Estimation
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Lecture 1.31 Criteria for optimal reception of radio signals. Max. Likelihood Detector Probability of Error

Probability Decision Theory Bayesian decision theory is a fundamental statistical approach to the problem of pattern classification. Quantify the tradeoffs between various classification decisions using probability and the costs that accompany these decisions. Assume all relevant probability distributions are known (later we will learn how to estimate these from data). Can we exploit prior knowledge in our signal classification problem: Are the sequence of signal predictable? (statistics) Is each class equally probable? (uniform priors) What is the cost of an error? (risk, optimization)

Prior Probabilities State of nature is prior information Model as a random variable, :  = 1: the event that the next signal is «zero» category 1: «zero»; category 2: : «unit» P(1) = probability of category 1 P(2) = probability of category 2 P(1) + P( 2) = 1 Exclusivity: 1 and 2 share no basic events Exhaustivity: the union of all outcomes is the sample space (either 1 or 2 must occur) If all incorrect classifications have an equal cost: Decide 1 if P(1) > P(2); otherwise, decide 2

Probability Functions A probability density function is denoted in lowercase and represents a function of a continuous variable. px(x|), often abbreviated as p(x), denotes a probability density function for the random variable X. Note that px(x|) and py(y|) can be two different functions. P(x|) denotes a probability mass function, and must obey the following constraints: Probability mass functions are typically used for discrete random variables while densities describe continuous random variables (latter must be integrated).

Bayes Formula Suppose we know both P(j) and p(x|j), and we can measure x. How does this influence our decision? The joint probability of finding a pattern that is in category j and that this pattern has a feature value of x is: Rearranging terms, we arrive at Bayes formula: where in the case of two categories:

Posterior Probabilities Bayes formula: can be expressed in words as: By measuring x, we can convert the prior probability, P(j), into a posterior probability, P(j|x). Evidence can be viewed as a scale factor and is often ignored in optimization applications.

Bayes Decision Rule Decision rule: For an observation x, decide 1 if P(1|x) > P(2|x); otherwise, decide 2 Probability of error: The average probability of error is given by: If for every x we ensure that is as small as possible, then the integral is as small as possible. Thus, Bayes decision rule minimizes .

Detection Matched filter reduces the received signal to a single variable z(T), after which the detection of symbol is carried out The concept of maximum likelihood detector is based on Statistical Decision Theory It allows us to formulate the decision rule that operates on the data optimize the detection criterion

Likelihood Suppose the data have density The likelihood is the probability of the observed data, as a function of the parameters.

Probabilities Review P[s0], P[s1]  a priori probabilities These probabilities are known before transmission P[z] probability of the received sample p(z|s0), p(z|s1) conditional pdf of received signal z, conditioned on the class si P[s0|z], P[s1|z]  a posteriori probabilities After examining the sample, we make a refinement of our previous knowledge P[s1|s0], P[s0|s1] wrong decision (error) P[s1|s1], P[s0|s0] correct decision

How to Choose the threshold? Maximum Likelihood Ratio test and Maximum a posteriori (MAP) criterion: If else Problem is that a posteriori probabilities are not known. Solution: Use Bay’s theorem: This means that if received signal is positive, s1 (t) was sent, else s0 (t) was sent

Likelihood of So and S1 1

Likelihood ratio test MAP criterion: When the two signals, s0(t) and s1(t), are equally likely, i.e., P(s0) = P(s1) = 0.5, then the decision rule becomes This is known as maximum likelihood ratio test because we are selecting the hypothesis that corresponds to the signal with the maximum likelihood. In terms of the Bayes criterion, it implies that the cost of both types of error is the same

Likelihood ratio test (cond) Substituting the pdfs

Likelihood ratio test (cond) Hence: Taking the log, both sides will give

Types of Error Type I Error: Type II Error: Rejecting H0 when H0 is true Type II Error: Accepting H0 when H0 is false

Probability of Error Error will occur if s1 is sent  s0 is received s0 is sent  s1 is received The total probability of error is sum of the errors

Likelihood ratio test (cond) Hence where z is the minimum error criterion and  0 is optimum threshold For antipodal signal, s1(t) = - s0 (t)  a1 = - a0

Probability of Error (contd) If signals are equally probable Hence, the probability of bit error PB, is the probability that an incorrect hypothesis is made Numerically, PB is the area under the tail of either of the conditional distributions p(z|s1) or p(z|s0)

Probability of Error (contd) The above equation cannot be evaluated in closed form (Q-function) Hence,

Co-error function Q(x) is called the complementary error function or co-error function Is commonly used symbol for probability Another approximation for Q(x) for x>3 is as follows: Q(x) is presented in a tabular form

Co-error Table

Imp. Observation To minimize PB, we need to maximize: or Where (a1-a2) is the difference of desired signal components at filter output at t=T, and square of this difference signal is the instantaneous power of the difference signal i.e. Signal to Noise Ratio

Neyman-Pearson Criterion Consider a two class problem Following four probabilities can be computed: Probability of detection (hit) Probability of false alarm Probability of miss R1 R2 Probability of correct rejection We do not know the prior probabilities, so Bayes’s optimum classification is not possible However we do know that Probability of False alarm must be below  Based on this constraint (Neyman-Pearson criterion) we can design a classifier Observation: maximizing probability of detection and minimizing probability of false alarm are conflicting goals (in general)

Receiver Operating Characteristics ROC is a plot: probability of false alarm vs. probability of detection Probability of detection Classifier 1 Classifier 2 Probability of false alarm Area under ROC curve is a measure of performance Used also to find a operating point for the classifier