DATASET INTRODUCTION 1. Dataset: Urine 2 From Cleveland Clinic 1981-1984.

Slides:



Advertisements
Similar presentations
A small taste of inferential statistics
Advertisements

1 Chapter 3 Probability 3.1 Terminology 3.2 Assign Probability 3.3 Compound Events 3.4 Conditional Probability 3.5 Rules of Computing Probabilities 3.6.
COUNTING AND PROBABILITY
Statistical Issues in Research Planning and Evaluation
1 1 PRESENTED BY E. G. GASCON Introduction to Probability Section 7.3, 7.4, 7.5.
1 The Odds Ratio (Relative Odds) In a case-control study, we do not know the incidence in the exposed population or the incidence in the nonexposed population.
Introduction Let’s say you and your friends draw straws to see who has to do some unpleasant activity, like cleaning out the class pet’s cage. If everyone.
Inferences About Process Quality
Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.
Class notes for ISE 201 San Jose State University
Chapter 4 Probability Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
6 Probability Chapter6 p Operations on events and probability An event is the basic element to which probability can be applied. Notations Event:
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Multiple Choice Questions for discussion
Investment Analysis and Portfolio management Lecture: 24 Course Code: MBF702.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Lecture Slides Elementary Statistics Twelfth Edition
Independent Samples t-Test (or 2-Sample t-Test)
PARAMETRIC STATISTICAL INFERENCE
Ex St 801 Statistical Methods Probability and Distributions.
Statistics and Quantitative Analysis Chemistry 321, Summer 2014.
Measures of Association
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Topic 2 – Probability Basic probability Conditional probability and independence Bayes rule Basic reliability.
Chapter 12: Introduction to Analysis of Variance
PROBABILITY Basic Concepts So simple.... Figure these out Take a blank piece of paper and write down your own answers before they show up on the slides.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Previous Lecture: Data types and Representations in Molecular Biology.
LECTURE 15 THURSDAY, 15 OCTOBER STA 291 Fall
1 Epidemiologic studies that are concerned with characterizing the amount and distribution of health and disease within a population. Descriptive Epidemiology.
Introduction to Probability  Probability is a numerical measure of the likelihood that an event will occur.  Probability values are always assigned on.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
LECTURE 14 TUESDAY, 13 OCTOBER STA 291 Fall
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
CT image testing. What is a CT image? CT= computed tomography CT= computed tomography Examines a person in “slices” Examines a person in “slices” Creates.
Topic 2: Intro to probability CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
The two way frequency table The  2 statistic Techniques for examining dependence amongst two categorical variables.
Evaluating Results of Learning Blaž Zupan
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
The exam is of 2 hours & Marks :40 The exam is of two parts ( Part I & Part II) Part I is of 20 questions. Answer any 15 questions Each question is of.
Issues concerning the interpretation of statistical significance tests.
Probability Formulas The probability of more than one outcome. This is a union of the probabilities. If events are disjoint: If events are not disjoint:
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Fall 2002Biostat Probability Probability - meaning 1) classical 2) frequentist 3) subjective (personal) Sample space, events Mutually exclusive,
© Copyright McGraw-Hill 2004
Lecture 7 Dustin Lueker. 2STA 291 Fall 2009 Lecture 7.
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
1 Chapter 4, Part 1 Basic ideas of Probability Relative Frequency, Classical Probability Compound Events, The Addition Rule Disjoint Events.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
Welcome to Math 6 Our subject for today is… Divisibility.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Biostatistics Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
BA5001 Business Decision Making Autumn 2014 Session 4 – Lecture Estimation Module Leader: Dr Francisca Tej.
Biostatistics Class 2 Probability 2/1/2000.
Estimating standard error using bootstrap
Essential Ideas for The Nature of Probability
The binomial applied: absolute and relative risks, chi-square
Measures of Association
Introduction Let’s say you and your friends draw straws to see who has to do some unpleasant activity, like cleaning out the class pet’s cage. If everyone.
STA 291 Spring 2008 Lecture 6 Dustin Lueker.
One-Way Analysis of Variance
Chapter 7 (Probability)
Chapter 5: Sampling Distributions
Presentation transcript:

DATASET INTRODUCTION 1

Dataset: Urine 2 From Cleveland Clinic

Outcome Variable: Categorical Variable  Calcium Oxalate Crystal Presence In this analysis, this variable will be our Outcome variable Response Variable Dependent Variable Note: The dataset is coded directly as Yes/No (not 0/1 coding) 3

Other Variables (Covariates) QuantitativeVariables  Specific Gravity  pH  Osmolarity  Conductivity  Urea Concentration (millimoles/liter)  Calcium Concentration (millimoles/liter)  Cholesterol: serum cholesterol levels 4

Discussion/Review  Purpose of dataset: Determine which of the covariates are related to the outcome. Covariates can also be called Independent Variables Predictors Explanatory Variables  Outcomes/Covariates can be categorical or quantitative  Can be more than one outcome and many covariates in a given study with any mixture of variable types 5

6 Calcium Oxalate Crystal Presence NMean Std Dev MinQ1MedQ3Max No Yes

Discussion  Clearly, those with calcium oxalate crystals present tend to have higher calcium concentrations  Later we will learn to conduct hypothesis tests in such situations  Now we use this data to illustrate concepts of probability 7

Comments  To facilitate our discussion of probability and classification tests  We will categorize the quantitative variable Calcium Concentration into four groups 1 = = = = 8 or More 8

BASIC PROBABILITY Part 1 (Unconditional Probability using Logic) 9

Back to the Urine Dataset  Suppose one individual is selected from our sample and consider the following questions What is the probability that the individual has calcium oxalate crystals present? What is the probability that the individual has a calcium concentration of 5 or more? What is the probability the individual has calcium oxalate crystals present AND has a calcium concentration of 5 or more? What is the probability the individual has calcium oxalate crystals present OR has a calcium concentration of 5 or more? 10

Comments  All of these four probability questions relate to the ENTIRE SAMPLE  We begin by answering the questions logically from the table we created using software 11

Let’s Practice! Basic Probability of an Event What is the probability that the individual has calcium oxalate crystals present? We will denote this event by A. = PREVALENCE of calcium oxalate crystals in our sample Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice! Basic Probability of an Event What is the probability that the individual has a calcium concentration of 5 or more? We will denote this event by B. Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice! Basic Probability of an Event: Intersections What is the probability the individual has calcium oxalate crystals present AND has a calcium concentration of 5 or more? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice! Basic Probability of an Event: Unions What is the probability the individual has calcium oxalate crystals present OR has a calcium concentration of 5 or more? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

USING PROBABILITY RULES Part 1 16

Probability Rules  Rules are created and used for many reasons  The rules and properties stated previously are important and useful in probability and sometimes in statistics  Not always needed If you can determine the answer through logic alone you may not need a rule! If you are provided only pieces of the puzzle, sometimes a rule is faster than logic! 17

Continuing  We now illustrate a few formulas using the questions we have already answered using logic 18

Let’s Practice Again! Complement Rule What is the probability that the individual DOES NOT have calcium oxalate crystals present? We could use logic and count the No’s instead of the Yes’s however knowing P(Yes)=P(A): Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice Again! Addition Rule (Unions) What is the probability the individual has calcium oxalate crystals present OR has a calcium concentration of 5 or more? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice Again! Addition Rule (Unions) What is the probability the individual has calcium oxalate crystals present OR has a calcium concentration of 5 or more? 21

INDEPENDENCE Part 1 22

Independent Events  Two events are independent if knowing one event occurs does not change the probability of the other  This is not the same as “disjoint” events which are separate in that they cannot occur together  These are two different concepts entirely  Independence is a statement about the equality of the probability of one event whether or not the other event occurs (or is occurring, or has occurred) 23

Let’s Practice! Investigating Independence Part 1 We know the following from our sample 24 ?

Let’s Practice! Investigating Independence Part 1  From our sample we have:  This is clearly not equal to 0.247!!  In our sample the events are dependent (we can test this hypothesis about the population later) 25

BASIC PROBABILITY Part 2: Conditional Probability (Logic & Formula) 26

Conditional Probability  So far, we have divided by the TOTAL  Sometimes, however, we have additional CONDITIONS that cause us to alter the denominator (bottom) of our probability calculation  Suppose, when choosing one person from the Urine data, we ask Given the individual has Calcium Oxalate Crystals present, what is the probability the individual’s calcium concentration is 5 or above?  “Conditional” refers to the fact that we have these additional conditions, restrictions, or other information 27

Let’s Practice! CONDITIONAL Probability of an Event Given the individual has Calcium Oxalate Crystals present, what is the probability the individual’s calcium concentration is 5 or above? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

Let’s Practice! CONDITIONAL Probability FORMULA Given the individual has Calcium Oxalate Crystals present, what is the probability the individual’s calcium concentration is 5 or above? 29

Let’s Practice! CONDITIONAL Probability of an Event Given the individual DOES NOT HAVE Calcium Oxalate Crystals present, what is the probability the individual’s calcium concentration is 5 or above? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

MORE PRACTICE Conditional Probability 31

Let’s Verify! CONDITIONAL Probability of an Event Given the individual has a calcium concentration of 5 or above, what is the probability the individual has calcium oxalate crystals? We have a small amount of rounding error this time Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

INDEPENDENCE Part 2 33

Let’s Practice! Investigating Independence Part 2 We know the following from our sample 34 ??

Comments Investigating Independence Part 2  These probabilities are clearly unequal in our sample, our eventual question might be if this is also true for our population  In this sample, these events are dependent  From our analysis so far, it seems likely they may be dependent in our population (we can test later)  Knowing whether or not the person has calcium oxalate crystals present CHANGES the probability of having a calcium concentration of 5 or above!! 35

GENERAL MULTIPLICATION RULE 36

General Multiplication Rule  This formula comes from rearranging the definition of conditional probability  To achieve the second formulation on the right consider the formula below for P(A|B) instead and note that the numerator is unchanged 37

General Multiplication Rule 38

REPEATED SAMPLING 39

Repeated Sampling  Often we consider problems in which we draw multiple individuals from a set of individuals Drawing parts from a box where some are defective Choosing multiple people from a certain population  The formulas we have investigated can be used to calculate probabilities in these situations 40

Let’s Practice!  If we select two subjects at random from our sample, what is the probability that both have a calcium concentration of 8 or more? Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

WANT TO LEARN MORE? READ THE FOLLOWING OPTIONAL MATERIAL The remaining slides are optional. They illustrate some more difficult probability rules along with additional examples of probability related to the health sciences 42

Optional Content: Read About  Relative Risk  Total Probability Rule  Bayes Rule  Screening Tests Sensitivity/Specificity PV+/PV- False Positive and False Negative Rates  ROC Curves 43

Relative Risk  Relative risk is the risk of an “event” relative to an “exposure” the ratio of the probability of the event occurring among “exposed” versus “non-exposed” If A and B are independent, the relative risk is 1  In our rule B is the EVENT and A is the EXPOSURE 44

Let’s Practice!  Find the Relative Risk of High Calcium Concentration Given Calcium Oxalate Crystal Presence Note: this is the reverse of what we probably want in this case, consider that for more practice! INTERPRET RR: Having a calcium concentration of 5 or more is around 4 times more likely among those with calcium oxalate crystals than among those without. 45

Total Probability Rule 46

Bayes’ Rule  We want to find P(A|B) so that we will need to “rearrange” the formula swapping A’s and B’s 47

Bayes’ Rule 48

Let’s Verify! CONDITIONAL Probability of an Event Given the individual has a calcium concentration of 5 or above, what is the probability the individual has calcium oxalate crystals? We have a small amount of rounding error this time Table of group by r group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency NoYesTotal or More 178 Total

SCREENING TESTS and ROC Curves 50

Screening Tests 51

Sensitivity & Specificity “Epi” Style Has Condition Does not have Condition Test Positive A TP B FP Total Positive Test (A+B) Test Negative C FN D TN Total Negative Test (C+D) Number with Condition (A+C) Number without Condition (B+D) 52

Sensitivity & Specificity Has Condition Does not have Condition NEGATIVE or more POSITIVE group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency YesNoTotal or More 718 Total

Sensitivity & Specificity Has Condition Does not have Condition NEGATIVE or more POSITIVE group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency YesNoTotal or More 718 Total

Sensitivity & Specificity Has Condition Does not have Condition NEGATIVE or more POSITIVE group (Calcium Concentration Group) r (Calcium Oxalate Crystal Presence) Frequency YesNoTotal or More 718 Total

Bayes’ Rule Has Condition Does not have Condition Negative Positive ≥ Here we Define: A = Disease B = Test Positive

Choosing Different Cut-Off Cut-pointSensitivitySpecificity 2 or more or more or more High Sensitivity but Low Specificity

Choosing Different Cut-Off Cut-pointSensitivitySpecificity 2 or more or more or more Specificity Increased But you reduce sensitivity (orange arrow)

Choosing Different Cut-Off Cut-pointSensitivitySpecificity 2 or more or more or more Very High Specificity Very Low Sensitivity (High False Negative Rate)

What happens when  We assign all individuals a positive test result? Sensitivity = P(Test+|Disease) = 1 Specificity = P(Test-|No Disease) = 0 1 – Specificity = 1  We assign all individuals a negative test result? Sensitivity = P(Test+|Disease) = 0 Specificity = P(Test-|No Disease) =1 1 – Specificity = 0 60

Receiver Operating Characteristic curve (ROC curve) Cut-pointSensitivitySpecificity 2 or more or more or more

ROC Curves  Area under the curve = probability that for a randomly selected pair of normal and abnormal subjects, the test will correctly identify the normal subject given the “measurement”  Area = 0.89 for the example on the left 62

Trapezoidal Rule (FYI) 63