Probability. Questions what is a good general size for artifact samples? what proportion of populations of interest should we be attempting to sample?

Slides:



Advertisements
Similar presentations
Probability Probability Principles of EngineeringTM
Advertisements

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Psychology 10 Analysis of Psychological Data February 26, 2014.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Probability Probability Principles of EngineeringTM
Chapter 4 Probability.
Pattern Classification, Chapter 1 1 Basic Probability.
Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.
1 Basic Probability Statistics 515 Lecture Importance of Probability Modeling randomness and measuring uncertainty Describing the distributions.
Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.
Probability Distributions: Finite Random Variables.
Chapter 6 Probabilit y Vocabulary Probability – the proportion of times the outcome would occur in a very long series of repetitions (likelihood of an.
Joint Distribution of two or More Random Variables
Chapter 6 Probability.
Probability.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Class 3 Binomial Random Variables Continuous Random Variables Standard Normal Distributions.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
The Binomial Distribution Permutations: How many different pairs of two items are possible from these four letters: L, M. N, P. L,M L,N L,P M,L M,N M,P.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.
Lecture Slides Elementary Statistics Twelfth Edition
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Theory of Probability Statistics for Business and Economics.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review Instructor: Anirban Mahanti Office: ICT Class.
PROBABILITY CONCEPTS Key concepts are described Probability rules are introduced Expected values, standard deviation, covariance and correlation for individual.
 Review Homework Chapter 6: 1, 2, 3, 4, 13 Chapter 7 - 2, 5, 11  Probability  Control charts for attributes  Week 13 Assignment Read Chapter 10: “Reliability”
Chapter 10: Introducing Probability STAT Connecting Chapter 10 to our Current Knowledge of Statistics Probability theory leads us from data collection.
Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
Stats Exam Prep. Dr. Lin Lin. WARNING The goal of this workshop is to go over some basic concepts in probability and statistic theories required for IS.
Probability The calculated likelihood that a given event will occur
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
The Big Picture: Counting events in a sample space allows us to calculate probabilities The key to calculating the probabilities of events is to count.
Introduction to Probability 1. What is the “chance” that sales will decrease if the price of the product is increase? 2. How likely that the Thai GDP will.
(c) 2007 IUPUI SPEA K300 (4392) Probability Likelihood (chance) that an event occurs Classical interpretation of probability: all outcomes in the sample.
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Probability (outcome k) = Relative Frequency of k
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
Random Variables Learn how to characterize the pattern of the distribution of values that a random variable may have, and how to use the pattern to find.
Measuring chance Probabilities FETP India. Competency to be gained from this lecture Apply probabilities to field epidemiology.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Probability. Probability Probability is fundamental to scientific inference Probability is fundamental to scientific inference Deterministic vs. Probabilistic.
Discrete Math Section 16.3 Use the Binomial Probability theorem to find the probability of a given outcome on repeated independent trials. Flip a coin.
Week 21 Rules of Probability for all Corollary: The probability of the union of any two events A and B is Proof: … If then, Proof:
PROBABILITY AND BAYES THEOREM 1. 2 POPULATION SAMPLE PROBABILITY STATISTICAL INFERENCE.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
STROUD Worked examples and exercises are in the text Programme 29: Probability PROGRAMME 29 PROBABILITY.
Binomial Probability Theorem In a rainy season, there is 60% chance that it will rain on a particular day. What is the probability that there will exactly.
Chapter 2 Probability. Motivation We need concept of probability to make judgments about our hypotheses in the scientific method. Is the data consistent.
APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS
Random Variables.
Basic Probability aft A RAJASEKHAR YADAV.
Unit 1: Probability and Statistics
Probability Principles of Engineering
Probability Principles of Engineering
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Probability distributions
Statistical Inference for Managers
Probability.
Probability Principles of Engineering
M248: Analyzing data Block A UNIT A3 Modeling Variation.
Probability Probability Principles of EngineeringTM
Probability Principles of Engineering
Presentation transcript:

Probability

Questions what is a good general size for artifact samples? what proportion of populations of interest should we be attempting to sample? how do we evaluate the absence of an artifact type in our collections?

“frequentist” approach probability should be assessed in purely objective terms no room for subjectivity on the part of individual researchers knowledge about probabilities comes from the relative frequency of a large number of trials –this is a good model for coin tossing –not so useful for archaeology, where many of the events that interest us are unique…

Bayesian approach Bayes Theorem –Thomas Bayes –18 th century English clergyman concerned with integrating “prior knowledge” into calculations of probability problematic for frequentists –prior knowledge = bias, subjectivity…

basic concepts probability of event = p 0 <= p <= 1 0 = certain non-occurrence 1 = certain occurrence.5 = even odds.1 = 1 chance out of 10

if A and B are mutually exclusive events: P(A or B) = P(A) + P(B) ex., die roll: P(1 or 6) = 1/6 + 1/6 =.33 possibility set: sum of all possible outcomes ~A = anything other than A P(A or ~A) = P(A) + P(~A) = 1 basic concepts (cont.)

discrete vs. continuous probabilities discrete –finite number of outcomes continuous –outcomes vary along continuous scale basic concepts (cont.)

discrete probabilities p HH TT HT

0.1.2 p continuous probabilities p total area under curve = 1 but the probability of any single value = 0  interested in the probability assoc. w/ intervals

independent events one event has no influence on the outcome of another event if events A & B are independent then P(A&B) = P(A)*P(B) if P(A&B) = P(A)*P(B) then events A & B are independent coin flipping if P(H) = P(T) =.5 then P(HTHTH) = P(HHHHH) =.5*.5*.5*.5*.5 =.5 5 =.03

if you are flipping a coin and it has already come up heads 6 times in a row, what are the odds of an 7 th head?.5 note that P(10H) P(4H,6T) –lots of ways to achieve the 2 nd result (therefore much more probable)

mutually exclusive events are not independent rather, the most dependent kinds of events –if not heads, then tails –joint probability of 2 mutually exclusive events is 0 P(A&B)=0

conditional probability concern the odds of one event occurring, given that another event has occurred P(A|B)=Prob of A, given B

e.g. consider a temporally ambiguous, but generally late, pottery type the probability that an actual example is “late” increases if found with other types of pottery that are unambiguously late… P = probability that the specimen is late: isolated:P(T a ) =.7 w/ late pottery (T b ):P(T a |T b ) =.9 w/ early pottery (T c ):P(T a |T c ) =.3

P(B|A) = P(A&B)/P(A) if A and B are independent, then P(B|A) = P(A)*P(B)/P(A) P(B|A) = P(B) conditional probability (cont.)

Bayes Theorem can be derived from the basic equation for conditional probabilities

application archaeological data about ceramic design –bowls and jars, decorated and undecorated previous excavations show: –75% of assemblage are bowls, 25% jars –of the bowls, about 50% are decorated –of the jars, only about 20% are decorated we have a decorated sherd fragment, but it’s too small to determine its form… what is the probability that it comes from a bowl?

can solve for P(B|A) events:?? events: B = “bowlness”; A = “decoratedness” P(B)=??; P(A|B)=?? P(B)=.75; P(A|B)=.50 P(~B)=.25; P(A|~B)=.20 P(B|A)=.75*.50 / ((.75*50)+(.25*.20)) P(B|A)=.88 bowljar dec.?? 50% of bowls 20% of jars undec. 50% of bowls 80% of jars 75%25%

Binomial theorem P(n,k,p) –probability of k successes in n trials where the probability of success on any one trial is p –“success” = some specific event or outcome –k specified outcomes –n trials –p probability of the specified outcome in 1 trial

where n! = n*(n-1)*(n-2)…*1 (where n is an integer) 0!=1

misc. useful derivations from BT if repeated trials are carried out:  mean successes (k) = n*p  sd of successes (k) =  npq (note: q=1-p) (really only approximated when trials are repeated many times…) k=0; P(n,0,p)=(1-p) n

binomial distribution binomial theorem describes a theoretical distribution that can be plotted in two different ways: –probability density function (PDF) –cumulative density function (CDF)

probability density function (PDF) summarizes how odds/probabilities are distributed among the events that can arise from a series of trials

ex: coin toss we toss a coin three times, defining the outcome head as a “success”… what are the possible outcomes? how do we calculate their probabilities?

coin toss (cont.) how do we assign values to P(n,k,p) ? 3 trials; n = 3 even odds of success; p=.5 P(3,k,.5) there are 4 possible values for ‘k’, and we want to calculate P for each of them k 0TTT 1HTT (THT,TTH) 2HHT (HTH, THH) 3HHH “probability of k successes in n trials where the probability of success on any one trial is p”

practical applications how do we interpret the absence of key types in artifact samples?? does sample size matter?? does anything else matter??

1.we are interested in ceramic production in southern Utah 2.we have surface collections from a number of sites  are any of them ceramic workshops?? 3.evidence: ceramic “wasters”  ethnoarchaeological data suggests that wasters tend to make up about 5% of samples at ceramic workshops example

one of our sites  15 sherds, none identified as wasters… so, our evidence seems to suggest that this site is not a workshop how strong is our conclusion??

reverse the logic: assume that it is a ceramic workshop new question: –how likely is it to have missed collecting wasters in a sample of 15 sherds from a real ceramic workshop?? P(n,k,p) [n trials, k successes, p prob. of success on 1 trial] P(15,0,.05) [we may want to look at other values of k…]

kP(15,k,.05) …

how large a sample do you need before you can place some reasonable confidence in the idea that no wasters = no workshop? how could we find out?? we could plot P(n,0,.05) against different values of n…

50 – less than 1 chance in 10 of collecting no wasters… 100 – about 1 chance in 100…

What if wasters existed at a higher proportion than 5%??

so, how big should samples be? depends on your research goals & interests need big samples to study rare items… “rules of thumb” are usually misguided (ex. “200 pollen grains is a valid sample”) in general, sheer sample size is more important that the actual proportion large samples that constitute a very small proportion of a population may be highly useful for inferential purposes

the plots we have been using are probability density functions (PDF) cumulative density functions (CDF) have a special purpose example based on mortuary data…

Site graves 160 exhibit body position and grave goods that mark members of a distinct ethnicity (group A) relative frequency of 0.2 Site 2 badly damaged; only 50 graves excavated 6 exhibit “group A” characteristics relative frequency of 0.12 Pre-Dynastic cemeteries in Upper Egypt

expressed as a proportion, Site 1 has around twice as many burials of individuals from “group A” as Site 2 how seriously should we take this observation as evidence about social differences between underlying populations?

assume for the moment that there is no difference between these societies—they represent samples from the same underlying population how likely would it be to collect our Site 2 sample from this underlying population? we could use data merged from both sites as a basis for characterizing this population but since the sample from Site 1 is so large, lets just use it …

Site 1 suggests that about 20% of our society belong to this distinct social class… if so, we might have expected that 10 of the 50 sites excavated from site 2 would belong to this class but we found only 6…

how likely is it that this difference (10 vs. 6) could arise just from random chance?? to answer this question, we have to be interested in more than just the probability associated with the single observed outcome “6” we are also interested in the total probability associated with outcomes that are more extreme than “6”…

imagine a simulation of the discovery/excavation process of graves at Site 2: repeated drawing of 50 balls from a jar: –ca. 800 balls –80% black, 20% white on average, samples will contain 10 white balls, but individual samples will vary

by keeping score on how many times we draw a sample that is as, or more divergent (relative to the mean sample) than what we observed in our real-world sample… this means we have to tally all samples that produce 6, 5, 4…0, white balls… a tally of just those samples with 6 white balls eliminates crucial evidence…

we can use the binomial theorem instead of the drawing experiment, but the same logic applies a cumulative density function (CDF) displays probabilities associated with a range of outcomes (such as 6 to 0 graves with evidence for elite status)

nkpP(n,k,p)cumP

so, the odds are about 1 in 10 that the differences we see could be attributed to random effects—rather than social differences you have to decide what this observation really means, and other kinds of evidence will probably play a role in your decision…