Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Epistemology

Similar presentations


Presentation on theme: "Bayesian Epistemology"— Presentation transcript:

1 Bayesian Epistemology
Phil 218/338

2 Welcome and thank you!

3 Outline Part I: What is Bayesian epistemology?
Probabilities as credences The axioms of probability Conditionalisation Part II: Applications and problems: Theism Bear with me! Ideally we would discuss these topics over several lectures.

4 What is Bayesian Epistemology?
Bayesianism is our “leading theory of uncertainty” Alan Hájek and Stephan Hartmann It concerns credences, or degrees of belief, which are often uncertain I’m not going to be attacked by a duck tomorrow Bayesianism ≈ a theory about when our credences are rational or justified (one which may complement other theories of justification) There are many varieties of Bayesianism (Irving Good calculated that there are at least 46,656!) Bayesian epistemology is the “application of Bayesian methods to epistemological problems.”

5 First component of Bayesianism: Probabilities as credences

6 Credences Traditional epistemology deals primarily with qualitative concepts Belief/disbelief Knowledge/ignorance In Bayesian epistemology, these binary concepts are arguably less central and therefore receive less attention Bayesian epistemology deals largely with a quantitative concept of credences Credences ≈ degrees of belief or disbelief

7 First component of Bayesianism: Probabilities as credences
In the 17th century, mathematicians Blaise Pascal and Pierre de Fermat pioneered a representation of uncertainty as probabilities Subjective interpretation of probability: Subjective interpretation: ‘Probability is degree of belief’ But whose degree of belief? Some actual person or Some ideal person This is the subjective or personal interpretation of probability because these probabilities concern the psychological state of a subject or person

8 Terminology Terminology Example: These terms are on your handout
h = hypothesis/proposition ~h = negation of the hypothesis P(h) = probability of the hypothesis Example: h = It will rain tomorrow P(h) = Probability that it will rain tomorrow These terms are on your handout

9 Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100%

10 Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false

11 Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true

12 Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true P(h)=.8 P(h)=80% h is probably true ~h is probably not true

13 Quantitative nature of credences
Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true P(h)=.8 P(h)=80% h is probably true ~h is probably not true P(h)=.2 P(h)=20% h is probably not true ~h is probably true

14 Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (after random selection) draw a red marble from an urn containing 5 red marbles 5 black marbles Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.5

15 Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (after random selection) draw a red marble from an urn containing 90 red marbles 10 black marbles Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.9

16 Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (via random selection) draw a red marble from an urn containing 9,999 red marbles 1 black marble Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.9999

17 Measuring credences What about your credence that:
It will rain tomorrow You will be attacked by a duck tomorrow Maybe an interval might represent your credences better If h = It will rain tomorrow Then P(h) = [.6, .7] What do you think? Can all of our credences be represented with numerical values?

18 Objections to the subjective interpretation
The probability of h given some evidence e does not mean someone’s actual credence since there may be no actual credence that is relevant It’s not clear that the probability of h given some evidence e is the credence of some epistemically rational agent When is an agent’s credence epistemically rational? When their credence for h given e equals the (inductive) probability of h given e? This is uniformative! (Patrick Maher) When their belief is not blameworthy from an epistemic point of view? But someone might accidentally mistake the probability of h given e to be low and not be blameworthy, but still the probability of h given e might be high (Patrick Maher) Isn’t this just like saying “A proposition is true if and only if an omniscient God were to believe it?” – It’s uninformative

19 Alternatives Inductive probabilities are conceptual primitives – they can be understood, but not expressed in terms of other simpler concepts (Patrick Maher) Probabilities are relative frequencies, which we might loosely understand as the proportion of the time that something is true (the frequentist interpretation of probability) 80% of the time when a student sits this course, it is true that they pass 60% of the time when a patient undergoes chemotherapy, it is true that they will recover

20 Second component of Bayesianism: Credences should conform to the axioms (or rules) of probability

21 Second component of Bayesianism: Credences should conform to the axioms (or rules) of probability
(A1) All probabilities are between 1 and 0, i.e. 0 ≤ P(h) ≤ 1 for any h. (A2) Logical truths have a probability of 1, i.e. P(T)=1 for any tautology T (A3) Where h1 and h2 are two mutually exclusive hypotheses, the probability of h1 or h2 (h1 ∨ h2) is the sum of their respective probabilities, i.e. P(h1 ∨ h2) = P(h1) + P(h2). These are on your handout

22 The axioms in action Suppose you draw a marble from an urn: You set
r = the marble you have drawn is red ~r = the marble you have drawn is not red Suppose the urn is comprised of 3 red marbles and 7 black marbles You set 𝑃(𝑟)=.3 (30%) 𝑃(~𝑟)=.7 (70%) These assignments conform to axiom 1 By axioms 2 and 3, 𝑃(𝑟 ∨~𝑟)=1 (100%)

23 Arguments for conformity to the axioms
Argument from cases Lindley draws out rules of probability from the urn example We can prove other theorems using the axioms and see that they make sense using the example E.g. 𝑃 ~𝑟 =1−𝑃 𝑟 Dutch book arguments Dutch book = a combination of bets which an individual might accept individually, but which collectively entail that they will lose money

24 A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 If you conform to axiom 2, then you do not conform to axiom 3 By axiom 2, 𝑃(𝑟 ∨~𝑟)=1 But by the above assignments 1 and 2, 𝑃 𝑟 +𝑃 ~𝑟 = = 1.2 So, contrary to axiom 2, 𝑃 𝑟 ∨~𝑟 ≠𝑃 𝑟 +𝑃 ~𝑟 because 1≠1.2

25 A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 But if you conform to axiom 3, then you do not conform to axiom 2 By axiom 3, 𝑃 𝑟 ∨~𝑟 =𝑃 𝑟 +𝑃 ~𝑟 So by assignments 1 and 2, 𝑃 𝑟 ∨~𝑟 = 1.2 = =𝑃 𝑟 + 𝑃 ~𝑟 So, contrary to axiom 2, 𝑃 𝑟 ∨~𝑟 ≠1 because 1≠1.2

26 A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 If you conform to axiom 2, then you do not conform to axiom 3 But if you conform to axiom 3, then you do not conform to axiom 2 So you cannot conform to the axioms

27 A Dutch book Suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7
𝑃 ~𝑟 = .5 𝑟 ~𝑟 Bet 1 for assignment 1 +$3 -$7 Bet 2 for assignment 2 -$5 +$5 If r occurs, then they win $3 according to the first bet and lose $5 according to the second, so they lose $2 If r does not occur, then they lose $7 according to the first bet and gain $5 according to the second, so they lose $2 Either way, they lose $2.

28 Dutch book argument If someone violates the probability axioms, then she is vulnerable to having a Dutch book made against her One should avoid being vulnerable to having a Dutch book made against her (because this is a rational defect) Therefore, one should avoid violating the axioms of probability

29 An objection to the second component
Conformity to the axioms requires logical omniscience, but no one is omniscient “You’re right, but the component only sets an ideal standard, irrespective whether any one can meet it”

30 Questions? Do you think that one’s credences should conform to the axioms of probability?

31 Third component of Bayesianism: Credences should be updated via conditionalisation

32 Terminology Before examining this component, we need to introduce some terms Conditional probability = 𝑃 𝑝 𝑞 = the probability of p on the condition that q obtains = the probability of p given q RATIO formula as an analysis of conditional probability: 𝑃 𝑝 𝑞 = 𝑃(𝑝&𝑞) 𝑃(𝑞) where 𝑃 𝑞 >0.

33 Example of a conditional probability
m = Taylor is a mother f = Taylor is a female 𝑃 𝑚 𝑓 = the probability that Taylor is a mother given that Taylor is a female 𝑃 𝑓 =.5 𝑃 𝑚&𝑓 = .2 So: 𝑃 𝑚 𝑓 = 𝑃 𝑚&𝑓 𝑃 𝑓 = .2 .5 = .4 Note the big difference between 𝑃 𝑚 𝑓 and 𝑃 𝑓 𝑚 𝑃 𝑚|𝑓 = .4 𝑃 𝑓 𝑚 =1

34 Likelihoods A likelihood = 𝑃 𝑒 ℎ where e represents some evidence and h a hypothesis. 𝑃 𝑒 ℎ is called the likelihood of h on e.

35 Prior probabilities 𝑃 𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis immediately before the evidence comes in” (emphasis added) Strevens Terms: e = A person, such as Taylor, smiles at you h = A person, such as Taylor, likes you ~h = A person, such as Taylor, does not like you 𝑃 𝑖 (ℎ) = prior probability of a person, such as Taylor, liking you 𝑃 ℎ 𝑒 = probability of a person, such as Taylor, liking you given that s/he smiles at you What is the probability that Taylor likes you given that he or she smiled at you? 𝑃 ℎ 𝑒

36 What is the prior probability that Taylor likes you?
Suppose you surveyed 100 people and find the following:

37 What is the probability that Taylor likes you given the evidence?
P(h|e) = ? P(h|e) = 9/(9+36) = 9/45 = 1/5 = 20% = .2

38 Posterior probabilities
What is the probability that Taylor likes you given the evidence? 𝑃 𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis immediately before the evidence comes in” – Michael Strevens(emphasis added) 𝑃 𝑓 ℎ = Your posterior probability = “your subjective probability immediately after the evidence (and nothing else) comes in” (emphasis added) Conditionalisation: One should adjust their probability for h from their prior probability 𝑃 𝑖 (ℎ) to a posterior probability 𝑃 𝑓 (ℎ) which equals 𝑃 ℎ 𝑒 when having acquired some evidence e (which has a non-zero initial probability). This is called conditionalising h on e. Conditionalisation should occur through Bayes’s theorem (where applicable).

39 Conditionalisation via Bayes’s theorem
Application to the case: 𝑃 ℎ 𝑒 = 𝑃 𝑒 ℎ × 𝑃 𝑖 (ℎ) 𝑃 𝑖 (𝑒) Where 𝑃 𝑖 𝑒 =𝑃(𝑒|ℎ)× 𝑃 𝑖 (ℎ)+𝑃(𝑒|~ℎ)× 𝑃 𝑖 (~ℎ) .2= .9× Where .45 = .9 × × .9 Bayes’s theorem was expressed in a paper by Rev. Thomas Bayes that was published posthumously.

40 Arguments for the conditionalization norm
Case-by-case evidence Bayes’s theorem is used widely in statistics Dutch-book arguments

41 Part II: Applications and problems

42 Does God exist? WRONG! 𝑃 𝑖 ℎ = ? (where h = theism)
(One version of) The principle of indifference: In the absence of evidence favouring one possibility over another, assign each possibility an equal probability The principle of indifference seems intuitively plausible in many cases E.g. all you know is that a prize is behind one of three doors Presumably the probability that it is behind a given door is 1/3 or approximately .33 Application to theism: Either ℎ or ~ℎ, so 𝑃 𝑖 ℎ = .5 Sounds reasonable right? WRONG!

43 Multiple partitions problem
Suppose you’re cooking dinner for Jed, but you don’t know whether he eats meat One partition of possibilities: Either 1) Jed is a meat eater h or 2) he is not a meat eater ~ h, so 𝑃 𝑖 ℎ = .5 Another partition of possibilities: 1) Jed is a meat eater h, 2) Jed is a vegetarian v1 or 3) Jed is a vegan v2, so 𝑃 𝑖 ℎ = 1/3 The problem is that the space of possibilities can be partitioned differently so that it is unclear as to how or whether to apply the principle of indifference

44 Application to theism Either ℎ or ~ℎ, so 𝑃 𝑖 ℎ = .5
But what about another partition? Either: There is no ultimate cause of the universe Or there is an ultimate cause of the universe, but this cause is not a person (or conscious being) Or there is a personal and ultimate cause of the universe, but this cause is not omnibenevolent Or there is a personal, omnibenevolent and ultimate cause of the universe, but this cause is not omnipotent Or theism is true So already 𝑃 𝑖 ℎ < 1/5 according to the principle of indifference!

45 The problem of the priors: Subjective and objective Bayesianism
We can partition the logical possibilities differently so as to yield conflicting results when the principle of indifference applies So which partition do we go with? Some think that there is no uniquely correct partition So how do we determine 𝑃 𝑖 ℎ ? Subjectivists: Well, just pick any value you like – no value is incorrect, except for perhaps 1 or 0 Objectivists: There is a uniquely correct value for 𝑃 𝑖 ℎ , and it is… Let’s move on and assume that 𝑃 𝑖 ℎ = .5, just for illustration

46 What evidence is there that God exists?
Theistic evidence: Atheistic evidence: Fine-tuning of laws and constants A universe Moral truths Miracle reports Abiogenesis (Origins of life) Consciousness Human suffering Animal suffering Non-resistant, non-belief in God Scale of the universe Contradictory theistic theories Theism is less simple (Occam’s razor)

47 The fine-tuning argument
e1 = the laws of the universe are finely tuned to permit meaningful life: According to philosopher Robin Collins, if the strength of the gravitational force were to change by one part in 1036, then any land-based or aquatic organisms the size of humans would be crushed. Likelihoods: 𝑃 𝑒 1 ℎ)=.5 𝑃 𝑒 1 ~ℎ)=1/ 10 36 Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including polytheism, pantheism, etc.) What is the posterior probability of theism? 𝑃 ℎ 𝑒 1 ) ≈ 1

48 The fine-tuning argument – Just kidding!
e1 = the laws of the universe are finely tuned to permit meaningful life: According to philosopher Robin Collins, if the strength of the gravitational force were to change by one part in 1036, then any land-based or aquatic organisms the size of humans would be crushed. Likelihoods: 𝑃 𝑒 1 ℎ)=.5 𝑃 𝑒 1 ~ℎ)=.01 Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including polytheism, pantheism, etc.) What is the posterior probability of theism? 𝑃 ℎ 𝑒 1 ) ≈ .98

49 The multiverse objection
If there were (infinitely) many universes with the values of their laws randomly generated by chance, then we wouldn’t be surprised to see that one of them happen to have life-permitting values In Bayesian terms: Perhaps it is true that a where a = there is an (infinitely) large number of other universes with values randomly generated by chance and 𝑃 𝑒 ~ℎ&𝑎 =1 (or some relatively high figure)

50 The argument from suffering
e2 = humans suffer and this is a bad thing Genocide Oppression Missing buses Now our prior probability relative to e2 is our posterior probability relative to e1, so 𝑃 𝑖 ℎ ≈ .98 What are the likelihoods? Logical argument from evil (J.L. Mackie): 𝑃 𝑒 2 ℎ =0 𝑃 𝑒 2 ~ℎ =.5 So, 𝑃 ℎ 𝑒 2 )=0

51 The argument from suffering
e2 = humans suffer and this is a bad thing Genocide Oppression Missing buses Now our prior probability relative to e2 is our posterior probability relative to e1, so 𝑃 𝑖 ℎ ≈ .98 What are the likelihoods? Evidential argument from evil (William Rowe): 𝑃 𝑒 2 ℎ =.01 𝑃 𝑒 2 ~ℎ =.5 So, 𝑃 ℎ 𝑒 2 )=.5

52 Sceptical theism “God knows a lot more than us and would have reasons to justify his actions which we do not know of” “So if God existed, there was suffering and we did not see any reason that would justify God’s permission of the suffering, then we would not be surprised” More sophisticated defences of versions of sceptical theism are given by Stephen Wykstra and Daniel Howard-Snyder

53 The problem of the priors
There is sometimes a lot of debate about the likelihoods, or at least about what the relevant likelihoods are Suppose we agree that: 𝑃 𝑒 ℎ =.9 𝑃 𝑒 ~ℎ =.1 So if we assume that 𝑃 𝑖 ℎ =.5 Then 𝑃 ℎ 𝑒)=.9 But if we assume that 𝑃 𝑖 ℎ =.1 Then 𝑃 ℎ 𝑒)=.5 And if we assume that 𝑃 𝑖 ℎ =.00001 Then 𝑃 ℎ 𝑒) ≈ .0009

54 The problem of the priors

55 The problem of the priors
The posterior probability is sensitive to the value of the prior probability Subjective Bayesians often think that the subjectivity of the prior is not a major problem since the subjectivity will be “washed out” as evidence accumulates So two people starting off with different priors will converge on the probable truth given their conditioning on a growing body of evidence However, as Alan Hájek notes: “Indeed, for any range of evidence, we can find in principle an agent whose prior is so pathological that conditionalizing on that evidence will not get him or her anywhere near the truth, or the rest of us.” And there are other worries So does the problem of the priors render Bayesianism practically useless? Does it eliminate scepticism about the reliability of inductive inference?

56 Questions?

57 Thank you!


Download ppt "Bayesian Epistemology"

Similar presentations


Ads by Google