Bayesian Epistemology

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

Bayesian Epistemology Phil 218/338

Welcome and thank you!

Outline Part I: What is Bayesian epistemology? Probabilities as credences The axioms of probability Conditionalisation Part II: Applications and problems: Theism Bear with me! Ideally we would discuss these topics over several lectures.

What is Bayesian Epistemology? Bayesianism is our “leading theory of uncertainty” Alan Hájek and Stephan Hartmann It concerns credences, or degrees of belief, which are often uncertain I’m not going to be attacked by a duck tomorrow Bayesianism ≈ a theory about when our credences are rational or justified (one which may complement other theories of justification) There are many varieties of Bayesianism (Irving Good calculated that there are at least 46,656!) Bayesian epistemology is the “application of Bayesian methods to epistemological problems.”

First component of Bayesianism: Probabilities as credences

Credences Traditional epistemology deals primarily with qualitative concepts Belief/disbelief Knowledge/ignorance In Bayesian epistemology, these binary concepts are arguably less central and therefore receive less attention Bayesian epistemology deals largely with a quantitative concept of credences Credences ≈ degrees of belief or disbelief

First component of Bayesianism: Probabilities as credences In the 17th century, mathematicians Blaise Pascal and Pierre de Fermat pioneered a representation of uncertainty as probabilities Subjective interpretation of probability: Subjective interpretation: ‘Probability is degree of belief’ But whose degree of belief? Some actual person or Some ideal person This is the subjective or personal interpretation of probability because these probabilities concern the psychological state of a subject or person

Terminology Terminology Example: These terms are on your handout h = hypothesis/proposition ~h = negation of the hypothesis P(h) = probability of the hypothesis Example: h = It will rain tomorrow P(h) = Probability that it will rain tomorrow These terms are on your handout

Quantitative nature of credences Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100%

Quantitative nature of credences Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false

Quantitative nature of credences Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true

Quantitative nature of credences Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true P(h)=.8 P(h)=80% h is probably true ~h is probably not true

Quantitative nature of credences Credences (or subjective probabilities) are taken to be associated with a numerical value or an interval P(h) - decimal P(h) in % P(h) in normal language P(~h) in normal language P(h)=1 P(h)=100% h is certainly true ~h is certainly false P(h)=0 P(h)=0% h is certainly false ~h is certainly true P(h)=.8 P(h)=80% h is probably true ~h is probably not true P(h)=.2 P(h)=20% h is probably not true ~h is probably true

Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (after random selection) draw a red marble from an urn containing 5 red marbles 5 black marbles Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.5

Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (after random selection) draw a red marble from an urn containing 90 red marbles 10 black marbles Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.9

Measuring credences Consider your credence that h, the sun will rise tomorrow Consider your credence that you will (via random selection) draw a red marble from an urn containing 9,999 red marbles 1 black marble Are you more confident that the sun will rise tomorrow? If yes, then P(h)>.9999

Measuring credences What about your credence that: It will rain tomorrow You will be attacked by a duck tomorrow Maybe an interval might represent your credences better If h = It will rain tomorrow Then P(h) = [.6, .7] What do you think? Can all of our credences be represented with numerical values?

Objections to the subjective interpretation The probability of h given some evidence e does not mean someone’s actual credence since there may be no actual credence that is relevant It’s not clear that the probability of h given some evidence e is the credence of some epistemically rational agent When is an agent’s credence epistemically rational? When their credence for h given e equals the (inductive) probability of h given e? This is uniformative! (Patrick Maher) When their belief is not blameworthy from an epistemic point of view? But someone might accidentally mistake the probability of h given e to be low and not be blameworthy, but still the probability of h given e might be high (Patrick Maher) Isn’t this just like saying “A proposition is true if and only if an omniscient God were to believe it?” – It’s uninformative

Alternatives Inductive probabilities are conceptual primitives – they can be understood, but not expressed in terms of other simpler concepts (Patrick Maher) Probabilities are relative frequencies, which we might loosely understand as the proportion of the time that something is true (the frequentist interpretation of probability) 80% of the time when a student sits this course, it is true that they pass 60% of the time when a patient undergoes chemotherapy, it is true that they will recover

Second component of Bayesianism: Credences should conform to the axioms (or rules) of probability

Second component of Bayesianism: Credences should conform to the axioms (or rules) of probability (A1) All probabilities are between 1 and 0, i.e. 0 ≤ P(h) ≤ 1 for any h. (A2) Logical truths have a probability of 1, i.e. P(T)=1 for any tautology T (A3) Where h1 and h2 are two mutually exclusive hypotheses, the probability of h1 or h2 (h1 ∨ h2) is the sum of their respective probabilities, i.e. P(h1 ∨ h2) = P(h1) + P(h2). These are on your handout

The axioms in action Suppose you draw a marble from an urn: You set r = the marble you have drawn is red ~r = the marble you have drawn is not red Suppose the urn is comprised of 3 red marbles and 7 black marbles You set 𝑃(𝑟)=.3 (30%) 𝑃(~𝑟)=.7 (70%) These assignments conform to axiom 1 By axioms 2 and 3, 𝑃(𝑟 ∨~𝑟)=1 (100%)

Arguments for conformity to the axioms Argument from cases Lindley draws out rules of probability from the urn example We can prove other theorems using the axioms and see that they make sense using the example E.g. 𝑃 ~𝑟 =1−𝑃 𝑟 Dutch book arguments Dutch book = a combination of bets which an individual might accept individually, but which collectively entail that they will lose money

A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 If you conform to axiom 2, then you do not conform to axiom 3 By axiom 2, 𝑃(𝑟 ∨~𝑟)=1 But by the above assignments 1 and 2, 𝑃 𝑟 +𝑃 ~𝑟 = .7 + .5 = 1.2 So, contrary to axiom 2, 𝑃 𝑟 ∨~𝑟 ≠𝑃 𝑟 +𝑃 ~𝑟 because 1≠1.2

A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 But if you conform to axiom 3, then you do not conform to axiom 2 By axiom 3, 𝑃 𝑟 ∨~𝑟 =𝑃 𝑟 +𝑃 ~𝑟 So by assignments 1 and 2, 𝑃 𝑟 ∨~𝑟 = 1.2 = .7 + .5 =𝑃 𝑟 + 𝑃 ~𝑟 So, contrary to axiom 2, 𝑃 𝑟 ∨~𝑟 ≠1 because 1≠1.2

A Dutch book If one violates the probability axioms, then they are vulnerable to having a Dutch book made against them E.g. suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 If you conform to axiom 2, then you do not conform to axiom 3 But if you conform to axiom 3, then you do not conform to axiom 2 So you cannot conform to the axioms

A Dutch book Suppose you violate A2 or A3 by setting 𝑃 𝑟 = .7 𝑃 ~𝑟 = .5 𝑟 ~𝑟 Bet 1 for assignment 1 +$3 -$7 Bet 2 for assignment 2 -$5 +$5 If r occurs, then they win $3 according to the first bet and lose $5 according to the second, so they lose $2 If r does not occur, then they lose $7 according to the first bet and gain $5 according to the second, so they lose $2 Either way, they lose $2.

Dutch book argument If someone violates the probability axioms, then she is vulnerable to having a Dutch book made against her One should avoid being vulnerable to having a Dutch book made against her (because this is a rational defect) Therefore, one should avoid violating the axioms of probability

An objection to the second component Conformity to the axioms requires logical omniscience, but no one is omniscient “You’re right, but the component only sets an ideal standard, irrespective whether any one can meet it”

Questions? Do you think that one’s credences should conform to the axioms of probability?

Third component of Bayesianism: Credences should be updated via conditionalisation

Terminology Before examining this component, we need to introduce some terms Conditional probability = 𝑃 𝑝 𝑞 = the probability of p on the condition that q obtains = the probability of p given q RATIO formula as an analysis of conditional probability: 𝑃 𝑝 𝑞 = 𝑃(𝑝&𝑞) 𝑃(𝑞) where 𝑃 𝑞 >0.

Example of a conditional probability m = Taylor is a mother f = Taylor is a female 𝑃 𝑚 𝑓 = the probability that Taylor is a mother given that Taylor is a female 𝑃 𝑓 =.5 𝑃 𝑚&𝑓 = .2 So: 𝑃 𝑚 𝑓 = 𝑃 𝑚&𝑓 𝑃 𝑓 = .2 .5 = .4 Note the big difference between 𝑃 𝑚 𝑓 and 𝑃 𝑓 𝑚 𝑃 𝑚|𝑓 = .4 𝑃 𝑓 𝑚 =1

Likelihoods A likelihood = 𝑃 𝑒 ℎ where e represents some evidence and h a hypothesis. 𝑃 𝑒 ℎ is called the likelihood of h on e.

Prior probabilities 𝑃 𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis immediately before the evidence comes in” (emphasis added) Strevens Terms: e = A person, such as Taylor, smiles at you h = A person, such as Taylor, likes you ~h = A person, such as Taylor, does not like you 𝑃 𝑖 (ℎ) = prior probability of a person, such as Taylor, liking you 𝑃 ℎ 𝑒 = probability of a person, such as Taylor, liking you given that s/he smiles at you What is the probability that Taylor likes you given that he or she smiled at you? 𝑃 ℎ 𝑒

What is the prior probability that Taylor likes you? Suppose you surveyed 100 people and find the following:

What is the probability that Taylor likes you given the evidence? P(h|e) = ? P(h|e) = 9/(9+36) = 9/45 = 1/5 = 20% = .2

Posterior probabilities What is the probability that Taylor likes you given the evidence? 𝑃 𝑖 (ℎ) = Your prior probability = “your subjective probability for the hypothesis immediately before the evidence comes in” – Michael Strevens(emphasis added) 𝑃 𝑓 ℎ = Your posterior probability = “your subjective probability immediately after the evidence (and nothing else) comes in” (emphasis added) Conditionalisation: One should adjust their probability for h from their prior probability 𝑃 𝑖 (ℎ) to a posterior probability 𝑃 𝑓 (ℎ) which equals 𝑃 ℎ 𝑒 when having acquired some evidence e (which has a non-zero initial probability). This is called conditionalising h on e. Conditionalisation should occur through Bayes’s theorem (where applicable).

Conditionalisation via Bayes’s theorem Application to the case: 𝑃 ℎ 𝑒 = 𝑃 𝑒 ℎ × 𝑃 𝑖 (ℎ) 𝑃 𝑖 (𝑒) Where 𝑃 𝑖 𝑒 =𝑃(𝑒|ℎ)× 𝑃 𝑖 (ℎ)+𝑃(𝑒|~ℎ)× 𝑃 𝑖 (~ℎ) .2= .9×.1 .45 Where .45 = .9 × .1 + .4 × .9 Bayes’s theorem was expressed in a paper by Rev. Thomas Bayes that was published posthumously.

Arguments for the conditionalization norm Case-by-case evidence Bayes’s theorem is used widely in statistics Dutch-book arguments

Part II: Applications and problems

Does God exist? WRONG! 𝑃 𝑖 ℎ = ? (where h = theism) (One version of) The principle of indifference: In the absence of evidence favouring one possibility over another, assign each possibility an equal probability The principle of indifference seems intuitively plausible in many cases E.g. all you know is that a prize is behind one of three doors Presumably the probability that it is behind a given door is 1/3 or approximately .33 Application to theism: Either ℎ or ~ℎ, so 𝑃 𝑖 ℎ = .5 Sounds reasonable right? WRONG!

Multiple partitions problem Suppose you’re cooking dinner for Jed, but you don’t know whether he eats meat One partition of possibilities: Either 1) Jed is a meat eater h or 2) he is not a meat eater ~ h, so 𝑃 𝑖 ℎ = .5 Another partition of possibilities: 1) Jed is a meat eater h, 2) Jed is a vegetarian v1 or 3) Jed is a vegan v2, so 𝑃 𝑖 ℎ = 1/3 The problem is that the space of possibilities can be partitioned differently so that it is unclear as to how or whether to apply the principle of indifference

Application to theism Either ℎ or ~ℎ, so 𝑃 𝑖 ℎ = .5 But what about another partition? Either: There is no ultimate cause of the universe Or there is an ultimate cause of the universe, but this cause is not a person (or conscious being) Or there is a personal and ultimate cause of the universe, but this cause is not omnibenevolent Or there is a personal, omnibenevolent and ultimate cause of the universe, but this cause is not omnipotent … Or theism is true So already 𝑃 𝑖 ℎ < 1/5 according to the principle of indifference!

The problem of the priors: Subjective and objective Bayesianism We can partition the logical possibilities differently so as to yield conflicting results when the principle of indifference applies So which partition do we go with? Some think that there is no uniquely correct partition So how do we determine 𝑃 𝑖 ℎ ? Subjectivists: Well, just pick any value you like – no value is incorrect, except for perhaps 1 or 0 Objectivists: There is a uniquely correct value for 𝑃 𝑖 ℎ , and it is… Let’s move on and assume that 𝑃 𝑖 ℎ = .5, just for illustration

What evidence is there that God exists? Theistic evidence: Atheistic evidence: Fine-tuning of laws and constants A universe Moral truths Miracle reports Abiogenesis (Origins of life) Consciousness Human suffering Animal suffering Non-resistant, non-belief in God Scale of the universe Contradictory theistic theories Theism is less simple (Occam’s razor)

The fine-tuning argument e1 = the laws of the universe are finely tuned to permit meaningful life: According to philosopher Robin Collins, if the strength of the gravitational force were to change by one part in 1036, then any land-based or aquatic organisms the size of humans would be crushed. Likelihoods: 𝑃 𝑒 1 ℎ)=.5 𝑃 𝑒 1 ~ℎ)=1/ 10 36 Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including polytheism, pantheism, etc.) What is the posterior probability of theism? 𝑃 ℎ 𝑒 1 ) ≈ 1

The fine-tuning argument – Just kidding! e1 = the laws of the universe are finely tuned to permit meaningful life: According to philosopher Robin Collins, if the strength of the gravitational force were to change by one part in 1036, then any land-based or aquatic organisms the size of humans would be crushed. Likelihoods: 𝑃 𝑒 1 ℎ)=.5 𝑃 𝑒 1 ~ℎ)=.01 Note that I will assume that ~h is equivalent to Western philosophical atheism (rather than also including polytheism, pantheism, etc.) What is the posterior probability of theism? 𝑃 ℎ 𝑒 1 ) ≈ .98

The multiverse objection If there were (infinitely) many universes with the values of their laws randomly generated by chance, then we wouldn’t be surprised to see that one of them happen to have life-permitting values In Bayesian terms: Perhaps it is true that a where a = there is an (infinitely) large number of other universes with values randomly generated by chance and 𝑃 𝑒 ~ℎ&𝑎 =1 (or some relatively high figure)

The argument from suffering e2 = humans suffer and this is a bad thing Genocide Oppression Missing buses Now our prior probability relative to e2 is our posterior probability relative to e1, so 𝑃 𝑖 ℎ ≈ .98 What are the likelihoods? Logical argument from evil (J.L. Mackie): 𝑃 𝑒 2 ℎ =0 𝑃 𝑒 2 ~ℎ =.5 So, 𝑃 ℎ 𝑒 2 )=0

The argument from suffering e2 = humans suffer and this is a bad thing Genocide Oppression Missing buses Now our prior probability relative to e2 is our posterior probability relative to e1, so 𝑃 𝑖 ℎ ≈ .98 What are the likelihoods? Evidential argument from evil (William Rowe): 𝑃 𝑒 2 ℎ =.01 𝑃 𝑒 2 ~ℎ =.5 So, 𝑃 ℎ 𝑒 2 )=.5

Sceptical theism “God knows a lot more than us and would have reasons to justify his actions which we do not know of” “So if God existed, there was suffering and we did not see any reason that would justify God’s permission of the suffering, then we would not be surprised” More sophisticated defences of versions of sceptical theism are given by Stephen Wykstra and Daniel Howard-Snyder

The problem of the priors There is sometimes a lot of debate about the likelihoods, or at least about what the relevant likelihoods are Suppose we agree that: 𝑃 𝑒 ℎ =.9 𝑃 𝑒 ~ℎ =.1 So if we assume that 𝑃 𝑖 ℎ =.5 Then 𝑃 ℎ 𝑒)=.9 But if we assume that 𝑃 𝑖 ℎ =.1 Then 𝑃 ℎ 𝑒)=.5 And if we assume that 𝑃 𝑖 ℎ =.00001 Then 𝑃 ℎ 𝑒) ≈ .0009

The problem of the priors

The problem of the priors The posterior probability is sensitive to the value of the prior probability Subjective Bayesians often think that the subjectivity of the prior is not a major problem since the subjectivity will be “washed out” as evidence accumulates So two people starting off with different priors will converge on the probable truth given their conditioning on a growing body of evidence However, as Alan Hájek notes: “Indeed, for any range of evidence, we can find in principle an agent whose prior is so pathological that conditionalizing on that evidence will not get him or her anywhere near the truth, or the rest of us.” And there are other worries So does the problem of the priors render Bayesianism practically useless? Does it eliminate scepticism about the reliability of inductive inference?

Questions?

Thank you!