Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inference in action: the force law in the Solar System

Similar presentations


Presentation on theme: "Inference in action: the force law in the Solar System"— Presentation transcript:

1 Inference in action: the force law in the Solar System
Jo Bovy Center for Cosmology and Particle Physics New York University IMPRS Summer School: Statistical Inferences from Astrophysical Data 2009/08/14

2 Introduction >>''Bayesians'' vs. ''Frequentists'': Plenty of theory and idealized examples, but no ''real'' example this week >>This talk: - Actual astrophysically (somewhat) relevant application - Frequentists vs. Bayesians in practice: strengths and weaknesses of both - Bayes is a framework for inference, not the nec plus ultra: Inference is not just blindly applying Bayes's theorem: exploration, forward modeling often requires careful thought

3 The general problem >>Infer the dynamics of the Milky Way (the gravitational potential Φ) given a set of positions xi and velocities vi (think Gaia) >>Why is this hard? - Often large uncertainties (e.g., distances) - Often missing data (e.g., line-of-sight velocities in astrometric missions) - Large numbers of stars (~109 for Gaia): computational problem >>No real problem if we had a forward model: e.g., convolving the model with the data uncertainties deals with large uncertainties and missing data

4 The forward/generative model
>>Do we have a forward model?

5 The forward/generative model
>>Do we have a forward model? >>Newton's second law: x'' = F(x) = -∇Φ(x) --> Φ(x) can be used to forward model x'' = a(x)

6 The forward/generative model
>>Do we have a forward model? >>Newton's second law: x'' = F(x) = -∇Φ(x) --> Φ(x) can be used to forward model x'' = a(x) >>But we do not have a, we have x and v, initial conditions for any Φ(x) >> p({x,v} | Φ) p(Φ) p(Φ| {x,v}) = p({x,v})

7 The forward/generative model
>>Do we have a forward model? >>Newton's second law: x'' = F(x) = -∇Φ(x) --> Φ(x) can be used to forward model x'' = a(x) >>But we do not have a, we have x and v, initial conditions for any Φ(x) >> p({x,v} | Φ) p(Φ) p(Φ| {x,v}) = p({x,v}) ---> Positions and velocities tell you nothing!?

8 The distribution function
>>The way out: the distribution function f(x,v): f(x,v) dx dv = #stars per volume dx dv in 6D phase space ∝ probability to find a star in dx dv >> p(x,v | Φ) ∝ f(x,v) >>By making assumptions about the distribution function (DF) we can learn something about the potential

9 Possible assumptions >>What kind of assumptions does one make?

10 Possible assumptions >>What kind of assumptions does one make?
>>All tracers are bound: Φcannot be too weak (~ lower bound on the strength of the potential): E.g., MW satellites are bound to the MW -> lower bound on the MW mass

11 Possible assumptions >>What kind of assumptions does one make?
>>All tracers are bound: Φcannot be too weak (~ lower bound on the strength of the potential): E.g., MW satellites are bound to the MW -> lower bound on the MW mass >>Tracers are in a steady state (more on this later): - The basic dynamical assumption used to infer (parts of) the potential of the MW and external galaxies

12 Possible assumptions >>What kind of assumptions does one make?
>>All tracers are bound: Φcannot be too weak (~ lower bound on the strength of the potential): E.g., MW satellites are bound to the MW -> lower bound on the MW mass >>Tracers are in a steady state (more on this later): - The basic dynamical assumption used to infer (parts of) the potential of the MW and external galaxies >>Other informative features in the DF: - E.g., Halo streams trace out orbits -> a (e.g., Koposov, Rix, & Hogg 2009)

13 Toy problem: the Solar System
JB, Hogg, & Murray (2009) >>We asked the question: Can we infer the force law in the Solar System from a snapshot of its kinematics? >>Given: x,v of 8 planets on 2009-Apr-01

14 Toy problem: the Solar System
JB, Hogg, & Murray (2009) >>We asked the question: Can we infer the force law in the Solar System from a snapshot of its kinematics? >>Given: x,v of 8 planets on 2009-Apr-01 >>Find: a(r) = -A (r/1AU)-α >>Focuses on the forward modeling issue, not the large uncertainties, missing data, and computational complexity problems

15 The data >>We did not measure the positions and velocities, but took them from a Solar System integration service (JPL ephemeris) >>Accurate to ~10-6 – 10-8 fractional uncertainty

16 Where to start? The virial theorem
>>The virial theorem for general power-law potentials says >>For the 8 planets:

17 Where to start? The virial theorem
>>The virial theorem for general power-law potentials says >>For the 8 planets:

18 Where to start? The virial theorem
>>The virial theorem for general power-law potentials says >>For the 8 planets: >>Lines cross near a single point --> System is virialized --> System is in a steady-state

19 The virial estimate? >>Is there anything we can do with the virial relation?

20 The virial estimate? >>Is there anything we can do with the virial relation? >>Only one equation for two parameters (here): divide the planets into subsamples? (Inner/outer planets?) + bootstrap the result somehow to get uncertainty estimates on A,α

21 The virial estimate? >>Is there anything we can do with the virial relation? >>Only one equation for two parameters (here): divide the planets into subsamples? (Inner/outer planets?) + bootstrap the result somehow to get uncertainty estimates on A,α >>Obviously there is some constraint coming from the virial relation, but this is hard to quantiy >>Virial relation is good for exploration: We will from now on assume that the planets are in a steady-state

22 The steady-state assumption
>>Any collisionless system satisfies the collisionless Boltzmann equation:

23 The steady-state assumption
>>Any collisionless system satisfies the collisionless Boltzmann equation: >>In a steady-state the first term is zero:

24 Jeans theorem >>Jeans's theorem states that in a steady-state the DF is a function of the integrals of the motion alone: f ≣f(I)

25 Jeans theorem >>Jeans's theorem states that in a steady-state the DF is a function of the integrals of the motion alone: f ≣f(I) >>This way the gravitational potential can sneak in, e.g., E = K + Φ >>Very useful, since we have assumed spherical symmetry: - 4 integrals of the motion: E, L (5 for α = 2) - f ≣f(E,L,[e])

26 What frequentists do with this
>>(Some) Frequentists imagine a world of many realizations of the Solar System and test whether the observed Solar System has the expected properties, and they rule out (with a certain confidence) models that do not have the expected properties >>Expected properties: - distribution of radial angle θ(orbital phase) is uniform - no correlation between E and θ - ...

27 What frequentists do with this
>>(Some) Frequentists imagine a world of many realizations of the Solar System and test whether the observed Solar System has the expected properties, and they rule out (with a certain confidence) models that do not have the expected properties >>Expected properties: - distribution of radial angle θ(orbital phase) is uniform - no correlation between E and θ - ... >>Orbital roulette: Test for a given Φ whether these properties are satisfied and rule out models on this basis (Beloborodov & Levin 2004)

28 Why orbital roulette works
>>Uniform θ distribution: for potentials that are too strong, all planets are at aphelion, for weak potentials all planets are at perihelion

29 Angles

30 Angles

31 Why orbital roulette works
>>Uniform θ distribution: for potentials that are too strong, all planets are at aphelion, for weak potentials all planets are at perihelion >>Absence of (E,θ) correlations: e.g., a(r) too strong and to steep: inner planets are at aphelion and strongly bound (large E), outer planets are at perihelion and weakly bound (small E)

32 Different frequentist tests
>>Frequentist procedure: - Choose statistic to test whether the θdistribution is flat (e.g., the mean angle, KS, Kuiper) - Choose statistic to test whether E and θare correlated (e.g., rank test) - Combine these somehow, e.g., Bonferroni correction (conservative) - Rule out models with a certain confidence

33 Different frequentist tests

34 Different frequentist tests

35 So what's wrong with orbital roulette?
>>Good: - Conceptually simple, intuitive - Better than virial - Given good data (small uncertainties, no missing data) it does a good job of setting bounds on the potential (e.g., it was used successfully to constrain the black hole mass at the GC, Beloborodov et al. 2006)

36 So what's wrong with orbital roulette?
>>Good: - Conceptually simple, intuitive - Better than virial - Given good data (small uncertainties, no missing data) it does a good job of setting bounds on the potential (e.g., it was used successfully to constrain the black hole mass at the GC, Beloborodov et al. 2006) >>Bad: - Test what? Which test? How to combine different tests? - No way to deal with missing data or large uncertainties - only rules out models, it does not give parameter estimates with uncertainties - It uses a very crude model of the data

37 ''Bayesian'' orbital roulette
>>Starting point is the same: f ≣ f(I) >>Assuming spherical symmetry: f ≣f(E,e) >>Bayes's theorem: p(A,α| {x,v}) ∝ p(A,α) Πi J(E,e; θ; x,v) f(Ei,ei| A,α)

38 But what is the distribution function?
>>Not a big problem: we can infer it as well and marginalize over it >>Given only 8 data points we can only hope to infer some crude features of the distribution function, roughly the bounds

39 But what is the distribution function?
>>Not a big problem: we can infer it as well and marginalize over it >>Given only 8 data points we can only hope to infer some crude features of the distribution function, roughly the bounds >>Therefore, we model the distribution function f(ln E, e) as a product of a tophat function in ln E, and a tophat in e: our forward/generative model >>This adds four parameters to the model (two times the edges of the tophat) >> p(A,α| {x,v}) --> p(A,α, ln Emin, ln Emax, emin,emax| {x,v}) >>We can marginalize these DF parameters out, analytically even

40 But what is the distribution function?
>>Not a big problem: we can infer it as well and marginalize over it >>Given only 8 data points we can only hope to infer some crude features of the distribution function, roughly the bounds >>Therefore, we model the distribution function f(ln E, e) as a product of a tophat function in ln E, and a tophat in e: our forward/generative model >>This adds four parameters to the model (two times the edges of the tophat) >> p(A,α| {x,v}) --> p(A,α, ln Emin, ln Emax, emin,emax| {x,v}) >>We can marginalize these DF parameters out, analytically even

41 Jacobians α=2:

42 Jacobians α=2:

43 Results Marginalized over the DF + marginalized over A

44 Discussion >>Given few data points we get reasonably accurate bounds on α (best possible bounds? e ~ 0.01) >>But, we started from accurate data, and get 10-2 results ---> inferring the DF has cost us a lot of information >>This method can be generalized to missing data problems

45 Frequentist vs. Bayesian method
Frequentist result: ~10 percent measurement

46 Frequentist vs. Bayesian method
Frequentist result: Bayesian result: ~10 percent measurement ~2 percent measurement

47 Why does the Bayesian method perform better?
>>The virial relation holds up to about e2 >>For the frequentist method, the median or mean eccentricity is what matters

48 Why does the Bayesian method perform better?
>>The virial relation holds up to about e2 >>For the frequentist method, the median or mean eccentricity is what matters

49 Why does the Bayesian method perform better?
>>The virial relation holds up to about e2 >>For the frequentist method, the median or mean eccentricity is what matters --> median e =~ > ~10 percent measurement possible

50 Why does the Bayesian method perform better?
>>The virial relation holds up to about e2 >>For the frequentist method, the median or mean eccentricity is what matters --> median e =~ > ~10 percent measurement possible >>But we can do better: planets with the (2) lowest eccentricities have the most constraining power --> smallest e's =~ > ~2 percent measurement possible

51 Why does the Bayesian method perform better?
>>The virial relation holds up to about e2 >>For the frequentist method, the median or mean eccentricity is what matters --> median e =~ > ~10 percent measurement possible >>But we can do better: planets with the (2) lowest eccentricities have the most constraining power --> smallest e's =~ > ~2 percent measurement possible >>The Bayesian method comes close to this bound

52 Conclusions >>Science is in the forward modeling and making necessary and motivated assumptions, probability theory provides a framework for inference >>Both Bayesian and frequentist methods have interesting things to say in this case >>Bayes outperforms the frequentist methods and is the appropriate framework to use here (Galactic dynamics is a ''mature'' field)

53 Conclusions >>Science is in the forward modeling and making necessary and motivated assumptions, probability theory provides a framework for inference >>Both Bayesian and frequentist methods have interesting things to say in this case >>Bayes outperforms the frequentist methods and is the appropriate framework to use here (Galactic dynamics is a ''mature'' field) >>And the force law in the Solar System is ∝ r-2


Download ppt "Inference in action: the force law in the Solar System"

Similar presentations


Ads by Google