Overview of astrostatistics Eric Feigelson (Astro & Astrophys) & Jogesh Babu (Stat) Penn State University.

Slides:



Advertisements
Similar presentations
Regression Eric Feigelson Lecture and R tutorial Arcetri Observatory April 2014.
Advertisements

The Earth and Beyond.
X-ray Astrostatistics Bayesian Methods in Data Analysis Aneta Siemiginowska Vinay Kashyap and CHASC Jeremy Drake, Nov.2005.
Dark Matter Mike Brotherton Professor of Astronomy, University of Wyoming Author of Star Dragon and Spider Star.
Introduction to Astrostatistics Eric Feigelson Dept. of Astronomy & Astrophysics Center for Astrostatistics Penn State University Summer.
Components of the Universe Review REGULAR. List the stages in the life cycle of an Average Star:  Nebula – area of dust and gas where stars are formed.
What color is the Milky Way? Jeffrey Newman and Timothy Licquia University of Pittsburgh/Pitt-PACC (Pittsburgh Particle physics, Astrophysics and Cosmology.
Lwando Kondlo Supervisor: Prof. Chris Koen University of the Western Cape 12/3/2008 SKA SA Postgraduate Bursary Conference Estimation of the parameters.
How Do Astronomers Learn About the Universe?
Codes for astrostatistics: StatCodes & VOStat Eric Feigelson Penn State.
Constraining Astronomical Populations with Truncated Data Sets Brandon C. Kelly (CfA, Hubble Fellow, 6/11/2015Brandon C. Kelly,
Universe in a box: simulating formation of cosmic structures Andrey Kravtsov Department of Astronomy & Astrophysics Center for Cosmological Physics (CfCP)
NSF DMS VOStat - HEAD 2004 Ashish Mahabal VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham,
Like the jelly beans in this jar, the Universe is mostly dark: about 96 percent consists of dark energy (about 70%) and dark matter (about 26%). Only about.
Advanced Methods for Studying Astronomical Populations: Inferring Distributions and Evolution of Derived (not Measured!) Quantities Brandon C. Kelly (CfA,
Regression Eric Feigelson. Classical regression model ``The expectation (mean) of the dependent (response) variable Y for a given value of the independent.
Astronomy The scientific study of matter in outer space, especially the positions, dimensions, distribution, motion, composition, energy, and evolution.
Review for Exam 3.
The Big Bang, Galaxies, & Stars
Evolution of the Universe (continued)
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Overview of Astronomy AST 200. Astronomy Nature designs the Experiment Nature designs the Experiment Tools Tools 1) Imaging 2) Spectroscopy 3) Computational.
Playing in High Dimensions Bob Nichol ICG, Portsmouth Thanks to all my colleagues in SDSS, GRIST & PiCA Special thanks to Chris Miller, Alex Gray, Gordon.
Absorption lines of a galaxy shift toward the blue end of the spectrum when it moves toward Earth. The lines shift to the red end of the spectrum when.
Big Bang Theory Created by Evan Chernenko Click to Start.
Unit 11, Chapter 32 Integrated Science. Unit Eleven: Astronomy 32.1 Stars 32.2 Galaxies and the Universe Chapter 32 The Universe.
Astronomy and the Electromagnetic Spectrum
SPACE SYSTEMS UNIT Chapters 26 & 30.
Research in Astronomy Prof. David Cohen Swarthmore College January 30, 2004 Resources and Information for Students Sponsored by SWAP.
Unit Stars and the Universe. Stars A star is a giant, hot ball of gas. Stars generate light and heat through nuclear reactions. They are powered by the.
ORIGINS OF THE UNIVERSE What you need to know about life, the universe, and everything………
The dark universe SFB – Transregio Bonn – Munich - Heidelberg.
Key Ideas Describe characteristics of the universe in terms of time, distance, and organization. Identify the visible and nonvisible parts of the electromagnetic.
Wiss. Beirat AIP, ClusterFinder & VO-Methods H. Enke German Astrophysical Virtual Observatory ClusterFinder VO Methods for Astronomical Applications.
Extrasolar Planet Search OGLE-2005-BLG-390Lb The Age of Miniaturization: Smaller is Better OGLE-2005-BLG-390Lb is believed to be the smallest exoplanet.
PHY306 1 Modern cosmology 3: The Growth of Structure Growth of structure in an expanding universe The Jeans length Dark matter Large scale structure simulations.
G. Miknaitis SC2006, Tampa, FL Observational Cosmology at Fermilab: Sloan Digital Sky Survey Dark Energy Survey SNAP Gajus Miknaitis EAG, Fermilab.
Final Review December 4, 2002 Final Exam will be held in Ruby Diamond Auditorium NOTE THIS!!! not UPL Dec. 11, am-noon Bring your ID, calculator.
Structure Formation in the Universe Concentrate on: the origin of structure in the Universe How do we make progress?How do we make progress? What are the.
What is Astronomy? Mr. Hibbetts Classical and Modern Astronomy.
Overview G. Jogesh Babu. Overview of Astrostatistics A brief description of modern astronomy & astrophysics. Many statistical concepts have their roots.
What is Astronomy? An overview..
Reasoning in Psychology Using Statistics Psychology
Cosmology and Dark Matter III: The Formation of Galaxies Jerry Sellwood.
TEK Objective 4: The student knows how Earth-based and space-based astronomical observations reveal differing theories about the structure, scale, composition,
ROSES 2006 Code S & T Workshop Michael Way Space Sciences Division.
1) Name the planets in our solar system in order
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Charles Hakes Fort Lewis College1. Charles Hakes Fort Lewis College2.
© 2010 Pearson Education, Inc. Galaxies. © 2010 Pearson Education, Inc. Hubble Deep Field Our deepest images of the universe show a great variety of galaxies,
Commentary on Chris Genovese’s “Nonparametric inference and the Dark Energy equation of state” Eric Feigelson (Penn State) SCMA IV.
ST9 TPWS OSS Science Needs Overview Robert M. Nelson Lead Scientist New Millennium Program Offcie California Institute of Technology, Jet Propulsion.
SCI 151 Academic Success /snaptutorial.com For more course Tutorials
Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,
Astrostatistics: Past, Present & Future
Measurement, Quantification and Analysis
Overview G. Jogesh Babu.
Astronomy Review November 29, 2007.
Reasoning in Psychology Using Statistics
What is Astronomy? An overview..
Reasoning in Psychology Using Statistics
Bootstrap for Goodness of Fit
Life as an Astronomer: 1. What do Astronomers Study?
What is Astronomy? An overview..
Integrated Science.
Center for Astrostatistics
Reasoning in Psychology Using Statistics
What is Astronomy? An overview..
Presentation transcript:

Overview of astrostatistics Eric Feigelson (Astro & Astrophys) & Jogesh Babu (Stat) Penn State University

What is astronomy Astronomy (astro = star, nomen = name in Greek) is the observational study of matter beyond Earth – planets in the Solar System, stars in the Milky Way Galaxy, galaxies in the Universe, and diffuse matter between these concentrations. The perspective is rooted from our viewpoint on or near Earth using telescopes or robotic probes. Astrophysics (astro = star, physis = nature) is the study of the intrinsic nature of astronomical bodies and the processes by which they interact and evolve. This is an indirect, inferential intellectual effort based on the assumption that gravity, electromagnetism, quantum mechanics, plasma physics, chemistry, and so forth – apply universally to distant cosmic phenomena.

Overview of modern astronomy & astrophysics Big Bang Cosmic Microwave Background Inflation H  He First stars, galaxies and black holes Gravity Continuing star & planet formation in galaxies Earth science Biosphere Today Eternal expansion

Lifecycle of the stars Interstellar gas & dust Star & planet formation Main sequence stars Red giant phase Winds & supernova explosions Habitability & life He  CNO  Fe Fe  U Compact stars White dwarfs Neutron stars Black holes H  He

What is astrostatistics? What is astronomy? The properties of planets, stars, galaxies and the Universe, and the processes that govern them What is statistics? –“The first task of a statistician is cross-examination of data” (R. A. Fisher) –“[Statistics is] the study of algorithms for data analysis” (R. Beran) –“A statistical inference carries us from observations to conclusions about the populations sampled” (D. R. Cox) –“Some statistical models are helpful in a given context, and some are not” (T. Speed, addressing astronomers) –“There is no need for these hypotheses to be true, or even to be at all like the truth; rather … they should yield calculations which agree with observations” (Osiander’s Preface to Copernicus’ De Revolutionibus, quoted by C. R. Rao)

“The goal of science is to unlock nature’s secrets. … Our understanding comes through the development of theoretical models which are capable of explaining the existing observations as well as making testable predictions. … Fortunately, a variety of sophisticated mathematical and computational approaches have been developed to help us through this interface, these go under the general heading of statistical inference.” (P. C. Gregory, Bayesian Logical Data Analysis for the Physical Sciences, 2005) My conclusion: The application of statistics to high-energy astronomical data is not a straightforward, mechanical enterprise. It requires careful statement of the problem, model formulation, choice of statistical method(s), and judicious evaluation of the result.

Astronomy & statistics: A glorious history Hipparchus (4th c. BC): Average via midrange of observations Galileo (1572): Average via mean of observations Halley (1693): Foundations of actuarial science Legendre (1805): Cometary orbits via least squares regression Gauss (1809): Normal distribution of errors in planetary orbits Quetelet (1835): Statistics applied to human affairs But the fields diverged in the late 19-20th centuries, astronomy  astrophysics (EM, QM) statistics  social sciences & industries

Do we need statistics in astronomy today? Are these stars/galaxies/sources an unbiased sample of the vast underlying population? When should these objects be divided into 2/3/… classes? What is the intrinsic relationship between two properties of a class (especially with confounding variables)? Can we answer such questions in the presence of observations with measurement errors & flux limits?

Do we need statistics in astronomy today? Are these stars/galaxies/sources an unbiased sample of the vast underlying population? Sampling When should these objects be divided into 2/3/… classes? Multivariate classification What is the intrinsic relationship between two properties of a class (especially with confounding variables)? Multivariate regression Can we answer such questions in the presence of observations with measurement errors & flux limits? Censoring, truncation & measurement errors

When is a blip in a spectrum, image or datastream a real signal? Statistical inference How do we model the vast range of variable objects (extrasolar planets, BH accretion, GRBs, …)? Time series analysis How do we model the 2-6-dimensional points representing galaxies in the Universe or photons in a detector? Spatial point processes & image processing How do we model continuous structures (CMB fluctuations, interstellar/intergalactic media)? Density estimation, regression

How often do astronomers need statistics? (a bibliometric measure) Of ~15,000 refereed papers annually: 1% have `statistics’ in title or keywords 5% have `statistics’ in abstract 10% treat variable objects 5-10% (est) analyze data tables 5-10% (est) fit parametric models

The state of astrostatistics today The typical astronomical study uses: –Fourier transform for temporal analysis (Fourier 1807) –Least squares regression (Legendre 1805, Pearson 1901) –Kolmogorov-Smirnov goodness-of-fit test (Kolmogorov, 1933) –Principal components analysis for tables (Hotelling 1936) Even traditional methods are often misused: –Six unweighted bivariate least squares fits are used interchangeably in H o studies with wrong confidence intervals Feigelson & Babu ApJ 1992 –Likelihood ratio test (F test) usage typically inconsistent with asymptotic statistical theory Protassov et al. ApJ 2002 –K-S g.o.f. probabilities are inapplicable when the model is derived from the data Babu & Feigelson ADASS 2006

A new imperative: Virtual Observatory Huge, uniform, multivariate databases are emerging from specialized survey projects & telescopes: object catalogs from USNO, 2MASS & SDSS opt/IR surveys galaxy redshift catalogs from 2dF & SDSS source radio/infrared/X-ray catalogs samples of well-characterized stars & galaxies with dozens of measured properties Many on-line collections of images & spectra Planned Large-aperture Synoptic Survey Telescope will generate ~10 Pby The Virtual Observatory is an international effort underway to federate these distributed on-line astronomical databases. Powerful statistical tools are needed to derive scientific insights from extracted VO datasets (NSF FRG involving PSU/CMU/Caltech)

But astrostatistics is an emerging discipline We organize cross-disciplinary conferences at Penn State Statistical Challenges in Modern Astronomy (1991/1996, 2001/06) Fionn Murtagh & Jean-Luc Starck run methodological meetings & write monographs We organize Summer Schools at Penn State and astrostatistics workshops at SAMSI Powerful astro-stat collaborations appearing in the 1990s: –Penn State CASt (Jogesh Babu, Eric Feigelson) –Harvard/Smithsonian (David van Dyk, Chandra scientists, students) –CMU/Pitt = PICA (Larry Wasserman, Chris Genovese, … ) –NASA-ARC/Stanford (Jeffrey Scargle, David Donoho) –Efron/Petrosian, Berger/Jeffreys/Loredo/Connors, Stark/GONG, …

Some methodological challenges for astrostatistics in the 2000s Simultaneous treatment of measurement errors and censoring (esp. multivariate) Statistical inference and visualization with very- large-N datasets too large for computer memories A user-friendly cookbook for construction of likelihoods & Bayesian computation of astronomical problems Links between astrophysical theory and wavelet coefficients (spatial & temporal) Rich families of time series models to treat accretion and explosive phenomena

Structural challenges for astrostatistics Cross-training of astronomers & statisticians New curriculum, summer workshops Effective statistical consulting Enthusiasm for astro-stat collaborative research Recognition within communities & agencies More funding (astrostat gets <0.1% of astro+stat) Implementation software StatCodes Web metasite ( Standardized in R, MatLab or VOStat? ( Inreach & outreach A Center for Astrostatistics to help attain these goals