Codes for astrostatistics: StatCodes & VOStat Eric Feigelson Penn State.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

IVOA Interoperability Meeting, Trieste1 Mining data using MATLAB through AstroBox Chao LIU, Chenzhou CUI Presented by: Chenzhou CUI National Astronomical.
© Copyright 2008 All rights reserved 2 VO-India Project Started in 2002 as a collaboration between IUCAA and Persistent Systems Ltd. Part of International.
September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
Discovery and Exploration in the VO Chris Miller NOAO/CTIO La Serena, Chile T HE US N ATIONAL V IRTUAL O BSERVATORY.
2008 NVO Summer School1 Image Visualization in the VO Doug Tody (NRAO) Francois Bonnarel (CDS) T HE US N ATIONAL V IRTUAL O BSERVATORY.
Remote Visualisation System (RVS) By: Anil Chandra.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
CASDA Virtual Observatory CSIRO ASTRONOMY AND SPACE SCIENCE Arkadi Kosmynin 11 March 2014.
Metadata at ICPSR Sanda Ionescu, ICPSR.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
NSF DMS VOStat - HEAD 2004 Ashish Mahabal VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham,
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Ch 12 Distributed Systems Architectures
Virtual Observatory Single Sign-on U.S. National Virtual Observatory National Center for Supercomputing Applications Ray Plante, Bill Baker.
Leicester Database & Archive Service J. D. Law-Green, S. W. Poulton, J. Osborne, R. S. Warwick Dept. of Physics & Astronomy, University of Leicester LEDAS.
RESEARCH HUB AT THE UNIVERSITY LIBRARIES PENN STATE UNIVERSITY TOUR OF STATISTICAL PACKAGES.
Overview of astrostatistics Eric Feigelson (Astro & Astrophys) & Jogesh Babu (Stat) Penn State University.
2012 National BDPA Technology Conference Creating Rich Data Visualizations using the Google API Yolanda M. Davis Senior Software Engineer AdvancED August.
Biostatistics, statistical software II. A brief survey of statistical program systems Krisztina Boda PhD Department of Medical Informatics, University.
Advanced Statistics for Interventional Cardiologists.
Time Table exchange QSAS / CL / CAA / AMDA CESR, 25/26 feb
The Digital Library for Earth System Education: A Community Resource
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY NASA GODDARD SPACE FLIGHT CENTER ORBITAL SCIENCES CORPORATION NASA AMES RESEARCH CENTER SPACE TELESCOPE SCIENCE INSTITUTE.
Dec 2, 2014 MAST Data Discovery Portal Tom Donaldson Tony Rogers.
WSRF Supported Data Access Service (VO-DAS)‏ Chao Liu, Haijun Tian, Dan Gao, Yang Yang, Yong Lu China-VO National Astronomical Observatories, CAS, China.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
Tero Oittinen Sampo Team Department of Astronomy University of Helsinki,Finland Using ESO Reflex to access astronomical WebServices by The Sampo.
Tunis International Centre for Environmental Technologies Small Seminar on Networking Technology Information Centers UNFCCC secretariat offices Bonn, Germany.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Science with the Virtual Observatory Brian R. Kent NRAO.
How to Adapt existing Archives to VO: the ISO and XMM-Newton cases Research and Scientific Support Department Science Operations.
Data Visualization Project B.Tech Major Project Project Guide Dr. Naresh Nagwani Project Team Members Pawan Singh Sumit Guha.
Summary of distributed tools of potential use for JRA3 Dugan Witherick HPC Programmer for the Miracle Consortium University College.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
F. Genova, Berlin 7, Paris, 2 December 2009 The astronomical information network.
July 16, 2004P. Padovani, NEON Archive School Science with multi-wavelength Archival Data Paolo Padovani (ESO) Virtual Observatory Systems Department &
26 October 2005HST Calibration Workshop1 The National Virtual Observatory and HST T HE US N ATIONAL V IRTUAL O BSERVATORY Robert Hanisch US National Virtual.
Strasbourg astronomical Data Centre (DS) Françoise GENOVA.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Where will PyRAF lead us?: The future of data analysis software at STScI Perry Greenfield Science Analysis Tools Project Space Telescope Science Institute.
March 1st, 2006Prospective PNG PNG: Databases - Virtual Observatory.
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Chapter 3 Application Software. Chapter 3 Objectives Identify the categories of application software Explain how to work with application software Identify.
German Astrophysical Virtual Observatory Overview and Results So Far W. Voges, G. Lemson, H.-M. Adorf.
F. Genova, VO as a Data Grid, 2003/06/301 Interoperability of astronomy data bases Françoise Genova, CDS.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
JWST Pipeline/Analysis Tools Perry Greenfield Science Software Branch.
John Porter Sheng Shan Lu M. Gastil Gastil-Buhl With special thanks to Chau-Chin Lin and Chi-Wen Hsaio.
12 Oct 2003VO Tutorial, ADASS Strasbourg, Data Access Layer (DAL) Tutorial Doug Tody, National Radio Astronomy Observatory T HE US N ATIONAL V IRTUAL.
CUAHSI HIS: Science Challenges Linking small integrated research sites (
Commentary on: The Virtual Observatory G. Jogesh Babu Center for Astrostatistics
Virtual Observatory India VOStat Statistical Analysis for the Virtual Observatory By Deoyani and Mohasin.
February 12, 2002Tom McGlynn ADEC Interoperability Technical Working Group Report.
Jacobus Kapteyn ( ). Until the Second World War Astronomy was an Optical Science: all observations were made with instruments working in the visible.
Science Gateway- 13 th May Science Gateway Use Cases/Interfaces D. Sanchez, N. Neyroud.
R Programming.
Today’s Beginner Workshop
Chapter 8 Models and Decision Support
Center for Astrostatistics
Google Sky.
Introduction to Matlab
Presentation transcript:

Codes for astrostatistics: StatCodes & VOStat Eric Feigelson Penn State

Vast range of statistical problems in modern astronomy Poisson processes: point processes, time series analysis Image analysis: MLE deconvolution, adaptive smoothing, wavelet analyses Multivariate analysis & classification (w/ meas errors) Survival analysis (censoring & truncation w/ meas errors) Parametric models: Model selection, non-linear regression Non-parametric methods Confidence limits: bootstrap resampling Prior knowledge: Bayesian inference (see talk at PhysStat 2003 conference)

The problem Astronomers are insufficiently trained in modern applied statistics ….. but even if they knew what to do, they inadequate access to computer codes.

Astronomers never use large commercial statistical packages like SAS, SPSS, Statistica Some astronomers sometimes use UNIX-based command- line systems like MatLab or S-Plus. Astronomers like mini-codes in Numerical Recipes & often write their own codes. Many like IDL which has simple statistics. NASA/NSF observatories produce huge data analysis codes (IRAF, AIPS, CIAO, …) which by policy avoid proprietary codes A few specialized stand-along astrostat codes written under NASA funding: ROSTAT, ASURV, SLOPES, StatPy Altogether this is a very bad situation: vast statistical needs with very inadequate codes

The rise of the Virtual Observatory Vast collections of calibrated data (images, spectra, time series), extracted catalogs (rows=sources, columns=properties), and source bibliographies emerged during the 1990s. NASA Science Archive Centers (MAST, HEASARC, IRSA, LAMDA), bibliographic databases (ADS, SIMBAD, NED), & more are being transformed into a federated (though still distributed & heterogeneous) system. XML metadata (VOTable), SOAP protocols, … for data mining & extraction. but originally no plan for visualization & statistical analysis of extracted datasets

StatCodes: A partial solution In late-1990s, the Penn State group created a Web metasite with annotated links to ~200 open source packages & codes of utility to astronomers. Quite successful: hits/day for 7 years. Multivariate & time series methods most popular. But the collection of on-line codes was very inhomogeneous and incomplete

R Finally a broad public-domain statistical software system emerges Based on the successful commercial UNIX-based S/S-Plus, R has an interactive command-line feel (like IDL), flexible data I/O, acceptable graphics, integration to C/Fortran/Python/…, and quite a lot of sophisticated statistical methods. Core R: 2000-page manual with ~200 functionalities, some very complex & advanced CRAN: 300 add-on packages, dozens useful to astronomers. Some are themselves full systems.

VOStat: A Web service 1.Web form interface providing simple statistical R functions with VOTable inputs 2.Same R functions provided through a more sophisticated Java-based grid-computing mode. User data bases Dispersed VO VOStat server Heavy statistical computation Answers Requests Heavy data

VOStat may be a big improvement but … Generic Web-based services are inherently inflexible & limited. VOStat may serve to entice the astronomer to download R & perform the real analysis at home. Astronomers need training in advanced methods before using them with R. Penn State has just created a Center for Astrostatistics to develop curriculum, conduct tutorials, provide template R code, etc. R/CRAN does not serve huge VO datasets or some special astrostat needs. New methodological/code development underway (CMU, Cornell, PSU, UCIrv,…)