NSF DMS-0101360 VOStat - HEAD 2004 Ashish Mahabal VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham,

Slides:



Advertisements
Similar presentations
© Copyright 2008 All rights reserved 2 VO-India Project Started in 2002 as a collaboration between IUCAA and Persistent Systems Ltd. Part of International.
Advertisements

Outline of Ch 11b: The H-R Diagram
1. absolute brightness - the brightness a star would have if it were 10 parsecs from Earth.
Stars and the HR Diagram Dr. Matt Penn National Solar Observatory
Codes for astrostatistics: StatCodes & VOStat Eric Feigelson Penn State.
Clustering II.
Binary Stars Astronomy 315 Professor Lee Carkner Lecture 9.
Mining quasars from the Palomar-QUEST survey SC4DEVO (Caltech: Jul 2004) Ashish Mahabal, Caltech (+ Djorgovski, Graham, Williams …) (+Yale, NCSA,
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Making a Color-Magnitude Diagram for Globular Cluster Omega Centauri Jay Anderson, STScI 1.
Ch. 8 – Characterizing Stars part 3: The Hertzsprung-Russell Diagram Luminosity Classes Spectral Types.
EViews. Agenda Introduction EViews files and data Examining the data Estimating equations.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Galaxies Chapter Twenty-Six. Guiding Questions How did astronomers first discover other galaxies? How did astronomers first determine the distances to.
Chapter 11c Surveying the Stars Star Clusters Our Goals for Learning What are the two types of star clusters? How do we measure the age of a star.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
First Quantitative Variable: Ear Length  The unit of measurement for this variable is INCHES.  A few possible values for this first quantitative variable.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
MAT 1000 Mathematics in Today's World. Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.Can measure spread with the.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Chapter 11 Surveying the Stars Properties of Stars Our Goals for Learning How luminous are stars? How hot are stars? How massive are stars?
1 Stars Stars are very far away. The nearest star is over 270,000 AU away! ( Pluto is 39 AU from the Sun ) That is equal to 25 trillion miles! At this.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Statistics with TI-Nspire™ Technology Module E. Lesson 2: Properties Statistics with TI-Nspire™ Technology Module E.
Exploratory Data Analysis Exploratory Data Analysis Dr.Lutz Hamel Dr.Joan Peckham Venkat Surapaneni.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Group 4 Members and Participants Amelia Corey, Angie Coates, Cynthia Bradwisch, Aaron Grow, and Daniel Champion.
AP Statistics Semester One Review Part 1 Chapters 1-3 Semester One Review Part 1 Chapters 1-3.
Chapter 16 Exploratory data analysis: numerical summaries CIS 2033 Based on Textbook: A Modern Introduction to Probability and Statistics Instructor:
Analyzing Expression Data: Clustering and Stats Chapter 16.
The VAO is operated by the VAO, LLC. Ashish Mahabal Ciro Donalek Matthew Graham Ray Plante George Djorgovski.
Stars.
Stellar Clusters Homework Problems Chapter 13
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Statistics with TI-Nspire™ Technology Module E Lesson 1: Elementary concepts.
Commentary on: The Virtual Observatory G. Jogesh Babu Center for Astrostatistics
Spatial Point Processes Eric Feigelson Institut d’Astrophysique April 2014.
Universe Tenth Edition Chapter 23 Galaxies Roger Freedman Robert Geller William Kaufmann III.
Virtual Observatory India VOStat Statistical Analysis for the Virtual Observatory By Deoyani and Mohasin.
Multivariate Statistics Introduction W. M. van der Veld University of Amsterdam.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Marginal Distribution Conditional Distribution. Side by Side Bar Graph Segmented Bar Graph Dotplot Stemplot Histogram.
Strategies for Metabolomic Data Analysis Dmitry Grapov, PhD.
CLASSIFICATION OF ECG SIGNAL USING WAVELET ANALYSIS
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Hertzsprung–Russell diagram review. Temperature Luminosity An H-R diagram plots the luminosities and temperatures of stars.
FOR TEEN AND YOUNG ADULT MALES (13 TO 29) IS AGE RELATED TO THE NUMBER OF HOURS SPENT PLAYING VIDEO/COMPUTER GAMES? By Amanda Webster, Jennifer Burgoyne,
Stars and the HR Diagram Dr. Matt Penn National Solar Observatory.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Thursday, May 12, 2016 Report at 11:30 to Prairieview
From LSE-30: Observatory System Spec.
Correlation, Bivariate Regression, and Multiple Regression
Introduction to Data Mining
Binary Stars Hypothesis. Masses of Stars  While we can find the radius of a star from the Stefan-Boltzmann Law, we still do not know the mass  How do.
15.3 Variable Stars & Star Clusters
Treat everyone with sincerity,
Data Avalanche in Astronomy
Center for Astrostatistics
Multivariate Methods Berlin Chen, 2005 References:
Intensity Transformations and Spatial Filtering
Principal Component Analysis (PCA)
Introductory Statistics
Presentation transcript:

NSF DMS VOStat - HEAD 2004 Ashish Mahabal VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham, S.G.Djorgovski, R. Williams Penn State: J. Babu (PI), E. Feigelson CMU: R. Nichol, D. Van DenBerk, L.Wasserman

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Use of statistics astronomical studies per year 5% have “statistics” in their abstract 20% treat variable objects or multivariate datasets

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Traditional methods Fourier transform (Fourier 1807) Least sq. and chisq (Legendre 1805, Pearson 1901) Kolmogorov-Smirnov test (Kolomogrov 1933) Principal Component Analysis (Hotelling 1936)

NSF DMS VOStat - HEAD 2004 Ashish Mahabal VOStat Web based service Simple and sophisticated statistical routines Large datasets Public domain (R)/ specially written General purpose and Virtual Observatory

NSF DMS VOStat - HEAD 2004 Ashish Mahabal VOStat ASCII / VOTABLE as input (can be used as an intermediate block for a VO based pipeline) CGI routines as prototypes (few 1000 lines) Webservices (Java GUI) - hundreds of thousands of lines (limited by R’s capabilities) - distributed, multi-OS, multi-language

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Examples of available functions Descriptive statistics (e.g. boxplot) Two- and k-sample tests (e.g. Wilcoxon rank-sum test) Density estimation (e.g. Kernel smoothing) Correlation and regression (e.g. PCA) Censored data (e.g. Survival) Multivariate classification (e.g. H clustering) External functions (e.g. K-density)

NSF DMS VOStat - HEAD 2004 Ashish Mahabal User-friendly GUI Columns are autoselected (and can be deselected) Parameter choices for functions are conveniently placed Can be used from your own webpages on tables residing elsewhere

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Toy Demos Rediscovering HR diagram Rediscovering FP of Globular Clusters Looking for outliers in color-color space

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Rediscovering HR diagram Hyades stars (Hipparcus main catalog) Mean/median/boxplot Density estimation (Histogram) Kernel smoothing Correlation matrix X-Y plot Multivariate clustering

NSF DMS VOStat - HEAD 2004 Ashish Mahabal X-Y plot between Vmag and B-V reveals the famous structure in the dataset: the color-magnitude of bright stars showing the main sequence, giant branch (with red clump stars), and a few Hyades white dwarfs.

NSF DMS VOStat - HEAD 2004 Ashish Mahabal FP of Globular clusters Matrix of pairwise correlation coefficients Pairwise plots Principal Component Analysis

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Core parameters as a group tend to be highly correlated, unlike the half-light parameters. This is indicative of the dynamical evolution driven by the core collapse.

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Exploring outliers Palomar-QUEST synoptic sky survey 9 mix-and-match colors from 8 filters Aim: finding outliers in color-color space for spectroscopic follow-up 1000 random objects

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Boxplot Reveals relationships between colors (mean, median, overlap, outliers)

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Clustering K-means provides various cluster centers along with withinss and a list of possible outliers

NSF DMS VOStat - HEAD 2004 Ashish Mahabal

NSF DMS VOStat - HEAD 2004 Ashish Mahabal K-density Probability - density association for outliers

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Visual confirmation (found from 1000 random objects)

NSF DMS VOStat - HEAD 2004 Ashish Mahabal Summary Web-based VO compatible Public domain and specialized routines