Automated Classification of X-ray Sources for Very Large Datasets Susan Hojnacki, Joel Kastner, Steven LaLonde Rochester Institute of Technology Giusi.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

The Science of Solar B Transient phenomena – this aim covers the wide ranges of explosive phenomena observed on the Sun – from small scale flaring in the.
Stellar Evolution up to the Main Sequence. Stellar Evolution Recall that at the start we made a point that all we can "see" of the stars is: Brightness.
Bruce Gendre Osservatorio di Roma / ASI Science Data Center Recent activities from the TAROT/Zadko network.
XMM-Newton view of eight young open star clusters Himali Bhatt INSPIRE FACULTY (Department of Science & Technology) Astrophysical Sciences Division Bhabha.
Gene Shaving – Applying PCA Identify groups of genes a set of genes using PCA which serve as the informative genes to classify samples. The “gene shaving”
Strange Galactic Supernova Remnants G (the Tornado) & G in X-rays Anant Tanna Physics IV 2007 Supervisor: Prof. Bryan Gaensler.
New Results from the GALEX Nearby Young-Star Survey David R. Rodriguez (Universidad de Chile), B. Zuckerman (UCLA), Joel H. Kastner (RIT), Laura Vican.
EGRET unidentified sources and gamma-ray pulsars I. CGRO mission and the instrument EGRET and it’s scientific goals II. Simple introduction of EGRET sources.
X Ray Astronomy Presented by:- Mohit Shashwat Ankit.
What Are the Faint X-ray Transients Near the Galactic Center? Michael Muno (UCLA/Hubble Fellow) Fred Baganoff (MIT), Eric Pfahl (UVa), Niel Brandt, Gordon.
GALEX UV Light-curves of M-Dwarf Flare Stars: “ THE FLARING UV SKY” Barry Welsh, Jonathan Wheatley & Stanley Browne (UC Berkeley) Richard Robinson (Catholic.
© 2010 Pearson Education, Inc. Chapter 21 Galaxy Evolution.
Ionization, Resonance excitation, fluorescence, and lasers The ground state of an atom is the state where all electrons are in the lowest available energy.
Cumulative  Deviation of data & model scaled  to 0.3  99%  90%  95% HD 36861J (rp200200a01) Probability of Variability A Large ROSAT Survey.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Institute for Astronomy and Astrophysics, University of Tübingen, Germany July 5, 2004Cool Stars, Stellar Systems and the Sun (Hamburg, Germany)1 Turning.
Can people meet from 2:40 to 3:30 on Tuesday, September 5?
The X-ray Astronomy Imaging Chain. Pop quiz (1): which of these is the X-ray image?
Black Hole in M83 Topic: Black holes Concepts: multi-wavelength observations, black hole evolution Missions: Hubble, Chandra, Swift Coordinated by the.
Chapter 24 Normal and Active Galaxies. The light we receive tonight from the most distant galaxies was emitted long before Earth existed.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Black holes: do they exist?
Hyperspectral Imaging Alex Chen 1, Meiching Fong 1, Zhong Hu 1, Andrea Bertozzi 1, Jean-Michel Morel 2 1 Department of Mathematics, UCLA 2 ENS Cachan,
200 MG 500 MG TheoryObservation Authors Institutes RE J is a hydrogen rich strongly magnetic white dwarf discovered as an EUV source by the ROSAT.
 Galaxies with extremely violent energy release in their nuclei  Active Galactic Nuclei (AGN)  Up to many thousand times more luminous than the entire.
V410 TAU T TAURI Pre Main Sequence – young, low mass stars that are contracting as they evolve toward their main sequence stage. Mostly made of Hydrogen,
RXJ a soft X-ray excess in a low luminosity accreting pulsar La Palombara & Mereghetti astro-ph/
NASSP Masters 5003F - Computational Astronomy Lecture 19 EPIC background Event lists and selection The RGA Calibration quantities Exposure calculations.
Magnetic Fields Near the Young Stellar Object IRAS M. J Claussen (NRAO), A. P. Sarma (E. Kentucky Univ), H.A. Wootten (NRAO), K. B. Marvel (AAS),
Image Classification 영상분류
EARTH SCIENCE Prentice Hall EARTH SCIENCE Tarbuck Lutgens 
Galaxies Astronomy 115. First, which of the following is a galaxy? Open cluster Globular cluster Nebula Interstellar medium (gas and dust) Supernova remnant.
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
Diagnosing the Shock from Accretion onto a Young Star Nancy S. Brickhouse Harvard-Smithsonian Center for Astrophysics Collaborators: Steve Cranmer, Moritz.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Chandra X-Ray Spectroscopy of DoAr 21: The Youngest PMS Star with a High-Resolution Grating Spectrum The High Energy Grating Spectrum of DoAr 21, binned.
Digital Image Processing
Computer Graphics and Image Processing (CIS-601).
Lecture Outlines Astronomy Today 7th Edition Chaisson/McMillan © 2011 Pearson Education, Inc. Chapter 23.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Monitor of all-sky X-ray Image (MAXI) was designed to be capable of monitoring variability of a medium-sized sample of active galactic nuclei. As of November.
THIS PRESENTAION HAS BEEN RATED BY THE CLASSIFICATION AND RATING ADMINISTRATION TG-13 TEACHERS’ GUIDANCE STRONGLY ADVISED Some Material May Be Unintelligible.
GEOG2021 Environmental Remote Sensing Lecture 3 Spectral Information in Remote Sensing.
Is the Initial Mass Function universal? Morten Andersen, M. R. Meyer, J. Greissl, B. D. Oppenheimer, M. Kenworthy, D. McCarthy Steward Observatory, University.
Galaxies with Active Nuclei Chapter 14:. Active Galaxies Galaxies with extremely violent energy release in their nuclei (pl. of nucleus).  “active galactic.
Thessaloniki, Oct 3rd 2009 Cool dusty galaxies: the impact of the Herschel mission Michael Rowan-Robinson Imperial College London.
Active Galaxies and Supermassive Black Holes Chapter 17.
Our Milky Way Galaxy. The Milky Way Almost everything we see in the night sky belongs to the Milky Way. We see most of the Milky Way as a faint band of.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Metal abundance evolution in distant galaxy clusters observed by XMM-Newton Alessandro Baldi Astronomy Dept. - University of Bologna INAF - OABO In collaboration.
COMBO-17 Galaxy Dataset Colin Holden COSC 4335 April 17, 2012.
XMM-Newton observations of open clusters and star forming regions R. Pallavicini and E. Franciosini INAF- Osservatorio Astronomico di Palermo, Italy S.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 28 Nov 9, 2005 Nanjing University of Science & Technology.
Chapter 21 Galaxy Evolution Looking Back Through Time Our goals for learning How do we observe the life histories of galaxies? How did galaxies.
KASI Galaxy Evolution Journal Club A Massive Protocluster of Galaxies at a Redshift of z ~ P. L. Capak et al. 2011, Nature, in press (arXive: )
High energy Astrophysics Mat Page Mullard Space Science Lab, UCL 1. Overview.
Chapter 25 Galaxies and Dark Matter. 25.1Dark Matter in the Universe 25.2Galaxy Collisions 25.3Galaxy Formation and Evolution 25.4Black Holes in Galaxies.
Tools for computing the AGN feedback: radio-loudness and the kinetic luminosity function Gabriele Melini Fabio La Franca Fabrizio Fiore Active Galactic.
Unsupervised Classification
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
GEOG2021 Environmental Remote Sensing
Big data classification using neural network
Stars.
A Survey of Orion A with XMM and Spitzer: SOXS
Principal Component Analysis (PCA)
A Study of Accretion Disks Around Young Binary Star Systems
The spectral properties of Galactic X-ray sources at faint fluxes
Presentation transcript:

Automated Classification of X-ray Sources for Very Large Datasets Susan Hojnacki, Joel Kastner, Steven LaLonde Rochester Institute of Technology Giusi Micela INAF, Osservatorio Astronomico di Palermo April 2005

2 Background Pervasive Problem: Like many current space science missions, the Chandra X-ray Observatory is generating data at a rate much faster than can be analyzed with current tools Enormous amount of data from Chandra: 6 CCD arrays x 1024 x 1024 = 6,291,456 pixels ~3.2 s exposure time  9000 frames generated in 8 hours of observing!

3 Background X-ray images help astronomers study new star formation and galactic evolution X-ray sources are classified by visual inspection of individual spectra and a model-fitting approach; typically one source at a time Good for studying physics of bright, individual sources, but time consuming for analysis of rich stellar clusters Model-fitting approach difficult to use with faint sources or “new” sources that don’t fit any existing models

4 Our existing semi-automated technique groups X-ray sources based on X-ray spectral attributes Uses a combination of techniques from the fields of multivariate statistics, remote sensing, and pattern recognition Objective, model-independent approach that requires no a priori assumption as to nature of X-ray source Allows for automated exploration to find “interesting” objects, or clusters of objects, for further study Background

5 Algorithm Input Data High energy X-ray spectrum divided into 42 spectral bands Photon counts within the 42 spectral bands are used as the multivariate input variables Input data is from the Chandra X-ray Observatory, but can be from any X-ray observatory

6 Chandra Deep Field Image Orion Nebula Cluster Energy Spectrum X-ray Light curve (Image: Garmire et al. 2000)

7 Algorithm Details Multivariate techniques used:  Principal Component Analysis  Agglomerative Hierarchical Clustering  K-means Clustering All are “unsupervised” methods None require multivariate normal data Choice of number of resulting classes is heuristic

8 Example Application Is X-ray emission from young stars derived from coronal activity, accretion, outflow activity, or some combination of these mechanisms ? The answer to this question will have an impact on studies of a wide variety of astrophysical phenomena that produce X-ray emission.

9 Chandra Orion Ultradeep Project (COUP) COUP dataset compiled from ~850 ks Chandra observation of the Orion Nebula Cluster Represents most sensitive and comprehensive description of X-ray emission from a young star cluster (Getman et al. 2005) ~1616 X-ray sources detected, some of which can not be “seen” in visible or near-infrared wavelengths Spectral classification technique applied to sample of 444 sources selected from COUP image

Chandra X-ray image of the Orion Nebula Cluster (10-day integration; Feigelson et al. 2004)

11 Principal Component Plot Plot of the first 2 principal components showing the source classes (4 components were retained) Progression of classes moving clockwise around the arch forms a sequence of decreasing spectral hardness Average spectra for some of the classes are shown Class numbers increase clockwise around the curve

12

13 Example Classes Class 2 Class 14

14 Analysis of Results Class Sequences vs. Standard Measures of Xray / Visual / Near-IR Spectral Properties

15 Analysis of Results For this sample of low-mass young stars: Classes form sequences in hydrogen column density, visual absorption, and near-IR K-band excess, demonstrating that the algorithm efficiently sorts young stars into physically meaningful groupsClasses form sequences in hydrogen column density, visual absorption, and near-IR K-band excess, demonstrating that the algorithm efficiently sorts young stars into physically meaningful groups Lack of correlation with effective temperature shows that stellar X-ray spectral properties are not well correlated with stellar photospheric propertiesLack of correlation with effective temperature shows that stellar X-ray spectral properties are not well correlated with stellar photospheric properties

16 Knowledge Discovery Preliminary classification results reveal that our spectral clustering technique can be used to efficiently identify very young X-ray sources that: – lack optical and near-infrared counterparts – display strong Fe K  line emission – display large-amplitude, impulsive flares Within the COUP dataset, such sources likely represent the youngest protostars in the ONC

17 Work To Do Phase I Classify X-ray sources in other star formation regions based on previous source groupings Compare expected vs actual results for known sources Extend algorithm for use with ‘unknown’ X-ray source datasets

18 Work To Do Phase II Study X-ray source variability and add temporal inputs to the algorithm Apply new algorithm to datasets from Chandra Develop into a software tool for use by the X-ray astronomy community

19 Space Science Goals Algorithm results aid in the study of physical conditions of X-ray plasmas surrounding young stars: Determine if young stars in other regions of the sky fit into previously established statistical groupings, helping to ascertain their evolutionary status Determine mechanisms underlying the bright X-ray emission that is a distinguishing feature among young stars Improve understanding of nature and timescale of accretion onto young, solar-mass stars from protoplanetary disks

20 Background Material Algorithm Details

21 Principal Component Analysis Goal is to identify a new, smaller set of uncorrelated variables, called the principal components, which explains all (or nearly all) of the total variance in the data set Each principal component is described by:  eigenvector: linear combination of original variables  eigenvalue: variance accounted for by that PC Number of principal components to retain is based on analysis of several stopping rules

22 Agglomerative Hierarchical Clustering Attempts to find natural groupings of the detected X- ray sources Partitions set of X-ray sources into relatively homogeneous subsets based on inter-source distances Starts with 1 source in each cluster and successively merges them based on statistical distance measure Examines distance level at each merger step to determine the final number of clusters

23 K-means Clustering Hierarchical clustering cannot transfer a source from one cluster to another if initially grouped incorrectly: K-means used for “fine-tuning” K-Means goal: arrive at clusters with small within- cluster variation and large between-cluster variation Start with cluster assignments from hierarchical clustering for initial partition of X-ray sources Iterative process  change X-ray source’s cluster membership if there is a cluster with a closer centroid