An Automated Classification Algorithm for Multi-wavelength Data Yanxia Zhang, Ali Luo,Yongheng Zhao National Astronomical Observatories, China 2005.8.16,

Slides:



Advertisements
Similar presentations
Trying to Use Databases for Science Jim Gray Microsoft Research
Advertisements

Online Science -- The World-Wide Telescope Archetype
World Wide Telescope mining the Sky using Web Services Information At Your Fingertips for astronomers Jim Gray Microsoft Research Alex Szalay Johns Hopkins.
1 Online Science -- The World-Wide Telescope as an Archetype Jim Gray Microsoft Research Collaborating with: Alex Szalay, Peter Kunszt, Ani
1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
VO-DAS Chenzhou CUI Chao LIU, Haijun TIAN, Yang YANG, etc National Astronomical Observatories, CAS.
Development of China-VO ZHAO Yongheng NAOC, Beijing Nov
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Virtual Observatory & Grid Technique ZHAO Yongheng (National Astronomical Observatories of China) CANS2002.
Assuming normally distributed data! Naïve Bayes Classifier.
Normal Galaxies Sample From 2dF-XMM Wide Angle Survey Jonathan Tedds, Silvia Mateos, Mike Watson, Matthew Page, Francisco Carrera, Mirko Krumpe, Jacobo.
Week 9 Data Mining System (Knowledge Data Discovery)
A Web service for Distributed Covariance Computation on Astronomy Catalogs Presented by Haimonti Dutta CMSC 691D.
ACM SAC’06, DM Track Dijon, France “The Impact of Sample Reduction on PCA-based Feature Extraction for Supervised Learning” by M. Pechenizkiy,
The aims of SC4DEVO and SC4DEVO-1 Bob Mann Institute for Astronomy and National e-Science Centre, University of Edinburgh.
Gemma Anderson ChIcAGO Chasing the Identification of ASCA Galactic Objects.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Making the Most of Small Sample High Dimensional Micro-Array Data Allan Tucker, Veronica Vinciotti, Xiaohui Liu; Brunel University Paul Kellam; Windeyer.
How does computer know what is spam and what is ham?
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Chapter 5 Data mining : A Closer Look.
Introduction to machine learning
Requirements from astronomy in the Virtual Observatory era Bob Mann Institute for Astronomy & NeSC University of Edinburgh.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Introduction to Sky Survey Problems Bob Mann. Introduction to sky survey database problems Astronomical data Astronomical databases –The Virtual Observatory.
11/27/2003IVOA Small Projects Meeting China-VO Data Access Service Based on OGSA Jian Sang National Astronomical Observatory of China Chinese Virtual.
Optical Spectroscopy of Distant Red Galaxies Stijn Wuyts 1, Pieter van Dokkum 2 and Marijn Franx 1 1 Leiden Observatory, P.O. Box 9513, 2300RA Leiden,
Automated Classification of X-ray Sources R. J. Hanisch, A. A. Suchkov, R. L. White Space Telescope Science Institute T. A. McGlynn, E. L. Winter, M. F.
Chapter 1 Introduction to Data Mining
Objectives & Methodology: We aim to identify formerly unknown quasars from select databases, using previously known techniques that have been slightly.
Prototype system of the Japanese Virtual Observatory The Japanese Virtual Observatory (JVO) aims at providing easy access to federated astronomical databases.
A Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data.
Kaihua Zhang Lei Zhang (PolyU, Hong Kong) Ming-Hsuan Yang (UC Merced, California, U.S.A. ) Real-Time Compressive Tracking.
Zhang Yanxia China-VO Group in Guilin Chinese Virtual Observatory.
Astronomical Spectroscopy and the Virtual Observatory ESAC, March 2007 VO tools and cross-calibration Pedro García-Lario European Space Astronomy.
Public Access to Large Astronomical Datasets Alex Szalay, Johns Hopkins Jim Gray, Microsoft Research.
Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.
Astronomical Data Mining: Renaissance or dark age? Giuseppe Longo & the DAME Group
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Wiss. Beirat AIP, ClusterFinder & VO-Methods H. Enke German Astrophysical Virtual Observatory ClusterFinder VO Methods for Astronomical Applications.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Sky Survey Database Design National e-Science Centre Edinburgh 8 April 2003.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
German Astrophysical Virtual Observatory Overview and Results So Far W. Voges, G. Lemson, H.-M. Adorf.
Data Mining and Decision Support
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
Multiwavelength Properties of the SDSS Galaxies in Various Classes Feb 19, 2008 Joon Hyeop Lee 1, Myung Gyoon Lee 1, Changbom Park 2, Yun-Young Choi 2.
An Effective Hybridized Classifier for Breast Cancer Diagnosis DISHANT MITTAL, DEV GAURAV & SANJIBAN SEKHAR ROY VIT University, India.
Classification Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 24, 2015.
Spectral Analysis Pipeline for LAMOST Project A-Li Luo LAMOST Science Division NAOC, CAS.
Astronomy toolkits and data structures Andrew Jenkins Durham University.
Automated Classification of Galaxy Images
Data Mining: Introduction
MIS 451 Building Business Intelligence Systems
Boosting Nearest-Neighbor Classifier for Character Recognition
A Unifying View on Instance Selection
Machine Learning with Weka
Steve Zhang Armando Fox In collaboration with:
Data Mining and Virtual Observatory
Multiple Decision Trees ISQS7342
Classification and Prediction
Classification Breakdown
Presentation transcript:

An Automated Classification Algorithm for Multi-wavelength Data Yanxia Zhang, Ali Luo,Yongheng Zhao National Astronomical Observatories, China , Lijiang

Astronomy facing “ data avalanche ” IRAS 25  2MASS 2  DSS Optical IRAS 100  WENSS 92cm NVSS 20cm GB 6cm ROSAT ~keV

Necessity Necessity Is the Mother of Invention Virtual Observatories Data avalanche DM & KDD

–DM—core of KDD Data cleaning Data federation Database Data warehouse task selection DM Pattern evaluation Data Mining & KDD

One Task of DM:Classification Training set Test set Validated set classifier Classification method New data Cross identification all features Feature selectionSelected features predict The scheme of classification of multiwavelength data

Data sample Near infrared optical X ray 2MASS USNO A2.0 ROSAT J,H,K B,R CR,HR1,HR2, ext,extl

Known sample star Normal galaxies AGNs SIMBAD RC3 Veron(2000)

Feature Selection Parameters:B+2.5lgCR,J+2.5lgCR,B-R,J-H,H-K,lgCR, HR1,HR2,ext,extl Methods: ReliefF B+2.5lgCRJ+2.5lgCRB-RHR2H-K extJ-HlgCRHR1extl Result of feature selection:

Classification Method: Naïve Bayes classifier With the full set of features With weighted features With the subset of features 97.0%97.6%97.9% Classification results for three situations

Summary 1. By feature selection, we can deal with high dimensional data, and select important attributes, thus improve the efficiency and effect of classification. 2. The Naïve Bayes algorithm is an robust method to classify multiwavelength data with high accuracy of classification. It is not only used for multiwavelength data, but also for other data, such as photometric data, spectra data, image data or the combined data of these types of data. 3. With the classifier, it is helpful to preselect source candidates for large surveys and classify the new data. 4. The methods will be part of VO toolkits.

Thanks a lot !!!