Pattern Discovery Tools for Large Astronomical Surveys

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

© Copyright 2008 All rights reserved 2 VO-India Project Started in 2002 as a collaboration between IUCAA and Persistent Systems Ltd. Part of International.
Visual Scripting of XML
Improving your OpenEdge® Development Productivity David Lund Sr. Training Program Manager, Progress.
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 11 Designing for Usability I.
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
Electrical and Computer Engineering PeopleFinder Vitaly Gordievsky Alex Trefonas Scott Richard Matt Beckford Comprehensive Design Review.
REMOTE SITE MANAGEMENT OBJCTIVES ·To provide global coverage of readily accessible, consistently presented, relevant data for use by a diverse group of.
SDSS Web Services Tamás Budavári Johns Hopkins University Coding against the Universe.
Version 4 for Windows NEX T. Welcome to SphinxSurvey Version 4,4, the integrated solution for all your survey needs... Question list Questionnaire Design.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies In collaboration with David Wittman, J. Anthony Tyson of UC Davis Samuel Carliles,
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
CHAPTER TEN AUTHORING.
Chapter 4 Realtime Widely Distributed Instrumention System.
Search Update April 1-3, 2009 Joshua Ganderson Laura Baalman.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
Technical Overview The Fastest Way to Create Architecture!
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
VO Enabled Mirage and The IVOA Client Package Samuel Carliles 1, Tin Kam Ho 2, and William O’Mullane 1 1 Department of Physics and Astronomy, The Johns.
Polaris: A System for Query, Analysis and Visualization of Multi- dimensional Relational Database by Chris Stolte & Pat Hanrahan presenter Andrew Trieu.
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
Lucent Technologies - Proprietary 1 Interactive Pattern Discovery with Mirage Mirage uses exploratory visualization, intuitive graphical operations to.
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
Mantid Scientific Steering Committee Nick Draper 18/06/2010.
CS 501: Software Engineering Fall 1999 Lecture 23 Design for Usability I.
The Palantir Platform… …Changes in 2.3
Data Visualization with Tableau
WEB TESTING
VisIt Project Overview
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Data Mining – Intro.
From LSE-30: Observatory System Spec.
Working in the Forms Developer Environment
DELLSOFT Technologies Pvt. Ltd.
CMS High Level Trigger Configuration Management
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Software Tools and Environments
Requirements Basis Requirements of an Image Visualization System (IVS), to support the verification of the correct functioning of some components under.
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
CHAPTER 8 Multimedia Authoring Tools
CHAPTER 2 CREATING AN ARCHITECTURAL DESIGN.
Data Warehouse.
Prepared by Kimberly Sayre and Jinbo Bi
Visualization of Web Search Results in 3D
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
EPICS Version 4 Abstract:
N. Capp, E. Krome, I. Obeid and J. Picone
(VIP-EDC) Point 6 of the agenda
Data Warehousing and Data Mining
Databases, Web Pages and Archives
COMPASS Database SPACE TELESCOPE SCIENCE INSTITUTE Gretchen Greene
Overview of big data tools
What's New in eCognition 9
Analysis models and design models
An Introduction to Software Architecture
Volume 21, Issue 8, Pages (August 2014)
Clustering Wei Wang.
CHAPTER 7: Information Visualization
Lab 2: Information Retrieval
What's New in eCognition 9
Paul Craig, Jessie Kennedy
What's New in eCognition 9
Presentation transcript:

Pattern Discovery Tools for Large Astronomical Surveys Tin Kam Ho Bell Labs, Lucent Technologies tkh@research.bell-labs.com in collaboration with David Wittman, J. Anthony Tyson University of California, Davis Samuel Carliles, Wil O'Mullane, Alex Szalay Johns Hopkins University Mirage web site: http://www.cs.bell-labs.com/who/tkh/mirage VO interface: http://skyservice.pha.jhu.edu/develop/vo/mirage Mirage (in public release since 2002) is a prototype of an analysis tool that supports pattern discovery across multi-typed data. Mirage is a Java-based tool that is organized around a command interpreter which receives action commands from textual input or a graphical user interface. The action commands are for loading data, incremental import of new entries and new attributes, simple attribute manipulation, and activating several embedded classification routines. The most important functionalities are built on simultaneous visualization of raw image data, extracted feature vectors, and classification results. The graphical display presents a stack of canvas pages. Each page can be subdivided arbitrarily, via horizontal or vertical splits, into rectangular cells. Each cell can be loaded with any particular data view module via simple drag-and-drop operations. Each module provides its own control commands to manipulate the specific method of data presentation. In addition, all view modules implement the same Java Interface "ActivePanel", which contains the following commands that, when coupled with view-specific operations, support very powerful exploration operations: getSelected() clearSelected() highlightDataEntry() colorDataEntry() clearHighlights() clearColors() changeToMonochrome() changeToColor() Early results from various uses of Mirage have been very encouraging. We have plans to refine and generalize the ideas experimented in the software, towards a more versatile tool suitable for supporting more advanced analysis of large-scale imaging databases featured in next-generation astronomical surveys. Many large-scale sky surveys are generating data at a rate far beyond reach by traditional manual analysis. This trend is accelerating: in the near future, the Large Synoptic Survey Telescope (LSST) (http://www.lsst.org/lsst_home.shtml) will repeatedly image the entire sky visible from its site, at multiple wavelengths, producing a time-tagged imaging database of 20 petabytes and a corresponding event catalog of 150 TB, with parameters of position, time, intensity, colors, and motion. Besides much increased data volume, databases are no more collected for a single well-defined purpose, with filters and detectors optimized for known features. Paradigm-shifting discoveries of unexpected events or correlations often result from open-ended explorations. This requires a tool which not only enables detection of the unexpected, but rapid exploration and visualization of the new phenomenon to determine if it is scientifically valuable, or a previously unidentified systematic error. Challenges for the Analysis Tool Versatile visualization utilities allowing many perspectives Visualization can help verify correctness of preprocessing steps, clean up undesirable artifacts, choose relevant samples, spot explicit patterns, select useful features, and suggest algorithms and models. To support all these needs, flexibility in the choice of perspectives is critical. Moreover, a connecting architecture is needed such that data relationship can be easily tracked between different views of the data. Support for exploratory discovery across diverse data types Astronomical surveys contain multiple data types and incomparable groups of variables. Examples are images, spectra, light curves, and various scalar or vector parameters derived from the raw data. Relationships uncovered in each data type need to be correlated with those from others. This requires tools for modeling, building index structures, and navigation of data distributions in each data type, and methods for tracking correlations between different navigation paths. Integration of manual and automatic pattern recognition methods Human judgement needs to be part of the analysis loop to apply proper domain expertise. Automatic pattern recognition algorithms can process large data volumes efficiently, objectively, and consistently. They can also complement deficiencies in manual explorations due to unreliable human intuition or inability to comprehend high-dimensional vectors. But "stand-alone" algorithms are not enough. A convenient bridge is needed to connect between manual and automatic exploration tools. This includes support for rapid examination of different sampling options and feature choices, algorithmic alternatives and parameters, and facilities for checking the results for validity and interpretation, in contexts of different levels of abstraction from the raw data. And a good tool should: -- leverage existing visualization and analysis methods, - enable continued growth by addition of new visualization or analysis tools, - support interface with existing databases access tools, - be scalable in data volume and processing speed. Mirage features: Data Visualization in Multiple, Linked Views: Show patterns in histograms, scatter plots, parallel coordinates, tables, images Selection and Tracking: Select points in any view, broadcast to all others with highlights or colors Systematic Traversal of Data Structures: Walk in histograms, cluster graphs or trees, echo in all other views Flexible Graphics Utilities: Open multiple-page plots easily with arbitrary configuration Command Scripts: Run prepared groups of operations as animations Remote Database Access: Retrieve data for analysis over WWW; VO data access via IVOA client package Work in progress: Images: FITS image panel with World Coordinates support using JSky package; Array of image panels with synchronized zooming and panning; Panel for overlay of multiple images and object markers Analysis: Connection to external libraries for automatic pattern recognition; Data structures for high-dimensional spaces Database: Join among different datasets on arbitrary common keys (e.g. RA, DEC); Coupling with VO access methods