Download presentation
Presentation is loading. Please wait.
Published byHilary McGee Modified over 9 years ago
1
The VAO is operated by the VAO, LLC. Ashish Mahabal (aam@astro.caltech.edu)aam@astro.caltech.edu Ciro Donalek Matthew Graham Ray Plante George Djorgovski Data 2 Knowledge study project VAO-LSST Meeting, NOAO, 24 March 2011
2
March 23, 2011Ashish Mahabal 2 Goals Feasibility study What is out there What is needed Milestones What can be done
3
Exploration of observable parameter spaces and searches for rare or new types of objects Djorgovski
4
March 23, 2011Ashish Mahabal 4 Overview – many connections Astroinformatics (next meeting in Sep. 2011) VOStat and other R/Statistics tools Data challenges Various sky surveys Related issues Semantics Classification/characterization Distributed data GPUs Focus on time domain
5
March 23, 2011Ashish Mahabal Focus on time-domain 5 Expertise, and it encompasses all aspects of data mining (save one) Plus, real-time forces us to be fast. Portfolio building – growing columns of tables Bayesian networks utilizing auxiliary information Lightcurve techniques for characterizing objects
6
March 23, 2011Ashish Mahabal Missing stat and CS tools 6
7
March 23, 2011Ashish Mahabal Missing stat and CS tools 7 Bootstrap aggregating Mixture of experts Boosting Simulated annealing Semi-supervised learning …. From IVOA KDD User guide for Data Mining (Nick Ball)
8
March 23, 2011Ashish Mahabal 8 Science goal: to solve the growing gap between the huge generation of data and our understanding of it Data Gathering (e.g., new generation instruments …) Data Farming: Storage/Archiving Indexing, Searchability Data Fusion, Interoperability, ontologies, etc. Data Mining (or Knowledge Discovery in Databases): Pattern or correlation search Clustering analysis, automated classification Outlier / anomaly searches Hyperdimensional visualization Data visualization and understanding Computer aided understanding KDD Etc. New Knowledge Data storage, Pbytes Data access >10 3 access Scalability: Petaflops, Exaflops Computing power (multicore) Algorithm: parallelism Visualization: N-dimensional
9
March 23, 2011Ashish Mahabal 9 Currently on the plate DAME Knime (Konstanz Information Miner) Orange (Visual/python) Weka (ML/Java) Rapidminer (standalone)
10
March 23, 2011Ashish Mahabal 10 Comparison matrix for DM/Viz tools Accuracy Scalability Interpretability Usability Robustness Versatility Speed Popularity
11
March 23, 2011Ashish Mahabal 11 Related activities Skyalert integration (Graham) – adding data and methods Solicitation of examples from community WD, Blazars’ example Making R more astronomy friendly Various datasets Differing number of rows, columns For supervised/unsupervised classification TA on GPUs – incorporate in pipeline
12
March 23, 2011Ashish Mahabal Slide from Budavari 12 CUDA zone, PyCUDA, …
13
March 23, 2011Ashish Mahabal VAO People working on this 13 Ashish Mahabal, Ciro Donalek, Matthew Graham, George Djorgovski (Caltech) Ray Plante (NCSA) But we are in touch with many others in astro/CS/stats and relying on many groups including LSST transients and informatics working groups
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.