Presentation is loading. Please wait.

Presentation is loading. Please wait.

The VAO is operated by the VAO, LLC. Ashish Mahabal Ciro Donalek Matthew Graham Ray Plante George Djorgovski.

Similar presentations


Presentation on theme: "The VAO is operated by the VAO, LLC. Ashish Mahabal Ciro Donalek Matthew Graham Ray Plante George Djorgovski."— Presentation transcript:

1 The VAO is operated by the VAO, LLC. Ashish Mahabal (aam@astro.caltech.edu)aam@astro.caltech.edu Ciro Donalek Matthew Graham Ray Plante George Djorgovski Data 2 Knowledge study project VAO-LSST Meeting, NOAO, 24 March 2011

2 March 23, 2011Ashish Mahabal 2 Goals Feasibility study What is out there What is needed Milestones What can be done

3 Exploration of observable parameter spaces and searches for rare or new types of objects Djorgovski

4 March 23, 2011Ashish Mahabal 4 Overview – many connections Astroinformatics (next meeting in Sep. 2011) VOStat and other R/Statistics tools Data challenges Various sky surveys Related issues Semantics Classification/characterization Distributed data GPUs Focus on time domain

5 March 23, 2011Ashish Mahabal Focus on time-domain 5 Expertise, and it encompasses all aspects of data mining (save one) Plus, real-time forces us to be fast. Portfolio building – growing columns of tables Bayesian networks utilizing auxiliary information Lightcurve techniques for characterizing objects

6 March 23, 2011Ashish Mahabal Missing stat and CS tools 6

7 March 23, 2011Ashish Mahabal Missing stat and CS tools 7 Bootstrap aggregating Mixture of experts Boosting Simulated annealing Semi-supervised learning …. From IVOA KDD User guide for Data Mining (Nick Ball)

8 March 23, 2011Ashish Mahabal 8 Science goal: to solve the growing gap between the huge generation of data and our understanding of it Data Gathering (e.g., new generation instruments …) Data Farming: Storage/Archiving Indexing, Searchability Data Fusion, Interoperability, ontologies, etc. Data Mining (or Knowledge Discovery in Databases): Pattern or correlation search Clustering analysis, automated classification Outlier / anomaly searches Hyperdimensional visualization Data visualization and understanding Computer aided understanding KDD Etc. New Knowledge Data storage, Pbytes Data access >10 3 access Scalability: Petaflops, Exaflops Computing power (multicore) Algorithm: parallelism Visualization: N-dimensional

9 March 23, 2011Ashish Mahabal 9 Currently on the plate DAME Knime (Konstanz Information Miner) Orange (Visual/python) Weka (ML/Java) Rapidminer (standalone)

10 March 23, 2011Ashish Mahabal 10 Comparison matrix for DM/Viz tools Accuracy Scalability Interpretability Usability Robustness Versatility Speed Popularity

11 March 23, 2011Ashish Mahabal 11 Related activities  Skyalert integration (Graham) – adding data and methods  Solicitation of examples from community  WD, Blazars’ example  Making R more astronomy friendly  Various datasets  Differing number of rows, columns  For supervised/unsupervised classification  TA on GPUs – incorporate in pipeline

12 March 23, 2011Ashish Mahabal Slide from Budavari 12 CUDA zone, PyCUDA, …

13 March 23, 2011Ashish Mahabal VAO People working on this 13 Ashish Mahabal, Ciro Donalek, Matthew Graham, George Djorgovski (Caltech) Ray Plante (NCSA) But we are in touch with many others in astro/CS/stats and relying on many groups including LSST transients and informatics working groups


Download ppt "The VAO is operated by the VAO, LLC. Ashish Mahabal Ciro Donalek Matthew Graham Ray Plante George Djorgovski."

Similar presentations


Ads by Google