ASCR Scientific Data Management Analysis & Visualization PI Meeting Exploration of Exascale In Situ Visualization and Analysis Approaches LANL: James Ahrens,

Slides:



Advertisements
Similar presentations
Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.
Advertisements

The Big Picture Scientific disciplines have developed a computational branch Models without closed form solutions solved numerically This has lead to.
Presented by Russell Myers Paper by Ming-Chuan Wu and Alejandro P. Buchmann.
Information Retrieval in Practice
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
ADVISE: Advanced Digital Video Information Segmentation Engine
ACM GIS An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics,
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Fast multiresolution image querying CS474/674 – Prof. Bebis.
Version 4 for Windows NEX T. Welcome to SphinxSurvey Version 4,4, the integrated solution for all your survey needs... Question list Questionnaire Design.
Overview of Search Engines
Presenting by, Prashanth B R 1AR08CS035 Dept.Of CSE. AIeMS-Bidadi. Sketch4Match – Content-based Image Retrieval System Using Sketches Under the Guidance.
Global Discovery: Turning Vision into Reality Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC Symposium: Global Discovery on the.
Ch 4. The Evolution of Analytic Scalability
Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,
Fine Grain MPI Earl J. Dodd Humaira Kamal, Alan University of British Columbia 1.
In Situ Sampling of a Large-Scale Particle Simulation Jon Woodring Los Alamos National Laboratory DOE CGF
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Statistical Performance Analysis for Scientific Applications Presentation at the XSEDE14 Conference Atlanta, GA Fei Xing Haihang You Charng-Da Lu July.
Science & Technology Centers Program Center for Science of Information Bryn Mawr Howard MIT Princeton Purdue Stanford Texas A&M UC Berkeley UC San Diego.
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
David S. Ebert David S. Ebert Visual Analytics to Enable Discovery and Decision Making: Potential, Challenges, and.
HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.
So far we have covered … Basic visualization algorithms Parallel polygon rendering Occlusion culling They all indirectly or directly help understanding.
Lesley Charles November 23, 2009.
Scientific Writing Abstract Writing. Why ? Most important part of the paper Number of Readers ! Make people read your work. Sell your work. Make your.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Science Problem: Cognitive capacity (human/scientist understanding), storage and I/O have not kept up with our capacity to generate massive amounts physics-based.
Light-Weight Data Management Solutions for Scientific Datasets Gagan Agrawal, Yu Su Ohio State Jonathan Woodring, LANL.
ICPP 2012 Indexing and Parallel Query Processing Support for Visualizing Climate Datasets Yu Su*, Gagan Agrawal*, Jonathan Woodring † *The Ohio State University.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Silvia Nittel University of California, Los Angeles Scientific Data Mining in ESP2Net.
HPDC 2013 Taming Massive Distributed Datasets: Data Sampling Using Bitmap Indices Yu Su*, Gagan Agrawal*, Jonathan Woodring # Kary Myers #, Joanne Wendelberger.
SUPPORTING SQL QUERIES FOR SUBSETTING LARGE- SCALE DATASETS IN PARAVIEW SC’11 UltraVis Workshop, November 13, 2011 Yu Su*, Gagan Agrawal*, Jon Woodring†
Non-Traditional Databases. Reading 1. Scientific data management at the Johns Hopkins institute for data intensive engineering and science Yanif Ahmad,
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
CS3041 – Final week Today: Searching and Visualization Friday: Software tools –Study guide distributed (in class only) Monday: Social Imps –Study guide.
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
Large Scale Data Representation Erik Goodman Daniel Kapellusch Brennen Meland Hyunjae Park Michael Rogers.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Forward Modeling Image Analysis The forward modeling analysis approach (sometimes called “analysis by synthesis”) involves optimizing the parameters of.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Xin Tong, Teng-Yok Lee, Han-Wei Shen The Ohio State University
Visualizing the Cosmos: Smoke or Mirrors? Designing Visualization Imagery David Bock National Center for Supercomputing Applications University of Illinois,
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA UNCLASSIFIED Optimizing the Energy Usage and Cognitive Value of.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
Evaluating Climate Visualization: An Information Visualization Approach -By Mridul Sen 1.
Information Retrieval in Practice
Human Computer Interaction Lecture 21 User Support
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Extreme-Scale Distribution Based Data Analysis
Search Engine Architecture
Introduction Multimedia initial focus
So far we have covered … Basic visualization algorithms
Scientific Discovery via Visualization Using Accelerated Computing
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Homogeneity Guided Probabilistic Data Summaries for Analysis and Visualization of Large-Scale Data Sets Ohio State University (Shen) Problem: Existing.
Document Visualization at UMBC
CSc4730/6730 Scientific Visualization
Usability of In Situ Generated PDFs for Post Hoc Analysis
Drag and Track: A Direct Manipulation Interface for Contextualizing Data Instances within a Continuous Parameter Space Daniel Orban, University of Minnesota.
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
In Situ Fusion Simulation Particle Data Reduction Through Binning
Extreme-Scale Distribution-Based Data Analysis
Presentation transcript:

ASCR Scientific Data Management Analysis & Visualization PI Meeting Exploration of Exascale In Situ Visualization and Analysis Approaches LANL: James Ahrens, Jon Woodring, Joanne Wendelberger, Francesca Samsel. We explore two in situ approaches at the extreme ends of a spectrum between flexibility and accuracy. We will strive to understand the advantages and disadvantages of both approaches and evaluate their effectiveness. Using the results of this evaluation, we will merge the best of both approaches to produce an optimize exascale visualization and analysis approach. Statistics and Sampling of Simulation Data with Bitmaps Challenges  Locating the data that a scientist needs is daunting due to the scale of the data and lack of information Solution: Sample Bitmaps  Bitmap indices provide summary information for a large-scale data set  They also provide distributional data that can be used for sampling  Statistics can be extracted from this summary to be able to drill down and extract information of interest  Bitmaps accelerate statistics and sampling for faster turn-around in exploration with lower sample error Papers  Y. Su, G. Agrawal, J. Woodring, K. Myers, J. Wendelberger and J. Ahrens, "Effective and Efficient Data Sampling Using Bitmap Indices", Cluster Computing, March  Y. Su, G. Agrawal, J. Woodring, A. Biswas and H.-W. Shen, "Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings", in Proceedings of the International ACM Symposium on High- Performance Parallel and Distribued Computing (HPDC'14), June 2014, Vancouver, Canada.  Y. Su, G. Agrawal, J. Woodring, K. Myers, J. Wendelberger and J. Ahrens. “Taming Massive Distributed Datasets: Data Sampling Using Bitmap Indices.” In Proceedings of the International ACM Symposium on High- Performance Parallel and Distributed Computing (HPDC’13), New York, NY, USA, June  Y. Su, G. Agrawal, and J. Woodring, “Indexing and Parallel Query Processing Support for Visualizing Climate Datasets”, Proceedings of the 41st International Conference on Parallel Processing, Pittsburgh, PA, Sept Bitmaps are used for sampling and statistics for large-scale data analysis Contact: James Ahrens Adaptive refinement based on analysis metric highlighting areas of interest Reduced Simulation Data Approach Significantly reducing simulation data by storing sampled and compressed data representations Adaptive Sampling of Simulation Data Challenges  Simulations and experiments generate more data that can be feasibly stored by the scientist Solution: Adaptive Sample Data based on Analysis Metrics  Treat the exascale data deluge as a sampling problem  Use a variety of metrics to automatically select and triage the important data  Analysis Driven Refinement is a framework that prioritizes and samples using these metrics Papers  B. Nouanesengsy, J. Woodring, K. Myers, J. Patchett, and J. Ahrens, “ADR Visualization: A Generalized Framework for Ranking Large-Scale Scientific Data using Analysis-Driven Refinement”, LDAV 2014, November 2014, Paris, France.  K. Myers, E. Lawrence, M. Fugate, J. Woodring, J. Wendelberger, and J. Ahrens, “An In Situ Approach for Approximating Complex Computer Simulations and Identifying Important Time Steps”, in submission, arXiv:  A. Biswas, S. Dutta, H.-W. Shen, J. Woodring. “An Information-Aware Framework for Exploring Multivariate Data Sets.” IEEE Visualization 2013, Atlanta, GA, November, Image Database Approach Significantly reducing simulation data by storing rendered visualization and analysis images into an image database Sampling in “Visualization and Analysis” Space Challenges  Simulations and experiments generate massive datasets that are difficult to store and analysis in a post processing manner Solution: Generate In Situ Image Database  Enables many different interaction modes including: 1) animation and selection, 2) camera and 3) time  Creates an responsive interactive visualization solution, rivaling modern post-processing approaches, based on producing constant time retrieval and assembly  Encourages the use of both computationally intensive analysis and temporal exploration typically avoided in post-processing approaches Supports Metadata Searching  By leveraging an image database, our approach allows the analyst to execute meta-data queries or browse analysis results to produce a prioritized sequence of matching results Creation of New Visualizations and Content Querying  Supports composing of individually imaged operators  Provides access to the underlying data to enable advanced rendering during post-processing (e.g. new lookup tables, lighting,...)  Makes it possible to perform queries that search on the content of the image in the database. Using image-based visual queries, the analyst can ask simple scientific questions and get the expected results Papers  J. Ahrens, S. Jourdain, P. O'Leary, J. Patchett, D. H. Rogers, M. Petersen, “An Image- based Approach to Extreme Scale In Situ Visualization and Analysis”, Supercomputing 2014, New Orleans. Interactive visualization and compositing using images from the image database Using lighting and color mapping, render passes and compositing enable more capable visualization pipelines such as changing color scale mapping for objects Queries based on the image content can be used to search for qualitative results like “best view” LA-UR