Uncertainty in Analysis and Visualization Topology and Statistics V. Pascucci Scientific Computing and Imaging Institute, University of Utah in collaboration.

Slides:



Advertisements
Similar presentations
Modeling of Complex Social Systems MATH 800 Fall 2011.
Advertisements

Sampling: Final and Initial Sample Size Determination
Def.: Time Series Time series is a collection or sequence of numbers that represent the state of any system as a function of time or space or any other.
WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
Considerations for Data Series for Current Practices Scenario November 2006 Update 17 January 2007 B. Contor (New stuff will be in green boxes or on green.
Topology-Based Analysis of Time-Varying Data Scalar data is often used in scientific data to represent the distribution of a particular value of interest,
A Framework for Integrating Remote Sensing, Soil Sampling, and Models for Monitoring Soil Carbon Sequestration J. W. Jones, S. Traore, J. Koo, M. Bostick,
Sampling Distributions (§ )
Statistical properties of Random time series (“noise”)
W. McNair Bostick, Oumarou Badini, James W. Jones, Russell S. Yost, Claudio O. Stockle, and Amadou Kodio Ensemble Kalman Filter Estimation of Soil Carbon.
Contour Tree and Small Seed Sets for Isosurface Traversal Marc van Kreveld Rene van Oostrum Chandrajit Bajaj Valerio Pascucci Daniel R. Schikore.
CDS 301 Fall, 2009 Scalar Visualization Chap. 5 September 24, 2009 Jie Zhang Copyright ©
Bayesian statistics – MCMC techniques
Peyman Mostaghimi, Martin Blunt, Branko Bijeljic 11 th January 2010, Pore-scale project meeting Direct Numerical Simulation of Transport Phenomena on Pore-space.
Texture Turk, 91.
Data Analysis and Visualization Using the Morse-Smale complex
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Pascucci-1 Utah April 2008 Understanding the Dynamics of Rayleigh-Taylor instabilities Rayleigh-Taylor instabilities arise in fusion, super-novae, and.
ITUppsala universitet Data representation and fundamental algorithms Filip Malmberg
Copyright, © Qiming Zhou GEOG1150. Cartography Quality Control and Error Assessment.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 14/15 – TP3 Digital Images Miguel Tavares Coimbra.
Chapter 8 Introduction to Hypothesis Testing
In Situ Sampling of a Large-Scale Particle Simulation Jon Woodring Los Alamos National Laboratory DOE CGF
Institute for Mathematical Modeling RAS 1 Visualization in distributed systems. Overview. Remote visualization means interactive viewing of three dimensional.
ESA DA Projects Progress Meeting 2University of Reading Advanced Data Assimilation Methods WP2.1 Perform (ensemble) experiments to quantify model errors.
Mean Shift Theory and Applications Reporter: Zhongping Ji.
Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology.
GIScience 2000 Raster Data Pixels as Modifiable Areal Units E. Lynn Usery U.S. Geological Survey University of Georgia.
Chapter 3 Digital Representation of Geographic Data.
8. Geographic Data Modeling. Outline Definitions Data models / modeling GIS data models – Topology.
Raster Data Model.
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
Computer Graphics: Programming, Problem Solving, and Visual Communication Steve Cunningham California State University Stanislaus and Grinnell College.
Modern Era Retrospective-analysis for Research and Applications: Introduction to NASA’s Modern Era Retrospective-analysis for Research and Applications:
Digital Image Processing Definition: Computer-based manipulation and interpretation of digital images.
*Partially funded by the Austrian Grid Project (BMBWK GZ 4003/2-VI/4c/2004) Making the Best of Your Data - Offloading Visualization Tasks onto the Grid.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Céline Scheidt and Jef Caers SCRF Affiliate Meeting– April 30, 2009.
CSE554ContouringSlide 1 CSE 554 Lecture 4: Contouring Fall 2015.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
Edge Maps: Representation of Consistent Flow, Error and Uncertainty Publications: H. Bhatia, S. Jadhav, P.-T. Bremer, G. Chen, J.A. Levine, L.G. Nonato,
Extreme Value Prediction in Sloshing Response Analysis
Large Scale Time-Varying Data Visualization Han-Wei Shen Department of Computer and Information Science The Ohio State University.
Information Bottleneck versus Maximum Likelihood Felix Polyakov.
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
WEPS Climate Data (Cligen and Windgen). Climate Data ▸ Climate Generation by Stochastic Process A stochastic process is one involving a randomly determined.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Simulation of single server queuing systems
New Features Added to Our DTI Package XU, Dongrong Ph.D. Columbia University New York State Psychiatric Institute Support: 1R03EB A1 June 18, 2009.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Extreme Value Prediction in Sloshing Response Analysis
Analyzing Redistribution Matrix with Wavelet
Sample Mean Distributions
Meng Lu and Edzer Pebesma
Towards Topology-Rich Visualization/Analysis
Invitation to Computer Science 5th Edition
Morgan Bruns1, Chris Paredis1, and Scott Ferson2
AP Statistics: Chapter 7
Development of Inter-model differences
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Efficient Quantification of Uncertainties Associated with Reservoir Performance Simulations Dongxiao Zhang, The University of Oklahoma . The efficiency.
Subgrid-Scale Model Tests Using Petascale Computers
Sampling Distributions (§ )
Assessing Exhaustiveness of Stochastic Sampling for Integrative Modeling of Macromolecular Structures  Shruthi Viswanath, Ilan E. Chemmama, Peter Cimermancic,
Emulator of Cosmological Simulation for Initial Parameters Study
EoCoE – Results of the geothermal energy part
Presentation transcript:

Uncertainty in Analysis and Visualization Topology and Statistics V. Pascucci Scientific Computing and Imaging Institute, University of Utah in collaboration with D. C. Thompson, J. A. Levine, J. C. Bennett, P.-T. Bremer, A. Gyulassy, P. E. Pébay

Massive Simulation and Sensing Devices Generate Great Challenges and Opportunities Cameras BlueGene/L Earth Images Porous Materials Carbon Seq. (Subsurface) Hydrodynamic Inst. Turbulent Combustion Satellite EM Retinal Connectome Climate Jaguar Photography Molecular Dynamics

Data from Many Sources Where is data “uncertain” coming from? – High Spatial Resolution (Massive grids) – Varying parameters, Time (Ensembles) – Many fields (pressure, temperature, ….) – Stochastic Simulations (Distributions/point)

Many Techniques Apply Honest assessment of data driven information Streamline AStreamline BUnstable Manifolds Analysis Parameter Space Simulation Parameter Space

Hixels: A Possible Unified Representation Hixels: – A pixel/voxel that stores a histogram of values Can be used to: – Compress blocks of large data – Encode the ensembles of runs per location – Discretize the distribution Trade data size/complexity for uncertainty

Hixels: 1d Example

Hixels: 2d Example (Ensemble) Normal x9600Poisson x3200Mean Hixels (Histogram drawn vertically)

Hixels: 2d Example (Ensemble) Normal x9600Poisson x3200 Mean Hixels (Histogram drawn vertically)

Some Basic Manipulations of the Hixels Representation Re-Sampling Bucketing Hixels Analyzing Statistical Dependence of Hixels Fuzzy Isosurfacing …… suggestions for more?

Re-Sampling from Hixels One way to represent a scalar field is by sampling the hixels per location. Generate (many) instances V i of downsampled data by selected a value for each hixel at random using the distribution. Treat this as a scalar field, apply various techniques to analyze. Initial basic assumption (probably not true): hixel data is independent.

Convergence of Sampled Topology 8x8 Hixels Persistence Distribution of Topological Boundaries Over Many Realizations

Convergence for 8x8 hixels

Convergence for 16x16 hixels

Persistence Distribution of Converged Topological Boundaries Over Data Simplification Convergence of Sampled Topology

Statistically Associated Sheets Not all values in hixels are independent Consider data as an ensemble – Each “sample” is per run, not per location – There may be certain (statistical) associations between neighboring values Use contingency tables to represent which values in adjacent hixels are related

Bucketing Hixels

Topological Bucketing of Hixels Ordering pairs of critical points by the area between them (hypervolume persistence ). Eliminates peaks by the probability associated with them.

Bucketing Hixels Hixels of size 16x16x bins/histogram Hypervolume Persistence varied from 16 to 512 by powers of 2 hp=16hp=32hp=64hp=128hp=256hp=512

Contingency Tables: Pointwise Mutual Information pmi(x,y)=0 => x independent y

Sheets Basins of Minima Basins of Maxima

More Sheets Red = 27 buckets/hixels Blue = 1/bucket/hixel 64 3 Hixels 16 3 Hixels 32 3 Hixels

Fuzzy Isosurfaces Slicing histograms:

Fuzzy Isosurfaces

Comparison Stag Beetle with 16 3 hixels Vs. mean & lower-left isosurfaces Isovalue 580 Original size 832x832x494

Questions? Jet with 2 3, 8 3, and 32 3 hixels Isovalue Original size: 768x896x512