Science-Driven Visualization Research Challenges 9 Nov 2004 SC2004 – Pittsburgh PA Wes Bethel with help from Friends at Lawrence Berkeley National Laboratory.

Slides:



Advertisements
Similar presentations
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Advertisements

Life Science Services and Solutions
NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
A new Network Concept for transporting and storing digital video…………
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
GENI: Global Environment for Networking Innovations Larry Landweber Senior Advisor NSF:CISE Joint Techs Madison, WI July 17, 2006.
Copyright 2002: LIIF Technology Architecture Review Database Application Architecture Database Application Architecture Collaborative Workgroup Architecture.
LBNL Visualization Group Research Snapshot Wes Bethel Lawrence Berkeley National Laboratory Berkeley, CA 24 Feb 2004.
Automated Analysis and Code Generation for Domain-Specific Models George Edwards Center for Systems and Software Engineering University of Southern California.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Accelerate Business Success With CRM CRM Interoperability.
SING* and ToNC * Scientific Foundations for Internet’s Next Generation Sirin Tekinay Program Director Theoretical Foundations Communication Research National.
Fundamental System Concepts Asper School of Business University of Manitoba Systems Analysis & Design Instructor: Bob Travica Updated: September 2014.
Integration of Applications MIS3502: Application Integration and Evaluation Paul Weinberg Adapted from material by Arnold Kurtz, David.
Chapter 9: Moving to Design
Distributed Systems: Client/Server Computing
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Robots at Work Dr Gerard McKee Active Robotics Laboratory School of Systems Engineering The University of Reading, UK
Distribution Statement A. Approved for public release; distribution is unlimited. Test and Evaluation/Science and Technology Program Rapid Data Analyzer.
Internet GIS. A vast network connecting computers throughout the world Computers on the Internet are physically connected Computers on the Internet use.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
A Research Agenda for Accelerating Adoption of Emerging Technologies in Complex Edge-to-Enterprise Systems Jay Ramanathan Rajiv Ramnath Co-Directors,
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
1 CS 456 Software Engineering. 2 Contents 3 Chapter 1: Introduction.
DOE BER Climate Modeling PI Meeting, Potomac, Maryland, May 12-14, 2014 Funding for this study was provided by the US Department of Energy, BER Program.
material assembled from the web pages at
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
Chapter 4 Realtime Widely Distributed Instrumention System.
Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering
National Energy Research Scientific Computing Center (NERSC) Visualization Tools and Techniques on Seaborg and Escher Wes Bethel & Cristina Siegerist NERSC.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 NERSC Visualization Greenbook Workshop Report June 2002 Wes Bethel LBNL.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Visualization Workshop David Bock Visualization Research Programmer National Center for Supercomputing Applications - NCSA University of Illinois at Urbana-Champaign.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Futures Lab: Biology Greenhouse gasses. Carbon-neutral fuels. Cleaning Waste Sites. All of these problems have possible solutions originating in the biology.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Commodity Grid Kits Gregor von Laszewski (ANL), Keith Jackson (LBL) Many state-of-the-art scientific applications, such as climate modeling, astrophysics,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
1 Yield Analysis and Increasing Engineering Efficiency Spotfire Users Conference 10/15/2003 William Pressnall, Scott Lacey.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
1 COMPUTER SCIENCE DEPARTMENT COLORADO STATE UNIVERSITY 1/9/2008 SAXS Software.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 1 Office of Science Jean-Luc Vay Accelerator Technology & Applied Physics Division Lawrence Berkeley National Laboratory HEP Software Foundation Workshop,
1 OFFICE OF ADVANCED SCIENTIFIC COMPUTING RESEARCH The NERSC Center --From A DOE Program Manager’s Perspective-- A Presentation to the NERSC Users Group.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Wednesday NI Vision Sessions
ILink Systems, Inc Feb, 2014 Government IT Solutions.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
VisIt Project Overview
Joseph JaJa, Mike Smorul, and Sangchul Song
Introduction to Cloud Computing
Model-Driven Analysis Frameworks for Embedded Systems
What's New in eCognition 9
What's New in eCognition 9
Presentation transcript:

Science-Driven Visualization Research Challenges 9 Nov 2004 SC2004 – Pittsburgh PA Wes Bethel with help from Friends at Lawrence Berkeley National Laboratory vis.lbl.gov

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization – Introduction and Approach. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

Science-Drive Visualization Challenges – Outline Role of visualization in science, and what users really want? Role of visualization in science, and what users really want? Challenges of user needs. Challenges of user needs. What efforts targeted at meeting those needs? What efforts targeted at meeting those needs? Is the current approach meeting user needs? Is the current approach meeting user needs?

Role of Visualization in Science An instrument to “see data” that is otherwise unseeable. A vehicle to communicate findings and results. Plays an integral part of the scientific process and scientific workflows. Something doesn’t “look right” in this picture – what happened?

Introduction – The Scientific Process and Workflows Hypothesize – experiment/test – refine. Workflows are the sequence of tasks in the scientific process. Visualization serves as the “instrument” to aid in “seeing results” at each stage in the workflow.

What Do (Science) Users Need? Easy to use software. Easy to use software. That is free (and works). That is free (and works). That is supported. That is supported. Help learning/using/applying the software to their problem. Help learning/using/applying the software to their problem. New visualization capabilities for their problem. New visualization capabilities for their problem. Support for remote and distributed operations, capacity to analyze large and complex data. Support for remote and distributed operations, capacity to analyze large and complex data.

Challenges of User Needs For many modern computational science projects, there exists no “canned” visualization solution. Tools and technology must be created. For many modern computational science projects, there exists no “canned” visualization solution. Tools and technology must be created. Such efforts require expertise in a wide range of specialties: computer science, software engineering, cognitive science, people skills, etc. Such efforts require expertise in a wide range of specialties: computer science, software engineering, cognitive science, people skills, etc. Creating such tools requires close and ongoing effort between researchers of many disciplines. Creating such tools requires close and ongoing effort between researchers of many disciplines. Few, if any, “standards” to help provide a stable environment for visualization. Few, if any, “standards” to help provide a stable environment for visualization.

Science-Drive Visualization Research Problem Statement Trend is towards remote and distributed data analysis and visualization. Trend is towards remote and distributed data analysis and visualization. Domain-specific solutions required. Domain-specific solutions required. Such solutions are inherently multidisciplinary and extremely complex. Such solutions are inherently multidisciplinary and extremely complex.

Efforts Targeted at Meeting Science Needs Individual P.I. Funded to perform some visualization research. Individual P.I. Funded to perform some visualization research. A fraction of a P.I. and a graduate student. Publish a research paper, might release a research prototype of their software (or might not). Their reward is the technical publication. Institutional visualization support. Institutional visualization support. NERSC, ASCI/Views, etc. Missing: large, program-wide coordination of activities. Missing: large, program-wide coordination of activities.

Does the Current Approach Work?

Sloan Digital Sky Survey Portal Interface and operations tailored to astronomy community. Interface and operations tailored to astronomy community.

Does the Current Approach Work? Generally, no: Generally, no: Duplication of effort across disparate programs. Little impetus to share work, to leverage others work. What’s Missing? What’s Missing? Critical visualization infrastructure: community-centric data models, fungible visualization technology that can be shared and reused across program areas. Program-wide emphasis upon coordinated visualization activities. Requires conscious engineering – coordinated activities will not “emerge” from many small visualization projects.

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization – Introduction and Approach. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

LBNL Visualization Research – Outline The LBNL Visualization Research Vision. The LBNL Visualization Research Vision. The Research Strategy and Tactics. The Research Strategy and Tactics. Near-term and long-term goals. Near-term and long-term goals. Results: Results: Domain-specific solutions. Remote and Distributed visualization research results. Computer Science Research.

LBNL Visualization Research Vision Shaky CityData CachesHPC Resources Sensor Nets Simulations STM Handheld Devices Dr Jane Collaborators

Problem Statement – Repeated Trend is towards remote and distributed data analysis and visualization. Trend is towards remote and distributed data analysis and visualization. Domain-specific solutions required. Domain-specific solutions required. Such solutions are inherently multidisciplinary and extremely complex. Such solutions are inherently multidisciplinary and extremely complex.

Research Challenges for Remote and Distributed Visualization Community-centric data models, component interfaces, execution frameworks. Community-centric data models, component interfaces, execution frameworks. Visualization algorithms, delivery mechanisms. Visualization algorithms, delivery mechanisms. Effective and simplified use of parallel and distributed resources. Effective and simplified use of parallel and distributed resources.

LBNL Visualization Research Strategy Map the canonical visualization pipeline into remote & distributed use model. Map the canonical visualization pipeline into remote & distributed use model. Shaky CityData CachesHPC Resources Sensor Nets Simulations STM Handheld Devices Dr Jane Collaborators

LBNL Visualization Research Tactics Close relationships with DOE science projects to deliver domain-specific (useful) technologies. Close relationships with DOE science projects to deliver domain-specific (useful) technologies. Research advances on the visualization pipeline to realize the dream of “vis anywhere, anytime, by anybody.” Research advances on the visualization pipeline to realize the dream of “vis anywhere, anytime, by anybody.” Fundamental CS research to complement visualization research. Fundamental CS research to complement visualization research.

LBNL Visualization Research Tactics Components encapsulate algorithms, frameworks marshal data and mediate execution (see HECRTF). Components encapsulate algorithms, frameworks marshal data and mediate execution (see HECRTF). Bottom-up: focus on specific application- driven projects. E.g., Accelerator SciDAC. Bottom-up: focus on specific application- driven projects. E.g., Accelerator SciDAC.

LBNL Visualization Research Tactics Distributed and parallel architectures offer new algorithmic opportunities (Visapult). Distributed and parallel architectures offer new algorithmic opportunities (Visapult). Interaction methodology important for large data exploration, cuts across data management, visualization, applications. Interaction methodology important for large data exploration, cuts across data management, visualization, applications. Delivery mechanisms are “the handles” provided to the user to guide data exploration and analysis. Delivery mechanisms are “the handles” provided to the user to guide data exploration and analysis.

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization – Introduction and Approach. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

Domain-Specific Solutions 21 st Century Accelerator Modeling (SciDAC) 21 st Century Accelerator Modeling (SciDAC) Center for Extended MHD (SciDAC) Center for Extended MHD (SciDAC) Protein Structure Prediction Protein Structure Prediction

Accelerator Simulation Visualization Data: time-varying, 6D, multi-species. Data: time-varying, 6D, multi-species. Typical visualization: scatter plots of one dimension against another. E.g., x- position vs. x-phase. Typical visualization: scatter plots of one dimension against another. E.g., x- position vs. x-phase. Need: ability to explore, to subset, to visually comprehend science. Need: ability to explore, to subset, to visually comprehend science.

Accelerator Simulation Visualization, ctd. Interactive data subsetting and selection. Interactive data subsetting and selection. Paint metaphor Using domain knowledge. Novel visualization technique well- suited for 6D data (next slide). Novel visualization technique well- suited for 6D data (next slide).

Accelerator Simulation Visualization, ctd.

Proton beam (particles) passing through a cloud of electrons (volume rendering).

Accelerator Simulation Visualization, ctd. Electron trajectories

Accelerator Modeling: Remote and High Performance Visual Analysis User-requested domain-specific tool for browsing data. User-requested domain-specific tool for browsing data. Distributed, pipelined architecture to scale with increasing data sizes. Distributed, pipelined architecture to scale with increasing data sizes. workstation Remote data storage

Accelerator Modeling: Remote and High Performance Visual Analysis Our group engineered a HDF5 file format for the computational scientists. Our group engineered a HDF5 file format for the computational scientists. They were using ASCII files. Our group also engineered parallel I/O capabilities using HDF5. Our group also engineered parallel I/O capabilities using HDF5. A common data model/format is the basis for a family of high performance analysis software technology. A common data model/format is the basis for a family of high performance analysis software technology.

Accelerator Modeling Visualization: Conclusion Close interaction with scientists resulted in domain-specific technologies as well as new visualization research. Close interaction with scientists resulted in domain-specific technologies as well as new visualization research. The “unglamorous work” of data models/formats and I/O is the underpinning for the much of the project. The “unglamorous work” of data models/formats and I/O is the underpinning for the much of the project. We are in a good position to move forward with additional tools based upon a community-centric data model. We are in a good position to move forward with additional tools based upon a community-centric data model.

Remote Visualization of Fusion Simulation Results Problems: Simulations run at centralized supercomputing facilities generate large, complex data. Analysis to be performed by remotely located scientists. Science teams are themselves geographically distributed, and have requested some form of collaborative investigation/visualization.

Remote Visualization of Fusion Simulation Results Approach: Use high performance, parallel resources located “close to” the data. Where plausible, retain the high performance rendering capabilities of desktop workstations. Partition the visualization pipeline (more later) across sites in multiple ways. Which works best?

Fusion Visualization: Pipelined, Distributed and Parallel Architecture Mass Storage Parallel Visualization (Compute, I/O) POW (Plain old workstation) Princeton Berkeley Data Vis Render 10GB/timestep 10sMB

Fusion Visualization: Pipelined, Distributed and Parallel Architecture High capacity I/O and compute located “near” large data source. High capacity I/O and compute located “near” large data source.

Collaborative Visualization Rapid inspection of data too large to move: Rapid inspection of data too large to move: Saves having to transfer 100s of GB across country. Saves having to transfer 100s of GB across country. Multiple simultaneous participants (roundtable model). Multiple simultaneous participants (roundtable model). Escher Ensight Server-of-Servers Ensight Server-of-Servers PPPL Ensight Client PPPL Ensight Client LBL Ensight Client Data Vis Render 10GB/timestep 10sMB

Remote Fusion Simulation Visualization – Sending Images ~50fps 800x600, 24-bpp ~50fps 800x600, 24-bpp Over 100BaseT, low latency connection (LAN) Freely running image generator – only framebuffer contents sent; no mouse events, etc. Frame rate relatively insensitive to compression algorithm, as long as some compression is used. 4-5fps “full screen interactive application” 4-5fps “full screen interactive application” 100BaseT Ethernet, 50ms latency (WAN between LBNL – PPPL) Interactive application. Data Vis Render 10GB/timestep 10sMB

4-5fps “not unexpected” 50 ms one-way latency is 100ms RTT 50 ms one-way latency is 100ms RTT Maximum possible frame rate: 10fps Maximum possible frame rate: 10fps Add in more latency due to fb reads, detect and package mouse events, etc. Add in more latency due to fb reads, detect and package mouse events, etc. Conclusion: latency is a killer. Conclusion: latency is a killer.

Frame Rate Limit Due to Latency: 1000/2*latencyMS. Time 50ms AB C/A B A – user drags the mouse, mouse event sent to server. B – “instantaneous” frame render, grab, compress, send and receipt by client. C – client decompresses, displays image, grabs next mouse event, etc.

Fusion Visualization: Conclusions Using high capacity visualization resources located “close to” the source data for remote use appears promising. Using high capacity visualization resources located “close to” the source data for remote use appears promising. Different approaches, each with advantages and disadvantages. Different approaches, each with advantages and disadvantages. Functional results: good. Functional results: good. Performance results: mixed. Performance results: mixed.

Protein Structure Prediction Outline Problem Description. Problem Description. Approaches to help solve an NP-hard problem: Approaches to help solve an NP-hard problem: Better initial configurations. Visualization and intervention to guide optimizations.

Protein Structure Prediction Challenges Protein structure prediction is difficult (NP-hard) – it is one of the grand challenges in computational biology. Visualization and interactive techniques can accelerate the process. No “off-the-shelf” technologies exist – they must be created.

Protein Structure Prediction, ctd. Given: an amino acid sequence, Find: an optimal protein conformation.

Protein Structure Prediction, ctd. Problem: what is the minimal-energy structure of a sequence of amino acids? Solution: Nature knows, but computing an answer is NP-hard (not solvable). Approach: Human-guided setup, computer-aided energy optimization and minimization. Conf: Pred: HHHHHHHCCCEEEEEEECCCEEEEEEEECCCCCCC AA: FKQYANDNGVDGVWTYDDATKTFTVTEMVTEVPVA

Protein Structure Prediction, ctd.

Optimization and computational steering Initial configurations used as “seed points” for optimization. Intermediate results – the “search tree” – is displayed for inspection. A human may intervene in the optimization.

Protein Structure Prediction – Energy Visualization Energy gradient Energy gradient (Movie) (Movie) (Movie)

Protein Structure Prediction – Energy Visualization Movie

Protein Structure Prediction – Conclusion Increased scientific capacity and capability. Increased scientific capacity and capability. CASP – days; CASP – hours. New scientific opportunities: New scientific opportunities: Multiple molecule interactions – drug design. Visualization impact: Visualization impact: Best Application Paper award, IEEE Visualization 2003.

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization – Introduction and Approach. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

Computer Science/Visualization Research - Outline Research Challenges. Research Challenges. Query-based visualization. Query-based visualization. Desktop delivery R&D. Desktop delivery R&D. Remote and distributed visualization pipeline optimization. Remote and distributed visualization pipeline optimization.

Fundamental Remote and Distributed Visualization Research Challenges Fungible technologies for creating visualization applications. Fungible technologies for creating visualization applications. Components, data/application adapters, vis-centric network transport, resource discovery/allocation, dynamic application construction, decoupling UI from vis/analysis “engine,” decoupling execution control from component architecture. Community-centric data models. Community-centric data models. Multi-resolution and progressive analysis/vis. Multi-resolution and progressive analysis/vis.

Fundamental Remote and Distributed Visualization Research Challenges, ctd. More interactions with other communities: science applications, data management and data analysis. More interactions with other communities: science applications, data management and data analysis. Long-term deployment and maintenance strategy. Long-term deployment and maintenance strategy. Community and programmatic focus on technology interoperability. Community and programmatic focus on technology interoperability.

Query-Driven Visualization (Dex) Combine Visualization with SDM technology to accelerate visualization and analysis. Combine Visualization with SDM technology to accelerate visualization and analysis. Select data based upon boolean queries. Select data based upon boolean queries. Only visualize/analyze data that meets query criteria. Only visualize/analyze data that meets query criteria.

Remote Desktop Delivery – Thin Client QuickTime VR QuickTime VR Panorama Movies Object Movies Two axis, time-varying. Two axistime-varying. QTVR: QTVR: Industry standard Freely available players (except Linux!). LBNL Contribution LBNL Contribution Object-movie encoder. Current research – multi-resolution-capable.

Visualization Pipeline Optimization Context: many heterogeneous, distributed resources. Context: many heterogeneous, distributed resources. Goal: user wants to take advantage of distributed resources to solve a problem. Goal: user wants to take advantage of distributed resources to solve a problem. Problem(s): need to select a set of resources to meet the task at hand. Problem(s): need to select a set of resources to meet the task at hand.

Visualization Pipeline Optimization Problem: component placement on distributed resources changes as a function of both performance target and specific data. Problem: component placement on distributed resources changes as a function of both performance target and specific data. Problem: distributed applications launched “by hand,” resource placement performed “by hand.” Problem: distributed applications launched “by hand,” resource placement performed “by hand.”

Performance Modeling and Pipeline Optimization Approach: model performance of individual components, optimize placement as a function of performance target. Approach: model performance of individual components, optimize placement as a function of performance target. Goal: automate the process of placing components on distribute resources. Goal: automate the process of placing components on distribute resources. Results: quadratic order algorithm, high degree of accuracy. Results: quadratic order algorithm, high degree of accuracy.

Performance Modeling and Pipeline Optimization Render Remote Render Remote Move images: setenv DISPLAY SGI’s Vizserver Data too big to move. Render Local Render Local Move data ftp, scp Logistical networking Hybrid approaches Hybrid approaches Move “vis results” for local rendering CEI’s Ensight, Visapult Data Visualization Display Render Data Visualization Display Render Data Visualization Display Render Your Workstation NERSC

Pipeline Optimization – User View Goal: simplify use of distributed visualization resources. Goal: simplify use of distributed visualization resources. Shaky CityData CachesHPC Resources Sensor Nets Simulations STM Handheld Devices Dr Jane Collaborators

Visualization Pipeline Optimization – Overview Obtain/derive performance measurements for pipeline components. Obtain/derive performance measurements for pipeline components. Automatically select placement of tasks on distributed resources to meet performance objectives. Automatically select placement of tasks on distributed resources to meet performance objectives.

Performance Modeling and Pipeline Optimization Single workflow: Single workflow: Reader -> Isosurface -> Render Reader performance: Reader performance: Function of: Data Size Machine constant T reader (n v ) = n v * C reader

Performance Modeling and Pipeline Optimization Render Performance: Render Performance: Function of: Number of triangles, Machine constant. T render = n t * C render + T readback

Performance Modeling and Pipeline Optimization Isosurface Performance: Isosurface Performance: Function of: Data set size, Number of triangles generated (determined by combination of dataset and isocontour level). Dominated number of triangles generated! T iso (n t,n v ) = n v * C base + n t * C iso

Performance Modeling and Pipeline Optimization Precompute histogram of data values. Precompute histogram of data values. Use histogram to estimate number of triangles as a function of iso level. Use histogram to estimate number of triangles as a function of iso level.

Performance Modeling and Pipeline Optimization Performance targets: Performance targets: Optimize for interactive transformation. Optimize for changing isocontour level. Optimize for data throughput.

Performance Modeling and Pipeline Optimization Pipeline Configurations: Pipeline Configurations: Render local – send data to workstation. Render remote – send images to workstation. Hybrid – send triangles to workstation.

Performance Modeling and Pipeline Optimization Optimize placement using Djikstra’s shortest path algorithm. Optimize placement using Djikstra’s shortest path algorithm. Edge weights assigned based upon performance target. Edge weights assigned based upon performance target. Low-cost algorithm: Low-cost algorithm: O(E + VlogV)

Performance Modeling and Pipeline Optimization - Conclusions “Microbenchmarks” to estimate individual component performance. “Microbenchmarks” to estimate individual component performance. Per-dataset statistics can be precomputed and saved with the dataset. Quadratic-order workflow-to-resource placement algorithm. Quadratic-order workflow-to-resource placement algorithm. Optimizes pipeline performance for an specific interaction target – relieves users from task of manual resource selection. Optimizes pipeline performance for an specific interaction target – relieves users from task of manual resource selection.

Outline Science-driven Visualization Challenges. Science-driven Visualization Challenges. LBNL Visualization Research LBNL Visualization Research Remote, Distributed and High Performance Visualization – Introduction and Approach. Domain-specific solutions for scientific research. Computer Science research. Conclusion and Future Directions Conclusion and Future Directions

Conclusions Close collaboration with applications produces usable, focused visualization technologies. Close collaboration with applications produces usable, focused visualization technologies. Such collaborations are long-term relationships. Such collaborations are long-term relationships. How to formalize and sustain such relationships?

Conclusions Component-based development holds much promise (see HECRTF). Component-based development holds much promise (see HECRTF). Underpinnings: Underpinnings: Community-centric data models. Interactive, parallel, distributed execution framework.

Conclusions Opportunity to move towards technology sharing and reuse, especially for visualization community. Opportunity to move towards technology sharing and reuse, especially for visualization community. Produce usable, long-lived visualization technology for applications. Produce usable, long-lived visualization technology for applications. Need for cross-program bridges – one form is stable infrastructure underpinnings based upon common component interfaces and community centric data models. Need for cross-program bridges – one form is stable infrastructure underpinnings based upon common component interfaces and community centric data models.

Summary LBNL has a world-class Visualization R&D program that has a balanced and effective having an emphasis upon remote, distributed and high performance visualization, and meeting the needs of science. Visit us on the web at This work was supported by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098.

The End