Advanced Scientific Visualization Paul Navrátil 28 May 2009.

Slides:



Advertisements
Similar presentations
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Advertisements

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information.
Advanced Scientific Visualization Laboratory Paul Navrátil 28 May 2009.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Parallel Visualization At TACC Greg Abram. Visualization Problems Small problems: Data are small and easily moved Office machines and laptops are adequate.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Xingfu Wu Xingfu Wu and Valerie Taylor Department of Computer Science Texas A&M University iGrid 2005, Calit2, UCSD, Sep. 29,
GPUs on Clouds Andrew J. Younge Indiana University (USC / Information Sciences Institute) UNCLASSIFIED: 08/03/2012.
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
Parallel Rendering Ed Angel
HIPerSpace The Highly Interactive Parallelized Display Space.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis.
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
What is Concurrent Programming? Maram Bani Younes.
Project Overview:. Longhorn Project Overview Project Program: –NSF XD Vis Purpose: –Provide remote interactive visualization and data analysis services.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
Parallelization with the Matlab® Distributed Computing Server CBI cluster December 3, Matlab Parallelization with the Matlab Distributed.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
MACHINE VISION GROUP Graphics hardware accelerated panorama builder for mobile phones Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and Markku.
Paraview Workshop: Distributed Visualization Samar Aseeri, KSL Madhu Srinivasan, KVL.
Chep06 1 High End Visualization with Scalable Display System By Dinesh M. Sarode, S.K.Bose, P.S.Dhekne, Venkata P.P.K Computer Division, BARC, Mumbai.
Visualization as a Science Discovery Tool Issues and Concerns Kelly Gaither Director of Visualization/ Sr. Research Scientist Texas Advanced Computing.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
DISTRIBUTED COMPUTING
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Kelly Gaither Visualization Area Report. Efforts in 2008 Focused on providing production visualization capabilities (software and hardware) Focused on.
Parallel Rendering. 2 Introduction In many situations, a standard rendering pipeline might not be sufficient ­Need higher resolution display ­More primitives.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
A Framework for Visualizing Science at the Petascale and Beyond Kelly Gaither Research Scientist Associate Director, Data and Information Analysis Texas.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Modeling Big Data Execution speed limited by: –Model complexity –Software Efficiency –Spatial and temporal extent and resolution –Data size & access speed.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Remote & Collaborative Visualization. TACC Remote Visualization Systems Longhorn – Dell XD Visualization Cluster –256 nodes, each with 48 GB (or 144 GB)
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Compute and Storage For the Farm at Jlab
VisIt Project Overview
A Brief Introduction to NERSC Resources and Allocations
Modeling Big Data Execution speed limited by: Model complexity
What is HPC? High Performance Computing (HPC)
Tools and Services Workshop
Joslynn Lee – Data Science Educator
CMSC 611: Advanced Computer Architecture
Working With Azure Batch AI
Spark Presentation.
VirtualGL.
Spatial Analysis With Big Data
TYPES OFF OPERATING SYSTEM
Chapter 16: Distributed System Structures
USF Health Informatics Institute (HII)
Introduction to Operating Systems
Introduction to Operating Systems
Presentation transcript:

Advanced Scientific Visualization Paul Navrátil 28 May 2009

Scientific Visualization “The purpose of computing is insight not numbers.” -- R. W. Hamming (1961) “The purpose of computing is insight not numbers.” -- R. W. Hamming (1961)

Visualization Allows Us to “See” the Science Application Render Geometric Primitives Pixels Raw Data

But what about large, distributed data?

Or distributed rendering?

Or distributed displays?

Or all three?

Schedule Advanced Scientific Visualization Topics –Large Data Visualization Challenges –Potential Solutions –Demonstrations Lab –Remote Visualization on Spur and Ranger –Scripting Visualization using Python in VisIt or Paraview

Visualization must scale with HPC Large data produced by large simulations require large visualization machines and produce large visualization results Terabytes of Data AT LEAST Terabytes of Vis Gigapixel Images Resampling, Application, … Resolution to Capture Feature Detail

Visualization Scaling Challenges Moving data to the visualization machine Most applications built for shared memory machines, not distributed clusters Image resolution limits in some software cannot capture feature details Displays cannot show entire high-resolution images at their native resolution

Moving Data How much time do you have? File Size10 Gbps54 Mbps 1 GB1 sec2.5 min 1 TB~17 min~43 hours 1 PB~12 days~5 years

Analyzing Data Visualization programs only beginning to efficiently handle ultrascale data –650 GB dataset -> 3 TB memory footprint –Allocate HPC nodes for RAM not cores –N-1 idle processors per node! Stability across many distributed nodes –Rendering clusters typically number N <= 64 –Data must be dividable onto N cores Remember this when resampling!

Imaging Data 4096 x 2160 PNG ~ 10 MB x 360 degrees ~ 3.6 GB x 30 days ~ 108 GB x 12 months ~ fps fps 36 min Image: NASA Blue Marble Project

Displaying Data Dell 30” flat-panel LCD 4 Megapixel display 2560 x 1600 resolution

Displaying Data Stallion – currently world’s highest-resolution tiled display 307 Megapixels x 8000 pixel resolution Dell 30” LCD

Displaying Data Dell 30” LCD – 4 Mpixel (2560 x 1600) Stallion – 307 Mpixel (38400 x 8000) NASA Blue Marble 0.5 km 2 per pixel 3732 Mpixel (86400 x 43200)

What’s the solution?

Solution by Partial Sums Moving data – integrate vis machine into simulation machine. Move the machine to data! –Ranger + Spur: shared file system and interconnect Analyzing data – create larger vis machines and develop more efficient vis apps –Smaller memory footprint –More stable across many distributed nodes Until then, the simulation machine is the vis machine!

Solution by Partial Sums Imaging data – focus vis effort on interesting features parallelize image creation –Feature detection to determine visualization targets but can miss “unknown unknowns” –Distribute image rendering across cluster Displaying data – high resolution displays multi-resolution image navigation –Large displays need large spaces –Physical navigation of display provides better insights

Old Model (No Remote Capability) Local Visualization Resource Local Visualization Resource HPC System HPC System Data Archive Data Archive Pixels Mouse Display Remote Site Wide-Area Network Wide-Area Network Local Site

New Model Remote Capability Large-Scale Visualization Resource Large-Scale Visualization Resource HPC System HPC System Data Archive Data Archive Display Remote Site Wide-Area Network Wide-Area Network Local Site Pixels Mouse

Spur / Ranger topology spur login3.ranger Login Nodes login4.ranger Compute Nodes Vis nodes ivis[1-7|big] HPC nodes ixxx-xxx vis queue normal development queues File System $HOME $WORK $SCRATCH

Parallel Visualization –Task parallelism – passing results to 1 process for rendering Read file 1Isosurface 1Cut Plane 1 2Read file 2Streamlines 2Render 3Read file 3Triangulate 3Decimate 3Glyph 3 Timesteps Processes

Parallel Visualization Pipeline parallelism Useful when processes have access to separate resources or when an operation requires many steps Read file 1Read file 2Read File 3 2Isosurface 1Isosurface 2Isosurface 3 3Render 1Render 2Render 3 Timesteps Processes

Parallel Visualization Data parallelism Data set is partitioned among the processes and all processes execute same operations on the data. Scales well as long as the data and operations can be decomposed Read partition 1 Isosurface partition 1 Render partition 1 2Read partition 2 Isosurface partition 2 Render partition 2 3Read partition 3 Isosurface partition 2 Render partition 3 Timesteps Processes

Parallel Visualization Libraries Chromium – –Sits between application and native OpenGL –Intercepts OpenGL calls, distribute across cluster –Can do either sort-first or sort-last (sort-first is simpler, sort-last can be better for large data) –Last update 31 Aug 2006, no new GL goodies IceT – SAGE – CGLX – –specifically for large tiled displays –Must use IceT / SAGE / CGLX API in code Mesa – –Software rendering library –Enables OpenGL rendering on machines without GPUs

Open-Source Parallel Vis Apps VisIt – –Good scaling to hundreds of cores –Integrated job launching mechanism for rendering engines –Good documentation and user community –GUI not as polished ParaView – –Polished GUI, easier to navigate –Less stable across hundreds of cores –Official documentation must be purchased, though rich knowledge base on web (via Google)

CUDA – coding for GPUs C / C++ interface plus GPU-based extensions Can use both for accelerating visualization operations and for general-purpose computing (GPGPU) Special GPU libraries for math, FFT, BLAS Image: Tom Halfhill, Microprocessor Report

GPU layout Image: Tom Halfhill, Microprocessor Report

GPU Considerations Parallelism – kernel should be highly SIMD –Switching kernels is expensive! Job size – high workload per thread –amortize thread initialization and memory transfer costs Memory footprint – task must decompose well –local store per GPU core is low (16 KB on G80) –card-local RAM is limited (~1GB on G8x) –access to system RAM is slow (treat like disk access) GPU N-body study (in GPU Gems 3):

Summary Challenges at every stage of visualization when operating on large data Partial solutions exist, though not integrated Problem sizes continue to grow at every stage Vis software community must keep pace with hardware innovations

Obrigado!