Workshop on Parallel Visualization and Graphics Chromium Mike Houston, Stanford University and The Chromium Community
Workshop on Parallel Visualization and Graphics 2 How Chromium works Replaces system’s OpenGL driver Industry standard API Support existing unmodified applications Manipulates streams of API commands Alter/inject/discard commands and parameters Route commands over a network Render commands using graphics hardware State tracking Allows parallel applications to issue OpenGL Constrain ordering between multiple streams
Workshop on Parallel Visualization and Graphics 3 Graphics Stream Processing Treat OpenGL calls as a stream of commands Form a DAG of stream transformation nodes Nodes are computers in a cluster Edges are OpenGL API communication Each node has a serialization stage and a transformation stage
Workshop on Parallel Visualization and Graphics 4 Stream Serialization Convert multiple streams into a single stream Efficiently context-switch between streams Constrain ordering using Parallel OpenGL extensions [Igehy98] Two kinds of serializers: Network server: Application: Unmodified serial application Custom parallel application S A OpenGL
Workshop on Parallel Visualization and Graphics 5 Stream Transformation Serialized stream is dispatched to “Stream Processing Units” (SPUs) Each SPU is a shared library Exports a (partial) OpenGL interface Each node loads a chain of SPUs at run time SPUs are generic and interchangeable
Workshop on Parallel Visualization and Graphics 6 SPU Chains SPUs are loaded as parts of linear chains Common usage: intercept a few OpenGL calls, pass all others to downstream SPU Useful for simple state changes, such as “wireframe” drawing
Workshop on Parallel Visualization and Graphics 7 Output Scalability (Sort-First) Larger displays with unmodified applications Other possibilities: broadcast, ring network App Server Display......
Workshop on Parallel Visualization and Graphics 8 Example: Sort-First App Tilesort Server Render
Workshop on Parallel Visualization and Graphics 9 Input Scalability (Sort-Last) Parallel geometry extraction Parallel data submission App Display Server
Workshop on Parallel Visualization and Graphics 10 Example: Sort-Last Application runs directly on graphics hardware Same application can use sort-last or sort-first Application Readback Send Server Render
Workshop on Parallel Visualization and Graphics 11 SPU Inheritance The Readback and Render SPUs are related Readback renders everything except SwapBuffers Readback inherits from the Render SPU Override parent’s implementation of SwapBuffers All OpenGL calls considered “virtual”
Workshop on Parallel Visualization and Graphics 12 Readback’s SwapBuffers Easily extended to include depth composite All other functions inherited from Render SPU void RB_SwapBuffers(void) { self.ReadPixels( 0, 0, w, h,... ); child.Clear( GL_COLOR_BUFFER_BIT ); child.SemaphorePCR( READBACK_SEMAPHORE ); child.RasterPos2i( tileX, tileY ); child.DrawPixels( w, h,... ); child.SemaphoreVCR( READBACK_SEMAPHORE ); child.SwapBuffers( ); }
Workshop on Parallel Visualization and Graphics 13 More Complicated Example: Hybrid App Tilesort Server Readback Send Server Render
Workshop on Parallel Visualization and Graphics 14 Networks Supported TCP/UDP Myrinet Quadrics Infiniband (coming soon)
Workshop on Parallel Visualization and Graphics 15 New Things to Chromium Extensions DMX Support Display list management (DLM) VNC Support CRUT Dale’s talk
Workshop on Parallel Visualization and Graphics 16 Extensions GL_ARB_fragment_program GL_ARB_vertex_program GL_NV_fragment_program GL_NV_vertex_program GL_NV_texture_rectangle GL_EXT_shadow_funcs GL_EXT_texture_rectangle GL_IBM_raster_pos_clip
Workshop on Parallel Visualization and Graphics 17 DMX Support DMX Distributed Multi-headed X Single X session across multiple-displays OpenGL through Chromium Chromium “DMX aware” Moving/resizing = retiling M to N rendering
Workshop on Parallel Visualization and Graphics 18 DMX In Action
Workshop on Parallel Visualization and Graphics 19 Display List Management Display List Manager (DLM) State tracking is really tricky Replay state calls on client Call list on servers Bounding Box tracking of display list Future optimizations Avoid broadcasting data in display list Send calls once per server as needed
Workshop on Parallel Visualization and Graphics 20 VNC X forwarding Forwards GLX calls to client DRI bypasses X Can’t get pixel data OpenGL apps load Chromium Render on local host Readback pixel data Send to user’s display
Workshop on Parallel Visualization and Graphics 21 What are people doing with Chromium?
Workshop on Parallel Visualization and Graphics 22 Dynamic Screen Calibration
Workshop on Parallel Visualization and Graphics 23 Quake3 Arena Niederauer, et al.
Workshop on Parallel Visualization and Graphics 24 Viewed in a new way Niederauer, et al.
Workshop on Parallel Visualization and Graphics 25 Architectural Analysis Intercept geometry Determine floor positions Change to orthographic view Insert clip planes at the ceilings Split floors apart Multi-pass rendering “Non-Invasive Interactive Visualization of Dynamic Architectural Environments” Christopher Niederauer, Mike Houston, Maneesh Agrawala, Greg Humphreys ACM SIGGRAPH 2003 Symposium on Interactive 3D Graphics
Workshop on Parallel Visualization and Graphics 26 Batch Scheduler Integration Offline rendering to a webpage Use massive compute resources Rendering with Vis cluster Integrate support with RMS Pittsburg Supercomputer Center
Workshop on Parallel Visualization and Graphics 27 Summary 750 Compute Nodes 3000 EV68 processors 6 Tf (peak, est >4Tf on LSMS) 3. TB memory 27 TB local disk Multi-rail fat-tree network Redundant monitor/ctrl WAN/LAN accessible Parallel visualization File servers: 30TB, ~32 GB/s Mass store, ~1 TB/hr WAN/LAN Switched ethernet Quadrics Control Compute Nodes File Servers Viz Mass Store buffer Archive Interactive /home Terascale Computing System Pittsburg Supercomputer Center
Workshop on Parallel Visualization and Graphics 28 Example qsub –l rmsnodes=3:12,other=visnodes=5 Job waits until 3 nodes (12 cpus) become available AND 5 vis nodes are available When resources available, job runs Visit vis web page for rendering Pittsburg Supercomputer Center
Workshop on Parallel Visualization and Graphics 29 What coming in the next year?
Workshop on Parallel Visualization and Graphics 30 General Improvements Continue to track OpenGL changes Add extensions Optimizations Display list management Tilesort Software Compositors
Workshop on Parallel Visualization and Graphics 31 PICA Support Parallel Image Compositing API (PICA) API for hardware and software compositing Will be supported by most hardware compositors Chromium support Hooks almost complete Need software compositors Readback (N to 1) Binary-swap SLIC Need info from hardware folks
Workshop on Parallel Visualization and Graphics 32 “Vis as a service” Better integration with schedulers Reservation systems Compute/Render/Display Distributed event model (CRUT) Compression Geometry data Pixel data Encryption
Workshop on Parallel Visualization and Graphics 33 Look at how much was done last year! 4 releases Constant bug fixes Constant improvements Constant optimizations Chromium is used in the real world! Chromium is supported by a large community!
Workshop on Parallel Visualization and Graphics 34 Go get it!
Workshop on Parallel Visualization and Graphics 35 Acknowlegements The Chromium community Greg Humphreys Brian Paul Joel Welling Alan Hourihane DOE!!!