Assets and Dynamics Computation for Virtual Worlds.

Slides:



Advertisements
Similar presentations
A Real Time Radiosity Architecture for Video Games
Advertisements

Multi-user Extensible Virtual Worlds Increasing complexity of objects and interactions with increasing world size, users, numbers of objects and types.
Database Architectures and the Web
Combining Incremental and Parallel Methods for Large- scale Physics Simulation OpenCL Physics 1 Sheldon Brown, Site Director Daniel Tracy, Programmer Analyst.
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
The State of the Art in Distributed Query Processing by Donald Kossmann Presented by Chris Gianfrancesco.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Distributed Processing, Client/Server, and Clusters
Dealing with Computational Load in Multi-user Scalable City with OpenCL Assets and Dynamics Computation for Virtual Worlds.
Towards Energy Efficient MapReduce Yanpei Chen, Laura Keys, Randy H. Katz University of California, Berkeley LoCal Retreat June 2009.
Parallel Decomposition-based Contact Response Fehmi Cirak California Institute of Technology.
Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
School of Computer Science and Software Engineering A Networked Virtual Environment Communications Model using Priority Updating Monash University Yang-Wai.
1 of 12 Scene Graphs: the 50,000 ft View. 2 of 12 Traditional Definition Historical roots: Sketchpad  linear display lists  hierarchical display lists.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.
Threading Games for Performance – Architecture – Case Studies.
Computing Platform Benchmark By Boonyarit Changaival King Mongkut’s University of Technology Thonburi (KMUTT)
Cutting the Electric Bill for Internet-Scale Systems Andreas Andreou Cambridge University, R02
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Proposed Work 1. Client-Server Synchronization Proposed Work 2.
1 Scalable and transparent parallelization of multiplayer games Bogdan Simion MASc thesis Department of Electrical and Computer Engineering.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,
Hardware Supported Time Synchronization in Multi-Core Architectures 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan,
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
Combining Incremental and Parallel Methods for Large-scale Physics Simulation 1 Daniel Tracy, UCSD CHMPR Software Engineer Erik Hill, UCSD CHMPR Software.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
An Iterated Method to the Dubins Vehicle Travelling Salesman Problem OBJECTIVES Develop an algorithm to compute near optimal solutions to the Travelling.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.
Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.
The Cosmic Cube Charles L. Seitz Presented By: Jason D. Robey 2 APR 03.
1 MMORPG Servers. 2 MMORPGs Features Avatar Avatar Levels Levels RPG Elements RPG Elements Mission Mission Chatting Chatting Society & Community Society.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
Efficient Streaming of 3D Scenes with Complex Geometry and Complex Lighting Romain Pacanowski and M. Raynaud X. Granier P. Reuter C. Schlick P. Poulin.
Dynamic Scenes Paul Arthur Navrátil ParallelismJustIsn’tEnough.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Runtime Software Power Estimation and Minimization Tao Li.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
1 Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery.
Data Center & Large-Scale Systems (updated) Luis Ceze, Bill Feiereisen, Krishna Kant, Richard Murphy, Onur Mutlu, Anand Sivasubramanian, Christos Kozyrakis.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
AQWA Adaptive Query-Workload-Aware Partitioning of Big Spatial Data Dimosthenis Stefanidis Stelios Nikolaou.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
VAR/Fence: Using NV_vertex_array_range and NV_fence Cass Everitt.
Parallel Programming in Chess Simulations Part 2 Tyler Patton.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Identify the roles and opportunities available within an elected vocational area of the Creative Industries sector.
Elec/Comp 526 Spring 2015 High Performance Computer Architecture Instructor Peter Varman DH 2022 (Duncan Hall) rice.edux3990 Office Hours Tue/Thu.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Source Multicore 1 November 2006.
Scalability of Intervisibility Testing using Clusters of GPUs
Lecture: DRAM Main Memory
Outline Midterm results summary Distributed file systems – continued
Simultaneous Multithreading in Superscalar Processors
Lecture: DRAM Main Memory
Course Outline Introduction in algorithms and applications
Gary M. Zoppetti Gagan Agrawal
CS510 - Portland State University
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,
Research: Past, Present and Future
Overview Problem Solution CPU vs Memory performance imbalance
Jetson-Enabled Autonomous Vehicle
Presentation transcript:

Assets and Dynamics Computation for Virtual Worlds

Scalable City Architecture The Scalable City project has required larger, more interactive environments than previously practical. Its development has resulted in many novel techniques.

Overview of Dynamics Scalable City software built to handle large, temporally coherent environments efficiently. – Incremental processing methods – Minimize required work Example: Physics performance – Resting objects do not affect run-time – Overhead proportional to level of activity, rather than environment scale

Spatial Proximity Determination Sweep and Prune: O(un ⅔ + (i + r)n + e + o)

Spatial Proximity Determination Hybrid S&P: O(u + (i + r) + e)

Incremental Physics Pipeline Novel method for processing resting bodies and potential activations caused by collisions. Again, asymptotically superior performance.

Incremental Physics Pipeline Increasing bodies at rest (not being interacted with) in the simulation

Parallelizing Physics Multi-threaded physics engine Parallelized traditional physics engine components without altering algorithms

Client/Server Data Synchronization Due to high interactive object count, determining minimum data set to synchronize a client is challenging. What we could not do due to performance constraints: – Clients could not retain knowledge of all objects – Server could not traverse object set for sake of client updates – Server could not efficiently track client knowledge

Client/Server Data Synchronization Our solution utilized: – Spatial proximity detector producing player/object mapping to zones. – Each state change associated with a rule that determined the set of clients to update without explicit object-level knowledge of client state. – All events placed in order-dependent queue with common format. Result: No measurable overhead for synchronization.

Rendering Optimizations Roads and fences consist of many simple objects that require animation. –Drawing many simple objects is sub-optimal. –Animation updates constrain solutions New system treats all roads in a city as a single, large mesh with subset of vertices animating –Single draw call, geometry retained on GPU

Example: Rendering Optimizations We mirror vertices in CPU memory and transfer changes to GPU each cycle. Transferring updates to GPU has trade-offs –Each operation incurs an overhead that is a “constant plus bytes” transferred. –Optimal for full update is a single, continuous transfer. –Optimal when no updates are necessary is “do nothing”! –Our case is an unknown subset of updates requiring transfer. CPU Buffer GPU Buffer

Example: Rendering Optimizations Goals: –Ideal performance on full updates –Ideal performance on few updates –“Graceful degradation” in between Set of updates will have clusters –Many associated vertices animating together We want to minimize cost of updating GPU –Divide-and-conquer algorithm

Example: Rendering Optimizations Accumulate update positions into ranges: –if # updates significant % of range transfer whole range –else split ranges on largest gap and recursively call for both sides Algorithm provides good performance for conditions in which there’s a trade-off between transferring contiguous bytes and limiting total data transferred. CPU Buffer GPU Buffer