Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dealing with Computational Load in Multi-user Scalable City with OpenCL Assets and Dynamics Computation for Virtual Worlds.

Similar presentations


Presentation on theme: "Dealing with Computational Load in Multi-user Scalable City with OpenCL Assets and Dynamics Computation for Virtual Worlds."— Presentation transcript:

1 Dealing with Computational Load in Multi-user Scalable City with OpenCL Assets and Dynamics Computation for Virtual Worlds

2 Focus on OpenCL Leveraging OpenCL allows targeting and testing various parallel compute resources for each workload: Core i7, Cell, Tesla, GPU Using a combination of compute accelerators allows us to alter configuration and mappings between software and hardware more easily. Vendors continue to optimize OpenCL drivers while we optimize our use of OpenCL: free development!

3 Multi-User Load Challenges Communications Graphics Rendering – Geometry Processing – Shaders – Rendering Techniques Dynamics Computation – Physics – AI or other application specific behaviors – Animation

4 Particle Systems Each player is represented by a cloud of particles in the shape of a cyclone. Particle systems are client-side only: no effect upon environment, no need for synchronizing. With many players per city, each player processes many times more particles. Causing scalability and performance problems on the client with multiple players visible.

5 Particle Systems: Now CPU computes new positions & texture coordinates New info must be sent to the card each frame. More CPU & bandwidth used as players increase.

6 Particle Systems: Solution Utilize OpenCL to compute particle updates on the GPU. Particle system is an ideal GPU workload. Use OpenGL/DirectX interoperability to keep all data on the card: primary reason for GPU supremacy for this task! Expect an order of magnitude or more performance improvement on this system.

7 Server Physics

8 Effect of Multi-User On Physics Multi-user Scalable City will: – Scale total interactivity occurring at once – Shift focus more towards parallel computation – Impose greater demands on the state of the art in parallel computation – Consequently expand upon the state of the art in parallel computation – Utilize both incremental and parallel methods Reduce work as much as possible Parallelize all work that must be done

9 Utilizing Parallel Hardware Next step: offloading physics from z10 – Xeon blades running Scalable Engine – Cell BE blades running Bullet for Cell – Distribute heavy computational stages Collision Detection on broad phase pair output Constraint solving/Integration on contact groups Then: OpenCL plan – Develop physics system using algorithms well- suited to OpenCL parallelization

10 Server Physics: Parallelization Physics is difficult to parallelize well: – Each stage has vastly different properties If stages map to different devices, large data buffer transmissions must synchronize compute accelerators. – Computationally heavy stages Collision detection is coarse-grained Independent contact groups can be large, unbalanced Constraint solving difficult to break down – Traditional systems that solve all constraints simultaneously (e.g. using Gauss-Seidel) parallelize in limited ways.

11 Server Physics: Goals Produce a new physics processing pipeline based as much as possible on OpenCL Minimize buffer transmission to/from OpenCL devices by keeping most all stages in OpenCL Specialize physics algorithms to highly parallel hardware Scale to as much activity as possible in real time as we scale hardware resources

12 Server Physics: Approach Present efforts for OpenCL physics focus upon traditional algorithms and pipeline – These methods have “won out” in single-core era – Physics programmers most familiar with these methods and their trade-offs – Ex: AMD-funded OpenCL port of Bullet We choose techniques better suited to massively parallel computation! – Greater potential, more exploration to perform

13 “Advanced Character Physics” – Jakobsen GDC 2001 History – Developed for speed and simplicity – Not yet implemented in a parallel system Features – All physics operates on solely on particles – There is no large, global set of constraints to solve Ever

14 Jakobsen: Key Features Objects represented as a set of particles and stick constraints rather than geometric shapes All constraints solved individually, w/o reference to other constraints Collision response by simple projection Velocity-less Verlet integration method (often used in molecular dynamics)

15 Jakobsen: Key Features Objects represented as a set of particles and stick constraints rather than geometric shapes All constraints solved individually, w/o reference to other constraints Collision response by simple projection Velocity-less Verlet integration method (often used in molecular dynamics)

16 Rigid Bodies from Particles Cube as set of particles and stick constraints – Corners are particles – Stick constraints placed for edges – Stick constraints placed to prevent collapse A stick constraint requires that the distance between two points be a constant value

17 Jakobsen: Key Features Objects represented as a set of particles and stick constraints rather than geometric shapes All constraints solved individually, w/o reference to other constraints Collision response by simple projection Velocity-less Verlet integration method (often used in molecular dynamics)

18 Constraint Solving Normally, all constraints upon a body are solved simultaneously – Solution to large set of equations and unknowns – Produces a transform that does not violate any constraints Jakobsen method solves each constraint individually – Solving one constraint violates another – Iteratively solving constraints approaches solution – Fewer iterations can be used when warranted

19 Jakobsen: Key Features Objects represented as a set of particles and stick constraints rather than geometric shapes All constraints solved individually, w/o reference to other constraints Collision response by simple projection Velocity-less Verlet integration method (often used in molecular dynamics)

20 Jakobsen: Key Features Verlet Integration – Replace use of ‘velocity’ with ‘previous position’ Projection – Simply move particle out of collision with face! Combination of features results in very stable simulation – Velocity & acceleration never get out of control Retaining OpenCL buffers on device – Last cycle’s result array binds to ‘previous position’

21 Jakobsen: The Good Physics processing tends toward large number of simpler, evenly divided computations! – Collision detection and constraints operate on particles – All constraints are solved independently Can reduce buffer transfers to half! – No contact graph generation stage – Allows parallel computation across collision detection, constraint generation, constraint solving, integration Contact GraphColl. Det.IntegrationColl. Det.Integration

22 Jakobsen: The Bad Constraints can be violated temporarily – Solving one constraint can re-violate another – ~10 constraint solving passes produces rigid body-like behavior: fewer passes produce cloth->plant behavior Body transforms must be computed based upon particle configuration in post-processing for display purposes Additional behaviors require re-engineering (friction model, bouncing, inverse kinematics, etc) in light of Verlet & particle scheme (see paper)

23 Progress Work is just beginning OpenCL particle system with Verlet integration Complex forces, texture animation for cyclones Particle/height-map collision detection Collision response by projection Rigid bodies as set of particles + stick constraints Multi-pass relaxation solver

24 Progress

25 Further Out Distribute responsibility for dynamics among qualified clients – Increases world asynchronicity – Develop operational semantics to indicate synchronous status – Server gathers, correlates, and distributes updates to world

26 Conclusion OpenCL at front and center of efforts to alleviate performance problems on both server and client in our multi-user systems. Focus is on algorithms to – Reduce buffer communication to/from OpenCL – Increase parallelism by breaking up problems into smaller pieces


Download ppt "Dealing with Computational Load in Multi-user Scalable City with OpenCL Assets and Dynamics Computation for Virtual Worlds."

Similar presentations


Ads by Google