Download presentation
Presentation is loading. Please wait.
Published byDana Gregory Modified over 9 years ago
1
Scalable and Secure Architectures for Online Multiplayer Games Thesis Proposal Ashwin Bharambe May 15, 2006
2
2 Online Games are Huge! 1 million 2 million 3 million 4 million 5 million 6 million 7 million 8 million 200520042003200220012000199919981997 Number of subscribers World of Warcraft Final Fantasy XI Everquest Ultima Online http://www.mmogchart.com/ 1.These MMORPGs have client- server architectures 2.They accommodate ~0.5 million players at a time 1.These MMORPGs have client- server architectures 2.They accommodate ~0.5 million players at a time Some more facts
3
3 Why MMORPGs Scale Role Playing Games have been slow-paced Players interact with the server relatively infrequently Maintain multiple independent game-worlds Each hosted on a different server Not true with other game genres FPS or First Person Shooters (e.g., Quake) Demand high interactivity Need a single game-world
4
4 FPS Games Don’t Scale Bandwidth (kbps) Quake II server Bandwidth and computation, both become bottlenecks
5
5 Goal: Cooperative Server Architecture Focus on fast-paced FPS games Focus on fast-paced FPS games
6
6 Distributing Games: Challenges Tight latency constraints As players or missiles move, updates must be disseminated very quickly < 150 ms for FPS games High write-sharing in the workload Cheating Execution and state maintenance spread over untrustworthy nodes
7
7 Talk Outline ProblemBackground Game Model Related Work Colyseus Architecture Expected Contributions
8
8 Game Model Player Game Status Monsters Ammo Interactive 3-D environment (maps, models, textures) Immutable State Mutable State Screenshot of Serious Sam
9
9 Game Execution in Client-Server Model void RunGameFrame() // every 50-100ms { // every object in the world // thinks once every game frame foreach (obj in mutable_objs) { if (obj->think) obj->think(); } send_world_update_to_clients(); }; void RunGameFrame() // every 50-100ms { // every object in the world // thinks once every game frame foreach (obj in mutable_objs) { if (obj->think) obj->think(); } send_world_update_to_clients(); };
10
10 Object Partitioning Player Monster
11
11 Distributed Game Execution class CruzMissile { // every object in the world // thinks once every game frame void think() { update_pos(); if (dist_to_ground() < EPSILON) explode(); } void explode() { foreach (p in get_nearby_objects()) { if (p.type == “player”) p.health -= 50; } }; class CruzMissile { // every object in the world // thinks once every game frame void think() { update_pos(); if (dist_to_ground() < EPSILON) explode(); } void explode() { foreach (p in get_nearby_objects()) { if (p.type == “player”) p.health -= 50; } }; Object Discovery Replica Synchronization Missile Monster Item
12
12 Talk Outline ProblemBackground Game Model Related Work Colyseus Architecture Expected Contributions
13
13 Related Work Distributed Designs Distributed Interactive Simulation (DIS) e.g., HLA, DIVE, MASSIVE, etc. Use region-based partitioning, IP multicast Butterfly, Second-Life, SimMUD [INFOCOM 04] Use region-based partitioning, DHT multicastCheat-proofing Lock-step synchronization with commitment
14
14 Related Work: Techniques Region-based Partitioning Parallel Simulation Area-of-Interest Management with Multicast
15
15 Related Work: Techniques Region-based Partitioning Divide the game-world into fixed #regions Assign objects in one region to one server + Simple to place and discover objects – High migration rates, especially for FPS games – Regions exhibit very high skews in popularity can result in severe load imbalance Parallel Simulation Area-of-Interest Management with Multicast
16
16 Related Work: Techniques Region-based Partitioning Parallel Simulation Peer-to-peer: each peer maintains full state Writes to objects are sent to all peers + Point-to-point link updates go fastest – Needs lock-step + bucket synchronization – No conflict resolution inconsistency never heals Area-of-Interest Management with Multicast
17
17 Related Work: Techniques Region-based Partitioning Parallel Simulation Area-of-Interest Management with Multicast Players only need updates from nearby region 1 region == 1 multicast group, use one shared multicast tree per group Bandwidth load-imbalance due to skews in region popularity Updates need multiple hops, bad for FPS games
18
18 Talk Outline ProblemBackground Colyseus Architecture Scalability [NSDI 2006] Evaluation Security Expected Contributions
19
19 Colyseus Components R3R4 P1 P2 Server S1 Server S2 P3P4 Object DiscoveryReplica ManagementObject Placement Server S3 get_nearby_objects ()
20
20 Flexible and dynamic object placement Permits use of clustering algorithms Not tied to “regions” Previous systems use region-based placement Frequent, disruptive migration for fast games Regions in a game have very skewed popularity Object Placement Region Rank Popularity
21
21 Writes are serialized at the primary Primary responsible for executing think code Replica trails from the primary by one hop Weakly consistent Low latency is critical Replication Model Single Primary Read-only Replicas Primary-Backup Replication 1-hop
22
22 Object Discovery Most objects only need other “nearby” objects for executing think functions get_nearby_objects ()
23
23 Distributed Object Discovery My position is x=x 1, y=y 1, z=z 1 Located on 128.2.255.255 My position is x=x 1, y=y 1, z=z 1 Located on 128.2.255.255 Publication Find all objects with obj.x ε [x 1, x 2 ] obj.y ε [y 1, y 2 ] obj.z ε [z 1, z 2 ] Find all objects with obj.x ε [x 1, x 2 ] obj.y ε [y 1, y 2 ] obj.z ε [z 1, z 2 ] Subscription S S S P Use a structured overlay to achieve this
24
24 Mercury: Range Queriable DHT Supports range queries vs. exact matches No need for partitioning into “regions” Places data contiguously Can utilize spatial locality in games Dynamically balances load Control traffic does not cause hotspots Provides O(log n)-hop lookup About 200ms for 225 nodes in our setup [SIGCOMM 2004]
25
25 Object Discovery Optimizations Pre-fetch soon-to-be required objects Use game physics for prediction Pro-active replication Piggyback object creation on update messages Soft-state subscriptions and publications Add object-specific TTLs to pubs and subs
26
26 Colyseus Design: Recap Mercury 128.2.9.100 128.2.9.200 Monster on 128.2.9.200 Find me nearby objects Replica Direct point-to-point connection
27
27 Putting It All Together
28
28 Talk Outline ProblemBackground Colyseus Architecture Scalability Evaluation [NSDI 2006] Security Expected Contributions
29
29 Evaluation Goals Bandwidth scalability Per-node bandwidth usage should scale with the number of nodes View inconsistency due to object discovery latency should be small Discovery latency, pre-fetching overhead in [NSDI 2006] in [NSDI 2006]
30
30 Experimental Setup Emulab-based evaluation Synthetic game Workload based on Quake III traces P2P scenario 1 player per server Unlimited bandwidth Modeled end-to-end latencies More results including a Quake II evaluation, in [NSDI 2006]
31
31 Per-node Bandwidth Scaling Mean outgoing bandwidth (kbps) Number of nodes
32
32 View Inconsistency Avg. fraction of mobile objects missing Number of nodes no delay 100 ms delay 400 ms delay
33
33 Planned Work Consistency models Game operations demand differing levels of consistency and latency response Causal ordering of events AtomicityDeployment Performance metrics depend crucially on the workload A real game workload would be useful for future research
34
34 Talk Outline ProblemBackground Colyseus Architecture Scalability Evaluation Security [Planned Work] Expected Contributions
35
35 Cheating in Online Games Why do cheats arise? Distributed system (client-server or P2P) Bugs in the game implementation Possible Cheats in Colyseus Object Discovery map-hack, subscription-hijack Replication god-mode, event-ordering, etc. Object Placement god-mode
36
36 Object Discovery Cheats map-hack cheat [Information overexposure] Subscribe to arbitrary areas in the game Discover all objects, which may be against game rules Subscription-hijack cheat Incorrectly route subscriptions of your enemy Enemy cannot discover (see) players Other players can see her and can shoot her
37
37 Replication Cheats god-mode cheat Primary node has arbitrary control over writes to the object Timestamp cheat Primary node decides the serialized write order You die! No, I don’t! Node A Node B
38
38 Replication Cheats Suppress-update cheat Primary does not send updates to the replicas Inconsistency cheat Primary sends incorrect or conflicting updates to the replicas Hide from this guy I am dead I moved to another room Player A Player C Player D Player B
39
39 Related Work NEO protocol [GauthierDickey 04] Lock-step synchronization with commitment Send encrypted update in round 1 Send decryption key in round 2, only after you receive updates from everybody + Addresses suppress-update cheat timestamp cheat – Lock-step synchronization increases “lag” – Does not address god-mode cheat, among others
40
40 Solution Approach Philosophy: Detection rather than Prevention Preventing cheating ≈ Byzantine fault tolerance Known protocols emphasize strict consistency and assume weak synchrony Multiple rounds unsuitable for game-play High-level decisions 1. Make players leave an audit-trail 2. Make peers police each other 3. Keep detection out of critical path always
41
41 Distributed Audit Log Randomly chosen witness Centralized Auditor
42
42 Player Log Witness Log Logging Using Witnesses Think code Witness Node Player Node Optimistic Update path Serialized Updates
43
43 Using Witnesses: Good and Bad + Player, witness logs can be used for audits Potentially address timestamp, god-mode and inconsistency cheats + Witness can generate pubs + subs Addresses map-hack cheat – Bandwidth overhead – Does not handle suppress-update cheat and the subscription-hijack cheat
44
44 Using Witnesses: Alternate Design Move the primary directly to the witness node Code execution and writes directly applied at the witness – Primary replica updates go through witness – Witness gets arbitrary power Player cannot complain to anybody Witness Node has primary copy of player
45
45 Challenges Balance power between player and witness Use cryptographic techniques How do players detect somebody is cheating? Extraction of rules from the game code Securing the object discovery layer Leverage DHT security research Keep bandwidth overhead minimal
46
46 Talk Outline ProblemBackground Colyseus Architecture Scalability Evaluation Security Expected Contributions
47
47 Expected Contributions Mercury range-queriable DHT Design and evaluation of Colyseus Real-world measurement of game workloads Anti-cheating protocols
48
48 Expected Contributions Mercury range-queriable DHT First structured overlay to support range queries and dynamic load balancing Implementation used in other systems Design and evaluation of Colyseus Real-world measurement of game workloads Anti-cheating protocols
49
49 Expected Contributions Mercury range-queriable DHT Design and evaluation of Colyseus First distributed design to be successfully applied for scaling FPS games Demonstrated that low-latency game-play is feasible Flexible architecture for adapting to various types of games Real-world measurement of game workloads Anti-cheating protocols
50
50 Expected Contributions Mercury range-queriable DHT Design and evaluation of Colyseus Real-world measurement of game workloads Deployment of Quake III Anti-cheating protocols
51
51 Expected Contributions Mercury range-queriable DHT Design and evaluation of Colyseus Real-world measurement of game workloads Anti-cheating protocols Encourage real-world deployments Lead towards lighter-weight fault-tolerance protocols
52
52 Summary of Thesis Statement Design of scalable, secure architectures for games utilizing key properties Design of scalable, secure architectures for games utilizing key properties Game workload is predictable Players tolerate loose, eventual consistency
53
53 Differences from Related Work Avoid region-based object placement Frequent migration when objects move Load-imbalance due to skewed region popularity 1-hop unicast update path between primaries and replicas Previous systems used overlay multicast Replication model with eventual consistency Avoid parallel execution
54
54 Timeline Development of newer consistency and anti-cheat protocols May 06 Jul 06 Integration of Colyseus with Quake III May 06 Jul 06 Implementation of consistency and anti-cheat protocols Jul 06 Sep 06 Deployment and evaluation Jul 06 Dec 06 Thesis writing Dec 06 Mar 07
55
55 Thanks
56
56 Object Discovery Latency Mean object discovery latency (ms) Number of nodes
57
57 Object Discovery Latency Observations: 1. Routing delay scales similarly for both types of DHTs: both exploit caching effectively. Median hop-count = 3. 2. DHT gains a small advantage because it does not have to “spread” subscriptions Observations: 1. Routing delay scales similarly for both types of DHTs: both exploit caching effectively. Median hop-count = 3. 2. DHT gains a small advantage because it does not have to “spread” subscriptions
58
58 Bandwidth Breakdown Number of nodes Mean outgoing bandwidth (kbps)
59
59 Bandwidth Breakdown Observations: 1. Object discovery forms a significant part of the total bandwidth consumed 2. A range-queriable DHT scales better vs. a normal DHT (with linearized maps) Observations: 1. Object discovery forms a significant part of the total bandwidth consumed 2. A range-queriable DHT scales better vs. a normal DHT (with linearized maps)
60
60 Goals and Challenges 1. Relieve the computational bottleneck Challenge: partition code execution effectively 2. Relieve the bandwidth bottleneck Challenge: minimize bandwidth overhead due to object replication 3. Enable low-latency game-play Challenge: replicas should be updated as quickly as possible
61
61 Key Design Elements Primary-backup replication model Read-only replicas Flexible object placement Allow objects to be placed on any node Scalable object lookup Use structured overlays for discovering objects
62
62 View Consistency Object discovery should succeed as quickly as possible Missing objects incorrect rendered viewChallenges O(log n) hops for the structured overlay Not enough for fast games Objects like missiles travel fast and short-lived
63
63 Distributed Architectures: Motivation Server farms? $$$ Significant barrier to entry Motivating factors Most game publishers are small Games grow old very quickly What if you are ~1000 university students wanting to host and play a large game? What if you are ~1000 university students wanting to host and play a large game?
64
64 Colyseus Components Object LocationReplica Management Mercury server s2 P1 P2 R3R4 3. Register Replicas: R3 (to s2), R4 (to s2) 4. Synch Replicas: R3, R4 1. Specify Predicted Interests: (5 < X < 60 & 10 < y < 200) TTL 30sec 2. Locate Remote Objects: P3 on s2, P4 on s2 Object Store server s1 P3P4 Object Placement 5. Optimize Placement: migrate P1 to server s2
65
65 Object Pre-fetching On-demand object discovery can cause stalls or render an incorrect view Use game physics for prediction Predict which areas objects will move to Subscribe to object publications in those areas
66
66 Normal object discovery and replica instantiation slow for short-lived objects Piggyback object-creation messages to updates of other objects Replicate missile pro-actively wherever creator is replicated Pro-active Replication
67
67 Objects need to tailor publication rate to speed Ammo or health-packs don’t move much Add TTLs to subscriptions and publications Stored pubs act like triggers to incoming subs Soft-state Storage
68
68 Per-node Bandwidth Scaling Observations: 1. Colyseus bandwidth-costs scale well with #nodes 2. Feasible for P2P deployment (compare single-server or broadcast) 3. In aggregate, Colyseus bandwidth costs are 4-5 times higher there is overhead Observations: 1. Colyseus bandwidth-costs scale well with #nodes 2. Feasible for P2P deployment (compare single-server or broadcast) 3. In aggregate, Colyseus bandwidth costs are 4-5 times higher there is overhead
69
69 View Inconsistency Observations: 1. View inconsistency is small and gets repaired quickly 2. Missing objects on the periphery Observations: 1. View inconsistency is small and gets repaired quickly 2. Missing objects on the periphery no delay 100 ms delay 400 ms delay
70
70 Cheating in Games Examples of some cheats Information overexposure ( maphack ) Get arbitrary health, weapons ( god-mode ) Precise and automatic weapons ( aimbot ) Event ordering Did I shoot you first or did you move first? Exploiting bugs inside the game ( duping )
71
71 Distributed Design Components Object Replica Object Discovery Object Discovery Instantiate Replicas Instantiate Replicas
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.