770 likes | 1.06k Views
Scalable and transparent parallelization of multiplayer games. Bogdan Simion MASc thesis Department of Electrical and Computer Engineering. Multiplayer games. Captivating, highly popular Dynamic artifacts. Multiplayer games. - More than 100k concurrent players. - Long playing times:.
E N D
Scalable and transparent parallelization of multiplayer games Bogdan Simion MASc thesis Department of Electrical and Computer Engineering
Multiplayer games • Captivating, highly popular • Dynamic artifacts
Multiplayer games - More than 100k concurrent players - Long playing times:
Multiplayer games - More than 100k concurrent players - Long playing times: 1.World of Warcraft: “I've been playing this Mage for 3 and a half years now, and I've invested too much time, blood, sweat and tears to quit now.”
Multiplayer games - More than 100k concurrent players - Long playing times: 1.World of Warcraft: “I've been playing this Mage for 3 and a half years now, and I've invested too much time, blood, sweat and tears to quit now.” 2.Halo2: “My longest playing streak was last summer, about 19 hours playing Halo2 on my XBox.”
Multiplayer games - More than 100k concurrent players - Long playing times: 1.World of Warcraft: “I've been playing this Mage for 3 and a half years now, and I've invested too much time, blood, sweat and tears to quit now.” 2.Halo2: “My longest playing streak was last summer, about 19 hours playing Halo2 on my XBox.” - Game server is the bottleneck
Server scaling • Game code parallelization is hard • Complex and highly dynamic code • Concurrency issues (data races) require conservative synchronization • Deadlocks
State-of-the-art • Parallel programming paradigms: • Lock-based (pthreads) • Transactional memory • Previous parallelizations of Quake • Lock-based [Abdelkhalek et. al ‘04] shows that false sharing is a challenge
Transactional Memory vs. Locks • Advantages • Simpler programming task • Transparently ensures correct execution • Shared data access tracking • Detects conflicts and aborts conflicting transactions • Disadvantages • Software (STM) access tracking overheads
Transactional Memory vs. Locks • Advantages • Simpler programming task • Transparently ensures correct execution • Shared data access tracking • Detects conflicts and aborts conflicting transactions • Disadvantages • Software (STM) access tracking overheads Never shown to be practical for real applications
Contributions • Case study of parallelization for games • synthetic version of Quake (SynQuake) • We compare 2 approaches: • lock-based and STM parallelizations • We showcase the first realistic application where STM outperforms locks
Outline • Application environment: SynQuake game • Data structures, server architecture • Parallelization issues • False sharing • Load balancing • Experimental results • Conclusions
Environment: SynQuake game • Simplified version of Quake • Entities: • players • resources (apples) • walls • Emulated quests
SynQuake Players can move and interact (eat, attack, flee, go to quest) Walls Immutable, limit movement Apples Food objects, increase life Contains all the features found in Quake
Game map representation • Fast retrieval of game objects • Spatial data structure: areanode tree
Areanode tree Root node Game map Areanode tree
Areanode tree A B Game map Areanode tree
A B Areanode tree A B Game map Areanode tree
A1 B1 A2 B2 A1 A2 B1 B2 Areanode tree A B Game map Areanode tree
Areanode tree A1 B1 A B A2 B2 A1 A2 B1 B2 Game map Areanode tree
Server frame Barrier 1 Barrier 2 Server frame Barrier 3 Barrier
Server frame Receive & Process Requests Client requests 1 2 Server frame 3
Server frame Receive & Process Requests Client requests 1 Admin (single thread) 2 Server frame 3
Server frame Receive & Process Requests Client requests 1 Admin (single thread) 2 Server frame Form & Send Replies Client updates 3
Parallelization in games • Quake - Locks-based synchronization [Abdelkhalek et al. 2004]
Parallelization: request processing Parallelization in this stage Receive & Process Requests Client requests 1 Admin (single thread) 2 Server frame Form & Send Replies Client updates 3
Outline • Application environment: SynQuake game • Parallelization issues • False sharing • Load balancing • Experimental results • Conclusions
Parallelization overview • Synchronization problems • Synchronization algorithms • Load balancing issues • Load balancing policies
Collision detection • Player actions: move, shoot etc. • Calculate action bounding box
Action bounding box Short- Range P1
Action bounding box Short- Range P1 P2 Long- Range
Action bounding box P3 P1 P2
Action bounding box P3 Overlap P1 P1 Overlap P2 P2
Player assignment Players handled by threads If players P1,P2,P3 are assigned to distinct threads → Synchronization required Long-range actions have a higher probability to cause conflicts P3 T3 T1 P1 T2 P2
False sharing Move range Shoot range
False sharing Move range Shoot range Action bounding box with locks
False sharing Move range Action bounding box with TM Shoot range Action bounding box with locks
Parallelization overview • Synchronization problems • Synchronization algorithms • Load balancing issues • Load balancing policies
Synchronization algorithm: Locks • Hold locks on parents as little as possible • Deadlock-free algorithm
Synchronization algorithm: Locks Root … P A B P3 P5 P3 A B P2 1 P1 P6 P1 2 B1 B2 A1 A2 P4 P5, P6 P2 P4
Synchronization algorithm: Locks Root … Area of interest P A B P3 P5 P3 A B P2 1 P1 P6 P1 2 B1 B2 A1 A2 P4 P5, P6 P2 P4
Synchronization algorithm: Locks Root … Area of interest P A B P3 P5 P3 A B P2 1 P1 P6 P1 2 B1 B2 A1 A2 P4 Leaves overlapped P5, P6 P2 P4
Synchronization algorithm: Locks Root Lock parents temporarily … Area of interest P A B P3 P5 P3 A B P2 1 P1 P6 P1 2 B1 B2 A1 A2 P4 Leaves overlapped P5, P6 P2 P4
Synchronization: Locks vs. STM Locks: 1. Determine overlapping leaves (L) 2. LOCK (L) 3. Process L 4. For each node P in overlapping parents LOCK(P) Process P UNLOCK(P) 5. UNLOCK (L) STM: 1. BEGIN_TRANSACTION 2. Determine overlapping leaves (L) 3. Process L 4. For each node in P in overlapping parents Process P 5. COMMIT_TRANSACTION
Synchronization: Locks vs. STM Locks: 1. Determine overlapping leaves (L) 2. LOCK (L) 3. Process L 4. For each node P in overlapping parents LOCK(P) Process P UNLOCK(P) 5. UNLOCK (L) STM: 1. BEGIN_TRANSACTION 2. Determine overlapping leaves (L) 3. Process L 4. For each node in P in overlapping parents Process P 5. COMMIT_TRANSACTION STM acquires ownership gradually, reduced false sharing Consistency ensured transparently by the STM
Parallelization overview • Synchronization problems • Synchronization algorithms • Load balancing issues • Load balancing policies
Load balancing issues Assign tasks to threads Balance workload P1 T1 T2 P2 P3 T3 T4 P4
Load balancing issues Move action Assign tasks to threads Cross-border conflicts → Synchronization P1 T1 T2 P2 Shoot action P3 T3 T4 P4
Load balancing goals • Tradeoff: • Balance workload among threads • Reduce synchronization/true sharing