A Parallel Architecture for the Generalized Traveling Salesman Problem

A Parallel Architecture for the Generalized Traveling Salesman Problem Max Scharrenbroich AMSC 664 Final Presentation 05/12/2009 Advisor: Dr. Bruce L. Golden R. H. Smith School of Business

Presentation Overview • Background and Review • GTSP Review • Review of Project Objectives • Review of mrOX GA • Parallelism in the mrOX GA • Parallel Architecture • Final Results • Status Summary

Review of the GTSP • The Generalized Traveling Salesman Problem (GTSP): • Variation of the well-known traveling salesman problem. • A set of nodes is partitioned into a number of clusters. • Objective: Find a minimum-cost tour visiting exactly one node in each cluster. • Example on the following slides…

GTSP Example • Given a set of locations (nodes).

GTSP Example (continued) 2 • Partition the nodes into clusters. 3 6 1 4 5

GTSP Example (continued) 2 • Find the minimum tour visiting each cluster. 3 6 1 4 5 Algorithms >

Algorithms for the GTSP • Like the TSP, the GTSP is NP-hard. • There exist exact algorithms that rely on smart enumeration techniques: • Brand-and-cut (B&C) algorithm (M. Fischetti, 1997) • Provided exact solutions to reasonably sized GTSP problems (48 ≤ n ≤ 442 and 10 ≤ m ≤ 89 ). • For problems with larger than 90 clusters the run time of the B&C algorithm began approaching one day. Heuristic Algorithms >

Algorithms for the GTSP (continued) • Heuristic approaches to the GTSP: • Generalized Nearest Neighbor Heuristic (C.E. Noon, 1988) • Random-key Genetic Algorithm (L. Snyder and M. Daskin, 2006) • mrOX Genetic Algorithm (J. Silberholz and B.L. Golden, 2007)* Project Objectives >

Review of Project Objectives • Develop a generic software architecture and framework for parallelizing serial heuristics for combinatorial optimization (in a minimally invasive way). • Extend the framework to host the serial mrOX GA and the GTSP problem class. • Investigate the performance of the parallel implementation of the mrOX GA on large instances of the GTSP (> 90 clusters). Overview >

Presentation Overview • Background and Review • Review of mrOX GA • Parallelism in the mrOX GA • Parallel Architecture • Final Results • Status Summary

Review of mrOX GA Start Initialization Phase (Light-Weight): • Sequentially generates and evolves 7 isolated populations of 50 individuals each. • Only use the rOX for crossover. • Apply one cycle of 2-opt followed by 1-swap improvement heuristic only to new best solution. • 5% chance of mutation. • Terminate each population after no new best solution is found for 10 consecutive generations. Pop 1 Pop 2 … Pop 7 • Select the best 50 individuals from the union of the isolated populations to continue on. Merge Post-Merge Phase: • Use the fullmrOX for crossover. • Apply full cycles of 2-opt followed by 1-swap improvements until no improvements are found to: • Child solutions that are better than both parents. • Random 5% of population (preserve diversity). • 5% chance of mutation. • Terminate after no new best solution is found for 150 consecutive generations. Post-Merge End Overview >

Presentation Overview • Background and Review • Review of mrOX GA • Parallelism in the mrOX GA • Concurrent Exploration • Cellular GA Inspired Parallel Cooperation • Parallel Architecture • Final Results • Status Summary

Type 3 Parallelism in the mrOX GA start … • Genetic algorithms are amenable to parallelism via concurrent exploration. • Cooperation between processes (migration) can be implemented to ensure diversity while maintaining intensification. P1 P2 Pn start … P1 P2 Pn P1 P2 Pn Migration Period P1 P2 Pn end P1 P2 Pn end Parallel Cooperation w/ Mesh >

Parallel Cooperation with Mesh Topology • Inspired by cellular genetic algorithms (cGAs), where individuals in a population only interact with nearest neighbors. • Processes cooperate over a toroidal mesh topology. • Ensures diversity while maintaining intensification. Each process has four neighbors. Processes periodically exchange the best solutions with neighbors. High-quality solutions diffuse through the population. Overview >

Presentation Overview • Background and Review • Review of mrOX GA • Parallelism in the mrOX GA • Parallel Architecture • Overview of Parallel Architecture • Parallel mrOX GA • Final Results • Status Summary

Overview of Parallel Architecture • Processes are arranged as nodes in a grid topology. • Each node runs iterations of a serial stochastic search algorithm (e.g. mrOX GA). • A node updates its state after each iteration. • The parallel architecture uses the MPI Cartesian coordinate communication topology to send state updates from a node to its nearest neighbors. • All communications are handled off of the main processing thread to prevent blocking. • All state updates are handled periodically (e.g. 100 updates/sec). • Updates are only sent to neighbors if the node’s state has changed (limits unnecessary I/O). Parallel mrOX GA. >

Parallel mrOX GA • An MxN grid of processes is started with mpirun. • Random seeds are broadcast to each process. Start Pop 1 … • The initialization phase is the same as in the serial case. • There is no cooperation among processes in this phase. Pop 7 Merge • After each iteration of the mrOX GA, a node updates its state with the best solution found thus far. • Migrate: After K generations with no improvement a node incorporates its neighbor’s solutions if they are better than the worst in the population. Post-Merge • The parallel mrOX GA terminates when all processes have had no improvements after 150 generations. End Overview >

Presentation Overview • Background and Review • Review of mrOX GA • Parallelism in the mrOX GA • Parallel Architecture • Final Results • Performance of parallel mrOX GA • Status Summary

Migration Parameter in Parallel mrOX GA • Vary the migration parameter for three problem instances. Results averaged over 20 runs. Quad Core AMD Opteron (8360 SE), 2511 MHz, 512K Cache. Results averaged over 20 runs. Quad Core AMD Opteron (8360 SE), 2511 MHz, 512K Cache. Objective Value Performance >

Performance of Parallel mrOX GA • Objective value performance of cooperative vs. non-cooperative parallel mrOX GA on five problem instances. Speedup Performance >

Performance of Parallel mrOX GA • Speedup performance of cooperative vs. non-cooperative parallel mrOX GA on five problem. Vary Grid Size >

Performance of Parallel mrOX GA • Performance of parallel mrOX GA using different grid configurations. Speedup vs. Solution Quality >

Performance of Parallel mrOX GA • Tradeoff between speedup and solution quality of parallel mrOX GA by varying grid shape. Vary # Processors >

Performance of Parallel mrOX GA • Percentage of runs where optimum was found. Vary # Processors >

Performance of Parallel mrOX GA • Number of processors vs. solution quality of parallel mrOX GA (all grids are square except for 32 processor case). • Single processor case is equivalent to the serial version. Conclusions >

Conclusions • Cooperation among processes improves both the solution quality and convergence time of the parallel implementation. • Changing the grid shape affects the diffusion of solutions through the process population: • Thinner grids lead to slower convergence and improved solution quality. • Square grids give the fastest convergence but at the cost of solution quality. Status Summary >

Status Summary • October 16-30: Start design of the parallel architecture. • November: Finish design and start coding and testing of the parallel architecture. • December and January: Continue coding parallel architecture and extend the framework for the mrOX GA algorithm and the GTSP problem class. • February 1-15: Begin test and validation on multi-core PC. • February 16-March 1:Move testing to Deepthought cluster*. • March: Perform final testing on full data sets and collect results. • April-May:Generate parallel architecture API documentation**, write final report. * Was not pursued because of success on 32-core genome5. ** Future work. References >

References • Crainic, T.G. and Toulouse, M. Parallel Strategies for Meta-Heuristics. Fleet Management and Logistics, 205-251. • L. Davis. Applying Adaptive Algorithms to Epistatic Domains. Proceeding of the International Joint Conference on Artificial Intelligence, 162-164, 1985. • M. Fischetti, J.J. Salazar-Gonzalez, P. Toth. A branch-and-cut algorithm for the symmetric generalized traveling salesman problem. Operations Research 45 (3): 378–394, 1997. • G. Laporte, A. Asef-Vaziri, C. Sriskandarajah. Some Applications of the Generalized Traveling Salesman Problem. Journal of the Operational Research Society 47: 1461-1467, 1996. • C.E. Noon. The generalized traveling salesman problem. Ph. D. Dissertation, University of Michigan, 1988. • C.E. Noon. A Lagrangian based approach for the asymmetric generalized traveling salesman problem. Operations Research 39 (4): 623-632, 1990. • J.P. Saksena. Mathematical model of scheduling clients through welfare agencies. CORS Journal 8: 185-200, 1970. • J. Silberholz and B.L. Golden. The Generalized Traveling Salesman Problem: A New Genetic Algorithm Approach. Operations Research/Computer Science Interfaces Series 37: 165-181, 2007. • L. Snyder and M. Daskin. A random-key genetic algorithm for the generalized traveling salesman problem. European Journal of Operational Research 17 (1): 38-53, 2006.

Acknowledgements Chris Groer, University of Maryland William Mennell, University of Maryland John Silberholz, University of Maryland Aleksey Zimin, University of Maryland

A Parallel Architecture for the Generalized Traveling Salesman Problem