1 / 41

# Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum - PowerPoint PPT Presentation

Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum. Yana Kortsarts Widener University Computer Science Department . Plan. Randomized Algorithms Teaching a Power of Randomization Using a Simple Game

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Yana Kortsarts

Widener University

Computer Science Department

Plan
• Randomized Algorithms
• Teaching a Power of Randomization Using a Simple Game
Algorithm
• An algorithm is a sequence of instructions for solving a problem
• Deterministic Algorithm runs in the same way on the same input every time. Deterministic algorithm has predicted behavior
• Randomized Algorithm is an algorithm that makes random choices during execution
Deterministic Algorithm

THE SAME INPUT

THE SAME BEHAIVOR

OUTPUT

Randomized Algorithm

THE SAME INPUT

DIFFERENT

BEHAIVOR

OUTPUT

Why Should We Teach Randomized Algorithms?
• Randomization is a general tool that applies in various computer science areas and not just a subject by itself.
• Significance: many of the breakthroughs in various algorithmic areas have used randomization.
• Example: Prime Number Test
• Simple polynomial one-sided error Monte Carlo algorithm – Rabin Algorithm (1980)
• A deterministic polynomial time algorithm was given by Agarwal, Kayal, and Saxena (2002).
• Performance: for many problems, randomized algorithms run faster than the best known deterministic algorithm
• Simplicity: many randomized algorithms are simpler to describe and implement than deterministic algorithms of comparable performance.
Challenges and Solutions
• The concept of a randomized algorithm can be difficult to understand.
• Usually, there is no separate course on Randomized Algorithms in undergraduate CS curriculum
• The idea of a randomized algorithm is clearer for students when presented as a game.
• Topic could be integrated into introductory courses
Algorithm as Part of the Game

Design of an Algorithm for a Combinatorial Problem

GAME

Algorithm Player

Input Player

Designs

the

Algorithms

Goal:

Minimize

Running Time

of the Algorithm

Goal:

Maximize

Running Time

of the Algorithm

Selects

Test Input

for Selected

Algorithm

Deterministic Algorithms

Algorithm Player

Input Player

Deterministic Strategy

(Deterministic Algorithm)

Best Strategy

Finding the

Worst Input

for the Algorithm

Produced

by the

Algorithm Player

• Reveals an entire strategy
• (algorithm) first
• Input Player can pick
• the worst example for
• the suggested algorithm
Deterministic Algorithms

The problem facing the algorithm player is that if it uses a deterministic strategy, then since in a sense it “moves first", the second (input) player can indeed pick the worst example for the suggested algorithm

Randomized Algorithms

Algorithm Player

Input Player

Randomized Strategy

(Randomized Algorithm)

a randomized

algorithm has to be

an input which

several algorithms

simultaneously

• Randomized algorithm can be seen
• as a distribution over all possible
• deterministic algorithms
• Doesn’t reveal his cards fully in advance
• Tells the second player the probability by
• which it selects any one of the possible
• deterministic algorithms
• The coins have not fallen yet, and the game
• only begins after the input player chooses
Game Description
• Player 1: Decides on integer x > 0
• Player 2 : Has to find a number ynso that yn  x
• Rules:
• Player 2: y1<y2< …< yn
• y1<y2< …< yn-1< x and yn  x
• On a guess yj,player 1 either says:
• smaller than x, please provide a next guess
• larger or equal x, and reveals x stopping the game
Optimization Criteria
• Let the guesses be y1, y2, ….yn, so that

yn x

yj< x for all j≤n – 1

• The optimization criteria is the

performance ratio:

EXAMPLE

Player 1

Player 2

x is chosen, pleaseprovide a guess

Smaller than x, next guess

1

Smaller than x, next guess

3

Smaller than x, next guess

10

Smaller than x, next guess

28

STOP! x = 37

76

Performance Ratio

118 / 37 = 3.189189

EXAMPLE

Player 1

Player 2

x is chosen, pleaseprovide a guess

Smaller than x, next guess

1

Smaller than x, next guess

2

Smaller than x, next guess

3

Smaller than x, next guess

4

STOP! x = 5

23

Performance Ratio

(1+2+3+4+23) / 5 = 6.6

Teaching the Game
• Discussion of the Optimization Criteria selection:
• Why not to choose yi/x, where yi  x?
• Answer: simple strategy 1, 2, 3, …is optimal
• Discussion of possible strategies:
• Why do we not benefit from increasing the next guess only a little compared to the previous guess?
• What is the disadvantage of making the next guess, say, 100 times larger than previous guess?
The Powers of 2 Strategy for the Second Player
• It turns out that the simple strategy that selects powers of 2: y0 = 1, y1 = 2, y2 = 4, … yi = 2i …

is an optimal deterministic strategy for this game

• The worst case for the strategy is when the number selected by the first player is x = 2j+ 1
• In this case the game is played until the second player suggests 2j+1
EXAMPLE
• x = 13
• guesses: 1, 2, 4, 8, 16
• sum: 1+2+4+8+16 = 31
• performance ratio is 2.384615
• x = 65
• guesses: 1, 2, 4, 8, 16, 32, 64, 128
• sum: 1+2+4+8+16+32+64+128 = 255
• performance ratio is 3.923
The Powers of 2 Strategy Analysis
• The strategy y0 = 1, y1 = 2, y2 = 4, … yi = 2i gives a following performance ratio:
• Worst Case: x = 2j+ 1, the performance ratio is:
Teaching the Game
• Encourage students to find by themselves the worst case for the powers of 2 strategy.
• This example serves well in illustrating the strict notion of the worst case input.
• The bad instance for the powers of 2 strategy is a very specific and rare number. (1, 2, 4, 8, 16, 17, 32)
Teaching the Game
• If x is some random number, the powers of 2 strategy performs much better.
• A good place to discuss the difference between random strategy and random inputs.
• The input is sometimes not within our control, while the randomized algorithm is within our control as the designers of the algorithms.
A Deterministic Worst CaseLower Bound
• Let  > 0 be a small as desired constant.
• We show that any deterministic strategy has examples with performance ratio at least 4 - 
• The powers of 2 is the optimal deterministic strategy.
A Randomized Strategy

The following simple randomized strategy gives an improved expected value

• Let  R [0, 1) – randomly and uniformly chosen from interval [0, 1)
• Define yj = exp( j +  )
• Let i be so that exp( i - 1 +  ) < x≤exp( i +  )
• The expected performance ratio is:
EXAMPLE

Player 1

Player 2

x is chosen, pleaseprovide a guess

 = 0.419

Smaller than x, next guess

exp() = 1

Smaller than x, next guess

exp(+1) = 4

Smaller than x, next guess

exp(+2) = 11

Smaller than x, next guess

exp(+3) =30

STOP! x = 48

exp(+4) = 83

Performance Ratio

129 / 48 = 2.6875

EXAMPLE

Player 1

Player 2

x is chosen, pleaseprovide a guess

 = 0.866

Smaller than x, next guess

exp() = 2

Smaller than x, next guess

exp(+1) = 6

Smaller than x, next guess

exp(+2) = 17

Smaller than x, next guess

exp(+3) =47

STOP! x = 63

exp(+4) = 129

Performance Ratio

201 / 63 = 3.190476

EXAMPLE

Player 1

Player 2

x is chosen, pleaseprovide a guess

 = 0.195

Smaller than x, next guess

exp() = 1

Smaller than x, next guess

exp(+1) = 3

STOP! x = 6

exp(+2) = 8

Performance Ratio

12 / 6 = 2.0000

A Deterministic Worst CaseLower Bound Intuitive Explanation

The second player cannot

choose yi+1 to be

“too large” compared to yi

If after may times

the second

player makes

such choices

x = yi + 1 << yi+1

(yi+1 +yi)/x is already a large

number

For some yj

y1+y2+…yj-1 >> yj

The second player always

selects yi+1 to be

not much larger than yi

The choice x = yjis bad for

second player, since

(y1+y2+….+yj)/yj is large

Summary
• Simple game illustrating the power of randomization.
• The full analysis of the game is presented in “Teaching the Power of Randomization Using Simple Game”, SIGCSE 2006 (Y. Kortsarts, J. Rufinus)
• Teaching the game:
• Introduction to Computer Science I and II
• Design and Analysis of Algorithms
Summary
• The game is well-motivated from the point of view of modern scheduling research
• Even though this specific game seems not to have been studied before, the techniques illustrated here have been used in a series of papers on approximating scheduling problems [11, 7, 8, 4]. These papers study the fast scheduling of conflicting jobs with the goal of minimizing the sum of finish times of these jobs. Hence, the suggested game is at the heart of modern research.
Advanced Algorithms in Introductory CS Curriculum
• Las Vegas - always gives the correct solution.
• Monte Carlo - may sometimes produce an incorrect solution
• How (and why) to Introduce Monte Carlo Randomized Algorithms Into a Basic Algorithms Course?, Y. Kortsarts, J. Rufinus, Journal of Computing Sciences in Colleges, December 2005
• Integrating a real-world scheduling problem into the basic algorithms course, Yana Kortsarts, Journal of Computing Sciences in Colleges, June 2007
Advanced Algorithms in Introductory CS Curriculum
• Merkle-Hellman Knapsack Cryptosystem [31]
• Elegant and beautiful underlying mathematics
• Due to its simple structure, the knapsack cryptosystem is an ideal model for introducing algorithmic techniques and a concept of Public Key cryptosystem to computer science students
• Sequence Alignment [32, 33]
• Needleman and Wunsch Algorithm (Global Alignment)
• Smith-Waterman Algorithm (Local Alignment)

Knapsack Cryptosystem in Computer Science Curriculum

Cryptology

Design and Analysis

of Algorithms

Introduction to

Computer Science

Concept of

Public Key

Cryptosystem

• Knapsack Problem
• Subset-Sum Problem
• Algorithmic Techniques
• Concept of Public Key
• Cryptosystem
• Computational Problems:
• Prime Numbers
• GCD, Euclidian Algorithm
• Modular Exponentiation
• Primitive Roots for Primes

Sequence Alignment
• Global Alignment: compare two sequences in their entirety; the gap penalty is assessed regardless of whether gaps are located internally within a sequence, or at the end of one or both sequences.
• The Needleman and Wunsch Algorithm.
• Local Alignment: find best matching subsequences within the two search sequences.
• The Smith-Waterman Algorithm.
REFERENCES

[1] S. Arora, C. Lund, R. Motwani, M. Sudan and M. Szegedy. Proof verication and the hardness of approximation problems. Journal of ACM, 45(3):501-555, 1998.

[2] G. J. Brebner and L. G. Valiant, Universal schemes for parallel communication. Proceedings of the thirteenth annual ACM symposium on Theory of computing, Pages: 263 - 277, 1981

[3] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to algorithms. The MIT Press, 2nd edition, 2001.

[4] S. Chakrabarti, C. A. Phillips, A. S. Schulz, D. B. Shmoys, C. Stein and J. Wein. Improved scheduling algorithms for-minsum criteria. ICALP '96, 875-886.

[5] A. Fiat, R. M. Karp, M. Luby, L. A. McGeoch, D. D. Sleator, and N. E. Young, Competitive paging algorithms. Journal of Algorithms archive Volume 12(4): 685 - 699 1991

[6] O. Goldreich, S. Micali, and A. Wigderson. Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems. Journal of the ACM, 38(3):690 - 728, 1991

REFERENCES

[7] L. A. Hall, D. B. Shmoys, and J. Wein. Scheduling to minimize average completion time: O-line and on-line algorithms. SODA'96, 142-151. 42-151, Jan 1996.

[8] L. A. Hall, A. Schulz, D. B. Shmoys, and J. Wein. Scheduling to minimize average completion time: O-line nd on-line approximation algorithms. Math. Operations Research 22:513-544, 1997.

[9] G. Kalai, A subexponential randomized simplex algorithm, Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, 475 - 482, 1992

[10] R. M. Karp, E. Upfal and A. Wigderson. Constructing a perfect matching is in random NC. Combinatorica Volume 6(1):35-48, 1986

[11] M. Queyranne, M. Sviridenko. Approximation algorithms for shop scheduling problems with minsum objective. J. Scheduling 5:287-305, 2002.

[12] R. L. Rivest, A. Shamir, L. M. Adleman, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Commun. ACM 21(2):120-126, 1978

REFERENCES

[13] N. Alon and R Yuster and U Zwick. Color-coding Journal of the ACM, 42(4):844 - 856

[14] A. Bjorklund, T. Husfeldt and S. Khanna. Approximating Longest Directed Path. Symposium on Automata, Languages and Programming (ICALP) 2004, to appear.

[15] D. Dor, U. Zwick, Selecting the Median, SIAM J. Comput, 28(5): 1722-1758, 1999.

[16] D. Dor and U. Zwick, Median Selection Requires (2+epsilon)n Comparisons, SIAM Journal on Discrete Mathematics, 14(3):312-325

REFERENCES

[17] R. W. Floyd and R. L. Rivest Expected time bounds for selection Communications of the ACM, 18(3):165 - 172, 1975.

[18] T. Feder, R. Motwani, C. Subi. Finding long paths and cycles in sparse Hamiltonian graphs Proceedings of the ACM symposium on Theory of computing, pages 524 - 529, 1999

[19] H. Gabow, Finding paths and cycles of superpolylogarithmic size.

Proceedings of the ACM symposium on Theory of computing, pages 407-416, 2004.

[20] M. T. Goodrich and R. Tamassia. Using randomization in the teaching of data structures and algorithms, The proceedings of the thirtieth SIGCSE technical symposium on Computer science education, 53 - 57, 1999

[21] D. Karger, R. Motwani, and G.D.S. Ramkumar. On Approximating the Longest Path in a Graph. Algorithmica 18 (1997): 82-98.

REFERENCES

[22] R. M. Karp. Reducibility among combinatorial problems, R. E. Miller and J. W. Thatcher, eds., Complexity of Computer Computations, Plenum Press, New York, 1972, pp. 85-103.

[23] M. O. Rabin Probabilistic algorithm for testing primality, J. Number Theory, 12, 128-138, 1980.

[24] N. Robertson and P. Seymour, Graph minors. II. Algorithmic aspects of tree-width. J.

Algorithms 7, 1986.

[25] R. Motwani and P. Raghavan, Randomized Algorithms, Cambridge University Press, 1995

[26] R.M. Karp, An Introduction to randomized algorithms, Discrete Applied Mathematics, 34: 165-201, 1991

REFERENCES

[27] D.R.Karger, Global min-cuts in RNC, and other ramifications of a simple min-cut algorithm, In Proceedings of the 4th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 21- 30, 1993.

[28] M. J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill, 2004

[29] Y. Kortsarts, J. Rufinus, Teaching the Power of Randomization Using Simple Game, SIGCSE 2006

[30] Y. Kortsarts, J. Rufinus, How (and why) to Introduce Monte Carlo Randomized Algorithms Into a Basic Algorithms Course?,Journal of Computing Sciences in Colleges, 2005

REFERENCES

[31] R. C. Merkle, M. E. Hellman,  Hiding Information

and Signatures in Trapdoor Knapsacks, IEEE

Transactions on Information Theory, vol. IT-24, 1978, pp. 525-530.

[32] An Introduction to Bioinformatics Algorithms,

N.C. Jones and P. A. Pevzner, The MIT Press, 2004

[33] Fundamental Concepts of Bioinformatics,

D. E. Krane and M . L. Raymer, Publisher:

Benjamin Cummings, 2002