Computational Number Theory

October 31, 2009 San Diego Math Circle Jack Brennen Computational Number Theory

Computational? What's that? • Computational Number Theory deals primarily with algorithms • Combines elements of traditional number theory (mathematics) with the study and development of algorithms (computer science)‏ • Usually working with “big” numbers • Active area for research, implementation, and unsolved problems

Overview • Computer software (PARI/GP)‏ • Basic algorithms (greatest common divisor, modular inverse, exponentiation, modular square roots, etc.)‏ • Distinguishing primes from composites (sometimes easy, sometimes hard)‏ • Factoring a number into prime factors (again, sometimes easy, sometimes hard)‏ • Discrete logarithm • Elliptic curves

PARI/GP • A powerful free software package for computational number theory • Originally developed under Dr. Henri Cohen at the University of Bordeaux • Downloadable from: http://pari.math.u-bordeaux.fr/

Features of PARI/GP • Deals with arbitrarily large numbers • Built-in data types for rational numbers, complex numbers, vectors, matrices, polynomials, etc. • Built-in functions for number theory and other branches of mathematics • Can be used as a library from C/C++, or comes with the GP front end – a powerful interactive calculator with hundreds of functions available

Euclid's Algorithm • Oldest result in computational number theory, maybe? • Euclid described how to determine the greatest common divisor of two natural numbers • Does not require factoring • Number of steps required is bounded and depends only on the size of the numbers

Euclid's Algorithm in GP • Assume both arguments are natural numbers and that a >= b • mygcd(a,b) ={ local(r); r=(a%b); return(if(r==0,b,mygcd(b,r)));}

GCD of “big” integers • Euclid's algorithm slows down when the numbers get really big. Why? • Computer age brings on new optimization • Binary GCD algorithm • Similar to Euclid's algorithm, but only requires comparisons, subtractions, and multiplying and dividing by powers of 2

Modular Inverse with Euclid • Euclid's algorithm extends easily for modular inverse (given A and M, find B such that A*B == 1 modulo M)‏ • Find inverse of 73 modulo 100 • (0,100) -> (1,73) -> (-1,27) -> (3,19) -> (-4,8) -> (11,3) -> (-26,2) -> (37,1)‏ • Tells us that 37*73 == 1 modulo 100 • 37 is the inverse of 73; 37*73 == 2701

Exponentiation • How would you compute x600? • Start with x, multiply by x 599 times? Works, but painfully slow. • Binary method: Write exponent in binary, use the binary representation to determine the operations • 600 in binary: 1001011000 • x600 = x512 * x64 * x16 * x8 • All of the smaller terms are computed on the way to getting the biggest term. • Requires 9 multiplications (squarings) to get to x512, then 3 more multiplications to get the final answer, for a total of 12 multiplications

Exponentiation • Can we compute x600 in fewer than 12 multiplications? • Note that 600 == 2 * 2 * 2 * 3 * 5 * 5 • We can square with one multiplication, cube with two multiplications, and take a fifth power with three multiplications • 1 + 1 + 1 + 2 + 3 + 3 is 11. So we can actually compute x600 in 11 multiplications. • This is called the factor method

Exponentiation • These methods of powering are a special case of the general problem called “addition chains.” An addition chain begins with the number 1; each subsequent term is a sum of two numbers already existing in the chain. • Addition chains for 600 that we've discussed: • 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 576, 592, 600 • 1, 2, 4, 8, 16, 24, 48, 96, 120, 240, 480, 600 • The problem of finding an optimal addition chain is hard in general • Binary method is never terrible, but can very often be improved upon • Factor method is sometimes an improvement, but not always

Addition chains • First target where binary addition chain is not optimal is 15. Factor method beats binary method. • First target where an “ad hoc” method beats binary or factor methods is 23. Binary method uses 7 steps: 1, 2, 4, 8, 16, 20, 22, 23. • Ad hoc method, 6 steps: 1, 2, 3, 5, 10, 13, 23. • Finding these optimal ad hoc methods generally requires brute force search, and is only worth it if you are going to raise numbers to a constant power on a regular basis, or for small exponents where you can keep a table of optimal addition chains

Prime or composite? • The problem of determining whether a large integer is prime or composite is one of the oldest and most studied subjects in computational number theory • Fermat's Little Theorem is fundamental to the question; it can be used to easily prove compositeness

Fermat's Little Theorem • Simply stated, if p is prime, then (ap-a) is divisible by p for any integer a. • Can be used to prove compositeness; if(ap-a) is not divisible by p, then p is composite. • Has been proven in dozens of ways; like Pythagorean Theorem, mathematicians seem to like finding new proofs.

Proof of Fermat's Little Theorem • Induction proof, assume that p is given • Note that (ap-a) is divisible by p when a = 1. • Show that if (ap-a) is divisible by p, then so is ((a+1)p-(a+1))‏ • Note that the binomial expansion of (a+1)p contains only two terms which don't contain a factor of p in the coefficient – the first and the last • Write (a+1)p as (ap+1+p*C); then((a+1)p-(a+1)) becomes (ap-a+p*C)‏ • Since (ap-a) is divisible by p, so is (ap-a+p*C), thus ((a+1)p-(a+1)) is divisible by p

Compositeness Test • Example of usage. Prove that 9 is composite without showing its factors... • Compute (29-2) == 512-2 == 510. 510 is not divisible by 9, so 9 is not prime. • Note that proving compositeness using Fermat's Little Theorem neither requires factors, nor yields factors.

Fermat's Little Theorem can't prove primality • Some composite numbers “fool” Fermat's Little Theorem. • Such numbers are called pseudo-primes • For a=2, the smallest such pseudo-prime is the number 341. • 341 is composite, but (2341-2) is also divisible by 341. • Notice why efficient exponentiation is so important in computational number theory?

Exponentiation simplified • Note that 2341 is a number with 103 decimal digits. But do we need to compute a 103 digit number? • No, because we only care about the remainder of 2341 when divided by 341; thus, we can do the exponentiation modulo 341. • Actually well within the realm of being done by pencil and paper

Proof that 341 is pseudo-prime • 22 == 4 (mod 341)‏ • 23 == 8 (mod 341)‏ • 25 == 32 (mod 341)‏ • 27 == 128 (mod 341)‏ • 214 == 128*128 == 16 (mod 341)‏ • 221 == 128*16 == 2 (mod 341)‏ • 242 == 2*2 == 4 (mod 341)‏ • 284 == 4*4 == 16 (mod 341)‏ • 2168 == 16*16 == 256 (mod 341)‏ • 2173 == 256*32 == 8 (mod 341)‏ • 2341 == 256*8 == 2 (mod 341)‏ • So we see that (2341-2) is divisible by 341

How to prove a pseudo-prime composite? • So 341 is a pseudo-prime (technically, an Euler pseudo-prime to base 2)‏ • How to prove it composite, short of factoring it? • Use a different base. In this case, we can choose the base 3, and note that (3341-3) is not divisible by 341. So we were just “unlucky” to choose base 2 as our first base. • Most compositeness tests based on Fermat's Little Theorem are probabilistic tests. Given a composite number, they “usually” prove its compositeness, but not always.

How to improve the test? • Most common compositeness test in widespread usage is the Rabin-Miller test. • Very similar to the test we just did, with one major improvement. • First, note that if (ap-a) is divisible by p, and if a is coprime to p, then we can describe the test as: a(p-1) == 1 (modulo p). • Second, assuming that we are only testing odd values for compositeness (testing even numbers is trivial), then the exponent (p-1) will always have one or more factors of 2, so we strip those factors out for our initial test. • Third, note that if we ever find that x2 == 1 (modulo M), and that x is neither equal to 1 or -1 (modulo M), that constitutes a proof that M is composite.

Rabin-Miller Test on 341 • To run the Rabin-Miller Test on the number 341 with base 2, we first note that if 341 is prime, then 2340 == 1 (modulo 341). • Next, note that we can compute 285 (modulo 341), then square it twice to get 2340. • So we compute 285 (modulo 341) == 32. So far so good, 341 could be prime. • Next, square it: 2170 (modulo 341) == (32*32) modulo 341 == 1. Thus, 341 is not prime, because we found a non-trivial solution to x2 == 1 (modulo 341), specifically x == 32.

Properties of Rabin-Miller Test • When testing a number N, and where the base is chosen at random over the range (2...N-2), it has been shown that a composite number will be proven composite at least ¾ of the time. • For the vast majority of candidates N, Rabin-Miller is much better than that. Many composite N exist which can be proven composite using any base in the valid range. • Only a very small subset of composites fall into the “difficult” category, where nearly ¼ of the tests fail to prove compositeness. • For large randomly-chosen numbers such as the ones used in cryptography (512 bits, 1024 bits, etc.), a transient arithmetic error in your processor is actually more likely than even a single failure of Rabin-Miller.

Rabin-Miller specifics • Smallest number which fails Rabin-Miller for base 2 is 2047. (It is an “Euler strong pseudo-prime to base 2”...)‏ • A commonly known test which correctly identifies every composite number up to 1012 is to run four iterations of Rabin-Miller, using the bases 2, 13, 23, and 1662803.

Proving primality • Sometimes, a more rigorous proof of primality is desired, other than just saying that a number passed some number of probabilistic Rabin-Miller tests. • The easiest such primality proofs for N are done when we are able to completely factor N-1. • If we can find a value of a such that aN-1 == 1 (modulo N), but ax != 1 (modulo N) for every other x which divides N-1, that constitutes a rigorous proof of the primality of N. • As an example, let's prove the primality of 3*266+1.

3*266+1 is prime • Here, N-1 is easily factored. • In fact, all we need to do is find a value of a such that the following three statements are simultaneously true:a266 != 1 (mod N)a3*265 != 1 (mod N)a3*266 == 1 (mod N)‏ • An exhaustive search shows that a == 10 is the first such value of a that works.

Factoring by Fermat's Method • Works for moderately sized numbers which are the product of two cofactors of roughly equal size • Makes use of the idea that a*b can be written as u2-v2 where u == (a+b)/2 and v == (a-b)/2. • Look for squares which are a “little” larger than the target N. If you find such a square which differs from N by a perfect square, you can proceed directly to splitting N into factors.

Example of Fermat's Method • Factor N = 5671 into primes • Look at primes “a little” larger than 5671:762 == 5776 == N + 105772 == 5929 == N + 258782 == 6084 == N + 413792 == 6241 == N + 570802 == 6400 == N + 729 • Aha, 729 is 272, so 5671 can be written as 6400-729 = 802-272 = (80+27)(80-27) =(107)(53).

Factoring by Pollard Rho • Pollard Rho is a probabilistic method of factoring which makes use of the idea that iterating a particular polynomial operation modulo any prime tends to fall into a short cycle before too long. • For instance, if we continually iterate the polynomial x2+1 (modulo 11), beginning with 1, we get: 1 -> 2 -> 5 -> 4 -> 6 -> 4.

How Pollard Rho works • When a polynomial is iterated modulo M, where M is a composite number, the result will be a combination of the cycles modulo each of the prime power divisors of M. Most notably, those cycles will probably be entered at different times, and likely be of different lengths. • How do we detect when we've fallen into a cycle modulo one divisor of M, but not another divisor? Use GCD.

Pollard Rho in action • Let's use Pollard Rho to factor 5671. • Choose the polynomial x2+3. • Iterate beginning with 1, modulo 5671... • 1 -> 4 -> 19 -> 364 -> 2066 -> 3767. • Take the GCD of (3767-4) and 5671. • It's 53. This is because the polynomial x2+3 has already fallen into a cycle modulo 53, but not yet fallen into a cycle modulo 107. So this can be used to split the factors apart.

Loop detection • How do we do loop detection, as required by Pollard Rho? Do we test every prior element of the sequence against the current element? • Two common simple ways which can be done with only limited storage. • First way is to test the 2nd element of the sequence against the 1st, the 4th element against the 2nd, the 6th against the 3rd, etc. • Second way is to test each element against the last element whose sequence number was a power of 2. For instance, the 5th, 6th, 7th, and 8th elements are tested against the 4th, etc. • Either method is guaranteed to eventually find any loop of any period.

Computational Number Theory

Computational Number Theory

Presentation Transcript

Number Theory

Tutorial: Computational Voting Theory

Number Theory

Number Theory

Computational Game Theory

Number Theory

NUMBER THEORY

Computational Game Theory

Number Theory

Number Theory

Number Theory

Computational Learning Theory

Number Theory

Computational Learning Theory

Number Theory

Number Theory

Computational Number Theory - traditional number theory

CS311: Computational Theory

NUMBER THEORY

Tutorial: Computational Voting Theory

7. Computational Learning Theory