1 / 35

Multiprecision

Multiprecision. Interest. For algorithms with public keys (RSA, ECC, LUC, El-Gamal …), prime numbers with several hundreds of digits are required Checking primality requires additions, multiplications and division in arbitrary length Processors possess arithmetic units of fixed size.

zamora
Download Presentation

Multiprecision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiprecision

  2. Interest • For algorithms with public keys (RSA, ECC, LUC, El-Gamal …), prime numbers with several hundreds of digits are required • Checking primality requires additions, multiplications and division in arbitrary length • Processors possess arithmetic units of fixed size

  3. How to avoid this course ? • Scolar algorithms are sufficient for numbers with several tens of digits and become obsolete only for numbers with more than 100 digits • There exists already done multiprecision libraries in most of the programming languages

  4. Complexity notions • We suppose that we have a computer able to make elementary operations. • Those are fixed time operations (Example: addition, soustraction, 32 or 64 bits wide). • The computer is also supposed to have a memory where the data of the problem are stored. • The size of the problem, denoted T, is the number of bits required for the storage of the problem data. • The complexity of the problem, denoted C, is a function such that a problem whose size is T, requires at most C(T) elementary operations to be solved.

  5. n 1 5 10 50 n2/3.log(n) 0 4,66 10,53 51,72 n.log(n) 0 8,05 23,03 195,60 n1,58 1 12,72 38,02 483,48 2n 2 32 1024 1,12e15 Complexity notions • Rather than exactly computing the function C, most of the time only an equivalent is determined. • Example: scolar addition scolaire has a linear complexity, sorting n integers has a complexity equivalent to n.log(n), Meissel-Lehmer algorithm has a complexity equivalent to n2/3.log(n), Karatsuba’s algorithm has a complexity equivalent to n1,58, Chernikova’s algorithm has an exponential complexity. Complexity comparison

  6. Scolar addition • A=(an-1an-2 … a1a0) represents A= ai Diwith D=232 or 264 (depending on the size of the avalaible operators) • B=(bn-1bn-2 … b1b0). Denote S=(snsn-1 … s1s0) with S=A+B • The computation is recursive c0=0, sk=ak+bk+ck-1(mod. D) and ck=1 if bk+ck-1> D-1-ak and 0 otherwise. • Modulo operation is computed automatically and the test is computed with numbers less than D. • The complexity is linear (the number of operations is less than 4n)

  7. Addition: complexity • Scolar addition is linear • When adding A=(an-1an-2 … a1a0) and B=(bn-1bn-2 … b1b0), the faintest modification of an ak ou a bk modifies the final result • An addition algorithm must « know » all the ak and bk. • Algorithms cannot be less than linear.

  8. Scolar multiplication • We wish to multiply A=(an-1an-2 … a1a0) and B=(bn-1bn-2 … b1b0) • We form the partial products A.bk and we add them with shifts (products albk have to be computed in double precision  4 simple multiplications and 3 simple additions) • The cost of computation for A.bk is linear. • We have to compute n of them and we must add them. • The global cost is proportionnal with n2.

  9. Karatsuba’s method (1962) • We wish to multiply A=(a2n-1a2n-2 … a1a0) and B=(b2n-1b2n-2 … b1b0) • Let A=A1.Dn+A0 and B=B1Dn+B0 with A1=(a2n-1a2n-2 … an+1an), A0=(an-1an-2 … a1a0), B1=(b2n-1b2n-2 … bn+1bn), B0=(bn-1bn-2 … b1b0). • M=A.B=M2D2n+M1Dn+M0 with M0=A0.B0, M1=A0.B1+A1.B0, M2=A1.B1. • We have M1=(A0+A1).(B0+B1)-M0-M2 • 3 multiplications only are required instead of 4 for the scolar multiplication. • We continue recursively (if the numbers of words is odd, we split in twice as good as possible). • The complexity of multiplication of two n-words numbers is proportional with nlog2(3)n1.58

  10. Karatsuba: Example • A=171 659,B=891 512 • Let D=1000A1=171, A0=659, B1=891,B0=512 • M2= A1.B1= 171891= 152 361, M0= A0.B0= 659512= 337 408 • M1= (A0+A1).(B0+B1)-M0-M1= (8301 403)-M0-M1= 1 243 602-M0-M1= 674 721 • R=D2.M2+D.M1+M0=153 306 058 408

  11. Méthode de Toom-Cook • We split in three parts • A=A2D2n+A1Dn+A0, B=B2D2n+B1Dn+B0 • M=A.B= M4D4n+M3D3n+M2D2n+M1Dn+M0 • Let P1=(A2+A1+A0).(B2+B1+B0), P-1=(A2+(-1)nA1+A0).(B2+(-1)nB1+B0), P2=(22nA2+2nA1+A0).(22nB2+2nB1+B0) • Then M0=A0B0, M1=2A2B2-(1/2)A0B0-(1/3)P-1+P1-(1/6)P2, M2=-A2B2-A0B0+(1/2)P-1+(1/2)P1et M3= - 2A2B2+(1/2)A0B0 -(1/6)P1-(1/2)P1+(1/6)P2, M4=A2B2 • Only 5 multiplications are required. • The complexity is proportional with nlog3(5)n1.46

  12. Toom-Cook: Example • A=171 659, B=891 512 • D=100, A2=17, A1=16, A0=59, B2=89, B1=15, B0=12 • P1=(A2+A1+A0).(B2+B1+B0)=10 672 • P-1=(A2-A1+A0).(B2-B1+B0)=5 160 • P2=(4A2+2A1+A0).(4B2+2B1+B0)=63 282 • M0=A0B0=708, M4=A2B2=1 513, M1=2M4-(1/2)M0-(1/3)P-1+P1-(1/6)P2=1 077, M2=-M4-M0+(1/2)P-1+(1/2)P1=5 695, M3=-2M4+(1/2)M0-(1/6)P-1-(1/2)P1+(1/6)P2=1 679 • R=D4.M4+D3.M3+D2M2+D.M1+M0=153 036 058 408

  13. Generalisation • We can generalize and split in r=4,5,6… parts • The complexity is proportional with nlog(2r-1)/log(r) • Theoretically we can then obtain multiplication algorithms whose complexity is proportional to n(1+) with >0 as small as desired • But intermediate computations are heavy • Practically even Toom-Cook algorithms are not implemented: if necessary, algorithms based on Fourier transforms are prefered.

  14. Fourier transform • Multiplying numbers or polynomials is essentially the same thing. • A=(an-1an-2 … a1a0) et B=(bn-1bn-2 … b1b0) • A(X)= an-1Xn-1+an-2Xn-2+ … +a1X1+a0,, B(X)= bn-1Xn-1+bn-2Xn-2+ … +b1X1+b0 • IfP(X)=A(X).B(X), then A.B=P(D) • The degreeof P is 2n-2, if we know 2n distinct values of P, we can determine P (for instance by Lagrange’s interpolation) • Fourier transform use the points wk=exp(2ik/2n), 0k<2n • Let Aj=A(wj), Bj=B(wj) andPj=P(wj) • Let Pj=Aj.Bj

  15. Fourier transform • We compute Aj and Bj by Fourier transform • We compute Pj = Aj . Bj • We perform an inverse Fourier transform to determine the Pi value • The computation of the Fourier transform and the inverse Fourier transform have a complexity proportional to n.log(n) • The computation for the Pj is linear • The global cost is then of the same order than n.log(n)

  16. Complexity: multiplication • Scolar algorithm has a complexity proportional to n2 • Karatsuba’s algorithm is proportional to n1.58 • Toom-Cook’s algorithm is proportional to n1.46 • Schonage’s Fourier transform algorithm is proportional to n.log(n) • But in fact, Winograd has shown that the complexity of multiplication is just linear

  17. Equation f(x)=0 • Let f be a real function, continuously derivable over an interval ]a,b[ which has a zero in this interval • We recursively define x0 ]a,b[, and xn+1=xn-f(xn)/f’(xn) for n1 • Depending f and x0, the series xn converges to a limit l such that f(l)=0 • Newton’s method • The convergence is generally quadratic (the number of exact decimals doubles at each iteration)

  18. Inverse computation • Let a be real positive value, we wish to compute 1/a • Let f(x)=1/x-a • The iteration is xn+1=xn(2-a.xn) • If xn-1/a= then xn+1-1/a=a.2 • If <1/a, then a.2 <  • Thus if x0]0,2/a[, the series converges to 1/a • We need a rough estimation of 1/a to initialize the algorithm • The convergence is quadratic

  19. Inverse: Example • a=0.52369817 • By looking at the first bits of a, we see that 0.5 < a < 1 thus 1 < 1/a < 2 we can initialize with x0=1.5 • x1=x0(2-a.x0)=1.82167911 • x2=x1(2-a.x1)=1.90545810 • x3=x2(2-a.x2)=1.90948830 • x4=x3(2-a.x3)=1.90949683

  20. Integer division • Let A and B two integers, we want to compute A/B • We compute q=1/B with a precision better than 1/A by excess (resp. by default) • We compute A.q and we approximate by the greatest inferior integer (resp. least superior integer)

  21. Integer division (Example) • A=309 641 782 117, B=143 557 • Considering the first bits of A, we have A<51011, then we have to compute q with a precision equal to 210-12 • We have q=0.000 006 965 874210-12 • The value for q is a value by excess (the next iteration gives a value slightly inferior) • A.q=2 156 925.639 with an error inferior to 1 • We approximate with the integer value by default A/B=2 156 925

  22. Modulo computation • Let A and B two integers, we wish to compute r=A mod B • We compute the integer quotient q=A/B by the previous algorithm • We compute then r=A-q.B • If B is small with respect to A, there is other methods

  23. Square root • Let a be a real positive number and we wish to compute a • Let f(x)=x2-a and we use Newton’s method • xn+1=(xn+a/xn)/2 • This method is known since antiquity as Hénon’s method • Problem: a division is necessary

  24. Square root: 2nd method • f(x)=1/x2-a • The iteration is xn+1=(3xn-a.xn3)/2 • The series converges to 1/a, we have just to multiply by a to find a. • No more divisions • If x0]0, 2/a[, the series converges quadratically to 1/a

  25. Square root: Example • a=5, 4<a<16, 2<a<4, 1/4<1/a <1/2 x0=0.375 • x1=(3x0-a.x03)/2=0.43066406 • x2=(3x1-a.x13)/2=0.44630628 • x3=(3x2-a.x23)/2=0.44721083 • x4=(3x3-a.x33)/2=0.44721359 • x5=(3x4-a.x43)/2=0.44721360 • x6=(3x5-a.x53)/2=0.44721360 (x6=x5we stop) • a5.x6=2.23606800, the value is exact with an error of at most 5.10-8 • In fact a= 2.23606797…

  26. Superior order iterations • We wish to solve f(X)=0 • We suppose that x is near a zero of the equation • f(x+h)=f(x)+h.f’(x)+(h2/2).f’’(x)+ … (Taylor) • We solve in h the equation f(x)+h.f’(x)+ (h2/2).f’’(x)=0 • We have h=-(f’(x)/f’’(x))(1-(1-g(x))1/2) with g(x)=2f(x)f’’(x)/f’(x)2 • f(x) is supposed to be small and if f is a function regular enough, g(x) also is small

  27. Superior order iterations (cont) • If  is small then 1-(1-)1/2 /2+2/8 (Taylor) • We replace with g(x) and we have (1-(1-g(x))1/2)(g(x)/2).(1+g(x)/4) • So h -(f(x)/f’(x)).(1+f(x)f’’(x)/(2.f’(x)2)) • Let xn+1=xn-(f(xn)/f’(xn)).(1+f(xn)f’’(xn) /(2.f’(xn)2)) • The series converges cubically to the zero of the equation f(X)=0 • Householder’s iteration (1970)

  28. Superior order iterations (end) • We can continue and define quartic, quintic, … iterations. • But all the functions are not convenient • The successive derivatives have to be easy to compute • The overloading of computation cost has to be compensated by the increase of convergence speed

  29. Example: Inverse • Computation of the inverse of a • f(x)=1/x-a • We choose an initial value x0 near 1/a • Let hn=(1-a.xn) • Quadratic iteration: xn+1=xn+xnhn (Newton) • Cubic iteration: xn+1=xn+xn(hn+hn2) (Householder) • Quartic iteration: xn+1=xn+xn(hn+hn2+hn3) • Quintic iteration: xn+1=xn+xn(hn+hn2+hn3+hn4) • Etc …

  30. 1.905458103 1.821679118 1.890664087 1.905458103 1.909495007 1.909496839 1.909488296 1.909496839 1.909496839 1.909496839 1.909496839 1.909496839 Example: Inverse • a=0.52369817, x0=1.5 quadratic cubic quartic n=1 n=2 n=3 n=4

  31. Computation of elementary functions • Computation of functions like sin, cos, log, … • We use the properties of the function to be sent in a given interval • For a limited precision (about ten decimal digits), CORDIC algorithm can be used. It requires only additions (it is the algorithm of pocket calculators) • For a greater precision, we can use Taylor’s developments with the middle of the interval as origin (to speed up convergence) • For a maximal precision, we use techniques like binary splitting or l’AGM

  32. Example: sinus • We have sin(x+2)=sin(x), sin(x+)=-sin(x), sin(-x)=-sin(x), sin(x)=cos(/2-x) • We can limit the computations to the interval [-/4, /4] • sin(x)=x-x3/6+x5/120+…=(-1)nx2n+1/(2n+1)! • cos(x)=1-x2/2+x4/24+…=(-1)nx2n/(2n)! • Truncated developments are computed with Horner’s algorithm in order to limit the number of multiplications • Example: 1-x2/2+x4/24-x6/720=1-(x2/1.2).(1-(x2/3.4)(1-x2/5.6))

  33. Numerical example • Computation of sin(8) with 5 exact decimals • sin(8)=sin(3-8)=cos(8-5/2) • We have8-5/26<10-5, we can then limit the development of cos(x) to the three first terms • We use then cos(x)1-(x2/1.2).(1-(x2/3.4)) • Computations by Horner’s algorithm are stable (the rounding errors do not propagate) • Numerically we have sin(8)=0.98935826008034 • In fact sin(8)=0.98935824662338

  34. 0.23876388301 0.066666666667 0.10991119991 0.076190476191 0.12332346666 0.30969097075 0.082783882784 0.14041543333 0.43656365692 0.090231990232 0.71828182846 0.16291649048 0.099111999112 0.19381941508 1.7182818285 Example: Computation of e with 10 decimal digits • We have e=1+1/2! + 1/3! + … • We have 15! > 1010, we can then stop at the 15 st term • Let x15=1/15 and xn=(1+xn-1)/n then we have exp(1)1+x0 • In fact e2.718281828459 • The computed value is exact with 10 decimal digits

  35. Further … • Web site: http://numbers.computation.free.fr • Arithmétique en précision arbitraire, Paul Zimmermann, Internal publication INRIA no 4272, september 2001 • AGM: Fast multiple-precision evaluation of elementary functions, Richard P. Brent, Journal of the ACM, vol 2 no 23 pages 242-251 (1976) • Computation of : Pi and the AGM, J.M. et P.B. Borwein on the web site http://www.cecm.sfu.ca

More Related