1 / 27

rand() Considered Harmful

rand() Considered Harmful. Stephan T. Lavavej (" Steh -fin Lah - wah -wade") Senior Developer - Visual C++ Libraries stl@microsoft.com. What's Wrong With This Code?. # include < stdio.h > #include < stdlib.h > #include < time.h > int main() { srand (time(NULL));

owena
Download Presentation

rand() Considered Harmful

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Version 1.1 - September 5, 2013 rand() Considered Harmful Stephan T. Lavavej ("Steh-fin Lah-wah-wade") Senior Developer - Visual C++ Libraries stl@microsoft.com

  2. What's Wrong With This Code? #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); }

  3. What's Right With This Code? All required headers are included! #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); } All included headers are required! Headers are sorted! One True Brace Style! %d is correct for int! Unnecessary argc, argv, return 0; omitted!

  4. What's Wrong With This Code? #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); }

  5. What's Wrong With This Code? ABOMINATION! #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); }

  6. What's Wrong With This Code? ABOMINATION! #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); } Frequency: 1 Hz!

  7. What's Wrong With This Code? ABOMINATION! warning C4244: 'argument' : conversion from 'time_t' to 'unsigned int', possible loss of data #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); } Frequency: 1 Hz! 32-bit seed!

  8. What's Wrong With This Code? ABOMINATION! warning C4244: 'argument' : conversion from 'time_t' to 'unsigned int', possible loss of data #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n"); } Frequency: 1 Hz! 32-bit seed! Range: [0, 32767] Linear congruential low quality!

  9. What's Wrong With This Code? ABOMINATION! warning C4244: 'argument' : conversion from 'time_t' to 'unsigned int', possible loss of data #include <stdio.h> #include <stdlib.h> #include <time.h> int main() { srand(time(NULL)); for (inti = 0; i < 16; ++i) { printf("%d ", rand()% 100); } printf("\n"); } Frequency: 1 Hz! 32-bit seed! Non-uniform distribution! Range: [0, 32767] Linear congruential low quality!

  10. Modulo  Non-Uniform Distribution intsrc = rand(); // Assume uniform [0, 32767] intdst = src % 100; // Non-uniform [0, 99] // [0, 99] src [0, 99] dst // [100, 199] src [0, 99] dst // ... // [32700, 32767] src [0, 67] dst • This is modulo's fault, not rand()'s • Trigger: input range isn't exact multiple of output range

  11. Floating-Point Treachery intsrc = rand(); // Assume uniform [0, 32767] intdst = static_cast<int>( // As seen on (src * 1.0 / RAND_MAX) * 99 // StackOverflow ); // Hilariously non-uniform [0, 99] • Only one input produces the output 99: static_cast<int>((32765* 1.0 / 32767) * 99) == 98 static_cast<int>((32766 * 1.0 / 32767) * 99) == 98 static_cast<int>((32767 * 1.0 / 32767) * 99) == 99

  12. Floating-Point Double Treachery intsrc = rand(); // Assume uniform [0, 32767] intdst = static_cast<int>( (src * 1.0 / (RAND_MAX + 1)) * 100 ); // Subtly non-uniform [0, 99] • Less likely outputs (327/32768 vs. 328/32768): 3, 6, 9, 12, 15, 18, 21, 24, 28, 31, 34, 37, 40, 43, 46, 49, 53, 56, 59, 62, 65, 68, 71, 74, 78, 81, 84, 87, 90, 93, 96, 99 • Same problem as src % 100 • Nothingcan uniformly map 32768 inputs to 100 outputs

  13. Floating-Point Triple Treachery • What if the input is [0, 232) or [0, 264)? • Non-uniformity is reduced, but not eliminated, when the input is much larger than the output • What if IEEE runs out of bits? • Example: [0, 264) input  [0, 1018 ≈ 259.8) output • double has only 53 bits of significand precision • Say you have a problem, so you use floating-point • Now you have 2.000001 problems • DO NOT MESS WITH FLOATING-POINT

  14. <random> URNGs(Uniform Random Number Generators) • Engine templates: • linear_congruential_engine • mersenne_twister_engine • subtract_with_carry_engine • Engine adaptor templates: • discard_block_engine • independent_bits_engine • shuffle_order_engine • Non-deterministic: • random_device • Engine (adaptor)typedefs: • minstd_rand0 • minstd_rand • mt19937 • mt19937_64 • ranlux24_base • ranlux48_base • ranlux24 • ranlux48 • knuth_b • default_random_engine

  15. <random> Distributions • Uniform distributions • uniform_int_distribution • uniform_real_distribution • Poisson distributions • poisson_distribution • exponential_distribution • gamma_distribution • weibull_distribution • extreme_value_distribution • Sampling distributions • discrete_distribution • piecewise_constant_distribution • piecewise_linear_distribution • Bernoulli distributions • bernoulli_distribution • binomial_distribution • geometric_distribution • negative_binomial_distribution • Normal distributions • normal_distribution • lognormal_distribution • chi_squared_distribution • cauchy_distribution • fisher_f_distribution • student_t_distribution

  16. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; }

  17. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Deterministic 32-bit seed

  18. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Engine: [0, 232) Deterministic 32-bit seed

  19. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937mt(1729); std::uniform_int_distribution<int>dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Engine: [0, 232) Deterministic 32-bit seed Distribution: [0, 99]

  20. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937mt(1729); std::uniform_int_distribution<int>dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Engine: [0, 232) Deterministic 32-bit seed Distribution: [0, 99] Note: [inclusive, inclusive]

  21. Hello, "Random" World! #include <iostream> #include <random> int main() { std::mt19937mt(1729); std::uniform_int_distribution<int>dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Engine: [0, 232) Deterministic 32-bit seed Distribution: [0, 99] Run engine, viewed through distribution Note: [inclusive, inclusive]

  22. Hello, Random World! #include <iostream> #include <random> int main() { std::random_devicerd; std::mt19937 mt(rd()); std::uniform_int_distribution<int> dist(0, 99); for (inti = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl; } Non-deterministic 32-bit seed

  23. mt19937 vs. random_device • mt19937 is: • Fast (499 MB/s = 6.5 cycles/byte for me) • Extremely high quality, but not cryptographically secure • Seedable (with more than 32 bits if you want) • Reproducible (Standard-mandated algorithm) • random_device is: • Possibly slow (1.93 MB/s = 1683 cycles/byte for me) • Strongly platform-dependent (GCC 4.8 can use IVB RDRAND) • Possibly crypto-secure (check documentation, true for VC) • Non-seedable, non-reproducible

  24. uniform_int_distribution • Takes any Uniform Random Number Generator • Usually [0, 232) or [0, 264) but [1701, 1729] works • If your URNG does that, you are bad and you should feel bad • Emits any desired range of integers [low, high] • signed/unsignedshort/int/long/long long • Why not char/signed char/unsigned char? Standard Says SoTM • Preserves perfect uniformity • Requires obsessive implementers • Uses bitwise/etc. magic, invokes URNG repeatedly (rare) • Runs fairly quickly (34% raw speed for me) • Deterministic, but not invariant • Will vary across platforms, may vary across versions

  25. random_shuffle() Considered Harmful template <typenameRanIt> void random_shuffle(RanIt f, RanIt l); • May call rand() • C++ Standard Library, I trusted you! template <typenameRanIt, typename RNG> void random_shuffle(RanIt f, RanIt l, RNG&& r); • Not evil, but highly inconvenient • Knuth shuffle needs r(n) to return [0, n)

  26. shuffle() Considered Awesome template <typenameRanIt, typename URNG> void shuffle(RanIt f, RanIt l, URNG&& g); • Takes URNGs directly (e.g. mt19937) • Shuffles perfectly • All permutations are equally likely • Invokes the URNG in-place (can't copy) • Other algorithms can copy functors, like generate() • Special exception: for_each() moves functors

  27. Random <random> Notes • Running mt19937 is fast, constructing/copying isn't • Constructing/copying engines often is already undesirable • URNG/distribution function call ops are non-const • Multiple threads cannot simultaneously call a single object • When is it safe to skip uniform_int_distribution? • mt19937's [0, 232) or mt19937_64's [0, 264)[0, 2N) • In this case, masking is safe, simple, and efficient • In all other cases, use uniform_int_distribution

More Related