proj-oot-old-151220-ootLibrariesNotes2

---

blog post and HN discussion on on random number generators and (in the HN discussion) hashing fns in std libs:

https://medium.com/@betable/tifu-by-using-math-random-f1c308c4fd9d https://news.ycombinator.com/item?id=10598065

notes:

"The period or cycle length of a PRNG is the length of the sequence of numbers that the PRNG generates before repeating.". This appears to be the primary quality metric for PRNGs.

performance metrics for PRNGs:

"If you need to use n random values you need a PRNG with a cycle length of at least n^2"

note: often you end up using only the most significant n bits; some PRNGs have a shorter cycle length in this case (eg MWC1616, whose cycle length drops from 2^60 to 2^30 if you use only the most significant 16 bits)

CSPRNGs: " Cryptographically secure PRNGs are a better option if you don’t have time to do proper diligence on non-cryptographic alternatives. The safest option (and the right solution for crypto) is to use urandom. In browser you can use crypto.getRandomValues(). "

but:

"

" If you’re unsure about the quality of your non-cryptographic alternatives, and unless you need deterministic seeding or require rigorous proofs of quality measures, using a CSPRNG is your best option. If you don’t trust your standard library’s CSPRNG (and you shouldn’t for cryptographic purposes) the right solution is to use urandom, which is managed by the kernel (Linux uses a scheme similar to OpenSSL’s?, OS X uses Bruce Schneier’s Yarrow generator).

I can’t tell you the exact cycle length of crypto.randomBytes() because as far as I know there’s no closed form solution to that problem (i.e., no one knows). All I can say is that with a large state space and a continuous stream of new entropy coming in, it should be safe. If you trust OpenSSL? to generate your public/private key pairs then it doesn’t make much sense not to trust it here. Empirically, once we swapped our call to Math.random() with a call to crypto.randomBytes() our collision problem went away.

In fact, Chrome could just have Math.random() call the same CSPRNG they’re using for crypto.randomBytes(), which appears to be what Webkit is doing. "

"Cryptographic applications require the output not to be predictable from earlier outputs, and more elaborate algorithms, which do not inherit the linearity of simpler PRNGs, are needed." -- https://en.wikipedia.org/wiki/Pseudorandom_number_generator

"A PRNG suitable for cryptographic applications is called a cryptographically secure PRNG (CSPRNG). A requirement for a CSPRNG is that an adversary not knowing the seed has only negligible advantage in distinguishing the generator's output sequence from a random sequence. In other words, while a PRNG is only required to pass certain statistical tests, a CSPRNG must pass all statistical tests that are restricted to polynomial time in the size of the seed. Though such property cannot be proven, strong evidence may be provided by reducing the CSPRNG to a problem that is assumed to be hard, such as integer factorization.[12] In general, years of review may be required before an algorithm can be certified as a CSPRNG." -- https://en.wikipedia.org/wiki/Pseudorandom_number_generator

Also CSPRNGs want it to be impractical for an attacker to calculate PREVIOUS values of the generator given the current internal state.

" [–]mjmalone 7 points 1 day ago

They're not lesser alternatives, they're different. Some classes of non-CS PRNGs like MT19937, WELL, xorshift*, and PCG have provable statistical characteristics and are likely better at producing sequences of numbers that appear to be random. They're also way faster than most (all?) CSPRNGs. If you're trying to shuffle a really big array using Fisher-Yates, for instance, and you want to be able to guarantee you can produce every possible shuffle, you can't get that with most CSPRNGs and if you could it'd be wasteful in terms of CPU cycles. You need either true randomness (probably can't get enough fast enough) or something with a provable long cycle and equidistribution. The copmlex non-linear transformations used by most CSPRNGs make these characteristics impossible to prove (well, you might be able to demonstrate something empirically, but maybe that's not good enough because blah blah blah). " -- https://www.reddit.com/r/netsec/comments/3tl4xd/tifu_by_using_mathrandom/cx7egl1

on specification:

" The ECMAScript standard says the following about Math.random() —

    Returns a Number value with positive sign, greater than or equal to 0 but less than 1, chosen randomly or pseudo randomly with approximately uniform distribution over that range, using an implementation-dependent algorithm or strategy.

The specification leaves a lot to be desired. First, it doesn’t mention anything about precision. Since ECMAScript Numbers are IEEE 754 binary64 double-precision floats, we might expect 53-bit precision (i.e., random values taking the form x/2⁵³ for all x in 0..2⁵³-1). Mozilla’s SpiderMonkey? engine seems to agree but, as we’ll soon find out, V8’s Math.random() only has 32-bit precision (i.e., values taking the form x/2³² for all x in 0..2³²-1). "

to scale random numbers (eg to get a uniform random from 1 to 10 from a uniform random number generator from 0 to 1): Math.floor(Math.random() * scale);

"This is the method that MDN recommends for scaling a random number, and it’s in widespread use in the wild. It’s called the multiply-and-floor method, because that’s what it does. It’s also called the take-from-top method, because the lower bits of the random number are truncated, leaving the left-most or top bits as our scaled integer result. (Quick note: it’s subtle, but in general this method is slightly biased if your scaled range doesn’t evenly divide your PRNG’s output range. A general solution should use rejection sampling like this, which is part of the standard library in other languages.)

some random number test suites:

recommended:

ANTI-recommended:

PRNG recommendation by the blog post author: mersenne twister. used by Python and Ruby. Wikipedia says, "The first PRNG to avoid major problems and still run fairly quickly was the Mersenne Twister (discussed below), which was published in 1998. Other high-quality PRNGs have since been developed." -- https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Potential_problems_with_deterministic_generators

Wikipedia says:

" In practice, the output from many common PRNGs exhibit artifacts that cause them to fail statistical pattern-detection tests. These include:

    Shorter than expected periods for some seed states (such seed states may be called 'weak' in this context);
    Lack of uniformity of distribution for large numbers of generated numbers;
    Correlation of successive values;
    Poor dimensional distribution of the output sequence;
    The distances between where certain values occur are distributed differently from those in a random sequence distribution.

Defects exhibited by flawed PRNGs range from unnoticeable (and unknown) to very obvious. An example was the RANDU random number algorithm used for decades on mainframe computers. It was seriously flawed, but its inadequacy went undetected for a very long time. "-- https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Potential_problems_with_deterministic_generators

"

Kendall and Smith's original four tests were hypothesis tests, which took as their null hypothesis the idea that each number in a given random sequence had an equal chance of occurring, and that various other patterns in the data should be also distributed equiprobably.

    The frequency test, was very basic: checking to make sure that there were roughly the same number of 0s, 1s, 2s, 3s, etc.
    The serial test, did the same thing but for sequences of two digits at a time (00, 01, 02, etc.), comparing their observed frequencies with their hypothetical predictions were they equally distributed.
    The poker test, tested for certain sequences of five numbers at a time (aaaaa, aaaab, aaabb, etc.) based on hands in the game poker.
    The gap test, looked at the distances between zeroes (00 would be a distance of 0, 030 would be a distance of 1, 02250 would be a distance of 3, etc.).

If a given sequence was able to pass all of these tests within a given degree of significance (generally 5%), then it was judged to be, in their words "locally random". "

" Other tests:

    The Monobit test treats each output bit of the random number generator as a coin flip test, and determine if the observed number of heads and tails are close to the expected 50% frequency. The number of heads in a coin flip trail forms a binomial distribution.
    The Wald–Wolfowitz runs test tests for the number of bit transitions between 0 bits, and 1 bits, comparing the observed frequencies with expected frequency of a random bit sequence.
    Information entropy
    Autocorrelation test
    Kolmogorov–Smirnov test
    The Spectral Test[3]
    Maurer's Universal Statistical Test

"

" An initial battery of statistical tests for uniform RNGs was offered by the 1969 first edition of The Art of Computer Programming, by Donald Knuth. In testing of RNGs, Knuth's tests were supplanted by George Marsaglia's (1996) Diehard tests, which consisted of fifteen different tests. The inability to modify the test parameters or add new tests led to the development of the TestU?01 library. "

" The 1997 invention of the Mersenne Twister,[8] in particular, avoided many of the problems with earlier generators. The Mersenne Twister has a period of 219937−1 iterations (≈4.3×106001), is proven to be equidistributed in (up to) 623 dimensions (for 32-bit values), and at the time of its introduction was running faster than other statistically reasonable generators.

Subsequently, the WELL family of generators was developed.[9] The WELL generators in some ways improves on the quality of the Mersenne Twister—which has a too-large state space and a very slow recovery from state spaces with a large number of zeros.

In 2003, George Marsaglia introduced the family of xorshift generators,[10] again based on a linear recurrence. Such generators are extremely fast and, combined with a nonlinear operation, they pass strong statistical tests.[11] " -- https://en.wikipedia.org/wiki/Pseudorandom_number_generator#Generators_based_on_linear_recurrences

"Mersenne twister - the most widely used PRNG [1]" --- https://en.wikipedia.org/wiki/List_of_random_number_generators , citing E.g. Marsland S. (2011) Machine Learning (CRC Press), §14.1.1. Also see the section "Adoption in software systems".

" Traditionally, C standard libraries used a linear congruential generator similar to java.util.Random. Apparently, recent versions of glibc use a trinomial-based linear-feedback shift register or even the SIMD version of MT19937. " -- http://xorshift.di.unimi.it/

" How can a xorshift64* generator be slower than a xorshift1024* generator?

Dependencies. The three xor/shifts of a xorshift64* generator must be executed sequentially, as each one is dependent on the result of the previous one. In a xorshift1024* generator two of the xor/shifts are completely independent and can be parallelized internally by the CPU. I also suspect that the larger state space makes it possible for the CPU to perform more aggressively speculative execution (indeed, a xorshift128* generator is slower than a xorshift1024* generator). " -- http://xorshift.di.unimi.it/

for xorshift* and xorshift+: " Why just sizes 64, 128, 1024 and 4096?

First of all, using powers of two in high dimension is advantageous because the modulo operation used to update the cyclic counter that simulates the block shift of the whole state space becomes a logical and.

64 is the smallest possible size if you use 64-bit shifts, and it makes it very easy to have a generator embedded in a Java object without creating an additional object. A generator with 64 bits of state can also be used to generate the seed for a generator with a larger state. We suggest in this case to use a SplitMix?64 generator.

128 works very well because you can just swap the two halves of the state space instead of keeping a circular counter. This makes for the incredible speed of a xorshift128+ generator, which is a very reasonable choice for a general-purpose generator if no parallelism is involved. Also in this case it is very easy to embed the generator without creating an additional Java object, as the state space can be represented by two long variables.

1024 seems the most reasonable choice for a general-purpose generator for massively parallel applications: it is large enough, but not uselessly large. ... " -- http://xorshift.di.unimi.it/

http://xorshift.di.unimi.it/ compares various xorshift, xorshift*, xorshift+s, and MT19937-64 (Mersenne Twister), and WELL1024a

http://www.pcg-random.org/other-rngs.html

the testu01 paper http://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf has a big table of various algorithms and how they did on the tests. The ones which passed all the tests and had no 'suspicious' outcomes are the ones with blank lines (NOT dashes) in the rightmost 3 columns. These are (with parens separating families): (ran2, CLCG4), (MRGk5-93, DX-47-3, DX-1597-2-7, Marsa-LFIB4, CombMRG?96, MRG32k3a, MRG63k3a), (Knuth-ranf-array2, SWB(2^24, 10, 24)[24, 97], SWB(2^24, 10, 24)[24, 389]), (SuperDuper?64, KISS99, Brent-xor4096s), LFib(2^64, 17, 5, *), LFib(2^64, 55, 24, *), LFib(2^64, 607, 273, *), LFib(2^64, 1279, 861, *), (ISAAC, AES (OFB), AES (KTR), SHA-1 (OFB), SHA-1 (CTR)). The paper says, "The default generators of many popular software programs (e.g., Excel, MATLAB, Mathematica, Java standard library, R, etc.) fail several tests miserably. The survivors are the long-period MRGs with good structure, the multiplicative lagged-Fibonacci generators, some non- linear generators designed for cryptology, and some combined generators with components from different families. SWB and additive lagged-Fibonacci gener- ators also pass when using appropriate decimation, but the decimation slows them down significantly. We believe that combined generators with components from different families should be given better attention because theoretical guarantees about their uniformity can be proved [L’Ecuyer? and Granger-Pich ́ e 2003], their period can easily be made very long, splitting their period into long disjoint substreams is easy to do if we can do it for the components and it is not hard to select the components with this in mind. We emphasize that the RNGs implementations in TestU?01 are only for testing purposes. For sim- ulation, we recommend RNGs with multiple streams and substreams such as those described in L’Ecuyer? et al. [2002] and L’Ecuyer? and Buist [2005], for example."

summary: use one of:

(does xorshift* and xorshift+ support 'multiple streams'?)

and for testing, rely on TestU?01.

---

https://news.ycombinator.com/item?id=10605562 pcwalton notes:

" My experience with compiler development is that the incentives all align toward making the "default RNG that people go to" as fast as possible to the exclusion of all else, including the quality of the generated numbers. That's because people frequently write benchmarks gated on the speed of random number generation, and those benchmarks usually don't care about the quality of the random numbers. Sometimes those benchmarks get popular, which encourages a race to the bottom in RNG quality."

im2w1l 13 hours ago

Fast hash functions can be really important for performance of hash sets and maps. In my experience high speed is more important than a low collision rate.

That was for primitive types, it may well be different for types where comparison is more costly. And of course if you need protection from algorithmic complexity attacks, then you will need to take that into account.

reply

Gankro 8 hours ago

As long as your usecase isn't worried about hash-flooding [0]. Unfortunately too few people know this is even an attack, so it may be reasonable for languages to default to a stronger hash algorithm for safety, but this isn't the common case... language design is hard.

[0]: https://www.youtube.com/watch?v=wGYj8fhhUVA

reply

KMag 6 hours ago

There isn't much complexity and performance cost of optimistically using a faster hash, and dynamically falling back to siphash (or another secure keyed hash function).

Using associative arrays implemented via chaining, one could have a bit in the associative array header that indicates if the associative array uses the fast keyed hash or siphash (or another secure hash). When inserting an item, if one finds that the chain length exceeds a certain limit without the load exceeding the resize threshold (indicating poor hash distribution), one could rehash all of the keys using siphash.

An associative array using open addressing could do something similar, switching hash functions if the probe sequence got too long (analogous to a chain being too long).

Alternatively, if using open addressing, one could use something like cuckoo hashing. A very fast keyed hash could be computed, and upon a collision, siphash could be used, with a probe sequence similar to used by Python dicts. Though, if your use case has lots of missed lookups, then performance will be better just using siphash. Of course, one could use counters to detect this condition, set a bit in the associative array header to indicate all future lookups should just use siphash, and rehash all of the existing keys.

Gankro 5 hours ago

Rust's solution is to provide the hash function as a generic parameter so you can just Fnv/Xx/Sip based on your workload, with Sip as the default if you don't pick. Unfortunately this is mostly incompatible with the adaptive scheme you suggest, because you definitely don't want the adaptive logic if you've already picked Xx/Fnv. Making it possible to disable all that if you pick something other than Sip would make things quite complicated.

But yeah, data structuring is ultimately really work-load specific. Any solution a language provides will always be suboptimal for tons of use cases. The solution you describe sounds good for a "never think about it" solution that works for 85% of cases. Rust's solution kinda necessitates more thinking more often, but I think it makes it more applicable for more usecases if you're willing to do that little bit of thinking.

reply

jbapple 6 hours ago

Fortunately, there are very fast universal hash functions like Vhash that are resistant to hash-flooding.

reply

Gankro 5 hours ago

I've literally never heard of Vhash, and I see no references to it outside of the 2007 paper. As far as I know, the state of the art for "fast and secure" is still SipHash? 2-4 or 2-3.

While SipHash? perf is generally quite good, it's still not as fast as Fnv on small inputs or XxHash? on large inputs. As in, 4x slower [0]. FarmHash? has great perf if you can invoke it in the way it wants, but doesn't do well in a streaming context.

[0]: http://cglab.ca/~abeinges/blah/hash-rs/

reply

jbapple 5 hours ago

> I've literally never heard of Vhash, and I see no references to it outside of the 2007 paper.

It is intimately tied up with VMAC. The title of the 2006 paper that introduced them is "Message authentication on 64-bit architectures". Google scholar says it has been cited 32 times, while "SipHash?: a fast short-input PRF" has been cited 43 times.

> As far as I know, the state of the art for "fast and secure" is still SipHash? 2-4 or 2-3.

SipHash? seems to me like a good choice.

SipHash? has some security properties that Vhash does not, although Vhash-based-VMAC has similar security properties to SipHash?, I believe. However, for hash flooding attacks, if the hash values themselves are not directly exposed to the attackers and there is enough noise in the response latency that timing attacks are difficult, I do not know what benefits SipHash? has.

"Faster 64-bit universal hashing using carry-less multiplications" has some benchmarks that show Vhash as faster than SipHash?, substantially so on long input.

Speed-wise, an iterated string hash using Dietzfelbinger-style multiply-shift hashing is also substantially faster (3-10x) than SipHash? on both large and small inputs in my testing, and it needs about 20 lines of code to implement. I haven't written any Rust in a while, but I'll try and send you a patch to the repo you linked.

The benchmarks from the "Faster 64-bit ..." paper are available at https://github.com/lemire/StronglyUniversalStringHashing. The iterated string hash I referenced is https://github.com/lemire/StronglyUniversalStringHashing/blo....

reply

Gankro 4 hours ago

Awesome, cool!

The repo's a bit of a commented-out mess (I was trying to get a bunch of broken impls working before I had to get back to real work), so let me know if you have any trouble. Always happy to help people learn the language. :)

I'm Gankro on github and the #rust IRC channel.

reply

nilknarf 10 hours ago

I have previously written about how to predict the next Math.random in Java (and hence Firefox also): http://franklinta.com/2014/08/31/predicting-the-next-math-ra... The TL;DR; is that it is easy since you only have to brute force around 2^22 possibilities (which runs in < 1 second).

Someone then asked whether it is also possible in Chrome and Node: https://github.com/fta2012/ReplicatedRandom/issues/2. After examining the MWC1616 code I saw the same “two concatenated sub-generators” problem explained in this post. The implication of it is that you can brute force the top and bottom 16 bits independently. So it is also easy since that’s just doing 2^16 possibilities twice! Code: https://gist.github.com/fta2012/57f2c48702ac1e6fe99b

reply

 bhickey 9 hours ago

I haven't looked closely enough at their choice to evaluate it, but there's a good reason not to use Mersenne Twister. MT is a bad generator.

If you need a CSPRNG use ChaCha?20 or AES. For simulation use PCG.

reply

SeanLuke? 5 hours ago

I wouldn't touch PCG yet. It's based on a single paper submitted to a journal which hasn't even been reviewed yet. Has there been any independent testing of the algorithm? Not that I know of. And the entire PCG website appears to have been built by the paper author as a promotional tool, which is a bit spooky given that the paper hasn't even been published.

MT is far from a "bad generator". It doesn't pass a few stringent TestU?01 tests, which is the case for a number of very well regarded generators.

reply

adrae5df 5 hours ago

I was just researching PCG, and I didn't find anything relevant apart from the the author's paper and website. I was trying to find out if it has been peer reviewed since its release last year. I'm really curious, how did you check if this paper was peer reviewed?

reply

acqq 7 hours ago

The best advice of all I've seen in the comments here. The PCG page:

http://www.pcg-random.org/

contains a nice comparison table.

reply

adrae5df 6 hours ago

Can you name those rudimentary statistical tests that Mersenne Twister fails. I would like to test them out, since I regularly use MT. Thank you.

reply

acqq 5 hours ago

Apparently here: https://en.m.wikipedia.org/wiki/TestU01

Wikipedia entry for MT specifies some of the problems.

reply

mmalone 4 hours ago

MT only fails LinearComp? systematically. It is not a bad generator, it is a very good generator. The biggest problem is the "escape from zero-land" stuff where it's biased towards zero for a while if its seed is mostly zeros, but other generators have the same problem (though not to the same extent) and it's still statistically very unlikely to be a problem and it self corrects as the sequence runs.

Comparing to ChaCha?20, which is a CSPRNG, is a little apples/oranges.

reply

acqq 2 hours ago

The author (or authors?) of MT have some more recent generators that are faster, what's the reason for staying with the original MT?

reply

mmalone 40 minutes ago

Just familiarity. Mostly I just can't recommend something I don't know much about. The SIMD variant of MT is architecture specific, I thought? Other newer variants of MT/WELL generators and PCG are probably all good options. Also xorshift*.

But I know that MT is a very good algorithm and using it has a certain pragmatic appeal - you make a lot of people's lives easier because when they Google to learn more about your generator they'll find good information quickly.

reply

panic 13 hours ago

Statistical simulations require good randomness (or your simulation might give a wrong result) and high speed but not cryptographic security. See, for example, xorshift* (http://xorshift.di.unimi.it), which is efficient and high-quality but not cryptographically secure.

reply


since Oot prioritizes simplicity over performance, we should provide the safest option for default random number generation, with an additional 'fast_' option if speed at the expense of safety is desired. But we need seeding.

i don't yet understand why some PRNGs are good for simulation and some aren't, and i don't understand the multiple streams stuff, which seems to be related. Are xorshift* and xorshift+ good for simulation?

So, probably use a CSPRNG by default (ChaCha?20? AES?), and then offer one or more of xorshift*, xorshift+, PCG as a rand_fast for simulation stuff (which of these has the smallest state, for the embedded profile? xorshift64*? in the embedded profile, the client must manually init the generator b/c we dont want to waste memory otherwise).

One issue is that we want a high-level program to be able to set this default and then call lower-level constructs that use it. It seems to me that this is a good use case for semiglobals (semiglobals are also good for storing the random seed and the random state).

---