Evolving the Go Standard Library with math/rand/v2

https://go.dev/blog/randv2

The Go Blog

Since Go 1 was released in March 2012, changes to the standard library have been constrained by Go’s compatibility promise. Overall, compatibility has been a boon for Go users, providing a stable base for production systems, documentation, tutorials, books, and more. Over time, however, we’ve realized mistakes in the original APIs that cannot be fixed compatibly; in other cases, best practices and convention have changed. We need a plan for making important, breaking changes too.

This blog post is about Go 1.22’s new math/rand/v2 package, the first “v2” in the standard library. It brings needed improvements to the math/rand API, but more importantly it sets an example for how we can revise other standard library packages as the need arises.

(In Go, math/rand and math/rand/v2 are two different packages with different import paths. Go 1 and every release after it have included math/rand; Go 1.22 added math/rand/v2. A Go program can import either package, or both.)

This post discusses the specific rationale for the changes in math/rand/v2 and then reflects on the general principles that will guide new versions of other packages.

Pseudorandom Number Generators

Before we look at math/rand, which is an API for a pseudorandom number generator, let’s take a moment to understand what that means.

A pseudorandom number generator is a deterministic program that generates a long sequence of seemingly random numbers from a small seed input, although the numbers are not in fact random at all. In the case of math/rand, the seed is a single int64, and the algorithm produces a sequence of int64s using a variant of a linear-feedback shift register (LFSR). The algorithm is based on an idea by George Marsaglia, tweaked by Don Mitchell and Jim Reeds, and further customized by Ken Thompson for Plan 9 and then Go. It has no official name, so this post calls it the Go 1 generator.

The goal is for these generators to be fast, repeatable, and random enough to support simulations, shuffling, and other non-cryptographic use cases. Repeatability is particularly important for uses like numerical simulations or randomized testing. For example, a randomized tester might pick a seed (perhaps based on the current time), generate a large random test input, and repeat. When the tester finds a failure, it only needs to print the seed to allow repeating the test with that specific large input.

Repeatability also matters over time: given a particular seed, a new version of Go needs to generate the same sequence of values that an older version did. We didn’t realize this when we released Go 1; instead, we discovered it the hard way, when we tried to make a change in Go 1.2 and got reports that we had broken certain tests and other use cases. At that point, we decided Go 1 compatibility included the specific random outputs for a given seed and added a test.

It is not a goal for these kinds of generators to produce random numbers suitable for deriving cryptographic keys or other important secrets. Because the seed is only 63 bits, any output drawn from the generator, no matter how long, will also only contain 63 bits of entropy. For example, using math/rand to generate a 128-bit or 256-bit AES key would be a serious mistake, since the key would be easier to brute force. For that kind of use, you need a cryptographically strong random number generator, as provided by crypto/rand.

That’s enough background that we can move on to what needed fixing in the math/rand package.

Problems with math/rand

Over time, we noticed more and more problems with math/rand. The most serious were the following.

Generator Algorithm

The generator itself needed replacement.

The initial implementation of Go, while production ready, was in many ways a “pencil sketch” of the entire system, working well enough to serve as a base for future development: the compiler and runtime were written in C; the garbage collector was a conservative, single-threaded, stop-the-world collector; and the libraries used basic implementations throughout. From Go 1 through around Go 1.5, we went back and drew the “fully inked” version of each of these: we converted the compiler and runtime to Go; we wrote a new, precise, parallel, concurrent garbage collection with microsecond pause times; and we replaced standard library implementations with more sophisticated, optimized algorithms as needed.

Unfortunately, the repeatability requirement in math/rand meant that we couldn’t replace the generator there without breaking compatibility. We were stuck with the Go 1 generator, which is reasonably fast (about 1.8ns per number on my M3 Mac) but maintains an internal state of almost 5 kilobytes. In contrast, Melissa O’Neill’s PCG family of generators generates better random numbers in about 2.1ns per number with only 16 bytes of internal state. We also wanted to explore using Daniel J. Bernstein’s ChaCha stream cipher as a generator. A follow-up post discusses that generator specifically.

Source Interface

The rand.Source interface was wrong. That interface defines the concept of a low-level random number generator that generates non-negative int64 values:

% go doc -src math/rand.Source
package rand // import "math/rand"
// A Source represents a source of uniformly-distributed
// pseudo-random int64 values in the range [0, 1<<63).
//
// A Source is not safe for concurrent use by multiple goroutines.
type Source interface {
    Int63() int64
    Seed(seed int64)
}
func NewSource(seed int64) Source
%

(In the doc comment, “[0, N)” denotes a half-open interval, meaning the range includes 0 but ends just before 2⁶³.)

The rand.Rand type wraps a Source to implement a richer set of operations, such as generating an integer between 0 and N, generating floating-point numbers, and so on.

We defined the Source interface to return a shortened 63-bit value instead of a uint64 because that’s what the Go 1 generator and other widely-used generators produce, and it matches the convention set by the C standard library. But this was a mistake: more modern generators produce full-width uint64s, which is a more convenient interface.

Another problem is the Seed method hard-coding an int64 seed: some generators are seeded by larger values, and the interface provides no way to handle that.

Seeding Responsibility

A bigger problem with Seed is that responsibility for seeding the global generator was unclear. Most users don’t use Source and Rand directly. Instead, the math/rand package provides a global generator accessed by top-level functions like Intn. Following the C standard library, the global generator defaults to behaving as if Seed(1) is called at startup. This is good for repeatability but bad for programs that want their random outputs to be different from one run to the next. The package documentation suggests using rand.Seed(time.Now().UnixNano()) in that case, to make the generator’s output time-dependent, but what code should do this?

Probably the main package should be in charge of how math/rand is seeded: it would be unfortunate for imported libraries to configure global state themselves, since their choices might conflict with other libraries or the main package. But what happens if a library needs some random data and wants to use math/rand? What if the main package doesn’t even know math/rand is being used? We found that in practice many libraries add init functions that seed the global generator with the current time, “just to be sure”.

Library packages seeding the global generator themselves causes a new problem. Suppose package main imports two packages that both use math/rand: package A assumes the global generator will be seeded by package main, but package B seeds it in an init func. And suppose that package main doesn’t seed the generator itself. Now package A’s correct operation depends on the coincidence that package B is also imported in the program. If package main stops importing package B, package A will stop getting random values. We observed this happening in practice in large codebases.

In retrospect, it was clearly a mistake to follow the C standard library here: seeding the global generator automatically would remove the confusion about who seeds it, and users would stop being surprised by repeatable output when they didn’t want that.

Scalability

The global generator also did not scale well. Because top-level functions like rand.Intn can be called simultaneously from multiple goroutines, the implementation needed a lock protecting the shared generator state. In parallel usage, acquiring and releasing this lock was more expensive than the actual generation. It would make sense instead to have a per-thread generator state, but doing so would break repeatability in programs without concurrent use of math/rand.

The Rand implementation was missing important optimizations

The rand.Rand type wraps a Source to implement a richer set of operations. For example, here is the Go 1 implementation of Int63n, which returns a random integer in the range [0, n).

func (r *Rand) Int63n(n int64) int64 {
    if n <= 0 {
        panic("invalid argument to Int63n")
    }
    max := int64((1<<63 - 1)  - (1<<63)%uint64(n))
    v := r.src.Int63()
    for v > max {
        v = r.Int63()
    }
    return v % n
}

The actual conversion is easy: v % n. However, no algorithm can convert 2⁶³ equally likely values into n equally likely values unless 2⁶³ is a multiple of n: otherwise some outputs will necessarily happen more often than others. (As a simpler example, try converting 4 equally likely values into 3.) The code computes max such that max+1 is the largest multiple of n less than or equal to 2⁶³, and then the loop rejects random values greater than or equal to max+1. Rejecting these too-large values ensures that all n outputs are equally likely. For small n, needing to reject any value at all is rare; rejection becomes more common and more important for larger values. Even without the rejection loop, the two (slow) modulus operations can make the conversion more expensive than generating the random value v in the first place.

In 2018, Daniel Lemire found an algorithm that avoids the divisions nearly all the time (see also his 2019 blog post). In math/rand, adopting Lemire’s algorithm would make Intn(1000) 20-30% faster, but we can’t: the faster algorithm generates different values than the standard conversion, breaking repeatability.

Other methods are also slower than they could be, constrained by repeatability. For example, the Float64 method could easily be sped up by about 10% if we could change the generated value stream. (This was the change we tried to make in Go 1.2 and rolled back, mentioned earlier.)

The Read Mistake

As mentioned earlier, math/rand is not intended for and not suitable for generating cryptographic secrets. The crypto/rand package does that, and its fundamental primitive is its Read function and Reader variable.

In 2015, we accepted a proposal to make rand.Rand implement io.Reader as well, along with adding a top-level Read function. This seemed reasonable at the time, but in retrospect we did not pay enough attention to the software engineering aspects of this change. Now, if you want to read random data, you have two choices: math/rand.Read and crypto/rand.Read. If the data is going to be used for key material, it is very important to use crypto/rand, but now it is possible to use math/rand instead, potentially with disastrous consequences.

Tools like goimports and gopls have a special case to make sure they prefer to use rand.Read from crypto/rand instead of math/rand, but that’s not a complete fix. It would be better to remove Read entirely.

Fixing math/rand directly

Making a new, incompatible major version of a package is never our first choice: that new version only benefits programs that switch to it, leaving all existing usage of the old major version behind. In contrast, fixing a problem in the existing package has much more impact, since it fixes all the existing usage. We should never create a v2 without doing as much as possible to fix v1. In the case of math/rand, we were able to partly address a few of the problems described above:

  • Go 1.8 introduced an optional Source64 interface with a Uint64 method. If a Source also implements Source64, then Rand uses that method when appropriate. This “extension interface” pattern provides a compatible (if slightly awkward) way to revise an interface after the fact.

  • Go 1.20 automatically seeded the top-level generator and deprecated rand.Seed. Although this may seem like an incompatible change given our focus on repeatability of the output stream, we reasoned that any imported package that called rand.Int at init time or inside any computation would also visibly change the output stream, and surely adding or removing such a call cannot be considered a breaking change. And if that’s true, then auto-seeding is no worse, and it would eliminate this source of fragility for future programs. We also added a GODEBUG setting to opt back into the old behavior. Then we marked the top-level rand.Seed as deprecated. (Programs that need seeded repeatability can still use rand.New(rand.NewSource(seed)) to obtain a local generator instead of using the global one.)

  • Having eliminated repeatability of the global output stream, Go 1.20 was also able to make the global generator scale better in programs that don’t call rand.Seed, replacing the Go 1 generator with a very cheap per-thread wyrand generator already used inside the Go runtime. This removed the global mutex and made the top-level functions scale much better. Programs that do call rand.Seed fall back to the mutex-protected Go 1 generator.

  • We were able to adopt Lemire’s optimization in the Go runtime, and we also used it inside rand.Shuffle, which was implemented after Lemire’s paper was published.

  • Although we couldn’t remove rand.Read entirely, Go 1.20 marked it deprecated in favor of crypto/rand. We have since heard from people who discovered that they were accidentally using math/rand.Read in cryptographic contexts when their editors flagged the use of the deprecated function.

These fixes are imperfect and incomplete but also real improvements that helped all users of the existing math/rand package. For more complete fixes, we needed to turn our attention to math/rand/v2.

Fixing the rest in math/rand/v2

Defining math/rand/v2 took significant planning, then a GitHub Discussion and then a proposal discussion. It is the same as math/rand with the following breaking changes addressing the problems outlined above:

  • We removed the Go 1 generator entirely, replacing it with two new generators, PCG and ChaCha8. The new types are named for their algorithms (avoiding the generic name NewSource) so that if another important algorithm needs to be added, it will fit well into the naming scheme.

    Adopting a suggestion from the proposal discussion, the new types implement the encoding.BinaryMarshaler and encoding.BinaryUnmarshaler interfaces.

  • We changed the Source interface, replacing the Int63 method with a Uint64 method and deleting the Seed method. Implementations that support seeding can provide their own concrete methods, like PCG.Seed and ChaCha8.Seed. Note that the two take different seed types, and neither is a single int64.

  • We removed the top-level Seed function: the global functions like Int can only be used in auto-seeded form now.

  • Removing the top-level Seed also let us hard-code the use of scalable, per-thread generators by the top-level methods, avoiding a GODEBUG check at each use.

  • We implemented Lemire’s optimization for Intn and related functions. The concrete rand.Rand API is now locked in to that value stream, so we will not be able to take advantage of any optimizations yet to be discovered, but at least we are up to date once again. We also implemented the Float32 and Float64 optimizations we wanted to use back in Go 1.2.

  • During the proposal discussion, a contributor pointed out detectable bias in the implementations of ExpFloat64 and NormFloat64. We fixed that bias and locked in the new value streams.

  • Perm and Shuffle used different shuffling algorithms and produced different value streams, because Shuffle happened second and used a faster algorithm. Deleting Perm entirely would have made migration harder for users. Instead we implemented Perm in terms of Shuffle, which still lets us delete an implementation.

  • We renamed Int31, Int63, Intn, Int31n, and Int63n to Int32, Int64, IntN, Int32N, and Int64N. The 31 and 63 in the names were unnecessarily pedantic and confusing, and the capitalized N is more idiomatic for a second “word” in the name in Go.

  • We added Uint, Uint32, Uint64, UintN, Uint32N, and Uint64N top-level functions and methods. We needed to add Uint64 to provide direct access to the core Source functionality, and it seemed inconsistent not to add the others.

  • Adopting another suggestion from the proposal discussion, we added a new top-level, generic function N that is like Int64N or Uint64N but works for any integer type. In the old API, to create a random duration of up to 5 seconds, it was necessary to write:

    d := time.Duration(rand.Int63n(int64(5*time.Second)))
    

    Using N, the equivalent code is:

    d := rand.N(5 * time.Second)
    

    N is only a top-level function; there is no N method on rand.Rand because there are no generic methods in Go. (Generic methods are not likely in the future, either; they conflict badly with interfaces, and a complete implementation would require either run-time code generation or slow execution.)

  • To ameliorate misuse of math/rand in cryptographic contexts, we made ChaCha8 the default generator used in global functions, and we also changed the Go runtime to use it (replacing wyrand). Programs are still strongly encouraged to use crypto/rand to generate cryptographic secrets, but accidentally using math/rand/v2 is not as catastrophic as using math/rand would be. Even in math/rand, the global functions now use the ChaCha8 generator when not explicitly seeded.

Principles for evolving the Go standard library

As mentioned at the start this post, one of the goals for this work was to establish principles and a pattern for how we approach all v2 packages in the standard library. There will not be a glut of v2 packages in the next few Go releases. Instead, we will handle one package at a time, making sure we set a quality bar that will last for another decade. Many packages will not need a v2 at all. But for those that do, our approach boils down to three principles.

First, a new, incompatible version of a package will use that/package/v2 as its import path, following semantic import versioning just like a v2 module outside the standard library would. This allows uses of the original package and the v2 package to coexist in a single program, which is critical for a gradual conversion to the new API.

Second, all changes must be rooted in respect for existing usage and users: we must not introduce needless churn, whether in the form of unnecessary changes to an existing package or an entirely new package that must be learned instead. In practice, that means we take the existing package as the starting point and only make changes that are well motivated and provide a value that justifies the cost to users of updating.

Third, the v2 package must not leave v1 users behind. Ideally, the v2 package should be able to do everything the v1 package could do, and when v2 is released, the v1 package should be rewritten to be a thin wrapper around v2. This would ensure that existing uses of v1 continue to benefit from bug fixes and performance optimizations in v2. Of course, given that v2 is introducing breaking changes, this is not always possible, but it is always something to consider carefully. For math/rand/v2, we arranged for the auto-seeded v1 functions to call the v2 generator, but we were unable to share other code due to the repeatability violations. Ultimately math/rand is not a lot of code and does not require regular maintenance, so the duplication is manageable. In other contexts, more work to avoid duplication could be worthwhile. For example, in the encoding/json/v2 design (still in progress), although the default semantics and the API are changed, the package provides configuration knobs that make it possible to implement the v1 API. When we eventually ship encoding/json/v2, encoding/json (v1) will become a thin wrapper around it, ensuring that users who don’t migrate from v1 still benefit from optimizations and security fixes in v2.

A follow-up blog post presents the ChaCha8 generator in more detail.

{
"by": "spacey",
"descendants": 44,
"id": 40224864,
"kids": [
40231203,
40233545,
40231842,
40229297,
40230698,
40232858,
40253474
],
"score": 156,
"time": 1714578132,
"title": "Evolving the Go Standard Library with math/rand/v2",
"type": "story",
"url": "https://go.dev/blog/randv2"
}
{
"author": "Russ Cox 1 May 2024",
"date": null,
"description": "Go 1.22 adds math/rand/v2 and charts a course for the evolution of the Go standard library.",
"image": "https://go.dev/doc/gopher/runningsquare.jpg",
"logo": "https://logo.clearbit.com/go.dev",
"publisher": "Go",
"title": "Evolving the Go Standard Library with math/rand/v2 - The Go Programming Language",
"url": "https://go.dev/blog/randv2"
}
{
"url": "https://go.dev/blog/randv2",
"title": "Evolving the Go Standard Library with math/rand/v2 - The Go Programming Language",
"description": "The Go Blog Since Go 1 was released in March 2012, changes to the standard library have been constrained by Go’s compatibility promise. Overall, compatibility has been a boon for Go users, providing a...",
"links": [
"https://go.dev/blog/randv2"
],
"image": "https://go.dev/doc/gopher/runningsquare.jpg",
"content": "<div>\n <h2><a target=\"_blank\" href=\"https://go.dev/blog/\">The Go Blog</a></h2>\n <p>Since Go 1 was <a target=\"_blank\" href=\"https://go.dev/blog/go1\">released in March 2012</a>,\nchanges to the standard library have been\nconstrained by Go’s <a target=\"_blank\" href=\"https://go.dev/doc/go1compat\">compatibility promise</a>.\nOverall, compatibility has been a boon for Go users,\nproviding a stable base for production systems,\ndocumentation, tutorials, books, and more.\nOver time, however, we’ve realized mistakes in the original APIs\nthat cannot be fixed compatibly; in other cases,\nbest practices and convention have changed.\nWe need a plan for making important, breaking changes too.</p>\n<p>This blog post is about Go 1.22’s new <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/\"><code>math/rand/v2</code></a> package,\nthe first “v2” in the standard library.\nIt brings needed improvements to the <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/\"><code>math/rand</code></a> API,\nbut more importantly it sets an example for how we can\nrevise other standard library packages as the need arises.</p>\n<p>(In Go, <code>math/rand</code> and <code>math/rand/v2</code> are two different packages\nwith different import paths.\nGo 1 and every release after it have included <code>math/rand</code>; Go 1.22 added <code>math/rand/v2</code>.\nA Go program can import either package, or both.)</p>\n<p>This post discusses the specific rationale for the changes in <code>math/rand/v2</code>\nand then <a target=\"_blank\" href=\"https://go.dev/blog/randv2#principles\">reflects on the general principles</a> that will guide\nnew versions of other packages.</p>\n<h2 id=\"pseudo\">Pseudorandom Number Generators</h2>\n<p>Before we look at <code>math/rand</code>, which is an API for a pseudorandom number generator,\nlet’s take a moment to understand what that means.</p>\n<p>A pseudorandom number generator is a deterministic program\nthat generates a long sequence of\nseemingly random numbers from a small seed input,\nalthough the numbers are not in fact random at all.\nIn the case of <code>math/rand</code>, the seed is a single int64,\nand the algorithm produces a sequence of int64s\nusing a variant of a\n<a href=\"https://en.wikipedia.org/wiki/Linear-feedback_shift_register\" target=\"_blank\">linear-feedback shift register (LFSR)</a>.\nThe algorithm is based on an idea by George Marsaglia,\ntweaked by Don Mitchell and Jim Reeds,\nand further customized by Ken Thompson for Plan 9 and then Go.\nIt has no official name, so this post calls it the Go 1 generator.</p>\n<p>The goal is for these generators to be fast,\nrepeatable, and random enough to support simulations,\nshuffling, and other non-cryptographic use cases.\nRepeatability is particularly important for uses like\nnumerical simulations or randomized testing.\nFor example, a randomized tester might pick a seed\n(perhaps based on the current time), generate\na large random test input, and repeat.\nWhen the tester finds a failure, it only needs to print the seed\nto allow repeating the test with that specific large input.</p>\n<p>Repeatability also matters over time: given a particular\nseed, a new version of Go needs to generate the same\nsequence of values that an older version did.\nWe didn’t realize this when we released Go 1;\ninstead, we discovered it the hard way,\nwhen we tried to make a change in Go 1.2\nand got reports that we had broken certain tests\nand other use cases.\nAt that point, we decided Go 1 compatibility included\nthe specific random outputs for a given seed\nand <a target=\"_blank\" href=\"https://go.dev/change/5aca0514941ce7dd0f3cea8d8ffe627dbcd542ca\">added a test</a>.</p>\n<p>It is not a goal for these kinds of generators to produce\nrandom numbers suitable for deriving cryptographic keys\nor other important secrets.\nBecause the seed is only 63 bits,\nany output drawn from the generator, no matter how long,\nwill also only contain 63 bits of entropy.\nFor example, using <code>math/rand</code> to generate a 128-bit or 256-bit AES key\nwould be a serious mistake,\nsince the key would be easier to brute force.\nFor that kind of use, you need a cryptographically strong\nrandom number generator, as provided by <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/\"><code>crypto/rand</code></a>.</p>\n<p>That’s enough background that we can move on to what needed\nfixing in the <code>math/rand</code> package.</p>\n<h2 id=\"problems\">Problems with <code>math/rand</code></h2>\n<p>Over time, we noticed more and more problems with <code>math/rand</code>.\nThe most serious were the following.</p>\n<h3 id=\"problem.generator\">Generator Algorithm</h3>\n<p>The generator itself needed replacement.</p>\n<p>The initial implementation of Go, while production ready, was in many ways a “pencil sketch”\nof the entire system, working well enough to serve as a base for future development:\nthe compiler and runtime were written in C; the garbage collector was a conservative, single-threaded,\nstop-the-world collector; and the libraries used basic implementations throughout.\nFrom Go 1 through around Go 1.5, we went back and drew the “fully inked”\nversion of each of these: we converted the compiler and runtime to Go; we wrote a new, precise, parallel,\nconcurrent garbage collection with microsecond pause times; and we replaced\nstandard library implementations with more sophisticated, optimized algorithms\nas needed.</p>\n<p>Unfortunately, the repeatability requirement in <code>math/rand</code>\nmeant that we couldn’t replace the generator there without\nbreaking compatibility.\nWe were stuck with the Go 1 generator,\nwhich is reasonably fast (about 1.8ns per number on my M3 Mac)\nbut maintains an internal state of almost 5 kilobytes.\nIn contrast, Melissa O’Neill’s <a href=\"https://www.pcg-random.org/\" target=\"_blank\">PCG family of generators</a>\ngenerates better random numbers in about 2.1ns per number\nwith only 16 bytes of internal state.\nWe also wanted to explore using\nDaniel J. Bernstein’s <a href=\"https://cr.yp.to/chacha.html\" target=\"_blank\">ChaCha stream cipher</a>\nas a generator.\nA <a target=\"_blank\" href=\"https://go.dev/blog/chacha8rand\">follow-up post</a> discusses that generator specifically.</p>\n<h3 id=\"problem.source\">Source Interface</h3>\n<p>The <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Source\"><code>rand.Source</code> interface</a> was wrong.\nThat interface defines the\nconcept of a low-level random number generator that generates\nnon-negative <code>int64</code> values:</p>\n<pre><code>% go doc -src math/rand.Source\npackage rand // import \"math/rand\"\n// A Source represents a source of uniformly-distributed\n// pseudo-random int64 values in the range [0, 1&lt;&lt;63).\n//\n// A Source is not safe for concurrent use by multiple goroutines.\ntype Source interface {\n Int63() int64\n Seed(seed int64)\n}\nfunc NewSource(seed int64) Source\n%\n</code></pre>\n<p>(In the doc comment, “[0, N)” denotes a\n<a href=\"https://en.wikipedia.org/wiki/Interval_(mathematics)#Definitions_and_terminology\" target=\"_blank\">half-open interval</a>,\nmeaning the range includes 0 but ends just before 2⁶³.)</p>\n<p>The <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Rand\"><code>rand.Rand</code> type</a> wraps a <code>Source</code>\nto implement a richer set of operations, such as\ngenerating <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Rand.Intn\">an integer between 0 and N</a>,\ngenerating <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Rand.Float64\">floating-point numbers</a>, and so on.</p>\n<p>We defined the <code>Source</code> interface to return a shortened 63-bit value\ninstead of a uint64 because that’s what the Go 1 generator and\nother widely-used generators produce,\nand it matches the convention set by the C standard library.\nBut this was a mistake: more modern generators produce full-width uint64s,\nwhich is a more convenient interface.</p>\n<p>Another problem is the <code>Seed</code> method hard-coding an <code>int64</code> seed:\nsome generators are seeded by larger values,\nand the interface provides no way to handle that.</p>\n<h3 id=\"problem.seed\">Seeding Responsibility</h3>\n<p>A bigger problem with <code>Seed</code> is that responsibility for seeding the global generator was unclear.\nMost users don’t use <code>Source</code> and <code>Rand</code> directly.\nInstead, the <code>math/rand</code> package provides a global generator\naccessed by top-level functions like <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Intn\"><code>Intn</code></a>.\nFollowing the C standard library, the global generator defaults to\nbehaving as if <code>Seed(1)</code> is called at startup.\nThis is good for repeatability but bad for programs that want\ntheir random outputs to be different from one run to the next.\nThe package documentation suggests using <code>rand.Seed(time.Now().UnixNano())</code> in that case,\nto make the generator’s output time-dependent,\nbut what code should do this?</p>\n<p>Probably the main package should be in charge of how <code>math/rand</code> is seeded:\nit would be unfortunate for imported libraries to configure global state themselves,\nsince their choices might conflict with other libraries or the main package.\nBut what happens if a library needs some random data and wants to use <code>math/rand</code>?\nWhat if the main package doesn’t even know <code>math/rand</code> is being used?\nWe found that in practice many libraries add init functions\nthat seed the global generator with the current time, “just to be sure”.</p>\n<p>Library packages seeding the global generator themselves causes a new problem.\nSuppose package main imports two packages that both use <code>math/rand</code>:\npackage A assumes the global generator will be seeded by package main,\nbut package B seeds it in an <code>init</code> func.\nAnd suppose that package main doesn’t seed the generator itself.\nNow package A’s correct operation depends on the coincidence that package B is also\nimported in the program.\nIf package main stops importing package B, package A will stop getting random values.\nWe observed this happening in practice in large codebases.</p>\n<p>In retrospect, it was clearly a mistake to follow the C standard library here:\nseeding the global generator automatically would remove the confusion\nabout who seeds it, and users would stop being surprised by repeatable\noutput when they didn’t want that.</p>\n<h3 id=\"problem.scale\">Scalability</h3>\n<p>The global generator also did not scale well.\nBecause top-level functions like <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Intn\"><code>rand.Intn</code></a>\ncan be called simultaneously from multiple goroutines,\nthe implementation needed a lock protecting the shared generator state.\nIn parallel usage, acquiring and releasing this lock was more expensive\nthan the actual generation.\nIt would make sense instead to have a per-thread generator state,\nbut doing so would break repeatability\nin programs without concurrent use of <code>math/rand</code>.</p>\n<h3 id=\"problem.rand\">The <code>Rand</code> implementation was missing important optimizations</h3>\n<p>The <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Rand\"><code>rand.Rand</code> type</a> wraps a <code>Source</code>\nto implement a richer set of operations.\nFor example, here is the Go 1 implementation of <code>Int63n</code>, which returns\na random integer in the range [0, <code>n</code>).</p>\n<pre><code>func (r *Rand) Int63n(n int64) int64 {\n if n &lt;= 0 {\n panic(\"invalid argument to Int63n\")\n }\n max := int64((1&lt;&lt;63 - 1) - (1&lt;&lt;63)%uint64(n))\n v := r.src.Int63()\n for v &gt; max {\n v = r.Int63()\n }\n return v % n\n}\n</code></pre>\n<p>The actual conversion is easy: <code>v % n</code>.\nHowever, no algorithm can convert 2⁶³ equally likely values\ninto <code>n</code> equally likely values unless 2⁶³ is a multiple of <code>n</code>:\notherwise some outputs will necessarily happen more often\nthan others. (As a simpler example, try converting 4 equally likely values into 3.)\nThe code computes <code>max</code> such that\n<code>max+1</code> is the largest multiple of <code>n</code> less than or equal to 2⁶³,\nand then the loop rejects random values greater than or equal to <code>max+1</code>.\nRejecting these too-large values ensures that all <code>n</code> outputs are equally likely.\nFor small <code>n</code>, needing to reject any value at all is rare;\nrejection becomes more common and more important for larger values.\nEven without the rejection loop, the two (slow) modulus operations\ncan make the conversion more expensive than generating the random value <code>v</code>\nin the first place.</p>\n<p>In 2018, <a href=\"https://arxiv.org/abs/1805.10941\" target=\"_blank\">Daniel Lemire found an algorithm</a>\nthat avoids the divisions nearly all the time\n(see also his <a href=\"https://lemire.me/blog/2019/06/06/nearly-divisionless-random-integer-generation-on-various-systems/\" target=\"_blank\">2019 blog post</a>).\nIn <code>math/rand</code>, adopting Lemire’s algorithm would make <code>Intn(1000)</code> 20-30% faster,\nbut we can’t: the faster algorithm generates different values than the standard conversion,\nbreaking repeatability.</p>\n<p>Other methods are also slower than they could be, constrained by repeatability.\nFor example, the <code>Float64</code> method could easily be sped up by about 10%\nif we could change the generated value stream.\n(This was the change we tried to make in Go 1.2 and rolled back, mentioned earlier.)</p>\n<h3 id=\"problem.read\">The <code>Read</code> Mistake</h3>\n<p>As mentioned earlier, <code>math/rand</code> is not intended for\nand not suitable for generating cryptographic secrets.\nThe <code>crypto/rand</code> package does that, and its fundamental\nprimitive is its <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/#Read\"><code>Read</code> function</a>\nand <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/#Reader\"><code>Reader</code></a> variable.</p>\n<p>In 2015, we accepted a proposal to make\n<code>rand.Rand</code> implement <code>io.Reader</code> as well,\nalong with <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Read\">adding a top-level <code>Read</code> function</a>.\nThis seemed reasonable at the time,\nbut in retrospect we did not pay enough attention to the\nsoftware engineering aspects of this change.\nNow, if you want to read random data, you have\ntwo choices: <code>math/rand.Read</code> and <code>crypto/rand.Read</code>.\nIf the data is going to be used for key material,\nit is very important to use <code>crypto/rand</code>,\nbut now it is possible to use <code>math/rand</code> instead,\npotentially with disastrous consequences.</p>\n<p>Tools like <code>goimports</code> and <code>gopls</code> have a special case\nto make sure they prefer to use <code>rand.Read</code> from\n<code>crypto/rand</code> instead of <code>math/rand</code>, but that’s not a complete fix.\nIt would be better to remove <code>Read</code> entirely.</p>\n<h2 id=\"fix.v1\">Fixing <code>math/rand</code> directly</h2>\n<p>Making a new, incompatible major version of a package is never our first choice:\nthat new version only benefits programs that switch to it,\nleaving all existing usage of the old major version behind.\nIn contrast, fixing a problem in the existing package has much more impact,\nsince it fixes all the existing usage.\nWe should never create a <code>v2</code> without doing as much as possible to fix <code>v1</code>.\nIn the case of <code>math/rand</code>, we were able to partly address\na few of the problems described above:</p>\n<ul>\n<li>\n<p>Go 1.8 introduced an optional <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Uint64\"><code>Source64</code> interface</a> with a <code>Uint64</code> method.\nIf a <code>Source</code> also implements <code>Source64</code>, then <code>Rand</code> uses that method\nwhen appropriate.\nThis “extension interface” pattern provides a compatible (if slightly awkward)\nway to revise an interface after the fact.</p>\n</li>\n<li>\n<p>Go 1.20 automatically seeded the top-level generator and\ndeprecated <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Seed\"><code>rand.Seed</code></a>.\nAlthough this may seem like an incompatible change\ngiven our focus on repeatability of the output stream,\n<a target=\"_blank\" href=\"https://go.dev/issue/56319\">we reasoned</a> that any imported package that called <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Int\"><code>rand.Int</code></a>\nat init time or inside any computation would also\nvisibly change the output stream, and surely adding or removing\nsuch a call cannot be considered a breaking change.\nAnd if that’s true, then auto-seeding is no worse,\nand it would eliminate this source of fragility for future programs.\nWe also added a <a target=\"_blank\" href=\"https://go.dev/doc/godebug\">GODEBUG setting</a> to opt\nback into the old behavior.\nThen we marked the top-level <code>rand.Seed</code> as <a target=\"_blank\" href=\"https://go.dev/wiki/Deprecated\">deprecated</a>.\n(Programs that need seeded repeatability can still use\n<code>rand.New(rand.NewSource(seed))</code> to obtain a local generator\ninstead of using the global one.)</p>\n</li>\n<li>\n<p>Having eliminated repeatability of the global output stream,\nGo 1.20 was also able to make the global generator scale better\nin programs that don’t call <code>rand.Seed</code>,\nreplacing the Go 1 generator with a very cheap per-thread\n<a href=\"https://github.com/wangyi-fudan/wyhash\" target=\"_blank\">wyrand generator</a>\nalready used inside the Go runtime. This removed the global mutex\nand made the top-level functions scale much better.\nPrograms that do call <code>rand.Seed</code> fall back to the\nmutex-protected Go 1 generator.</p>\n</li>\n<li>\n<p>We were able to adopt Lemire’s optimization in the Go runtime,\nand we also used it inside <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Shuffle\"><code>rand.Shuffle</code></a>,\nwhich was implemented after Lemire’s paper was published.</p>\n</li>\n<li>\n<p>Although we couldn’t remove <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Read\"><code>rand.Read</code></a> entirely,\nGo 1.20 marked it <a target=\"_blank\" href=\"https://go.dev/wiki/Deprecated\">deprecated</a> in favor of\n<code>crypto/rand</code>.\nWe have since heard from people who discovered that they were accidentally\nusing <code>math/rand.Read</code> in cryptographic contexts when their editors\nflagged the use of the deprecated function.</p>\n</li>\n</ul>\n<p>These fixes are imperfect and incomplete but also real improvements\nthat helped all users of the existing <code>math/rand</code> package.\nFor more complete fixes, we needed to turn our attention to <code>math/rand/v2</code>.</p>\n<h2 id=\"fix.v2\">Fixing the rest in <code>math/rand/v2</code></h2>\n<p>Defining <code>math/rand/v2</code> took\nsignificant planning,\nthen a <a target=\"_blank\" href=\"https://go.dev/issue/60751\">GitHub Discussion</a>\nand then a <a target=\"_blank\" href=\"https://go.dev/issue/61716\">proposal discussion</a>.\nIt is the same\nas <code>math/rand</code> with the following breaking changes\naddressing the problems outlined above:</p>\n<ul>\n<li>\n<p>We removed the Go 1 generator entirely, replacing it with two new generators,\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#PCG\">PCG</a> and <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#ChaCha8\">ChaCha8</a>.\nThe new types are named for their algorithms (avoiding the generic name <code>NewSource</code>)\nso that if another important algorithm needs to be added, it will fit well into the\nnaming scheme.</p>\n<p>Adopting a suggestion from the proposal discussion, the new types implement the\n<a target=\"_blank\" href=\"https://go.dev/pkg/encoding/#BinaryMarshaler\"><code>encoding.BinaryMarshaler</code></a>\nand\n<a target=\"_blank\" href=\"https://go.dev/pkg/encoding/#BinaryUnmarshaler\"><code>encoding.BinaryUnmarshaler</code></a>\ninterfaces.</p>\n</li>\n<li>\n<p>We changed the <code>Source</code> interface, replacing the <code>Int63</code> method with a <code>Uint64</code> method\nand deleting the <code>Seed</code> method. Implementations that support seeding can provide\ntheir own concrete methods, like <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#PCG.Seed\"><code>PCG.Seed</code></a> and\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#ChaCha8.Seed\"><code>ChaCha8.Seed</code></a>.\nNote that the two take different seed types, and neither is a single <code>int64</code>.</p>\n</li>\n<li>\n<p>We removed the top-level <code>Seed</code> function: the global functions like <code>Int</code> can only be used\nin auto-seeded form now.</p>\n</li>\n<li>\n<p>Removing the top-level <code>Seed</code> also let us hard-code the use of scalable,\nper-thread generators by the top-level methods,\navoiding a GODEBUG check at each use.</p>\n</li>\n<li>\n<p>We implemented Lemire’s optimization for <code>Intn</code> and related functions.\nThe concrete <code>rand.Rand</code> API is now locked in to that value stream,\nso we will not be able to take advantage of any optimizations yet to be discovered,\nbut at least we are up to date once again.\nWe also implemented the <code>Float32</code> and <code>Float64</code> optimizations we wanted to use back in Go 1.2.</p>\n</li>\n<li>\n<p>During the proposal discussion, a contributor pointed out detectable bias in the\nimplementations of <code>ExpFloat64</code> and <code>NormFloat64</code>.\nWe fixed that bias and locked in the new value streams.</p>\n</li>\n<li>\n<p><code>Perm</code> and <code>Shuffle</code> used different shuffling algorithms and produced different value streams,\nbecause <code>Shuffle</code> happened second and used a faster algorithm.\nDeleting <code>Perm</code> entirely would have made migration harder for users.\nInstead we implemented <code>Perm</code> in terms of <code>Shuffle</code>, which still lets us\ndelete an implementation.</p>\n</li>\n<li>\n<p>We renamed <code>Int31</code>, <code>Int63</code>, <code>Intn</code>, <code>Int31n</code>, and <code>Int63n</code> to\n<code>Int32</code>, <code>Int64</code>, <code>IntN</code>, <code>Int32N</code>, and <code>Int64N</code>.\nThe 31 and 63 in the names were unnecessarily pedantic\nand confusing, and the capitalized N is more idiomatic for a second\n“word” in the name in Go.</p>\n</li>\n<li>\n<p>We added <code>Uint</code>, <code>Uint32</code>, <code>Uint64</code>, <code>UintN</code>, <code>Uint32N</code>, and <code>Uint64N</code>\ntop-level functions and methods.\nWe needed to add <code>Uint64</code> to provide direct access to the core <code>Source</code>\nfunctionality, and it seemed inconsistent not to add the others.</p>\n</li>\n<li>\n<p>Adopting another suggestion from the proposal discussion,\nwe added a new top-level, generic function <code>N</code> that is like\n<code>Int64N</code> or <code>Uint64N</code> but works for any integer type.\nIn the old API, to create a random duration of up to 5 seconds,\nit was necessary to write:</p>\n<pre><code>d := time.Duration(rand.Int63n(int64(5*time.Second)))\n</code></pre>\n<p>Using <code>N</code>, the equivalent code is:</p>\n<pre><code>d := rand.N(5 * time.Second)\n</code></pre>\n<p><code>N</code> is only a top-level function; there is no <code>N</code> method on <code>rand.Rand</code>\nbecause there are no generic methods in Go.\n(Generic methods are not likely in the future, either;\nthey conflict badly with interfaces, and a complete implementation\nwould require either run-time code generation or slow execution.)</p>\n</li>\n<li>\n<p>To ameliorate misuse of <code>math/rand</code> in cryptographic contexts,\nwe made <code>ChaCha8</code> the default generator used in global functions,\nand we also changed the Go runtime to use it (replacing wyrand).\nPrograms are still strongly encouraged to use <code>crypto/rand</code>\nto generate cryptographic secrets,\nbut accidentally using <code>math/rand/v2</code> is not as catastrophic\nas using <code>math/rand</code> would be.\nEven in <code>math/rand</code>, the global functions now use the <code>ChaCha8</code> generator when not explicitly seeded.</p>\n</li>\n</ul>\n<h2 id=\"principles\">Principles for evolving the Go standard library</h2>\n<p>As mentioned at the start this post, one of the goals for this work\nwas to establish principles and a pattern for how we approach all\nv2 packages in the standard library.\nThere will not be a glut of v2 packages\nin the next few Go releases.\nInstead, we will handle one package\nat a time, making sure we set a quality bar that will last for another decade.\nMany packages will not need a v2 at all.\nBut for those that do, our approach boils down to three principles.</p>\n<p>First, a new, incompatible version of a package will use\n<code>that/package/v2</code> as its import path,\nfollowing\n<a href=\"https://research.swtch.com/vgo-import\" target=\"_blank\">semantic import versioning</a>\njust like a v2 module outside the standard library would.\nThis allows uses of the original package and the v2 package\nto coexist in a single program,\nwhich is critical for a\n<a target=\"_blank\" href=\"https://go.dev/talks/2016/refactor.article\">gradual conversion</a> to the new API.</p>\n<p>Second, all changes must be rooted in\nrespect for existing usage and users:\nwe must not introduce needless churn,\nwhether in the form of unnecessary changes to an existing package or\nan entirely new package that must be learned instead.\nIn practice, that means we take the existing package\nas the starting point\nand only make changes that are well motivated\nand provide a value that justifies the cost to users of updating.</p>\n<p>Third, the v2 package must not leave v1 users behind.\nIdeally, the v2 package should be able to do everything the v1 package\ncould do,\nand when v2 is released, the v1 package should be rewritten\nto be a thin wrapper around v2.\nThis would ensure that existing uses of v1 continue to benefit\nfrom bug fixes and performance optimizations in v2.\nOf course, given that v2 is introducing breaking changes,\nthis is not always possible, but it is always something to consider carefully.\nFor <code>math/rand/v2</code>, we arranged for the auto-seeded v1 functions to\ncall the v2 generator, but we were unable to share other code\ndue to the repeatability violations.\nUltimately <code>math/rand</code> is not a lot of code and does not require\nregular maintenance, so the duplication is manageable.\nIn other contexts, more work to avoid duplication could be worthwhile.\nFor example, in the\n<a target=\"_blank\" href=\"https://go.dev/issue/63397\">encoding/json/v2 design (still in progress)</a>,\nalthough the default semantics and the API are changed,\nthe package provides configuration knobs that\nmake it possible to implement the v1 API.\nWhen we eventually ship <code>encoding/json/v2</code>,\n<code>encoding/json</code> (v1) will become a thin wrapper around it,\nensuring that users who don’t migrate from v1 still\nbenefit from optimizations and security fixes in v2.</p>\n<p>A <a target=\"_blank\" href=\"https://go.dev/blog/chacha8rand\">follow-up blog post</a> presents the <code>ChaCha8</code> generator in more detail.</p>\n </div>",
"author": "",
"favicon": "https://go.dev/images/favicon-gopher.svg",
"source": "go.dev",
"published": "",
"ttr": 668,
"type": ""
}