...

Go 内存模型

版本:2012年3月6日 || 译者:Oling Cat,Ants Arks,特别感谢 Fall Ark 的帮助

Introduction

引言

The Go memory model specifies the conditions under which reads of a variable in one goroutine can be guaranteed to observe values produced by writes to the same variable in a different goroutine.

Go内存模型阐明了一个Go程对某变量的写入,如何才能确保被另一个读取该变量的Go程监测到。

Advice

忠告

Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.

程序在修改被多个Go程同时访问的数据时必须序列化该访问。

To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.

要序列化访问,需通过信道操作,或其它像 syncsync/atomic 包中的同步原语来保护数据。

If you must read the rest of this document to understand the behavior of your program, you are being too clever.

若您的程序行为必须通过阅读本文档才能理解,那...想必您一定十分聪明咯?

Don't be clever.

别自作聪明。

Happens Before

事件的发生次序

Within a single goroutine, reads and writes must behave as if they executed in the order specified by the program. That is, compilers and processors may reorder the reads and writes executed within a single goroutine only when the reordering does not change the behavior within that goroutine as defined by the language specification. Because of this reordering, the execution order observed by one goroutine may differ from the order perceived by another. For example, if one goroutine executes a = 1; b = 2;, another might observe the updated value of b before the updated value of a.

在单个Go程中,读取和写入的表现必须与程序指定的执行顺序相一致。换言之, 仅在不会改变语言规范对Go程行为的定义时,编译器和处理器才会对读取和写入的执行重新排序。 由于存在重新排序,一个Go程监测到的执行顺序可能与另一个Go程监到的不同。例如,若一个Go程执行 a = 1; b = 2;,另一个Go程可能监测到 b 的值先于 a 更新。

To specify the requirements of reads and writes, we define happens before, a partial order on the execution of memory operations in a Go program. If event e1 happens before event e2, then we say that e2 happens after e1. Also, if e1 does not happen before e2 and does not happen after e2, then we say that e1 and e2 happen concurrently.

为了详细论述读取和写入的必要条件,我们定义了事件发生顺序,它表示Go程序中内存操作执行的 偏序关系。 若事件 e1 发生在 e2 之前, 那么我们就说 e2 发生在 e1 之后。 换言之,若 e1 既未发生在 e2 之前, 又未发生在 e2 之后,那么我们就说 e1e2 是并发的。

Within a single goroutine, the happens-before order is the order expressed by the program.

在单一Go程中,事件发生的顺序即为程序所表达的顺序。

A read r of a variable v is allowed to observe a write w to v if both of the following hold:

若以下条件均成立,则对变量 v 的读取操作 r允许v 的写入操作 w 进行监测:

  1. r does not happen before w.
  2. There is no other write w' to v that happens after w but before r.
  1. r 不发生在 w 之前。
  2. w 之后 r 之前,不存在其它对 v 进行的写入操作 w'

To guarantee that a read r of a variable v observes a particular write w to v, ensure that w is the only write r is allowed to observe. That is, r is guaranteed to observe w if both of the following hold:

为确保对变量 v 的读取操作 r 能够监测到特定的对 v 进行写入的操作 w,需确保 w 是唯一允许被 r 监测的写入操作。也就是说,若以下条件均成立,则 r保证监测到 w

  1. w happens before r.
  2. Any other write to the shared variable v either happens before w or after r.
  1. w 发生在 r 之前。
  2. 对共享变量 v 的其它任何写入操作都只能发生在 w 之前或 r 之后。

This pair of conditions is stronger than the first pair; it requires that there are no other writes happening concurrently with w or r.

这对条件的要求比第一对更强,它需要确保没有其它写入操作与 wr 并发。

Within a single goroutine, there is no concurrency, so the two definitions are equivalent: a read r observes the value written by the most recent write w to v. When multiple goroutines access a shared variable v, they must use synchronization events to establish happens-before conditions that ensure reads observe the desired writes.

在单个Go程中并不存在并发,因此这两条定义是等价的:读取操作 r 可监测最近的写入操作 wv 写入的值。当多个Go程访问共享变量 v 时,它们必须通过同步事件来建立发生顺序的条件,以此确保读取操作能监测到预期的写入。

The initialization of variable v with the zero value for v's type behaves as a write in the memory model.

以变量 v 所属类型的零值来对 v 进行初始化,其表现如同在内存模型中进行的写入操作。

Reads and writes of values larger than a single machine word behave as multiple machine-word-sized operations in an unspecified order.

对大于单个机器字的值进行读取和写入,其表现如同以不确定的顺序对多个机器字大小的值进行操作。

译注(Ants Arks):
a 不在 b 之前,并不意味着 a 就在 b 之后,它们可以并发。这样的话,第一种说法, 即对于两个并发的Go程来说,一个Go程能否读到另一个Go程写入的数据,可能有,也可能没有。 第二种说法,由于 r 发生在 w 之后,r 之前并没有其它的 w',也没有 w" 和 r 并列,因此 r 读到的值必然是 w 写入的值。下面结合图形进行说明,其中 r 为 read,w 为 write,它们都对值进行操作.

单Go程的情形:
-- w0 ---- r1 -- w1 ---- w2 ----  r2 ---- r3 ------>

这里不仅是个偏序关系,还是一个良序关系:所有 r/w 的先后顺序都是可比较的。

双Go程的情形:
-- w0 -- r1 -- r2 ---- w3 ----  w4 ---- r5 -------->
-- w1 ----- w2 -- r3 ----  r4 ---- w5 -------->

单Go程上的事件都有先后顺序;而对于两条Go程,情况又有所不同。即便在时间上 r1 先于 w2 发生,
但由于每条Go程的执行时长都像皮筋一样伸缩不定,因此二者在逻辑上并无先后次序。换言之,即二者并发。
对于并发的 r/w,r3 读取的结果可能是前面的 w2,也可能是上面的 w3,甚至 w4 的值;
而 r5 读取的结果,可能是 w4 的值,也能是 w1、w2、w5 的值,但不可能是 w3 的值。


双Go程交叉同步的情形:
-- r0 -- r1 ---|------ r2 ------------|-- w5 ------>
-- w1 --- w2 --|-- r3 ---- r4 -- w4 --|------->

现在上面添加了两个同步点,即 | 处。这样的话,r3 就是后于 r1 ,先于 w5 发生的。
r2 之前的写入为 w2,但与其并发的有 w4,因此 r2 的值是不确定的:可以是 w2,也可以是 w4。
而 r4 之前的写入的是 w2,与它并发的并没有写入,因此 r4 读取的值为 w2。

到这里,Go程间的关系就很清楚了。若不加同步控制,那么所有的Go程都是“平行”并发的。换句话说, 若不进行同步,那么 main 函数以外的Go程都是无意义的,因为这样可以认为 main 跟它们没有关系。 只有加上同步控制,例如锁或信道,Go程间才有了相同的“节点”,在它们的两边也就有了执行的先后顺序, 不过两个“节点”之间的部分,同样还是可以自由伸缩,没有先后顺序的。如此推广,多条Go程的同步就成了有向的网。

Synchronization

同步

初始化

Program initialization runs in a single goroutine, but that goroutine may create other goroutines, which run concurrently.

程序的初始化运行在单个Go程中,但该Go程可能会创建其它并发运行的Go程。

If a package p imports package q, the completion of q's init functions happens before the start of any of p's.

若包 p 导入了包 q,则 qinit 函数会在 p 的任何函数启动前完成。

The start of the function main.main happens after all init functions have finished.

函数 main.main 会在所有的 init 函数结束后启动。

Goroutine creation

Go程的创建

The go statement that starts a new goroutine happens before the goroutine's execution begins.

go 语句会在当前Go程开始执行前启动新的Go程。

For example, in this program:

例如,在此程序中,

var a string

func f() {
	print(a)
}

func hello() {
	a = "hello, world"
	go f()
}

calling hello will print "hello, world" at some point in the future (perhaps after hello has returned).

调用 hello 或许会在将来的某一时刻打印 "hello, world" (在 hello 返回之后则会打印零值)。

Goroutine destruction

Go程的销毁

The exit of a goroutine is not guaranteed to happen before any event in the program. For example, in this program:

Go程无法确保在程序中的任何事件发生之前退出。例如,在此程序中:

var a string

func hello() {
	go func() { a = "hello" }()
	print(a)
}

the assignment to a is not followed by any synchronization event, so it is not guaranteed to be observed by any other goroutine. In fact, an aggressive compiler might delete the entire go statement.

a 进行赋值后并没有任何同步事件,因此它无法保证被其它任何Go程检测到。 实际上,一个积极的编译器可能会删除整条 go 语句。

If the effects of a goroutine must be observed by another goroutine, use a synchronization mechanism such as a lock or channel communication to establish a relative ordering.

若一个Go程的作用必须被另一个Go程监测到,需使用锁或信道通信之类的同步机制来建立顺序关系。

Channel communication

信道通信

Channel communication is the main method of synchronization between goroutines. Each send on a particular channel is matched to a corresponding receive from that channel, usually in a different goroutine.

信道通信是在Go程之间进行同步的主要方法。在特定信道上的每一次发送操作都有与其对应的接收操作相匹配, 这通常发生在不同的信道上。

A send on a channel happens before the corresponding receive from that channel completes.

信道上的发送操作总在对应的接收操作完成前发生。

This program:

此程序:

var c = make(chan int, 10)
var a string

func f() {
	a = "hello, world"
	c <- 0
}

func main() {
	go f()
	<-c
	print(a)
}

is guaranteed to print "hello, world". The write to a happens before the send on c, which happens before the corresponding receive on c completes, which happens before the print.

可保证打印出 "hello, world"。该程序首先对 a 进行写入, 然后在 c 上发送信号,随后从 c 接收对应的信号,最后执行 print 函数。

The closing of a channel happens before a receive that returns a zero value because the channel is closed.

若在信道关闭后从中接收数据,接收者就会收到该信道返回的零值。

In the previous example, replacing c <- 0 with close(c) yields a program with the same guaranteed behavior.

在上一个例子中,用 close(c) 代替 c <- 0 仍能保证该程序产生相同的行为。

A receive from an unbuffered channel happens before the send on that channel completes.

从无缓冲信道进行的接收,要发生在对该信道进行的发送完成之前。

This program (as above, but with the send and receive statements swapped and using an unbuffered channel):

此程序(与上面的相同,但交换了发送和接收语句的位置,且使用无缓冲信道):

var c = make(chan int)
var a string

func f() {
	a = "hello, world"
	<-c
}
func main() {
	go f()
	c <- 0
	print(a)
}

is also guaranteed to print "hello, world". The write to a happens before the receive on c, which happens before the corresponding send on c completes, which happens before the print.

也可保证打印出 "hello, world"。该程序首先对 a 进行写入, 然后从 c 中接收信号,随后向 c 发送对应的信号,最后执行 print 函数。

If the channel were buffered (e.g., c = make(chan int, 1)) then the program would not be guaranteed to print "hello, world". (It might print the empty string, crash, or do something else.)

若该信道为带缓冲的(例如,c = make(chan int, 1)), 则该程序将无法保证打印出 "hello, world"。(它可能会打印出空字符串, 崩溃,或做些别的事情。)

The kth send on a channel with capacity C happens before the k+Cth receive from that channel completes.

The kth send on a channel with capacity C happens before the k+Cth receive from that channel completes.

TODO: 优化语句 在某信道上进行的的第 k 次容量为 C 的发送发生在第 k+C 次从该信道进行的接收完成之前。

The kth receive on a channel with capacity C happens before the k+Cth send from that channel completes.

The kth receive on a channel with capacity C happens before the k+Cth send from that channel completes.

This rule generalizes the previous rule to buffered channels. It allows a counting semaphore to be modeled by a buffered channel: the number of items in the channel corresponds to the number of active uses, the capacity of the channel corresponds to the maximum number of simultaneous uses, sending an item acquires the semaphore, and receiving an item releases the semaphore. This is a common idiom for limiting concurrency.

This program starts a goroutine for every entry in the work list, but the goroutines coordinate using the limit channel to ensure that at most three are running work functions at a time.

var limit = make(chan int, 3)

func main() {
	for _, w := range work {
		go func(w func()) {
			limit <- 1
			w()
			<-limit
		}(w)
	}
	select{}
}

Locks

The sync package implements two lock data types, sync.Mutex and sync.RWMutex.

sync 包实现了两种锁的数据类型:sync.Mutexsync.RWMutex

For any sync.Mutex or sync.RWMutex variable l and n < m, call n of l.Unlock() happens before call m of l.Lock() returns.

对于任何 sync.Mutexsync.RWMutex 类型的变量 l 以及 n < m ,对 l.Unlock() 的第 n 次调用在对 l.Lock() 的第 m 次调用返回前发生。

This program:

此程序:

var l sync.Mutex
var a string

func f() {
	a = "hello, world"
	l.Unlock()
}

func main() {
	l.Lock()
	go f()
	l.Lock()
	print(a)
}

is guaranteed to print "hello, world". The first call to l.Unlock() (in f) happens before the second call to l.Lock() (in main) returns, which happens before the print.

可保证打印出 "hello, world"。该程序首先(在 f 中)对 l.Unlock() 进行第一次调用,然后(在 main 中)对 l.Lock() 进行第二次调用,最后执行 print 函数。

For any call to l.RLock on a sync.RWMutex variable l, there is an n such that the l.RLock happens (returns) after call n to l.Unlock and the matching l.RUnlock happens before call n+1 to l.Lock.

对于任何 sync.RWMutex 类型的变量 ll.RLock 的调用,存在一个这样的 n,使得 l.RLock 在对 l.Unlock 的第 n 次调用之后发生(返回),且与其相匹配的 l.RUnlock 在对 l.Lock的第 n+1 次调用之前发生。

Once

Once 类型

The sync package provides a safe mechanism for initialization in the presence of multiple goroutines through the use of the Once type. Multiple threads can execute once.Do(f) for a particular f, but only one will run f(), and the other calls block until f() has returned.

sync 包通过 Once 类型为存在多个Go程的初始化提供了安全的机制。 多个线程可为特定的 f 执行 once.Do(f),但只有一个会运行 f(),而其它调用会一直阻塞,直到 f() 返回。

A single call of f() from once.Do(f) happens (returns) before any call of once.Do(f) returns.

通过 once.Do(f)f() 的单次调用在对任何其它的 once.Do(f) 调用返回之前发生(返回)。

In this program:

在此程序中:

var a string
var once sync.Once

func setup() {
	a = "hello, world"
}

func doprint() {
	once.Do(setup)
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}

calling twoprint causes "hello, world" to be printed twice. The first call to doprint runs setup once.

调用 twoprint 会打印两次 "hello, world" 。 第一次对 twoprint 的调用会运行一次 setup

Incorrect synchronization

错误的同步

Note that a read r may observe the value written by a write w that happens concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.

请注意,读取操作 r 可能监测到与其并发的写入操作 w 写入的值。即便如此,也并不意味着发生在 r 之后的读取操作会监测到发生在 w 之前的写入操作。

In this program:

在此程序中:

var a, b int

func f() {
	a = 1
	b = 2
}

func g() {
	print(b)
	print(a)
}

func main() {
	go f()
	g()
}

it can happen that g prints 2 and then 0.

可能会发生 g 打印出 2 之后再打印出 0

This fact invalidates a few common idioms.

这个事实会使很多习惯变得无效。

Double-checked locking is an attempt to avoid the overhead of synchronization. For example, the twoprint program might be incorrectly written as:

双重检测锁是种避免同步开销的尝试。例如,twoprint 程序可能会错误地写成:

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func doprint() {
	if !done {
		once.Do(setup)
	}
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}

but there is no guarantee that, in doprint, observing the write to done implies observing the write to a. This version can (incorrectly) print an empty string instead of "hello, world".

但这里并不保证在 doprint 中对 done 的写入进行监测蕴含对 a 的写入进行监测。这个版本可能会(错误地)打印出一个空字符串而非 "hello, world"

Another incorrect idiom is busy waiting for a value, as in:

另一种错误的习惯就是忙于等待一个值,就像这样:

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func main() {
	go setup()
	for !done {
	}
	print(a)
}

As before, there is no guarantee that, in main, observing the write to done implies observing the write to a, so this program could print an empty string too. Worse, there is no guarantee that the write to done will ever be observed by main, since there are no synchronization events between the two threads. The loop in main is not guaranteed to finish.

和前面一样,这里不保证在 main 中对 done 的写入的监测, 蕴含对 a 的写入也进行监测,因此该程序也可能会打印出一个空字符串。 更糟的是,由于在两个线程之间没有同步事件,因此无法保证对 done 的写入总能被 main 监测到。main 中的循环不保证一定能结束。

There are subtler variants on this theme, such as this program.

这个主题有种微妙的变体,例如此程序:

type T struct {
	msg string
}

var g *T

func setup() {
	t := new(T)
	t.msg = "hello, world"
	g = t
}

func main() {
	go setup()
	for g == nil {
	}
	print(g.msg)
}

Even if main observes g != nil and exits its loop, there is no guarantee that it will observe the initialized value for g.msg.

即便 main 能够监测到 g != nil 并退出循环, 它也无法保证能监测到 g.msg 的初始化值。

In all these examples, the solution is the same: use explicit synchronization.

这里所有例子的解决方案都是相同的:使用显式的同步。