Go GC:Go 1.5 将会解决延迟问题 【未翻译】
· · 4815 次点击 · · 开始浏览Richard L. Hudson (Rick) is best known for his work in memory management including the invention of the Train, Sapphire, and Mississippi Delta algorithms as well as GC stack maps which enabled garbage collection in statically typed languages like Java, C#, and Go. He has published papers on language runtimes, memory management, concurrency, synchronization, memory models and transactional memory. Rick is a member of Google’s Go team where he is working on Go’s GC and runtime issues.
image
In economics, there is this concept of a virtuous cycle – a positive feedback loop between different processes that feed into one another. Traditionally in tech, there has been a virtuous cycle between software and hardware development. CPU hardware improves, which enables faster software to be written, which in turn drives further improvements in CPU speed and compute power. This cycle was healthy until about 2004, which is about when Moore’s Law started to end.
image
还没有人翻译此段落
我来翻译These days, 2X transistors != 2x faster programs. More transistors == more cores, but software has not evolved to be able to fully utilize more cores. Because software today is not able to adequately put multiple cores to work, the hardware guys are not going to keep putting more cores in. The cycle is sputtering.
A long term goal of Go is to reboot this virtuous cycle by enabling more concurrent, parallel programs. In the shorter term, we need to increase Go adoption. One of the biggest complaints with the Go runtime right now is that GC pauses are too long.
When their team initially took on this problem, he jokingly says that as engineers, their initial reaction was to not actually solve the problem, and to look for workarounds like:
Adding an eye tracker to the computer and GC when no one’s looking
Pop up a network wait icon during GC and blame the pause on network latency or something else
But Russ Cox shot these ideas down for some reason, so they decided to roll up their sleeves and actually try to improve the Go GC. The algorithm they developed trades program execution throughput for reduced GC latency. Go programs will get a little bit slower in exchange for ensuring lower GC latencies.
还没有人翻译此段落
我来翻译How can we make latency tangible?
Nanosecond: Grace Hopper analogized time to distance. A nanosecond is 11.8 inches
Microsecond: 5.4 microseconds is the time it takes light to travel 1 mile in vacuum.
Milliseconds
1: Read 1 MB sequentially from SSD
20: Read 1 MB from spinny disk
50: Perceptual causality (eye/cursor response threshold).
50+: Various network delays
300: eye blink
So how much GC can we do in a millisecond?
Java GC vs. Go GC
image
Go:
thousands of goroutines
synchronization via channels
runtime written in Go, leverages Go same as users
control of spatial locality (structs can be embedded, interior pointers (&foo.field))
Java:
tens of Java threads
synchronization via objects/locks
runtime written in C
objects linked with pointers
The biggest difference is the issue of spatial locality. In Java, everything is a pointer, whereas Go enables you to embed structs within one another. Following pointers many layers deep causes a lot of issues for a garbage collector.
还没有人翻译此段落
我来翻译GC basics
Here’s a quick primer on garbage collectors. They typically involve 2 phases:
image
Scan phase: Determine which things in the heap are reachable. This involves starting from the poitners in stacks, registers, and global variables, and following pointers into the heap.
Mark phase: Walk the pointer graph. Mark objects as reachable from the program as you go. From the GC’s point of view, it is simplest to stop the world so that pointers are not changing while the mark phase is happening. Truly concurrent GC is difficult, because pointers are continually changing. The program uses something called a write barrier to communicate to the GC that it should not collect an object. In practice, write barriers can be more expensive than stop-the-world pauses.
还没有人翻译此段落
我来翻译Go GC
The Go GC Algorithm uses a combination of write barriers and short stop-the-world pauses. Here are its phases:
image
Here’s what the GC algorithm looked like in Go 1.4:
image
Here it is in Go 1.5:
image
Note the shorter stop-the-world pauses. During concurrent GC, the GC uses 25% CPU.
Here are the benchmarks:
image
In previous versions of Go, GC pauses are in general much longer, and they grow as the heap size grows. In Go 1.5, GC pauses are more than order of magnitude shorter.
Zooming in, there is still a slight positive correlation between heap size and GC pauses. But they know what the issue is and it will be fixed in Go 1.6.
image
There is a slight throughput penalty with the new GC algorithm, and that penalty shrinks as the heap size grows:
image
还没有人翻译此段落
我来翻译Moving forward
Tell people that GC is no longer an issue with Go’s low latency GC. Moving forward, they are planning to tune for even lower latency, higher throughput, and more predictability. They want to find the sweet spot between these tradeoffs. Development work for Go 1.6 will be use case and feedback driven, so let them know.
image
The new low latency GC will make Go an even more viable replacement for manual-memory-management languages like C.
Q & A
Q: Any plans for heap compaction? A: Our approach has been to adopt the techniques that have served the C language community well, which is to avoid fragmentation to begin with by storing objects of the same size in the same memory span.
还没有人翻译此段落
我来翻译我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们
有疑问加站长微信联系(非本文作者)
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
关注微信- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码` - 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传
收入到我管理的专栏 新建专栏
Richard L. Hudson (Rick) is best known for his work in memory management including the invention of the Train, Sapphire, and Mississippi Delta algorithms as well as GC stack maps which enabled garbage collection in statically typed languages like Java, C#, and Go. He has published papers on language runtimes, memory management, concurrency, synchronization, memory models and transactional memory. Rick is a member of Google’s Go team where he is working on Go’s GC and runtime issues.
image
In economics, there is this concept of a virtuous cycle – a positive feedback loop between different processes that feed into one another. Traditionally in tech, there has been a virtuous cycle between software and hardware development. CPU hardware improves, which enables faster software to be written, which in turn drives further improvements in CPU speed and compute power. This cycle was healthy until about 2004, which is about when Moore’s Law started to end.
image
还没有人翻译此段落
我来翻译These days, 2X transistors != 2x faster programs. More transistors == more cores, but software has not evolved to be able to fully utilize more cores. Because software today is not able to adequately put multiple cores to work, the hardware guys are not going to keep putting more cores in. The cycle is sputtering.
A long term goal of Go is to reboot this virtuous cycle by enabling more concurrent, parallel programs. In the shorter term, we need to increase Go adoption. One of the biggest complaints with the Go runtime right now is that GC pauses are too long.
When their team initially took on this problem, he jokingly says that as engineers, their initial reaction was to not actually solve the problem, and to look for workarounds like:
Adding an eye tracker to the computer and GC when no one’s looking
Pop up a network wait icon during GC and blame the pause on network latency or something else
But Russ Cox shot these ideas down for some reason, so they decided to roll up their sleeves and actually try to improve the Go GC. The algorithm they developed trades program execution throughput for reduced GC latency. Go programs will get a little bit slower in exchange for ensuring lower GC latencies.
还没有人翻译此段落
我来翻译How can we make latency tangible?
Nanosecond: Grace Hopper analogized time to distance. A nanosecond is 11.8 inches
Microsecond: 5.4 microseconds is the time it takes light to travel 1 mile in vacuum.
Milliseconds
1: Read 1 MB sequentially from SSD
20: Read 1 MB from spinny disk
50: Perceptual causality (eye/cursor response threshold).
50+: Various network delays
300: eye blink
So how much GC can we do in a millisecond?
Java GC vs. Go GC
image
Go:
thousands of goroutines
synchronization via channels
runtime written in Go, leverages Go same as users
control of spatial locality (structs can be embedded, interior pointers (&foo.field))
Java:
tens of Java threads
synchronization via objects/locks
runtime written in C
objects linked with pointers
The biggest difference is the issue of spatial locality. In Java, everything is a pointer, whereas Go enables you to embed structs within one another. Following pointers many layers deep causes a lot of issues for a garbage collector.
还没有人翻译此段落
我来翻译GC basics
Here’s a quick primer on garbage collectors. They typically involve 2 phases:
image
Scan phase: Determine which things in the heap are reachable. This involves starting from the poitners in stacks, registers, and global variables, and following pointers into the heap.
Mark phase: Walk the pointer graph. Mark objects as reachable from the program as you go. From the GC’s point of view, it is simplest to stop the world so that pointers are not changing while the mark phase is happening. Truly concurrent GC is difficult, because pointers are continually changing. The program uses something called a write barrier to communicate to the GC that it should not collect an object. In practice, write barriers can be more expensive than stop-the-world pauses.
还没有人翻译此段落
我来翻译Go GC
The Go GC Algorithm uses a combination of write barriers and short stop-the-world pauses. Here are its phases:
image
Here’s what the GC algorithm looked like in Go 1.4:
image
Here it is in Go 1.5:
image
Note the shorter stop-the-world pauses. During concurrent GC, the GC uses 25% CPU.
Here are the benchmarks:
image
In previous versions of Go, GC pauses are in general much longer, and they grow as the heap size grows. In Go 1.5, GC pauses are more than order of magnitude shorter.
Zooming in, there is still a slight positive correlation between heap size and GC pauses. But they know what the issue is and it will be fixed in Go 1.6.
image
There is a slight throughput penalty with the new GC algorithm, and that penalty shrinks as the heap size grows:
image
还没有人翻译此段落
我来翻译Moving forward
Tell people that GC is no longer an issue with Go’s low latency GC. Moving forward, they are planning to tune for even lower latency, higher throughput, and more predictability. They want to find the sweet spot between these tradeoffs. Development work for Go 1.6 will be use case and feedback driven, so let them know.
image
The new low latency GC will make Go an even more viable replacement for manual-memory-management languages like C.
Q & A
Q: Any plans for heap compaction? A: Our approach has been to adopt the techniques that have served the C language community well, which is to avoid fragmentation to begin with by storing objects of the same size in the same memory span.
还没有人翻译此段落
我来翻译我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们