-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
Proposal Details
Proposal: Automatic GOMEMLIMIT based on cgroup memory limits
Overview
Change the Go runtime on Linux to use cgroup memory limits to set the default value of GOMEMLIMIT.
This proposal extends the containerization awareness introduced for GOMAXPROCS to memory management, addressing similar issues where Go applications in containers have poor defaults that lead to out-of-memory kills and suboptimal garbage collection behavior.
Background
Go Memory Management
GOMEMLIMIT sets a soft memory limit for the Go runtime's memory usage. When the runtime approaches this limit, it triggers more aggressive garbage collection to stay within the boundary. This prevents the application from consuming unbounded memory while allowing for efficient memory utilization.
The garbage collector uses GOMEMLIMIT as a signal for when to become more aggressive. Specifically:
- The GC triggers more frequently as memory usage approaches the limit
- The GC target is adjusted to maintain memory usage below the limit
- Memory ballast and other optimizations are scaled appropriately
GOMEMLIMIT does not apply to memory allocated outside the Go heap (such as C memory via cgo, memory-mapped files, or off-heap data structures), similar to how GOMAXPROCS doesn't apply to C threads.
Linux cgroups Memory Control
Linux cgroups provide memory limit controls commonly used by container runtimes. Both cgroup v1 and v2 support memory limits, though with different interfaces.
cgroup v1: memory.limit_in_bytes
sets the maximum memory (including cache) that processes in the cgroup may use. When this limit is exceeded, the kernel's OOM killer terminates processes in the cgroup.
cgroup v2: memory.max
serves the same purpose as v1's limit_in_bytes, setting a hard limit on memory usage.
Both versions also provide:
- Swap limits: Additional controls for swap usage
- Soft limits: Advisory limits that trigger reclaim pressure but don't cause OOM kills
- Usage statistics: Current memory consumption information
Container runtimes translate high-level memory specifications into these cgroup controls. When a container exceeds its memory limit, processes are killed by the OOM killer, often resulting in cryptic "exit code 137" errors.
Container Orchestration Memory Management
Docker provides --memory
(or -m
) flags that directly set cgroup memory limits.
Kubernetes uses memory requests and limits:
- Memory requests: The minimum memory guaranteed to be available. Used for scheduling decisions.
- Memory limits: The maximum memory the container can use. Corresponds directly to cgroup memory limits.
Unlike CPU limits, memory limits are hard limits - exceeding them results in process termination rather than throttling.
Mesos has similar request/limit concepts that map to cgroup memory controls.
Current GOMEMLIMIT Behavior
Since Go 1.19, GOMEMLIMIT can be set via:
- The
GOMEMLIMIT
environment variable - The
debug.SetMemoryLimit()
function - Default behavior when unset: effectively unlimited (math.MaxInt64)
When GOMEMLIMIT is unset, Go applications can consume all available system memory, which works well for single-tenant systems but causes problems in containerized environments.
Problems in Containerized Environments
Go applications in containers with memory limits but no configured GOMEMLIMIT face several issues:
-
OOM kills: Applications can consume memory up to the container limit and get killed by the OOM killer, often appearing as mysterious crashes.
-
Suboptimal GC behavior: Without a memory pressure signal, the GC runs infrequently, leading to high memory usage that approaches the container limit before collection occurs.
-
Memory efficiency: Applications may hold onto memory that could be freed if they knew they were approaching a limit.
Proposal
Core Behavior
At startup, if GOMEMLIMIT is not set in the environment, the Go runtime will:
- Determine available system memory from
/proc/meminfo
- Check cgroup memory limits by reading cgroup memory controls
- Calculate effective memory limit:
- For each level of the cgroup hierarchy, read the memory limit
- Take the minimum memory limit in the hierarchy as the "effective" limit
- If no cgroup limits are found, use system memory
- Set adjusted memory limit:
- Calculate 90% of the effective memory limit as the GOMEMLIMIT
- Ensure the limit is at least 100MiB to avoid pathological behavior
- Apply the limit using the existing GOMEMLIMIT mechanisms
Compatibility and Control
This behavior is controlled by a GODEBUG setting: cgroupmemlimit=1
. This defaults to cgroupmemlimit=0
for older language versions to maintain backward compatibility.
Automatic Updates
The runtime will periodically check for changes to cgroup memory limits and update GOMEMLIMIT accordingly. This uses the same low-frequency scanning mechanism as the GOMAXPROCS implementation.
Discussion
What is a good percentage level for the value, is 90% good? What about a minimum fixed limit (e.g. 100MB)?
Comparison to Other Runtimes
.NET: https://github.com/dotnet/designs/blob/main/accepted/2019/support-for-memory-limits.md
JVM: https://developers.redhat.com/articles/2022/04/19/java-17-whats-new-openjdks-container-awareness
Interaction with GOGC
GOMEMLIMIT works alongside GOGC (garbage collection target percentage):
- GOGC controls when GC runs based on heap growth
- GOMEMLIMIT provides an absolute ceiling that overrides GOGC
- Both mechanisms work together to provide responsive memory management
Testing Strategy
The implementation requires testing across:
- Different container runtimes (Docker, containerd, CRI-O)
- Different orchestration platforms (Kubernetes, Docker Swarm, Mesos)
- Mixed cgroup v1/v2 environments
- Various memory limit configurations
- Memory pressure scenarios
Related Work
- go 1.25 GOMAXPROCS default implementation. (runtime: CPU limit-aware GOMAXPROCS default #73193 )
- Go GC Guide: Existing documentation on memory management tuning
- Container memory management: Industry practices across language runtimes
This proposal aims to provide sensible defaults for Go applications in containerized environments, reducing the operational burden of manual memory tuning while preventing common failure modes like OOM kills.
As I understand there are downsides to this, as a default value might lead to poor performance due to more frequent GC runs, but this is still a sensible proposal for memory intensive applications and aligns closer to other runtime's industry practices, and users would still be free to tune the GC themselves via GOGC and GOMEMLIMIT variables (or in-code runtime
calls)