Skip to content

runc container escape and denial of service due to arbitrary write gadgets and procfs write redirects

High severity GitHub Reviewed Published Nov 5, 2025 in opencontainers/runc • Updated Nov 6, 2025

Package

gomod github.com/opencontainers/runc (Go)

Affected versions

<= 1.2.7
>= 1.3.0-rc.1, <= 1.3.2
>= 1.4.0-rc.1, <= 1.4.0-rc.2

Patched versions

1.2.8
1.3.3
1.4.0-rc.3
gomod github.com/opencontainers/selinux (Go)
<= 1.12.0
None

Description

Impact

This attack is primarily a more sophisticated version of CVE-2019-19921, which was a flaw which allowed an attacker to trick runc into writing the LSM process labels for a container process into a dummy tmpfs file and thus not apply the correct LSM labels to the container process. The mitigation runc applied for CVE-2019-19921 was fairly limited and effectively only caused runc to verify that when runc writes LSM labels that those labels are actual procfs files.

Rather than using a fake tmpfs file for /proc/self/attr/<label>, an attacker could instead (through various means) make /proc/self/attr/<label> reference a real procfs file, but one that would still be a no-op (such as /proc/self/sched). This would have the same effect but would clear the "is a procfs file" check. Runc is aware that this kind of attack would be possible (even going so far as to discuss this publicly as "future work" at conferences), and runc is working on a far more comprehensive mitigation of this attack, but this security issue was disclosed before runc could complete this work.

In all known versions of runc, an attacker can trick runc into misdirecting writes to /proc to other procfs files through the use of a racing container with shared mounts (runc has also verified this attack is possible to exploit using a standard Dockerfile with docker buildx build as that also permits triggering parallel execution of containers with custom shared mounts configured). This redirect could be through symbolic links in a tmpfs or theoretically other methods such as regular bind-mounts.

Note that while /proc/self/attr/<label> was the example used above (which is LSM-specific), this issue affect all writes to /proc in runc and thus also affects sysctls (written to /proc/sys/...) and some other APIs.

Additional Impacts

While investigating this issue, runc discovered that another risk with these redirected writes is that they could be redirected to dangerous files such as /proc/sysrq-trigger rather than just no-op files like /proc/self/sched. For instance, the default AppArmor profile name in Docker is docker-default, which when written to /proc/sysrq-trigger would cause the host system to crash.

When this was discovered, runc conducted an audit of other write operations within runc and found several possible areas where runc could be used as a semi-arbitrary write gadget when combined with the above race attacks. The most concerning attack scenario was the configuration of sysctls. Because the contents of the sysctl are free-form text, an attacker could use a misdirected write to write to /proc/sys/kernel/core_pattern and break out of the container (as described in CVE-2025-31133, kernel upcalls are not namespaced and so coredump helpers will run with complete root privileges on the host). Even if the attacker cannot configure custom sysctls, a valid sysctl string (when redirected to /proc/sysrq-trigger) can easily cause the machine to hang.

Note that the fact that this attack allows you to disable LSM labels makes it a very useful attack to combine with CVE-2025-31133 (as one of the only mitigations available to most users for that issue is AppArmor, and this attack would let you bypass that). However, the misdirected write issue above means that you could also achieve most of the same goals without needing to chain together attacks.

Patches

This advisory is being published as part of a set of three advisories:

The patches fixing this issue have accordingly been combined into a single patchset. The following patches from that patchset resolve the issues in this advisory:

  • db19bbed5348 ("internal/sys: add VerifyInode helper")
  • 6fc191449109 ("internal: move utils.MkdirAllInRoot to internal/pathrs")
  • ff94f9991bd3 ("*: switch to safer securejoin.Reopen")
  • 44a0fcf685db ("go.mod: update to github.com/cyphar/[email protected]")
  • 77889b56db93 ("internal: add wrappers for securejoin.Proc*")
  • fdcc9d3cad2f ("apparmor: use safe procfs API for labels")
  • ff6fe1324663 ("utils: use safe procfs for /proc/self/fd loop code")
  • b3dd1bc562ed ("utils: remove unneeded EnsureProcHandle")
  • 77d217c7c377 ("init: write sysctls using safe procfs API")
  • 435cc81be6b7 ("init: use securejoin for /proc/self/setgroups")
  • d61fd29d854b ("libct/system: use securejoin for /proc/$pid/stat")
  • 4b37cd93f86e ("libct: align param type for mountCgroupV1/V2 functions")
  • d40b3439a961 ("rootfs: switch to fd-based handling of mountpoint targets")
  • ed6b1693b8b3 ("selinux: use safe procfs API for labels")
    • Please note that this patch includes a private patch for github.com/opencontainers/selinux that could not be made public through a public pull request (as it would necessarily disclose this embargoed security issue).

      The patch includes a complete copy of the forked code and a replace directive (as well as go mod vendor applied), which should still work with downstream build systems. If you cannot apply this patch, you can safely drop it -- some of the other patches in this series should block these kinds of racing mount attacks entirely.

      See opencontainers/selinux#237 for the upstream patch.

  • 3f925525b44d ("rootfs: re-allow dangling symlinks in mount targets")
  • a41366e74080 ("openat2: improve resilience on busy systems")

runc 1.2.8, 1.3.3, and 1.4.0-rc.3 have been released and all contain fixes for these issues. As per runc's new release model, runc 1.1.x and earlier are no longer supported and thus have not been patched.

Mitigations

  • Do not run untrusted container images from unknown or unverified sources.

  • For the basic no-op attack, this attack allows a container process to run with the same LSM labels as runc. For most AppArmor deployments this means it will be unconfined, and for SELinux it will likely be container_runtime_t. Runc has not conducted in-depth testing of the impact on SELinux -- it is possible that it provides some reasonable protection but it seems likely that an attacker could cause harm to systems even with such an SELinux setup.

  • For the more involved redirect and write gadget attacks, unfortunately most LSM profiles (including the standard container-selinux profiles) provide the container runtime access to sysctl files (including /proc/sysrq-trigger) and so LSMs likely do not provide much protection against these attacks.

  • Using rootless containers provides some protection against these kinds of bugs (privileged writes in runc being redirected) -- by having runc itself be an unprivileged process, in general you would expect the impact scope of a runc bug to be less severe as it would only have the privileges afforded to the host user which spawned runc. For this particular bug, the privilege escalation caused by the inadvertent write issue is entirely mitigated with rootless containers because the unprivileged user that the runc process is executing as cannot write to the aforementioned procfs files (even intentionally).

Other Runtimes

As this vulnerability boils down to a fairly easy-to-make logic bug, runc has provided information to other OCI (crun, youki) and non-OCI (LXC) container runtimes about this vulnerability.

Based on discussions with other runtimes, it seems that crun and youki may have similar security issues and will release a co-ordinated security release along with runc. LXC appears to use the host's /proc for all procfs operations, and so is likely not vulnerable to this issue (this is a trade-off -- runc uses the container's procfs to avoid CVE-2016-9962-style attacks).

Credits

Thanks to Li Fubang (@lifubang from acmcoder.com, CIIC) and Tõnis Tiigi (@tonistiigi from Docker) for both independently discovering this vulnerability, as well as Aleksa Sarai (@cyphar from SUSE) for the original research into this class of security issues and solutions.

Additional thanks go to Tõnis Tiigi for finding some very useful exploit templates for these kinds of race attacks using docker buildx build.

References

@cyphar cyphar published to opencontainers/runc Nov 5, 2025
Published to the GitHub Advisory Database Nov 5, 2025
Reviewed Nov 5, 2025
Published by the National Vulnerability Database Nov 6, 2025
Last updated Nov 6, 2025

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v4 base metrics

Exploitability Metrics
Attack Vector Local
Attack Complexity Low
Attack Requirements Present
Privileges Required Low
User interaction Active
Vulnerable System Impact Metrics
Confidentiality High
Integrity High
Availability High
Subsequent System Impact Metrics
Confidentiality High
Integrity High
Availability High

CVSS v4 base metrics

Exploitability Metrics
Attack Vector: This metric reflects the context by which vulnerability exploitation is possible. This metric value (and consequently the resulting severity) will be larger the more remote (logically, and physically) an attacker can be in order to exploit the vulnerable system. The assumption is that the number of potential attackers for a vulnerability that could be exploited from across a network is larger than the number of potential attackers that could exploit a vulnerability requiring physical access to a device, and therefore warrants a greater severity.
Attack Complexity: This metric captures measurable actions that must be taken by the attacker to actively evade or circumvent existing built-in security-enhancing conditions in order to obtain a working exploit. These are conditions whose primary purpose is to increase security and/or increase exploit engineering complexity. A vulnerability exploitable without a target-specific variable has a lower complexity than a vulnerability that would require non-trivial customization. This metric is meant to capture security mechanisms utilized by the vulnerable system.
Attack Requirements: This metric captures the prerequisite deployment and execution conditions or variables of the vulnerable system that enable the attack. These differ from security-enhancing techniques/technologies (ref Attack Complexity) as the primary purpose of these conditions is not to explicitly mitigate attacks, but rather, emerge naturally as a consequence of the deployment and execution of the vulnerable system.
Privileges Required: This metric describes the level of privileges an attacker must possess prior to successfully exploiting the vulnerability. The method by which the attacker obtains privileged credentials prior to the attack (e.g., free trial accounts), is outside the scope of this metric. Generally, self-service provisioned accounts do not constitute a privilege requirement if the attacker can grant themselves privileges as part of the attack.
User interaction: This metric captures the requirement for a human user, other than the attacker, to participate in the successful compromise of the vulnerable system. This metric determines whether the vulnerability can be exploited solely at the will of the attacker, or whether a separate user (or user-initiated process) must participate in some manner.
Vulnerable System Impact Metrics
Confidentiality: This metric measures the impact to the confidentiality of the information managed by the VULNERABLE SYSTEM due to a successfully exploited vulnerability. Confidentiality refers to limiting information access and disclosure to only authorized users, as well as preventing access by, or disclosure to, unauthorized ones.
Integrity: This metric measures the impact to integrity of a successfully exploited vulnerability. Integrity refers to the trustworthiness and veracity of information. Integrity of the VULNERABLE SYSTEM is impacted when an attacker makes unauthorized modification of system data. Integrity is also impacted when a system user can repudiate critical actions taken in the context of the system (e.g. due to insufficient logging).
Availability: This metric measures the impact to the availability of the VULNERABLE SYSTEM resulting from a successfully exploited vulnerability. While the Confidentiality and Integrity impact metrics apply to the loss of confidentiality or integrity of data (e.g., information, files) used by the system, this metric refers to the loss of availability of the impacted system itself, such as a networked service (e.g., web, database, email). Since availability refers to the accessibility of information resources, attacks that consume network bandwidth, processor cycles, or disk space all impact the availability of a system.
Subsequent System Impact Metrics
Confidentiality: This metric measures the impact to the confidentiality of the information managed by the SUBSEQUENT SYSTEM due to a successfully exploited vulnerability. Confidentiality refers to limiting information access and disclosure to only authorized users, as well as preventing access by, or disclosure to, unauthorized ones.
Integrity: This metric measures the impact to integrity of a successfully exploited vulnerability. Integrity refers to the trustworthiness and veracity of information. Integrity of the SUBSEQUENT SYSTEM is impacted when an attacker makes unauthorized modification of system data. Integrity is also impacted when a system user can repudiate critical actions taken in the context of the system (e.g. due to insufficient logging).
Availability: This metric measures the impact to the availability of the SUBSEQUENT SYSTEM resulting from a successfully exploited vulnerability. While the Confidentiality and Integrity impact metrics apply to the loss of confidentiality or integrity of data (e.g., information, files) used by the system, this metric refers to the loss of availability of the impacted system itself, such as a networked service (e.g., web, database, email). Since availability refers to the accessibility of information resources, attacks that consume network bandwidth, processor cycles, or disk space all impact the availability of a system.
CVSS:4.0/AV:L/AC:L/AT:P/PR:L/UI:A/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H

EPSS score

Weaknesses

UNIX Symbolic Link (Symlink) Following

The product, when opening a file or directory, does not sufficiently account for when the file is a symbolic link that resolves to a target outside of the intended control sphere. This could allow an attacker to cause the product to operate on unauthorized files. Learn more on MITRE.

Race Condition Enabling Link Following

The product checks the status of a file or directory before accessing it, which produces a race condition in which the file can be replaced with a link before the access is performed, causing the product to access the wrong file. Learn more on MITRE.

CVE ID

CVE-2025-52881

GHSA ID

GHSA-cgrx-mc8f-2prm

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.