Skip to main content
Jump to mitigations: CVE-2026-31431 (copy.fail) · CVE-2026-43284 (Dirty Frag)

Summary

AWS shipped a kernel patch for CVE-2026-31431 in their Amazon Linux base AMIs on 2026-05-05 (see the AWS ALAS advisory). Action required varies by runner type on AWS:
  • RBE runners — no customer action required. Workflows does not pin the RBE runner base AMI, so newly launched instances pick up the patched AMI automatically; existing instances roll over on the next launch cycle.
  • CI runners using Aspect’s starter images — bump your deployment to starter image 20260509-0 or newer and re-apply. AL2 and AL2023 starter images are rebuilt on the patched AWS bases; Debian and Ubuntu starter images continue to ship the algif_aead modprobe blacklist until upstream patches are confirmed. Existing CI runner instances need to be replaced (ASG / instance refresh) for the new AMI to take effect.
  • CI runners using self-managed AMIs — either rebuild on a patched AWS base (AL2 ≥ 2.0.20260508.0, AL2023 ≥ 2023.11.20260509.0) or apply the algif_aead modprobe mitigation yourself (see Mitigations), then redeploy.
CVE-2026-31431 (known as “copy.fail”) is a Linux kernel privilege escalation vulnerability present in distributions built between 2017 and the patch date. A logic bug in the authencesn component allows an unprivileged local user to perform a 4-byte page-cache write via the AF_ALG (kernel crypto API) and splice() system calls. CI/CD runner environments—including multi-tenant systems and cloud platforms running user-supplied code—are considered high-risk targets. CVE-2026-43284 is one of two Linux kernel privilege escalation vulnerabilities collectively known as “Dirty Frag,” publicly disclosed on 7 May 2026. The flaw is in ESP (Encapsulating Security Payload) in-place decryption: when MSG_SPLICE_PAGES attaches pipe pages to a socket buffer without setting the SKBFL_SHARED_FRAG flag, the ESP input path decrypts data in-place over fragments it does not own. A second pending-assignment CVE in the same disclosure affects the RxRPC protocol (used by AFS). On non-containerised systems a working exploit for local privilege escalation to root exists; container escape is also considered possible, though no proof-of-concept has been published for that path. Canonical rates this CVSS 3.1 7.8 (High).

Affected Systems

All versions of Aspect Workflows are affected on all Cloud Providers. Components in Aspect Workflows that are Affected:
  • RBE Workers
  • CI Runners
  • Kubernetes Clusters
Known affected upstream distributions include Ubuntu 24.04 LTS, Amazon Linux 2023, RHEL 10.1, Debian, Arch, Fedora, Rocky, and others running unpatched kernels. CVE-2026-43284 has a wider impact range, affecting all Ubuntu LTS releases from 14.04 (Trusty) through 26.04 (Resolute Raccoon), as well as other distributions shipping Linux kernel versions 4.11 through 7.0.5.

Mitigations

What Aspect Has Done

What Customers May Need to Do

If you self-manage CI runners or pin kernel versions, apply one of the following: Preferred: Update to a kernel containing mainline commit a664bf3d603d, which reverts the problematic 2017 optimization. Interim: If an immediate kernel update is not possible, disable the algif_aead module and block AF_ALG socket creation:
  1. Disable the module at boot:
echo "install algif_aead /bin/true" | sudo tee /etc/modprobe.d/disable-algif.conf
  1. Unload the module from the running kernel:
sudo modprobe -r algif_aead
  1. For untrusted workloads, add a seccomp filter to your runner configuration that blocks AF_ALG socket creation.

Mitigating on Aspect’s CI Runners via a lifecycle hook

Since 5.13.9, A pre-bootstrap hook runs as root before Aspect’s bootstrapping logic, making it the right place to apply kernel-level mitigations until your AMIs have been updated. Since the hook runs as root, sudo is not required. Add the following to your runner’s pre-bootstrap hook; the || true ensures the script continues if the module is not loaded.
#!/usr/bin/env bash
set -o errexit -o errtrace -o pipefail -o nounset

# Mitigate CVE-2026-31431 (copy.fail)
echo "install algif_aead /bin/true" | tee /etc/modprobe.d/disable-algif.conf
modprobe -r algif_aead || true
Upload this file to aw-hooks-HASH/runners/pre-bootstrap. For full hook setup instructions, see Lifecycle hooks.

Mitigating GKE clusters on GCP

After the mitigation completes, all nodes reboot and the Bazel remote cache is wiped. The cache will be empty until it is repopulated by subsequent builds. To restore performance immediately, trigger a full build or run a warming job against the cluster after the script exits successfully.
For deployments using Aspect Workflows on GCP, the kernel-level mitigation must also be applied to the GKE cluster nodes that run the remote cache and CI infrastructure. Aspect provides a script that applies the upstream GoogleCloudPlatform/k8s-node-tools DaemonSet, handles the storage-node taint, waits for all nodes to reboot, and automatically recovers remote cache storage after the reboot. Prerequisites
  • Google Cloud CLI installed
  • kubectlgcloud components install kubectl
  • GKE auth plugin — gcloud components install gke-gcloud-auth-plugin
  • An account with roles/container.admin and roles/compute.viewer on the target project (run gcloud auth login to authenticate)
Script Save the following as cve-2026-31431-mitigate.sh and make it executable (chmod +x): Running the script Perform a dry run first to verify the cluster state without making changes:
./cve-2026-31431-mitigate.sh --dry-run <project-id> <region>
Apply the mitigation:
./cve-2026-31431-mitigate.sh <project-id> <region>
Replace <project-id> with the GCP project ID for the deployment and <region> with the cluster region (for example, us-west1). The script prints a STATUS: line at the start indicating whether the mitigation is already applied. If applied, it proceeds to verify RAID and storage health and exits cleanly if everything is healthy. The node reboot phase typically takes 15–30 minutes. The script polls until all nodes are patched and all storage services have recovered before exiting. If the script is interrupted, re-running it is safe — it detects the current state and picks up where it left off. If the script exits with ERROR: One or more nodes failed RAID recovery, contact Aspect support and share the full output.

CVE-2026-43284 (Dirty Frag) Mitigations

What Aspect Has Done

  • Patched our starter images with the interim modprobe blacklist for esp4, esp6, and rxrpc:
  • Scripted GKE cluster mitigation for GCP (see below).
  • Documented interim mitigation via a pre-bootstrap lifecycle hook (see below).
  • Tracking upstream kernel patch availability — starter images will switch to the kernel fix once patched kernel packages ship in affected distributions.

What Customers May Need to Do

If you self-manage CI runners or pin kernel versions, apply one of the following: Preferred: Update to a kernel containing the upstream netdev fix for MSG_SPLICE_PAGES handling (see kernel.org patches linked in the References section), which marks IPv4/IPv6 datagram splice fragments with SKBFL_SHARED_FRAG and adds an skb_cow_data() fallback in the ESP input path. Interim: If an immediate kernel update is not possible, disable the affected modules:
  1. Block the modules at boot:
printf 'install esp4 /bin/true\ninstall esp6 /bin/true\ninstall rxrpc /bin/true\n' | sudo tee /etc/modprobe.d/dirty-frag.conf
  1. Unload the modules from the running kernel:
sudo modprobe -r esp4 || true
sudo modprobe -r esp6 || true
sudo modprobe -r rxrpc || true
  1. Regenerate the initramfs so the block persists across reboots:
sudo update-initramfs -u   # Debian/Ubuntu
# or
sudo dracut -f             # RHEL/Fedora/Rocky
Disabling esp4 and esp6 blocks IPsec ESP traffic. If your environment uses IPsec-based VPNs or encrypted tunnels, test thoroughly before applying. The rxrpc module is only required by AFS clients and is safe to disable on most CI systems.

Mitigating on Aspect’s CI Runners via a lifecycle hook

Since CI runners are ephemeral, unloading the modules for the lifetime of each runner is sufficient — no initramfs regeneration is required. The hook below covers both CVE-2026-31431 and CVE-2026-43284; if you already deployed a hook for CVE-2026-31431, replace it with this combined version:
#!/usr/bin/env bash
set -o errexit -o errtrace -o pipefail -o nounset

# Mitigate CVE-2026-31431 (copy.fail)
echo "install algif_aead /bin/true" | tee /etc/modprobe.d/disable-algif.conf
modprobe -r algif_aead || true

# Mitigate CVE-2026-43284 (Dirty Frag)
printf 'install esp4 /bin/true\ninstall esp6 /bin/true\ninstall rxrpc /bin/true\n' | tee /etc/modprobe.d/dirty-frag.conf
modprobe -r esp4 || true
modprobe -r esp6 || true
modprobe -r rxrpc || true
Upload this file to aw-hooks-HASH/runners/pre-bootstrap. For full hook setup instructions, see Lifecycle hooks.

Mitigating GKE clusters on GCP

For deployments using Aspect Workflows on GCP, the kernel-level mitigation must also be applied to the GKE cluster nodes that run the remote cache and CI infrastructure. Aspect provides a script that applies the disable-dirty-frag DaemonSet and waits for all pods to reach Running. Unlike the CVE-2026-31431 mitigation, no node reboots occur and the Bazel remote cache is not affected. Prerequisites
  • Google Cloud CLI installed
  • kubectlgcloud components install kubectl
  • GKE auth plugin — gcloud components install gke-gcloud-auth-plugin
  • An account with roles/container.admin on the target project (run gcloud auth login to authenticate)
Script Save the following as cve-2026-43284-mitigate.sh and make it executable (chmod +x): Running the script Perform a dry run first to verify the cluster state without making changes:
./cve-2026-43284-mitigate.sh --dry-run <project-id> <region>
Apply the mitigation:
./cve-2026-43284-mitigate.sh <project-id> <region>
Replace <project-id> with the GCP project ID for the deployment and <region> with the cluster region (for example, us-west1). The script prints a STATUS: line at the start indicating whether the mitigation is already applied. Mitigation pods should reach Running in under a minute. If the script exits with a Forbidden error on the node-labeling step, the authenticated account lacks container.nodes.update in Cloud IAM for the target project — contact the project owner to grant roles/container.admin.

Status

CVESeverityDisclosedAspect MitigatedStatus
CVE-2026-31431 (copy.fail)Critical — local privilege escalation, multi-tenant environmentsPending2026-05-09Mitigated in starter images 20260509-0
CVE-2026-43284 (Dirty Frag)High — CVSS 3.1 7.8 (CISA-ADP)2026-05-072026-05-09Mitigated in starter images 20260509-0

References

CVE-2026-31431 (copy.fail)

CVE-2026-43284 (Dirty Frag)