Sandbox vs. Apptainer Containers¶

Disclaimer: This is not a formal security audit or a complete vulnerability assessment. It reflects personal research and best-effort analysis of publicly available documentation, CVE databases, and source code. Claims may be incomplete or outdated. If you are making security decisions for your organization, consult your security team and verify the findings independently.

Docker requires root and is not available on shared HPC clusters. The natural comparison is therefore with Apptainer (formerly Singularity), the standard container runtime in HPC. This document puts the sandbox's security posture in perspective by comparing the two approaches head-to-head.

Design philosophy¶

Apptainer was designed for reproducibility: running the same software stack across different clusters. Its documented philosophy is "integration over isolation", meaning containers share the host PID space, network, IPC, and home directory by default. This is deliberate, since HPC workloads need access to GPUs, InfiniBand, parallel filesystems, and Slurm.

This sandbox was designed for containment: restricting what AI coding agents (Claude Code, Codex, Gemini CLI, Aider, OpenCode) can see and modify on the host. Its philosophy is isolation-first, with selective holes for what the agent needs (project directory, Slurm via chaperon proxy, agent-specific API keys via config.conf profiles).

These are opposite defaults. An Apptainer container is wide-open unless you lock it down; the sandbox is locked-down unless you open it up.

Throughout this comparison, "this sandbox" refers to the bwrap backend, which is the default and the primary supported configuration. The firejail and landlock backends are fallbacks for environments where bwrap is unavailable (e.g. AppArmor-restricted user namespaces on Ubuntu 24.04+, or older kernels without unprivileged userns); their isolation is weaker in specific dimensions, called out where relevant below.

Default isolation comparison¶

Isolation layer	This sandbox (bwrap)	Apptainer (default)	Apptainer (`--containall`)
Mount namespace	✓	✓	✓
PID namespace	✓	✗ (opt-in `--pid`)	✓
Network namespace	✗ (shared)	✗ (shared)	✗ (shared)
IPC namespace	✓ (bwrap/firejail; ✗ landlock)	✗ (opt-in `--ipc`)	✓
`/tmp` isolation	✓ (private tmpfs)	✗ (bind-mounts host `/tmp`)	✓
`/run` isolation	✓ (private tmpfs)	✗	✗
`/dev` filtering	✓ (minimal devtmpfs + targeted `DEVICES` bind, admin-blacklist enforced)	✗ (host `/dev` bind-mounted)	✓ (minimal `/dev`)
Home directory	Blank tmpfs + selective re-mount	Bind-mounts `$HOME`	Isolated (empty `$HOME`)
CWD bind mount	Project dir only	Full CWD	CWD
Host `/proc`	Isolated (unshare-pid)	Full host `/proc`	Isolated
Env var filtering	✓ (explicit names + credential patterns: SSH_, TOKEN, CI*, etc.)	✗ (inherits host environment)	Partial (`--cleanenv`)
Passwd/group filtering	✓ (system accounts + current user)	Generates container-local files, but includes user info	Same
Seccomp	✓ (bwrap: generated BPF; firejail: --seccomp.drop; landlock: custom)	✗ (not applied by default)	✗
io_uring blocked	✓ (all three backends)	✗	✗
Agent config isolation	✓ (merged instruction files, kernel-enforced read-only)	n/a	n/a
Slurm integration	Chaperon proxy (scoped sbatch/scancel/squeue, CWD validation)	Transparent (no wrapping)	Transparent

The sandbox provides stronger default containment in every category except network namespace (neither isolates the network by default, since both need it for munge/Slurm). Apptainer's --containall closes some gaps (PID, IPC, /tmp, home) but still does not filter environment variables, does not isolate /run, and does not apply seccomp.

Security track record¶

All three sandbox backends and Apptainer have public CVE histories. The differences are stark:

Tool	Total CVEs	Critical	High	Root exploits	Last CVE
Bubblewrap	4	0	3	0	2020
Firejail	18	2	13	12	2022
Landlock	0	0	0	0	n/a
Apptainer/Singularity	18+	2	~5	~4	2025

Bubblewrap (4 CVEs, none since 2020)¶

Bubblewrap has a remarkably clean record. Its ~1,500-line C codebase is small and auditable. The most commonly cited issue, CVE-2017-5226 (TIOCSTI sandbox escape, scored Critical), was actually a Linux kernel/terminal design flaw, not a bubblewrap bug. The kernel fixed it in Linux 6.2. The remaining three CVEs were a dumpable process issue (fixed in 0.1.3), a /tmp mount-point race (fixed in 0.3.3), and a setuid-mode privilege issue (fixed in 0.4.1).

Firejail (18 CVEs, 12 are local root)¶

Firejail's record is the worst of the group. 12 of its 18 CVEs are direct privilege escalation to root, all exploiting the setuid-root architecture. Notable examples:

CVE	Severity	Issue
CVE-2022-31214	High	Local root exploit via `--join` logic. Published PoC works on Debian, Arch, Fedora, openSUSE.
CVE-2020-17368	Critical (9.8)	Shell metacharacter injection via `--output`, enabling command injection
CVE-2019-12499	High	Sandboxed code can truncate the firejail binary on the host
CVE-2016-10122	High	Environment variables not cleaned (LD_PRELOAD), enabling root shell

A 2017 oss-security audit found "a lot of low hanging exploitable fruit" and concluded that the setuid-root model is fundamentally problematic. A 2022 SUSE security disclosure demonstrated a full local root exploit chain using CVE-2022-31214. This is the same class of risk as Apptainer's setuid helper, but more frequent and more severe.

Apptainer/Singularity (18+ CVEs)¶

Apptainer's CVEs are more varied in severity (more medium-rated issues around image verification and build permissions), but include serious setuid-related privilege escalation:

CVE	Severity	Issue
CVE-2023-30549	High	Setuid mode lets unprivileged users trigger kernel filesystem driver bugs (ext4 use-after-free) on user-writable image data, leading to privilege escalation
CVE-2023-38496	High	Ineffective privilege drop allows root-privileged code to run on attacker-controlled config
CVE-2020-15229	High	Path traversal in `unsquashfs`, allowing arbitrary host file overwrite
CVE-2025-65105	Moderate	Container can disable `--security apparmor:` and `--security selinux:` options

The advisory for CVE-2023-30549 notes that "many ext4 filesystem vulnerabilities similar to the one in CVE-2022-1184 continue to be found, and most of them do not ever have a CVE assigned." The setuid model systematically elevates moderate kernel bugs to exploitable privilege escalation.

Takeaway¶

The setuid-root architecture is the common thread. Both firejail and Apptainer's setuid mode have been repeatedly exploited for local root. Bubblewrap avoids this entirely by using unprivileged user namespaces, and Landlock avoids it by being a pure kernel LSM with no userspace privileged component.

When choosing a sandbox backend, this matters: bwrap has 4 CVEs and zero root exploits; firejail has 18 CVEs and 12 root exploits. Firejail provides strong isolation features (seccomp, caps dropping) but installs a setuid-root binary on every node. On systems where bwrap is available (or can be enabled via AppArmor), it is the safer choice. See the bwrap vs firejail comparison in Admin Install.

Architectural weaknesses unique to Apptainer¶

Admin restrictions are unenforceable in rootless mode. The Apptainer admin docs explicitly state that the limit container and allow container directives "are not effective if unprivileged user namespaces are enabled." On systems with unprivileged user namespaces (the default), a user can compile their own Apptainer binary with any configuration and bypass all administrative restrictions. The admin cannot enforce policy when the user controls the binary.

ECL (Execution Control List) is ineffective in rootless mode. The container signing and verification mechanism (ECL) is "only effectively applied when Apptainer is running in setuid mode." In rootless mode, users can run any container image regardless of signatures.

SIF image verification has had gaps. CVE-2020-13845 showed that ECL enforcement compared fingerprints against unsigned SIF descriptors rather than cryptographically validated signatures, a verification bypass.

The sandbox has none of these issues. It does not rely on image verification, does not have a rootless/setuid split in enforcement, and policy is enforced by kernel mechanisms (mount namespace, Landlock LSM) that the user cannot bypass without privilege.

Shared weaknesses and honest gaps¶

Neither approach provides complete isolation. Both share these weaknesses:

Gap	This sandbox	Apptainer
Network not isolated	Shared host network (all backends). Agent can exfiltrate data via HTTP, DNS, or SSH.	Shared host network by default. `--net` available but breaks Slurm/munge.
Abstract Unix sockets	Accessible since bwrap/firejail share the network namespace. `@/org/freedesktop/...` reachable.	Accessible (shared network namespace).
SSH escape	Hidden by default — `~/.ssh`, `~/.aws`, `~/.gnupg` are always-blocked regardless of `HOME_ACCESS` mode. Only reachable if a config explicitly re-exposes them.	`$HOME` bind-mounted by default, so `~/.ssh` is exposed unless `--contain` is used.
`/dev/shm` / IPC	Isolated on bwrap (`--unshare-ipc`) and firejail (`--ipc-namespace`). Shared on Landlock.	Writable and shared by default.
`memfd_create`	Not blocked (needed by CUDA, PyTorch, JAX). Docker's default seccomp profile also allows it. `userfaultfd` and `io_uring` are blocked by all three backends via seccomp.	Not blocked (no seccomp by default).
Slurm wrapping	Munge socket blocked (bwrap/firejail). Slurm binaries blocked (bwrap/firejail). Landlock: neither munge nor Slurm binaries are blocked — `AF_UNIX connect()` bypasses Landlock, so the chaperon is fully bypassable. Use bwrap or firejail for any deployment that needs a hard Slurm boundary.	No wrapping at all. Slurm fully accessible.

The sandbox has additional backend-specific gaps documented in the Known Limitations:

Landlock cannot block Unix socket connect() (D-Bus/systemd escape), has no PID namespace, no mount namespace (BLOCKED_FILES and PRIVATE_TMP ineffective), and no LDAP user enumeration filtering.
bwrap seccomp filter is generated at runtime (generate-seccomp.py) — verify it loads (no "seccomp" warnings on stderr).
Landlock leaves IPC namespace shared (/dev/shm writable by all same-UID processes). Bwrap and firejail isolate IPC by default.
All backends leave the network namespace shared. Abstract Unix sockets are covert channels between sandbox sessions.

The key difference is not that the sandbox has no gaps (it does), but that its gaps are smaller and better characterized. Apptainer's default posture exposes the entire host environment; the sandbox's default posture hides everything and selectively re-exposes what is needed. Both benefit from admin hardening for strong isolation (see Admin Hardening).

What Apptainer does better¶

Reproducible environments. Apptainer containers bundle the entire OS userland: a specific Python version, CUDA toolkit, library stack. The sandbox does not provide environment isolation; it restricts the agent within the host environment. If the goal is running a known-good software stack, Apptainer is the right tool.

Image distribution and caching. SIF images can be built once, signed, and distributed across clusters. The sandbox has no equivalent and relies on the host's installed software.

Community and ecosystem. Apptainer has broad HPC adoption, extensive documentation, and integration with registries (Docker Hub, ORAS, library://). The sandbox is purpose-built for AI coding agents.

The two approaches are complementary, not competing. An agent running inside the sandbox can submit Slurm jobs that use Apptainer containers. The sandbox controls what the agent can access on the host, while Apptainer provides the reproducible environment inside the job. The chaperon ensures that jobs submitted from inside the sandbox (on bwrap/firejail) are themselves sandboxed, regardless of whether they use Apptainer internally.

Bottom line¶

For AI agent containment on HPC, the sandbox provides stronger default isolation than Apptainer with less complexity. Apptainer's "integration over isolation" design means that a default container is barely more isolated than running directly on the host: it shares PID space, network, home directory, /tmp, and environment variables. Achieving comparable containment with Apptainer requires --containall --cleanenv --pid plus a custom seccomp profile, a configuration that most HPC users do not use and that breaks many workflows.

The sandbox achieves this containment out of the box, with HPC-specific accommodations (munge passthrough, Slurm wrapping, supplementary groups, LDAP filtering) built in. With the bwrap backend (recommended), the attack surface is minimal: no setuid helper, no image parsing, no SIF verification code, and only 4 CVEs in a decade (none since 2020). The firejail backend provides comparable isolation but carries a worse CVE record than Apptainer itself (see Security track record).

Neither tool is a complete solution. The sandbox does not isolate the network (an agent can exfiltrate data over HTTP or SSH), does not block memfd_create (needed by CUDA/PyTorch, also allowed by Docker's default seccomp), and its Slurm wrapping is a soft boundary. Apptainer shares all of these gaps and more. Both benefit from the admin hardening options described in Admin Hardening: dedicated accounts, network isolation, and audit logging close gaps that neither tool addresses alone. The trade-off between them is clear: the sandbox does not provide environment reproducibility, Apptainer does not provide agent containment, and each is the right tool for its purpose.