container: optionally map uid/gid 0 as init
All checks were successful
Test / Create distribution (push) Successful in 1m3s
Test / Sandbox (push) Successful in 2m48s
Test / Hakurei (push) Successful in 3m45s
Test / ShareFS (push) Successful in 3m55s
Test / Sandbox (race detector) (push) Successful in 5m15s
Test / Hakurei (race detector) (push) Successful in 6m31s
Test / Flake checks (push) Successful in 1m21s

Unfortunately required to work around flawed APIs like binfmt_misc.

Signed-off-by: Ophestra <cat@gensokyo.uk>
This commit is contained in:
2026-05-07 15:15:28 +09:00
parent bad66facbc
commit d4144fcf7f
4 changed files with 91 additions and 11 deletions

View File

@@ -91,6 +91,9 @@ type (
// Time to wait for processes lingering after the initial process terminates.
AdoptWaitDelay time.Duration
// Map uid/gid 0 in the init process. Requires [FstypeProc] attached to
// [fhs.Proc] in the container filesystem.
InitAsRoot bool
// Mapped Uid in user namespace.
Uid int
// Mapped Gid in user namespace.
@@ -286,6 +289,18 @@ func (p *Container) Start() error {
if !p.HostNet {
p.cmd.SysProcAttr.Cloneflags |= CLONE_NEWNET
}
if p.InitAsRoot {
p.cmd.SysProcAttr.AmbientCaps = append(p.cmd.SysProcAttr.AmbientCaps,
// mappings during init as root
CAP_SETFCAP,
)
if !p.SeccompDisable &&
len(p.SeccompRules) == 0 &&
p.SeccompPresets&std.PresetDenyNS != 0 {
return errors.New("container: as root requires late namespace creation")
}
}
// place setup pipe before user supplied extra files, this is later restored by init
if r, w, err := os.Pipe(); err != nil {