Skip to content

Reject CLONE_NEW* namespace flags in legacy clone(2)#46

Merged
jserv merged 1 commit into
sysprog21:mainfrom
Max042004:fix-clone-namespace-flags
May 26, 2026
Merged

Reject CLONE_NEW* namespace flags in legacy clone(2)#46
jserv merged 1 commit into
sysprog21:mainfrom
Max042004:fix-clone-namespace-flags

Conversation

@Max042004
Copy link
Copy Markdown
Collaborator

@Max042004 Max042004 commented May 26, 2026

sys_clone forwarded raw flags to the posix_spawn-based fork path without inspecting CLONE_NEW* bits, so clone(CLONE_NEWPID, ...) silently created an unisolated child and returned a PID as if the namespace had been set up. clone3 already rejected the same flags with EINVAL, so the observable result depended on which entry point the caller used.

Mirror the clone3 policy in sys_clone: reject any namespace flag with EINVAL. The exit signal occupies the CSIGNAL low byte in clone(2), so mask it off before testing. CLONE_NEWTIME (0x80) lives in that byte and, like CLONE_INTO_CGROUP (bit 33) and set_tid, cannot be conveyed through clone(2), so only the higher namespace bits are reachable. Move the namespace flag macros to the top of the file so both entry points share them.

Add a regression test asserting clone(2) rejects each reachable CLONE_NEW* flag, matching the existing clone3 coverage.

Closes #44


Summary by cubic

Reject CLONE_NEW* namespace flags in legacy clone(2) to match clone3 behavior. This fixes a bug where clone() would silently fork without isolation and pretend the namespace was created.

  • Bug Fixes
    • In sys_clone, mask off the CSIGNAL byte and return EINVAL when any CLONE_NEW* bit is set (namespaces aren’t implemented).
    • Share namespace flag macros across sys_clone and sys_clone3, and add a regression test that asserts clone(2) rejects all reachable flags (excluding CLONE_NEWTIME in CSIGNAL).

Written for commit 64ea4e3. Summary will update on new commits. Review in cubic

sys_clone forwarded raw flags to the posix_spawn-based fork path without
inspecting CLONE_NEW* bits, so clone(CLONE_NEWPID, ...) silently created
an unisolated child and returned a PID as if the namespace had been set
up. clone3 already rejected the same flags with EINVAL, so the observable
result depended on which entry point the caller used.

Mirror the clone3 policy in sys_clone: reject any namespace flag with
EINVAL. The exit signal occupies the CSIGNAL low byte in clone(2), so
mask it off before testing. CLONE_NEWTIME (0x80) lives in that byte and,
like CLONE_INTO_CGROUP (bit 33) and set_tid, cannot be conveyed through
clone(2), so only the higher namespace bits are reachable. Move the
namespace flag macros to the top of the file so both entry points share
them.

Add a regression test asserting clone(2) rejects each reachable CLONE_NEW*
flag, matching the existing clone3 coverage.

Closes sysprog21#44
cubic-dev-ai[bot]

This comment was marked as resolved.

@jserv jserv merged commit b065c43 into sysprog21:main May 26, 2026
5 checks passed
@jserv
Copy link
Copy Markdown
Contributor

jserv commented May 26, 2026

Thank @Max042004 for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

clone(2) silently ignores CLONE_NEW* namespace flags while clone3(2) rejects them with EINVAL

2 participants