Closed
Conversation
fengmk2
approved these changes
Mar 20, 2026
The milestone PTY tests occasionally crash with SIGSEGV on Alpine/musl CI (https://github.com/voidzero-dev/vite-task/actions/runs/23328556726/job/67854932784). This stress test runs the same PTY milestone operations 20 times both sequentially and concurrently to amplify whatever race condition or memory issue triggers the crash in the musl environment. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Disable all other CI jobs to iterate faster on reproducing the flaky SIGSEGV in milestone tests on Alpine/musl. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
- Increase from 20 to 100 iterations per stress test - Add high-concurrency test (8 parallel PTY sessions) - Add CI step that runs the milestone binary 200 times in a loop https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Install a signal handler that prints /proc/self/maps on SIGSEGV to help identify whether the crash is a stack overflow or memory corruption. Uses an alternate signal stack so it works even during stack overflows. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Add the same signal handler with stack pointer and /proc/self/maps output to the milestone test binary (which is where the crash occurs). Increase loop to 500 iterations for more reliable reproduction. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Add SA_SIGINFO handler that extracts si_addr (fault address) and crashing RSP/RIP from ucontext_t to identify which code runs on the tiny 8KB stack. Also add single-threaded CI step for comparison. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Walk RBP frame pointers from the crashing context to produce a stack trace, and use addr2line in CI to resolve addresses to source locations. Also print handler fn address for PIE base calculation. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Alpine's busybox grep doesn't support -P (perl regex). Use sed instead to extract hex addresses. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
On musl libc (Alpine Linux), concurrent openpty + fork/exec operations trigger SIGSEGV/SIGBUS inside musl internals (observed crashes in sysconf and fcntl). This is a known class of musl threading issues with fork. Serialize PTY creation with a process-wide mutex, guarded by #[cfg(target_env = "musl")]. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Remove SIGSEGV signal handler, stress test, and CI modifications that were used to diagnose the musl libc race condition. The actual fix (SPAWN_LOCK in Terminal::spawn) is in the previous commit. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
The previous SPAWN_LOCK only serialized the openpty+fork/exec call, but concurrent PTY I/O operations after spawn also trigger SIGSEGV/SIGBUS in musl internals. Store the MutexGuard in the Terminal struct so the lock is held for the Terminal's entire lifetime, ensuring only one PTY is active at a time on musl. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
The new _pty_guard field only exists under #[cfg(target_env = "musl")], causing compilation failures on musl when destructuring Terminal without `..` to ignore inaccessible fields. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
Runs the full musl test suite 10 times in parallel to verify the PTY serialization fix is stable. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
The previous fix held the mutex for the Terminal's entire lifetime, which serialized all PTY tests within a binary. With 8 tests having 5-second timeouts, later tests would time out waiting for the lock (4/10 CI runs failed with exit code 101). The SIGSEGV occurs in musl's sysconf/fcntl during openpty + fork/exec, not during normal FD I/O on already-open PTYs. Restrict the lock to just the spawn section so tests can run concurrently after creation. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
All 10/10 parallel musl runs passed, confirming the spawn-only lock fix is stable. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
d559b93 to
e19c5ba
Compare
d42d442 to
a7b0d0a
Compare
e19c5ba to
97430c6
Compare
The SPAWN_LOCK only serialized openpty+fork, but background threads from previous spawns do FD cleanup (close on writer/slave) that races with the next openpty() call on musl-internal state, causing SIGSEGV in the parent process. Extend the lock to also cover the cleanup phase in background threads. https://claude.ai/code/session_011H8UR3gS6hoyQAf2x7Dfw8
38c6b63 to
29bea9f
Compare
Add -C target-feature=-crt-static to RUSTFLAGS in the musl CI job so that test binaries link against musl dynamically instead of statically. This ensures fspy preload shared libraries can be injected into dynamically-linked host processes (e.g. node on Alpine). https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
Add -C target-feature=-crt-static to the musl target rustflags in .cargo/config.toml so it applies for all musl builds (local and cross). Keep it in the CI RUSTFLAGS override as well since the env var overrides both [build] and [target] level config. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
Keep dynamic musl linking only in CI RUSTFLAGS, not in the shared cargo config. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
vite-task ships as a NAPI module in vite+, and musl Node with native modules links to musl libc dynamically, so we must match. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
The global -crt-static flag (for dynamic musl linking) would make fspy_test_bin dynamically linked, but it must remain static so fspy can test its seccomp-based tracing path for static executables. Pass -static to the linker via build.rs to override the global flag. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
The previous build.rs approach (passing -static to the linker) broke on macOS, glibc Linux, and even musl Alpine (conflicting -Bstatic/-Bdynamic). The seccomp tracer intercepts syscalls at the kernel level and works for both static and dynamic binaries, so the static_executable tests are valid either way. Replace the hard assertion with an informational check. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
The test binary is an artifact dep targeting musl, and when CI builds with -crt-static the binary becomes dynamically linked — defeating the purpose of these static-binary-specific tests. https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM
ctrlc::set_handler spawns a background thread to monitor signals. The subprocess closure runs during .init_array (via ctor), and on musl, newly-created threads cannot execute during init because musl holds a lock. This causes ctrlc's monitoring thread to never run, silently swallowing SIGINT and causing send_ctrl_c_interrupts_process to hang. Replace ctrlc with signal_hook::low_level::register on Unix, which installs a raw signal handler without spawning threads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
059df0a to
568ee34
Compare
branchseer
pushed a commit
that referenced
this pull request
Mar 20, 2026
All 10/10 parallel musl runs passed, confirming stability after merging #279 changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Updated the Alpine CI job's RUSTFLAGS to disable static C runtime linking, ensuring compatibility with dynamic musl libc linking required for NAPI modules in vite+.
Changes
RUSTFLAGSin the Alpine CI workflow to include-C target-feature=-crt-static[build].rustflagsand target-specific rustflags from.cargo/config.tomlDetails
The vite-task NAPI module needs to link against musl libc dynamically rather than statically. By disabling the
crt-statictarget feature, the build now matches the dynamic linking behavior expected by Node.js native modules on Alpine Linux systems.https://claude.ai/code/session_01R3RoGqPDBRtNa2NRg3SeBM