Skip to content

Wrap sidecar in a Job Object/process group via process-wrap#41

Merged
nedtwigg merged 11 commits intomainfrom
robust-sidecar-cleanup
May 1, 2026
Merged

Wrap sidecar in a Job Object/process group via process-wrap#41
nedtwigg merged 11 commits intomainfrom
robust-sidecar-cleanup

Conversation

@nedtwigg
Copy link
Copy Markdown
Member

@nedtwigg nedtwigg commented May 1, 2026

Summary

  • Replaces the Rust-side tauri-plugin-shell sidecar API with std::process::Command wrapped by process-wrap, so the Node.js sidecar runs inside a Windows Job Object (with KILL_ON_JOB_CLOSE) or a Unix process group.
  • The OS now guarantees the sidecar tree dies with the host regardless of which Tauri lifecycle events fired — fixes the orphan node.exe that was holding target\debug\node.exe open and breaking subsequent cargo builds with PermissionDenied from tauri-build.
  • Keeps tauri-plugin-shell registered as a plugin because the frontend uses open() from @tauri-apps/plugin-shell in updater.ts. Only the Rust-side sidecar spawn moves off it.

What changed

  • standalone/src-tauri/Cargo.toml — add process-wrap = { version = "9", features = ["std"] }, drop unused libc dep (was only used by the now-deleted kill_process_tree).
  • standalone/src-tauri/src/lib.rs:
    • start_sidecar now resolves the bundled node binary via a new resolve_node_binary_path() (looks alongside current_exe() for node-{TAURI_ENV_TARGET_TRIPLE}{ext} then node{ext}), spawns via process_wrap::std::CommandWrap + JobObject/ProcessGroup::leader, and reads stdout/stderr on plain std::threads using BufReader::lines() (replacing the old tauri::async_runtime::spawn + CommandEvent matcher).
    • SidecarState.child_pid: u32child: Arc<Mutex<Box<dyn ChildWrapper + Send + Sync>>>.
    • kill_process_tree (taskkill on Windows / libc::kill on Unix) replaced by kill_sidecar which calls start_kill() on the wrapped child. Same exit-time wiring as before — RunEvent::Exit and the shutdown_sidecar command both call it.

Why not just upgrade Tauri?

We're already on the latest stable tauri (2.10.3) and tauri-plugin-shell (2.3.5). The relevant upstream feature request — a sidecar-lifecycle plugin that handles graceful shutdown and crash detection — is open and unimplemented (tauri-apps/plugins-workspace#3062). Job Objects are the industry-standard answer to "kill child when parent dies regardless of how parent dies" (Chromium, VS Code, etc. all use them); process-wrap is the canonical Rust crate that abstracts Job Objects + Unix process groups under one API.

Why not also remove tauri-plugin-shell entirely?

The frontend still uses open() for the changelog/issue links. Swapping that to tauri-plugin-opener is a separate small follow-up — keeping it out of this PR keeps the diff focused.

Test plan

  • cargo test -p mouseterm_lib passes (3 new tests cover find_node_binary paths, 5 existing tests unchanged).
  • pnpm dev:standalone launches and a terminal pane works (sidecar IPC functions over the new std::process::Command pipes).
  • Normal window close: no node.exe orphan remains.
  • Force-kill host (Stop-Process -Force on mouseterm.exe): sidecar node.exe disappears immediately. This was the failure mode the refactor exists to fix.
  • Manual verification on macOS / Linux (process group path) — I only have Windows locally.

🤖 Generated with Claude Code

On Windows, force-killing or force-closing the Tauri host left the
Node.js sidecar (and its node-pty grandchildren) running as orphans
that locked target/debug/node.exe and caused the next cargo build to
fail with PermissionDenied. The kill-on-exit path was gated on
RunEvent::Exit, which doesn't fire reliably across every close path
(tauri-apps/tauri#10555), and tauri-plugin-shell deliberately doesn't
manage child lifetime for you (tauri-apps/plugins-workspace#3062).

Replace the Rust-side spawn/kill with std::process::Command wrapped
by process-wrap: JobObject on Windows (KILL_ON_JOB_CLOSE), process
group on Unix. The OS now guarantees the sidecar tree dies with the
host regardless of which lifecycle events fired.

tauri-plugin-shell stays registered because the frontend still uses
its open() in updater.ts.

Verified by force-killing mouseterm.exe with Stop-Process -Force and
confirming the sidecar PID disappears immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 1, 2026

Deploying mouseterm with  Cloudflare Pages  Cloudflare Pages

Latest commit: 31c7e2e
Status: ✅  Deploy successful!
Preview URL: https://871169fd.mouseterm.pages.dev
Branch Preview URL: https://robust-sidecar-cleanup.mouseterm.pages.dev

View logs

nedtwigg and others added 10 commits April 30, 2026 22:41
`append_log` was called from the per-line stdout/stderr reader loops,
and each call resolved the log path from env vars, ran create_dir_all
on the parent, and reopened the file with OpenOptions. On a chatty
stderr stream that's three syscalls plus several allocations per line.

Cache the resolved PathBuf in a OnceLock and the open append-mode File
in OnceLock<Option<Mutex<File>>>; init_log keeps its own truncate-mode
open since it runs once at startup and the cache opens lazily after.

Drive-by cleanup from the /simplify pass: drop a few narrating
comments, an over-wrapped Result, and a redundant `let mut` rebind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously a missing stdin/stdout/stderr would early-return Err while
the child kept running. On Windows the Job Object reaps it when the
handle drops, but on Unix the process group leader was leaked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Shutdown variant only broke the writer thread out of its loop; it
never sent anything to the sidecar over stdin, so the previous code
that did `tx.send(Shutdown); kill_sidecar(...)` gave the JS side no
graceful-cleanup window despite appearing to. Simplify: writer thread
now sends raw JSON lines and exits when the channel closes or the
post-kill stdin write fails. Document the kill semantics in
kill_sidecar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Old `tauri-plugin-shell` reader logged a `[sidecar] exited (code, signal)`
line from CommandEvent::Terminated. After the move to plain pipe readers
that signal was lost — the reader threads just broke silently on EOF
and any in-flight `request_from_sidecar_timeout` callers waited the
full timeout instead of failing fast.

Add a thread that polls try_wait on the shared child every 250ms; on
exit, log the status and clear the pending-requests map so blocked
callers wake immediately (their channels see Disconnected and surface
the existing timeout error).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the reaper drops pending senders on sidecar exit, callers got
"timed out waiting for X" — misleading. Surface disconnect explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After dropping the SidecarMsg::Shutdown sentinel, the command no longer
performs a graceful shutdown — it just kills. Name it accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The about-menu construction is only consumed under cfg(target_os =
"macos"). Move pkg, about, and the AboutMetadata import behind the
same cfg so non-macOS builds compile clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nedtwigg nedtwigg merged commit 4cf927c into main May 1, 2026
3 checks passed
@nedtwigg nedtwigg deleted the robust-sidecar-cleanup branch May 1, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant