Skip to content

ssh: fix proxycommand stdin deadlock when server closes first#1642

Open
techwolf359 wants to merge 1 commit into
smallstep:masterfrom
techwolf359:fix/proxycommand-stdin-deadlock
Open

ssh: fix proxycommand stdin deadlock when server closes first#1642
techwolf359 wants to merge 1 commit into
smallstep:masterfrom
techwolf359:fix/proxycommand-stdin-deadlock

Conversation

@techwolf359
Copy link
Copy Markdown

Fixes #1641

Summary

step ssh proxycommand hangs when the SSH server closes the connection before the client has closed stdin. The hang lasts until an OS-level timeout (~60s on macOS) kills the process.

Suspected cause

proxyDirect uses sync.WaitGroup waiting for both goroutines to finish:

  • Goroutine 1 (stdin→server): blocked reading os.Stdin, which is a pipe from the SSH client
  • Goroutine 2 (server→client): finishes when the server closes the connection

When the server closes, goroutine 2 exits. But goroutine 1 is stuck: os.Stdin won't close because the SSH client is waiting for the ProxyCommand to exit, and the ProxyCommand won't exit because goroutine 1 is still running — deadlock.

Calling os.Stdin.Close() from goroutine 2 does not reliably interrupt a blocked read() syscall on macOS when stdin is a pipe.

Fix

Return as soon as either goroutine completes. The ProxyCommand's job is to relay bytes; once one side closes there is nothing more to do. The process exits and the OS reclaims the blocked goroutine.

Changes

  • Extracts proxyDirectWithIO(host, port, in, out) to enable testing without depending on os.Stdin/os.Stdout
  • Replaces wg.Wait() (wait for both) with <-done (return on first)
  • Adds TestProxyDirectExitsWhenServerCloses — reproduces the deadlock and verifies the fix (fails before, passes after)

Testing

go test ./command/ssh/ -run TestProxyDirectExitsWhenServerCloses -v

proxyDirect uses two goroutines with wg.Wait() requiring both to finish.
When the server closes the connection, the server->client goroutine exits,
but the client->server goroutine blocks forever reading from os.Stdin —
stdin is a pipe from the SSH client, which won't close until the
ProxyCommand exits, which won't happen until both goroutines finish.

Calling os.Stdin.Close() from the server goroutine does not reliably
interrupt a blocked read() syscall on macOS when stdin is a pipe.

Fix by returning as soon as either goroutine finishes. When the server
closes, the process exits and the OS reclaims the blocked goroutine.
The ProxyCommand's only job is to relay bytes; once one side closes
there is nothing more to do.

Also extracts proxyDirectWithIO(host, port, in, out) to make the logic
testable without depending on os.Stdin/os.Stdout, and adds a regression
test that reproduces the deadlock and verifies the fix.

Fixes: smallstep#1641
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

step ssh proxycommand hangs ~60s when server closes connection before stdin closes

3 participants