Websocket close timeout prevents gateway hangs
Gateway restarts should now complete cleanly instead of hanging until the 25-second watchdog timeout. Websocket shutdown now has bounded timeouts, with force-termination of stalled clients after a grace period.
When the gateway server restarted on production systems, the process would hang indefinitely waiting for websocket connections to close cleanly. The shutdown handler called wss.close(), which would never resolve unless the server tracked clients and those clients cooperated with close callbacks. With no tracked clients or uncooperative clients, the gateway was stuck until the watchdog killed the process 25 seconds later.
Websocket shutdown now has two bounded windows. The close operation gets a one-second grace period to complete normally. If it exceeds that window, all tracked clients are force-terminated. A final 250-millisecond window handles any lingering close callbacks before the shutdown proceeds regardless.
This fix lives in the gateway server shutdown path, which means every gateway restart and deployment will now complete within a predictable timeframe instead of relying on the watchdog as a safety net.
View Original GitHub Description
Summary
- bound gateway websocket shutdown so restart/stop can continue even if
wss.close()never resolves - terminate tracked websocket clients after a short grace window and continue after a final force window
- add regressions for both lingering-client and zero-tracked-client hangs
Why
On jpclawhq, gateway restarts were hanging in shutdown until the 25s watchdog killed the process. Root cause was createGatewayCloseHandler waiting forever on wss.close() unless the server both tracked clients and cooperated with close callbacks.
Testing
bunx vitest run src/gateway/server-close.test.ts --pool forks --maxWorkers 1pnpm buildin the clean worktree hit an unrelated existing@anthropic-ai/sdkunresolved-import failure fromsrc/agents/anthropic-transport-stream.ts- verified the same logic live on
jpclawhq: restart now completes cleanly instead of timing out in websocket shutdown
AI
- AI-assisted
- fully tested on the touched shutdown path; broader maintainer gates still pending