Redis failover reconnection added
n8n instances will now automatically recover from Redis failover events instead of getting stuck in a READONLY error loop.
When Redis fails over, it promotes a replica to primary and demotes the old primary to read-only. Any n8n instance still holding a connection to the old primary will receive READONLY errors on write operations — and without intervention, stays stuck.
A new configuration option enables the Redis client to detect these errors and reconnect automatically. When enabled, the client recognizes the READONLY error signature, logs a warning, and re-establishes the connection before retrying the operation. This handles failover events in multi-AZ setups without manual intervention.
The change lives in the scaling mode package where Redis client settings are configured. The new option is controlled via the QUEUE_BULL_REDIS_RECONNECT_ON_FAILOVER environment variable and is enabled by default.
View Original GitHub DescriptionFact Check
Description
Backport of #25038 to 1.x.
Checklist for the author (@mfsiega) to go through.
- Review the backport changes
- Fix possible conflicts
- Merge to target branch
After this PR has been merged, it will be picked up in the next patch release for release track.
Original description
Summary
If the Redis server fails over, it will start to throw READONLY errors when you try to write because you're trying to write to a readonly server. In this case, the client should reconnect.
By default this doesn't happen, so we leave this disabled by default as well, but we expose an option to enable it.
Related Linear tickets, Github issues, and Community forum posts
Review / Merge checklist
- PR title and summary are descriptive. (conventions) <!-- **Remember, the title automatically goes into the changelog. Use `(no-changelog)` otherwise.** -->
- Docs updated or follow-up ticket created.
- Tests included. <!-- A bug is not considered fixed, unless a test is added to prevent it from happening again. A feature is not complete without tests. -->
- PR Labeled with
release/backport(if the PR is an urgent fix that needs to be backported)