Queue limit errors downgraded to warnings
Expected validation rejections like queue size limits are now logged as warnings instead of errors, reducing noise in monitoring and alerting systems.
Queue size limit errors were being logged at error level across the system, but these are expected validation rejections — not bugs. When a task can't be triggered because the queue is full, that's working as intended, not a failure.
The fix adds a logLevel property to ServiceValidationError in both the webapp and run-engine packages. All queue limit throws now explicitly set logLevel to "warn". The schedule engine detects these queue limit failures and categorizes them separately from system errors. The redis-worker reads the logLevel from thrown errors and logs at the appropriate level.
In the webapp's schedule engine, failed triggers now return an errorType field distinguishing "QUEUE_LIMIT" from "SYSTEM_ERROR". The schedule engine then logs queue limit warnings separately from other task failures, and reports them with a distinct error_type in metrics. Some noisy info-level logs in the run-queue were also downgraded to debug.
This reduces alert fatigue and makes logs more meaningful — errors represent actual problems, while warnings represent expected rejections.
View Original GitHub Description
Queue limit ServiceValidationErrors were being logged at error level. These are expected validation rejections, not bugs.
- Add logLevel property to ServiceValidationError (webapp + run-engine)
- Set logLevel: warn on all queue limit throws
- Schedule engine: detect queue limit failures and log as warn
- Redis-worker: respect logLevel on thrown errors