AI builder eval gains workflow debugging support

A new --keep-workflows flag preserves built workflows after evaluation runs, making it easier to debug AI builder behavior. The change also fixes n8n API compliance and improves error handling.
When evaluating AI-built workflows, the old system deleted everything after each run — developers had no way to inspect what the AI actually generated. A new --keep-workflows flag lets developers preserve workflows for post-mortem debugging without manually saving them.
The changes also address an API compliance issue: n8n now requires workflows to be archived before deletion. The client library now calls the archive endpoint before hard-deleting, preventing errors in cleanup routines.
Error handling in EvalExecutionService was also improved — execution failures now return proper evaluation results instead of propagating 500 errors.
In the AI builder package (packages/@n8n/instance-ai), the CLI flag flows through the argument parser into the test harness runner. The n8n-client handles the archive-before-delete sequence, and scenario hints now pass through to pin data generation so mock nodes reflect the actual evaluation context.
View Original GitHub Description
Summary
- Add
--keep-workflowsCLI flag to preserve built workflows after evaluation for debugging - Fix workflow cleanup: n8n now requires archiving before deletion —
deleteWorkflowarchives first - Catch workflow execution errors in
EvalExecutionServiceand return proper eval results instead of 500s - Pass scenario hints to bypass pin data generation so AI/LangChain node mocks reflect the scenario
- Add Slack channel IDs to daily-slack-summary test case prompt for better builder node configuration
Related Linear ticket
https://linear.app/n8n/issue/TRUST-32
Test plan
-
pnpm typecheck— bothcliandinstance-aiclean -
pnpm lint— both packages clean - CLI tests: 261 passed (13 suites)
- eval-mock-helpers tests: 20 passed
- Manual workflow eval run: build + scenarios + archive + delete working end-to-end
🤖 Generated with Claude Code