AI builder eval gains workflow debugging support

JoseBra

·Apr 7, 2026·#28129feat(ai-builder): Add --keep-workflows flag and fix eval execution errors (no-changelog)

A new --keep-workflows flag preserves built workflows after evaluation runs, making it easier to debug AI builder behavior. The change also fixes n8n API compliance and improves error handling.

When evaluating AI-built workflows, the old system deleted everything after each run — developers had no way to inspect what the AI actually generated. A new --keep-workflows flag lets developers preserve workflows for post-mortem debugging without manually saving them.

The changes also address an API compliance issue: n8n now requires workflows to be archived before deletion. The client library now calls the archive endpoint before hard-deleting, preventing errors in cleanup routines.

Error handling in EvalExecutionService was also improved — execution failures now return proper evaluation results instead of propagating 500 errors.

In the AI builder package (packages/@n8n/instance-ai), the CLI flag flows through the argument parser into the test harness runner. The n8n-client handles the archive-before-delete sequence, and scenario hints now pass through to pin data generation so mock nodes reflect the actual evaluation context.

View Original GitHub Description

Summary

Add --keep-workflows CLI flag to preserve built workflows after evaluation for debugging
Fix workflow cleanup: n8n now requires archiving before deletion — deleteWorkflow archives first
Catch workflow execution errors in EvalExecutionService and return proper eval results instead of 500s
Pass scenario hints to bypass pin data generation so AI/LangChain node mocks reflect the scenario
Add Slack channel IDs to daily-slack-summary test case prompt for better builder node configuration

Related Linear ticket

https://linear.app/n8n/issue/TRUST-32

Test plan

pnpm typecheck — both cli and instance-ai clean
pnpm lint — both packages clean
CLI tests: 261 passed (13 suites)
eval-mock-helpers tests: 20 passed
Manual workflow eval run: build + scenarios + archive + delete working end-to-end

🤖 Generated with Claude Code