Token exchange and embed login now expose Prometheus metrics
n8n operators can now monitor token exchange and embed login health directly from Prometheus — catching misconfiguration, replay attacks, and JIT provisioning spikes without touching logs.
n8n operators running token exchange and embed login flows previously had no built-in way to monitor their health. Troubleshooting meant wading through application logs to find sporadic failures, replayed tokens, or unexpected user provisioning events.
Six new Prometheus counter families are now available whenever metrics are enabled. Token exchange requests track success and failure rates, with failures broken down by normalized reason codes (invalid signature, unknown key, token replay, and others). Embed login gets the same treatment. Two additional counters surface just-in-time user provisioning and external identity linking events.
Failure reason labels are stable, typed codes — not raw error message strings. This means dashboards won't break if error text changes, and cardinality stays bounded even under high failure volumes.
The embed auth controller also had a monitoring blind spot: authentication failures propagated silently with no event emitted. Now a try/catch wrapper fires an embed-login-failed event before re-throwing, giving operators visibility into the failure path alongside success metrics.
These metrics live in the CLI package and are always registered when the metrics endpoint is active.
View Original GitHub Description
Summary
Adds Prometheus counters for token exchange and embed login operations so operators can monitor the health of these authentication flows — detecting key misconfiguration, replay attacks, and JIT provisioning bursts without digging through logs.
New metrics (always registered when N8N_METRICS=true)
| Metric | Labels | What it tracks |
|---|---|---|
n8n_token_exchange_requests_total | result: success|failure | Overall exchange success/failure rate |
n8n_token_exchange_failures_total | reason: <code> | Failure breakdown by cause |
n8n_embed_login_requests_total | result: success|failure | Embed login success/failure rate |
n8n_embed_login_failures_total | reason: <code> | Embed login failure breakdown by cause |
n8n_token_exchange_jit_provisioning_total | — | Users JIT-provisioned via token exchange |
n8n_token_exchange_identity_linked_total | — | External identities linked to existing users |
Failure reason labels are stable codes normalised from error messages (invalid_signature, unknown_key, token_replay, token_too_long, token_near_expiry, invalid_format, missing_kid, missing_iss, invalid_claims, internal_error, role_not_allowed, other) — dashboards won't break if error message text changes.
Also: embed login failure visibility
The embed auth controller previously let errors propagate silently (no event emitted, no metric). This PR wraps handleLogin() in a try/catch that emits a new embed-login-failed event before re-throwing, closing the monitoring blind spot. The event is also wired into the log-streaming relay and audit event registry alongside the existing token exchange events.
How to test
# Start n8n with metrics enabled
N8N_METRICS=true n8n start
# Scrape metrics endpoint
curl http://localhost:5678/metrics | grep -E "token_exchange|embed_login"
# Expected output (all 6 counter families at 0 before any requests):
# n8n_token_exchange_requests_total{result="success"} 0
# n8n_token_exchange_requests_total{result="failure"} 0
# n8n_embed_login_requests_total{result="success"} 0
# n8n_embed_login_requests_total{result="failure"} 0
# n8n_token_exchange_jit_provisioning_total 0
# n8n_token_exchange_identity_linked_total 0
# After a failed token exchange attempt, verify labelled failure counters appear:
# n8n_token_exchange_failures_total{reason="unknown_key"} 1
Related Linear tickets, Github issues, and Community forum posts
https://linear.app/n8n/issue/IAM-475
Tests
Unit tests added in packages/cli/src/metrics/__tests__/prometheus-metrics.service.test.ts covering:
- All 6 counters are registered on
init()(unconditional, no config flag required) resultlabel combos (success/failure) are pre-seeded at 0 on startuptoken-exchange-succeeded→ increments success countertoken-exchange-failed→ increments failure counter + maps error message to normalized reason label- Unknown failure reason falls through to
'other'(cardinality safety) - Role-related error strings (
'not allowed','Unrecognized role','Cannot provision') map to'role_not_allowed' embed-login→ increments embed login success counterembed-login-failed→ increments embed login failure counter + normalizes reasontoken-exchange-user-provisioned→ increments JIT provisioning countertoken-exchange-identity-linked→ increments identity-linked counter
Embed controller test updated: failure path now asserts embed-login-failed is emitted and embed-login (success event) is not.
Review / Merge checklist
- I have seen this code, I have run this code, and I take responsibility for this code.
- PR title and summary are descriptive. (conventions)
- Docs updated or follow-up ticket created.
- Tests included.
- PR Labeled with
Backport to Beta,Backport to Stable, orBackport to v1(if the PR is an urgent fix that needs to be backported)