Token exchange and embed login now expose Prometheus metrics

BGZStephen

·Apr 16, 2026·#28453feat: Add Prometheus counters for token exchange

n8n operators can now monitor token exchange and embed login health directly from Prometheus — catching misconfiguration, replay attacks, and JIT provisioning spikes without touching logs.

n8n operators running token exchange and embed login flows previously had no built-in way to monitor their health. Troubleshooting meant wading through application logs to find sporadic failures, replayed tokens, or unexpected user provisioning events.

Six new Prometheus counter families are now available whenever metrics are enabled. Token exchange requests track success and failure rates, with failures broken down by normalized reason codes (invalid signature, unknown key, token replay, and others). Embed login gets the same treatment. Two additional counters surface just-in-time user provisioning and external identity linking events.

Failure reason labels are stable, typed codes — not raw error message strings. This means dashboards won't break if error text changes, and cardinality stays bounded even under high failure volumes.

The embed auth controller also had a monitoring blind spot: authentication failures propagated silently with no event emitted. Now a try/catch wrapper fires an embed-login-failed event before re-throwing, giving operators visibility into the failure path alongside success metrics.

These metrics live in the CLI package and are always registered when the metrics endpoint is active.

View Original GitHub Description

Summary

Adds Prometheus counters for token exchange and embed login operations so operators can monitor the health of these authentication flows — detecting key misconfiguration, replay attacks, and JIT provisioning bursts without digging through logs.

New metrics (always registered when `N8N_METRICS=true`)

Metric	Labels	What it tracks
`n8n_token_exchange_requests_total`	`result: success\|failure`	Overall exchange success/failure rate
`n8n_token_exchange_failures_total`	`reason: <code>`	Failure breakdown by cause
`n8n_embed_login_requests_total`	`result: success\|failure`	Embed login success/failure rate
`n8n_embed_login_failures_total`	`reason: <code>`	Embed login failure breakdown by cause
`n8n_token_exchange_jit_provisioning_total`	—	Users JIT-provisioned via token exchange
`n8n_token_exchange_identity_linked_total`	—	External identities linked to existing users

Failure reason labels are stable codes normalised from error messages (invalid_signature, unknown_key, token_replay, token_too_long, token_near_expiry, invalid_format, missing_kid, missing_iss, invalid_claims, internal_error, role_not_allowed, other) — dashboards won't break if error message text changes.

Also: embed login failure visibility

The embed auth controller previously let errors propagate silently (no event emitted, no metric). This PR wraps handleLogin() in a try/catch that emits a new embed-login-failed event before re-throwing, closing the monitoring blind spot. The event is also wired into the log-streaming relay and audit event registry alongside the existing token exchange events.

How to test

# Start n8n with metrics enabled
N8N_METRICS=true n8n start

# Scrape metrics endpoint
curl http://localhost:5678/metrics | grep -E "token_exchange|embed_login"

# Expected output (all 6 counter families at 0 before any requests):
# n8n_token_exchange_requests_total{result="success"} 0
# n8n_token_exchange_requests_total{result="failure"} 0
# n8n_embed_login_requests_total{result="success"} 0
# n8n_embed_login_requests_total{result="failure"} 0
# n8n_token_exchange_jit_provisioning_total 0
# n8n_token_exchange_identity_linked_total 0

# After a failed token exchange attempt, verify labelled failure counters appear:
# n8n_token_exchange_failures_total{reason="unknown_key"} 1

Related Linear tickets, Github issues, and Community forum posts

https://linear.app/n8n/issue/IAM-475

Tests

Unit tests added in packages/cli/src/metrics/__tests__/prometheus-metrics.service.test.ts covering:

All 6 counters are registered on init() (unconditional, no config flag required)
result label combos (success/failure) are pre-seeded at 0 on startup
token-exchange-succeeded → increments success counter
token-exchange-failed → increments failure counter + maps error message to normalized reason label
Unknown failure reason falls through to 'other' (cardinality safety)
Role-related error strings ('not allowed', 'Unrecognized role', 'Cannot provision') map to 'role_not_allowed'
embed-login → increments embed login success counter
embed-login-failed → increments embed login failure counter + normalizes reason
token-exchange-user-provisioned → increments JIT provisioning counter
token-exchange-identity-linked → increments identity-linked counter

Embed controller test updated: failure path now asserts embed-login-failed is emitted and embed-login (success event) is not.

Review / Merge checklist

I have seen this code, I have run this code, and I take responsibility for this code.
PR title and summary are descriptive. (conventions)
Docs updated or follow-up ticket created.
Tests included.
PR Labeled with Backport to Beta, Backport to Stable, or Backport to v1 (if the PR is an urgent fix that needs to be backported)