feat(DATAGO-130042): add local Entra ID JWT validation to auth middleware#1440
Closed
JKaram wants to merge 1 commit into
Closed
feat(DATAGO-130042): add local Entra ID JWT validation to auth middleware#1440JKaram wants to merge 1 commit into
JKaram wants to merge 1 commit into
Conversation
…ware Add a third branch in shared/auth/middleware.py that validates Entra ID JWTs locally via JWKS, positioned between the existing sam_access_token path and the IdP /user_info fallback. Motivation: CI-minted service-principal tokens (via GitHub OIDC federation) are app-only and lack a `sub` claim, so the existing fallback path's call to Microsoft Graph /oidc/userinfo rejects them. Local JWKS verification sidesteps Graph entirely. Security notes: - Algorithm whitelist RS256 blocks alg=none and HS256 key confusion - aud/tid mismatch post-kid-resolution returns 401 (confused-deputy defense), not fall-through - App-only tokens get a `.invalid` TLD sentinel email (RFC 2606) so they can never coincidentally match allowed_domains/shared_user_emails - No new dependencies; PyJWT + cryptography already present Includes 29 unit tests covering validator outcomes, middleware branch ordering, and soft-auth probe behaviour.
✅ FOSSA Guard: Licensing (
|
✅ FOSSA Guard: Vulnerability (
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




What is the purpose of this change?
Adds a third authentication branch to
shared/auth/middleware.pythat validates Microsoft Entra ID JWTs locally via JWKS, positioned between the existingsam_access_tokenpath and the IdP/user_infofallback.Motivation. The existing IdP fallback path routes bearer tokens through the OAuth proxy's
/user_infoendpoint, which calls Microsoft Graph's/oidc/userinfo. Graph rejects app-only tokens (nosubclaim — onlyoidandappid) because the OIDCuserinfospec requiressub. This blocks any caller authenticating as a service principal, e.g. CI workflows minting tokens via GitHub OIDC federation.This PR adds a local-verification branch so app-only Entra ID tokens can reach the backend without touching Graph. Users with real Graph-compatible tokens are unaffected — they still flow through the existing IdP path.
Jira: DATAGO-130042
Companion PR
Consumed by SolaceDev/solace-chat#428, which:
Authorization: Bearer <token>for smoke tests against stagingaad_tenant_id/aad_audience/aad_issuer_overrideconfig keys to itswebui.yamlthat activate the branch introduced hereWithout this PR, those config keys are defined but have no consumer — the bearer would still 401 against
/user_info.How is this accomplished?
New module:
src/solace_agent_mesh/shared/auth/jwt_validator.pyGeneric local JWKS-based JWT verification, with Entra ID as the first supported issuer (module deliberately named
jwt_validatorrather thanaad_jwt_validatorso a future Okta/Auth0/etc. local-verify caller can reuse the shape). Public surface:AadValidatorConfig— tenant, audience, optional issuer/JWKS overrides, 60s leeway.accepted_audiences()normalises betweenapi://<guid>and bare<guid>so either CI configuration works.AadTokenValidator.validate(token) -> AadValidationOutcome— three-outcome tagged union (enum, not string):VALID,NOT_AAD,INVALID. No exceptions for control flow.AadClaims— structured view of the validated payload with anis_service_principalflag derived from theidtypclaim (falls back to sub/oid heuristic).build_validator()— factory.PyJWKClientconstruction is lightweight (no network I/O); the firstvalidate()call fetches JWKS and caches it for 300s.Three-outcome contract:
VALIDiss/aud/tid/exp/nbfall passrequest.state.user, 200 pathNOT_AADPyJWKClientError(kid unknown), non-JWT shape, issuer mismatchINVALIDThe
aud/tidmismatch →INVALID(notNOT_AAD) rule is a confused-deputy defense. Once akidresolves in Entra ID's JWKS, the token is claiming to be for us — any failure past that point is a hard reject rather than handing the token to another branch that might happen to accept it.Middleware integration (
src/solace_agent_mesh/shared/auth/middleware.py)_handle_authenticated_request:sam_access_token→ Entra ID JWT → IdP. The AAD branch only runs when bothaad_tenant_idandaad_audienceare configured._get_or_build_aad_validator(tenant, audience)helper caches one validator per(tenant, audience)tuple on the middleware instance. Lazy import ofjwt_validatorkeepsPyJWKClient/cryptographyoff the import path for deployments that don't opt in.infowhen enabled,warning+ disables the branch if only one of the pair is set (fail noisy, not silent 401s).INVALIDoutcome setsrequest.state.auth_probe = Trueand proceeds, matching the existing pattern for other branches.Identity shape
user_id= first ofsub,oid,appidemail= first ofemail,preferred_username,upn, elsesvc-principal+{appid}@aad-app-only.invalid. The IANA-reserved.invalidTLD (RFC 2606) means the sentinel can never coincidentally match anallowed_domains/shared_user_emailsentry.is_service_principal+service_principal_idadded torequest.state.userso downstream code can branch on identity type without regexing the email.auth_method = "aad_jwt"flag for downstream branching.Dependencies: no new ones.
pyjwt>=2.12.0andcryptography==46.0.7are already inpyproject.toml.Security notes
Verified non-issues (confirmed during security review):
algorithms=("RS256",)passed explicitly tojwt.decode. PyJWT 2.x removed auto-detect; passingalgorithmsis mandatory and RS256-only blocks both.jku/x5uheader injection —PyJWKClientignores these; onlykidis used against the configured URL.share.py::can_be_accessed_by_user,shared_user_emails,project_service.py::get_accessible_projects,routers/auth.py,routers/users.py,local_file_identity_service.py. All use exact-match equality or display-only; the.invalid-TLD sentinel is non-matchable across all of them._handle_authenticated_requestis not called whenfrontend_use_authorization=false(short-circuit earlier in the file). The new branch is unreachable in dev deployments.Tests
29 new unit tests — all passing, and the full community suite (4959/4959) remains green.
tests/unit/auth/test_jwt_validator.py(20 tests) — validator in isolation. Generates an RSA keypair viacryptography, signs test tokens with PyJWT, stubsPyJWKClient.get_signing_key_from_jwt:is_service_principalflag)INVALIDalg=none, HS256 key-confusion (hand-forged withhmac.new) →INVALIDINVALID(post-kid-resolution tightening)tid→INVALIDNOT_AADNOT_AADNOT_AADtid→INVALIDaccepted_audiences()normalisation both directions (api://<guid>↔ bare GUID)tests/unit/auth/test_middleware_aad.py(9 tests) — middleware integration. ReusesMockComponent/MockTrustManagerpatterns, stubs the validator to return canned outcomes:VALIDuser token /VALIDapp-only token (asserts.invalidTLD sentinel +is_service_principal)INVALID→ 401 JSON response, IdP path not calledINVALIDunder soft-auth →request.state.auth_probe = True, returnsFalseNOT_AAD→ IdP fall-through path runsAnything reviews should focus on/be aware of?
aad_tenant_id/aad_audienceunset (default), the branch is skipped entirely — zero behaviour change for existing deployments. The lazy import means even thePyJWKClientdependency isn't touched.aud/tidmismatch = hard 401). Deliberate — a previous draft of this plan fell through on audience mismatch, which relied on the downstream proxy also enforcing audience. Tighter rule here guards against proxy misconfiguration..invalidTLD sentinel. Intentionally non-matchable. Every email-keyed authorization call site was audited before merge; findings are in the plan doc under "Risks §A".sam_access_tokenlog downgrade (warning → debug). Once this lands, every real Entra ID token passes through the sam-tokenexceptclause before succeeding in the new branch. Warning-level volume would explode. Downgrade prevents that — failures are now only loud when they end up as 401s.sts.windows.netissuer, Azure sovereign clouds, conditional access / revocation checks, lifting the branch into a formal provider class. All tracked as follow-ups in the plan.solace-chat#428) must have its staging pod configured withAAD_TENANT_ID,AAD_AUDIENCE,FRONTEND_USE_AUTHORIZATION=truebefore the cron will succeed end-to-end.Verification
After merge + staging deploy, trigger
playwright-smoke-tests.yamlonsolace-chatviaworkflow_dispatch— expected: end-to-end green (OIDC mint → bearer → local JWKS verify → 200).