Reactive OAuth2 proxy/service built with Spring Boot 4 + Java 25, using WebFlux, R2DBC, Keycloak, and OTLP-first observability.
API.md- endpoint reference and response/error contractSECURITY.md- security policy and runtime security modelCONTRIBUTING.md- contribution workflow and coding/testing standardsCHANGELOG.md- notable project changes
This project keeps the same business scenarios as jwt-demo, but the implementation is fully reactive:
- WebFlux controllers (
Mono<AppResponse<...>>) - R2DBC + PostgreSQL persistence
- Async request lifecycle (
PENDING -> PROCESSING -> COMPLETED|FAILED) - Security chain with Opaque Token Introspection + DPoP + Rate Limiting
This project is licensed under the MIT License. See LICENSE for details.
- Username/password login, token refresh, logout (
/api/auth/*) - Opaque token introspection for protected APIs
- DPoP support for auth endpoints and protected endpoints
- Role-based authorization (
CLIENT_CREATE,CLIENT_GET,CLIENT_SEARCH,UPDATE_BALANCE) - Async client creation request queue + status endpoint
- Account balance updates (pessimistic and optimistic flows)
- OTLP-first observability:
- traces: Spring Boot -> OTel Collector -> Tempo
- logs: Spring Boot -> OTel Collector -> Loki
- metrics: Prometheus scrapes
/actuator/prometheus
- Java 25
- Spring Boot 4.0.3
- Spring WebFlux
- Spring Security OAuth2 Resource Server
- PostgreSQL + Flyway
- Spring Data R2DBC
- Bucket4j + Caffeine
- OpenTelemetry + Micrometer + Prometheus
- Grafana + Loki + Tempo + OTel Collector
- Testcontainers + WireMock + Awaitility
Create .env from the template:
Set-Location <repo-root>
Copy-Item .env.example .envSet real values for secrets in .env:
APP_DB_PASSWORDKEYCLOAK_ADMIN_PASSWORDKEYCLOAK_RESOURCE_CLIENT_SECRETGRAFANA_ADMIN_PASSWORD
Resource server introspection credentials must match the Keycloak realm import (src/test/resources/keycloak/realm-export.json):
KEYCLOAK_RESOURCE_CLIENT_IDKEYCLOAK_RESOURCE_CLIENT_SECRET
Set-Location <repo-root>
docker compose up -d --buildCheck status:
docker compose psStop:
docker compose downStop and remove volumes:
docker compose down -vWhen running via mvn spring-boot:run, .env is not auto-loaded by Spring Boot.
Set required variables in your shell first (note the host Keycloak URL):
$env:KEYCLOAK_RESOURCE_CLIENT_ID = "resource-server"
$env:KEYCLOAK_RESOURCE_CLIENT_SECRET = "<secret-from-realm-export-or-keycloak>"
$env:KEYCLOAK_AUTH_SERVER_URL = "http://localhost:8080"Set-Location <repo-root>
docker compose up -d postgres keycloak
mvn spring-boot:runapp- application (:8081)postgres- database (:5432)keycloak- auth server (:8080)prometheus- metrics (:9090)grafana- dashboards (:3000)loki- logs (:3100)tempo- traces (:3200)otel-collector- OTLP ingest/export
- API base:
http://localhost:8081 - Swagger UI:
http://localhost:8081/swagger-ui.html - OpenAPI JSON:
http://localhost:8081/v3/api-docs - Prometheus:
http://localhost:9090 - Grafana:
http://localhost:3000 - Loki readiness:
http://localhost:3100/ready - Tempo:
http://localhost:3200 - App metrics endpoint:
http://localhost:8081/actuator/prometheus
The OpenAPI spec is generated at runtime.
Helpful links:
- Swagger UI:
http://localhost:8081/swagger-ui.html - OpenAPI JSON:
http://localhost:8081/v3/api-docs
Tip: use Swagger UI for quick token-based checks after login (/api/auth/login), then call protected endpoints with Bearer or DPoP authorization.
Public routes:
/api/auth/**/v3/api-docs/**,/swagger-ui/**,/swagger-ui.html/actuator/prometheus
All other routes require authentication.
Authorization options:
Authorization: Bearer <access_token>Authorization: DPoP <access_token>andDPoP: <proof-jwt>
| Endpoint | Method | Access | Required Role |
|---|---|---|---|
/api/auth/login |
POST |
Public | - |
/api/auth/refresh |
POST |
Public | - |
/api/auth/logout |
POST |
Public | - |
/api/clients |
POST |
Protected | CLIENT_CREATE |
/api/clients/{id} |
GET |
Protected | CLIENT_GET |
/api/clients/search |
GET |
Protected | CLIENT_SEARCH |
/api/requests/{id} |
GET |
Protected | CLIENT_CREATE |
/api/accounts/balance/pessimistic |
POST |
Protected | UPDATE_BALANCE |
/api/accounts/balance/optimistic |
POST |
Protected | UPDATE_BALANCE |
/api/accounts/client/{clientId} |
GET |
Protected | CLIENT_GET |
| Endpoint | Method | Required Role | Accepted Auth Scheme |
|---|---|---|---|
/api/clients |
POST |
CLIENT_CREATE |
Bearer or DPoP |
/api/requests/{id} |
GET |
CLIENT_CREATE |
Bearer or DPoP |
/api/clients/{id} |
GET |
CLIENT_GET |
Bearer or DPoP |
/api/clients/search |
GET |
CLIENT_SEARCH |
Bearer or DPoP |
/api/accounts/client/{clientId} |
GET |
CLIENT_GET |
Bearer or DPoP |
/api/accounts/balance/pessimistic |
POST |
UPDATE_BALANCE |
Bearer or DPoP |
/api/accounts/balance/optimistic |
POST |
UPDATE_BALANCE |
Bearer or DPoP |
sequenceDiagram
actor C as Client
participant S as Spring Boot (AuthController)
participant K as Keycloak
C->>S: 1) POST /api/auth/login<br/>{username, password, clientId, clientSecret}
S->>K: 2) KeycloakReactiveAuthService.login()
K->>K: 3) POST /realms/my-realm/protocol/openid-connect/token<br/>grant_type=password<br/>username, password<br/>client_id, client_secret
K-->>S: 4) 200 OK<br/>{access_token, refresh_token}
S-->>C: 5) AppResponse(code=0, data=tokens)
sequenceDiagram
actor C as Client
participant S as Spring Boot (AuthController)
participant K as Keycloak
C->>S: 1) POST /api/auth/refresh<br/>{refreshToken, clientId, clientSecret}
S->>K: 2) KeycloakReactiveAuthService.refresh()
K->>K: 3) POST /realms/my-realm/protocol/openid-connect/token<br/>grant_type=refresh_token<br/>refresh_token<br/>client_id, client_secret
K-->>S: 4) 200 OK<br/>{new_access_token, new_refresh_token}
S-->>C: 5) AppResponse(code=0, data=tokens)
sequenceDiagram
actor C as Client
participant S as Spring Boot (AuthController)
participant K as Keycloak
C->>S: 1) POST /api/auth/logout<br/>{refreshToken, clientId, clientSecret}
S->>K: 2) KeycloakReactiveAuthService.logout()
K->>K: 3) POST /realms/my-realm/protocol/openid-connect/logout<br/>client_id, client_secret<br/>refresh_token
K-->>S: 4) 200 OK (Keycloak behavior)
S-->>C: 5) AppResponse(code=0)
POST /api/clients does not create a client synchronously.
sequenceDiagram
actor C as Caller
participant A as API (/api/clients)
participant D as request table (PostgreSQL)
participant W as Request Worker
C->>A: POST /api/clients
A->>A: Validate payload
A->>D: INSERT type=CLIENT_CREATE, status=PENDING
A-->>C: AppResponse(code=0, data={requestId})
loop Poll until terminal status
C->>A: GET /api/requests/{id}
A->>D: SELECT status by id
A-->>C: status=PENDING|PROCESSING|COMPLETED|FAILED
end
W->>D: Reclaim stale PROCESSING rows
W->>D: Claim PENDING batch (FOR UPDATE SKIP LOCKED)
W->>D: UPDATE status=PROCESSING
alt Success
W->>D: UPDATE status=COMPLETED, response_json
else Failure
W->>D: UPDATE status=FAILED, error_json
end
For multi-instance safety, stale PROCESSING reclaim is implemented and indexed (V2__add_request_reclaim_index.sql).
- Keycloak realm is auto-imported from
src/test/resources/keycloak/realm-export.json. - Demo users included in realm import:
user/passwordadmin/admin
- Demo clients in realm import include:
spring-appresource-server(used for opaque token introspection)
- Client search supports trigram indexes if
pg_trgmextension exists; migration creates indexes conditionally. - Async worker tunables are configurable in
application.properties:app.request.worker.batch-sizeapp.request.worker.max-concurrency(env:APP_REQUEST_WORKER_MAX_CONCURRENCY)app.request.worker.interval-msapp.request.worker.retry.max-attemptsapp.request.worker.retry.backoff-msapp.request.worker.processing-timeout
- Reclaim path is optimized by index:
idx_request_status_type_status_changed_aton(status, type, status_changed_at)
- Basic perf smoke scenario with before/after metric diff:
- script:
ops/perf/perf-smoke.ps1 - usage docs:
ops/perf/README.md - output reports:
target/perf/perf-smoke-*.{json,md}
- script:
Telemetry pipelines:
- traces:
management.otlp.tracing.endpoint - logs:
management.otlp.logging.endpoint - metrics: Prometheus scrape
/actuator/prometheus
Custom business/security metrics:
auth.login{result}- login attempts (success/failure)auth.refresh{result}- refresh attempts (success/failure)auth.logout{result}- logout attempts (success/failure)security.http.responses{status,endpoint_group}- counters for401/403responses from security handlerssecurity.dpop.rejected{reason}- DPoP rejection counters grouped by normalized reason (scheme_required,proof_missing,replay_detected, etc.)security.opaque_introspection.cache{result}- opaque token introspection cachehit/misssecurity.rate_limit.decisions{rule_id,key_strategy,decision}- rate limit decisions (allowed/rejected) per rulerequest.worker.reclaimed_count- number of stalePROCESSINGrequests reclaimed back toPENDINGrequest.worker.stale_processing_age- age of the oldest reclaimed stalePROCESSINGrequestrequest.worker.claim_lag_seconds- claim lag distribution (request creation -> worker claim)request.worker.claim_batch_size- distribution of claimed batch size per worker iterationrequest.worker.processing_duration{terminal_status}- processing duration timer forCOMPLETED/FAILEDrequest.worker.terminal_status{status}- terminal status counters (COMPLETED/FAILED)
Recommended settings:
management.logging.export.otlp.enabled=truemanagement.otlp.metrics.export.enabled=falsemanagement.tracing.sampling.probability=1.0
flowchart LR
subgraph App[Spring Boot jwt-demo-reactive]
A1[HTTP metrics\nActuator /prometheus]
A2[Traces OTLP\nmanagement.otlp.tracing.endpoint]
A3[Logs OTLP\nmanagement.otlp.logging.endpoint]
end
subgraph Infra[Observability Infra]
C[OTel Collector]
T[Tempo]
L[Loki]
P[Prometheus]
G[Grafana]
end
A1 -->|pull /actuator/prometheus| P
A2 -->|OTLP traces| C
A3 -->|OTLP logs| C
C -->|traces| T
C -->|logs| L
P --> G
T --> G
L --> G
The following alerts are provisioned from ops/grafana/provisioning/alerting/alerts.yml.
| Alert UID | Signal | Threshold | For | Severity |
|---|---|---|---|---|
jwt-high-5xx-rate |
5xx error rate | > 5% |
5m |
warning |
jwt-high-p95-latency |
p95 HTTP latency | > 800 ms |
5m |
warning |
jwt-high-cpu-saturation |
process CPU usage | > 90% |
10m |
critical |
jwt-high-heap-saturation |
JVM heap usage | > 90% |
10m |
critical |
jwt-high-rate-limit-reject-ratio |
rate-limit reject ratio | > 20% |
5m |
warning |
jwt-dpop-reject-spike |
DPoP rejected requests | > 20 events / 5m |
5m |
warning |
jwt-worker-failed-terminal-ratio |
async worker FAILED terminal ratio | > 10% |
10m |
warning |
On-call quick actions:
- Open Grafana and locate the firing rule in Alerting.
- Check RED and saturation dashboards for trend confirmation.
- For request-level incidents, capture
X-Trace-Id(when available) and pivot to Tempo and Loki. - In Loki, filter by the same trace id and inspect related error/security logs.
- In Tempo, inspect the slow/error spans and identify the failing upstream or handler.
- For
criticalalerts (CPU/Heap), page immediately if sustained beyond the configuredforwindow. - For
warningalerts, escalate if impact persists for two consecutive evaluation windows.
PromQL examples for custom metrics:
- Rate-limit reject ratio (5m):
100 * sum(rate(security_rate_limit_decisions_total{decision="rejected"}[5m])) / clamp_min(sum(rate(security_rate_limit_decisions_total[5m])), 1) - DPoP reject spike (5m):
sum(increase(security_dpop_rejected_total[5m])) - Worker failed ratio (10m):
100 * sum(rate(request_worker_terminal_status_total{status="FAILED"}[10m])) / clamp_min(sum(rate(request_worker_terminal_status_total[10m])), 1)
Run unit tests:
mvn testRun integration tests:
mvn verifyMain integration suites:
AuthControllerITAuthValidationITKeycloakIntegrationITKeycloakNegativeITDpopIntegrationITRateLimitingITSecurityChainRegressionITRequestIntegrationITRequestWorkerRetryITRequestWorkerReclaimITRequestWorkerMultiInstanceITAccountIntegrationIT
src/main/java/lt/satsyuk/controller- REST entry pointssrc/main/java/lt/satsyuk/service- business logicsrc/main/java/lt/satsyuk/repository- R2DBC repositoriessrc/main/resources/db/migration- Flyway migrationsops/- observability configs (Prometheus/Loki/Tempo/OTel/Grafana)docker-compose.yml- local infrastructure stack
401 invalid_clienton protected API: verifyKEYCLOAK_RESOURCE_CLIENT_ID/KEYCLOAK_RESOURCE_CLIENT_SECRET.403on protected API: verify token roles inrealm_access.roles.- No logs/traces in Grafana: verify
MANAGEMENT_OTLP_TRACING_ENDPOINTandMANAGEMENT_OTLP_LOGGING_ENDPOINT. - Integration tests fail without Docker: start Docker Desktop before
mvn verify.
- For DPoP-bound tokens, send both headers:
Authorization: DPoP <access_token>DPoP: <proof-jwt>
- Validate proof claims and binding:
htmandhtumust match the exact request method and URLiatmust be within allowed time windowjtimust be unique (replay protection)athmust match the access token hash- token
cnf.jktmust match proof key thumbprint
- When the API returns
X-Trace-Id, use it as the primary correlation key. - In Loki, filter by trace id from logs (MDC includes
traceId/spanId). - In Tempo, search by the same trace id to inspect span timeline.
- This is the fastest path to diagnose
401,403, and429scenarios across API + security filters.