Webhook Fanout
The webhook fanout system receives inbound webhooks from third parties (e.g. Juspay payment events), then durably re-delivers each event to any number of DB-configurable downstream HTTP endpoints, with per-endpoint retries.
It lives in the admin-api module under com.elivaas.pms.webhook. The transport is SNS → SQS (decoupling + buffering) and delivery is orchestrated by a Temporal Cloud workflow, so retries, backoff and durability are handled by Temporal rather than hand-rolled.
How It Works
Third party ──POST /api/webhooks/{source} (HTTP Basic Auth)──► WebhookInboundController
│ verify Basic Auth against webhook_inbound_credential (per source)
│ idempotency check on (source, external_event_id)
│ persist webhook_event (status RECEIVED) ──► publish to SNS ──► 200 OK
▼
SNS topic ──► SQS queue ──► WebhookSqsConsumer (@SqsListener)
│ start Temporal workflow, workflowId = eventId (dedupes redeliveries)
▼
WebhookFanoutWorkflow (Temporal Cloud, task queue "WEBHOOKS")
├─ prepare: resolve endpoints for (source, eventType); create PENDING webhook_delivery rows
├─ deliver (one activity per endpoint, concurrent, each with its own RetryPolicy)
│ POST payload + auth (NONE | BASIC | BEARER | HMAC) + custom headers
│ 2xx → SUCCESS · 4xx → non-retryable fail · 5xx/timeout → retry
└─ finishEvent: COMPLETED | PARTIAL | FAILED
The inbound endpoint always answers quickly: it returns 200 as soon as the event is safely persisted and published, so the sender does not retry. Downstream delivery happens asynchronously.
Routing
Each downstream webhook_endpoint is keyed by source plus an optional event type:
- A row with a specific
event_typereceives only that event type. - A row with
event_type = NULLis a catch-all that receives every event for the source.
An event fans out to the union of the matching active endpoints.
Retries & Dead-Lettering
Two independent layers:
| Leg | Mechanism | Exhaustion |
|---|---|---|
| Ingest (SNS → SQS → consumer) | SQS redelivery on consumer error | After maxReceiveCount (5) the message moves to the webhook-fanout-dlq DLQ |
| Delivery (per endpoint) | Temporal RetryOptions (exponential backoff, max_attempts per endpoint) | The webhook_delivery row is left FAILED and is replayable |
A 4xx from an endpoint is treated as a configuration/contract error and is not retried; 5xx, timeouts and connection errors are retried.
Idempotency
webhook_event has a partial unique index on (source, external_event_id). A duplicate inbound delivery (same source + external id) is detected on receipt and is not re-published. The Temporal workflow id is the event id, so a redelivered SQS message cannot start a second fanout for the same event.
Data Model
Created by migration 052-add-webhook-fanout-tables.sql:
| Table | Grain | Holds |
|---|---|---|
webhook_inbound_credential | per source | Basic Auth username + bcrypt password hash used to verify inbound calls |
webhook_endpoint | per downstream target | URL, method, auth config, custom headers, max_attempts, timeout_seconds, routing (source + nullable event_type) |
webhook_event | per received event | raw payload + headers, status, external_event_id, workflow_id |
webhook_delivery | per (event, endpoint) | status, attempts, last response status/body, last error |
Outbound Authentication
Each endpoint chooses how its delivery requests are authenticated:
auth_type | Header added |
|---|---|
NONE | — |
BASIC | Authorization: Basic … from auth_username + auth_secret |
BEARER | Authorization: Bearer <auth_secret> |
HMAC | X-Webhook-Signature: <base64(HMAC-SHA256(body, signing_secret))> |
Custom static headers (custom_headers, a JSON object) are added for every auth type.
See Configuration & Operations for environment setup and the admin API.