Skip to content
lazy devs
5 min readLazy Devs

Webhooks done right

How to build webhook receivers that survive retries, duplicates, and 3am incidents: verify signatures, return fast, process async, stay idempotent.

Most webhook bugs do not show up in the demo. They show up three weeks later when Stripe retries a payment event you already processed, and a customer gets charged twice in your ledger. Webhooks look trivial (it is just an HTTP POST) but the part that bites you is everything around the handler: signatures, retries, ordering, and the fact that the sender does not care about your database.

Here is how to build a receiver that holds up in production.

The four rules that actually matter

If you only remember four things, remember these. Verify the signature before you trust the body. Return a 2xx fast. Process the work asynchronously. Make every handler idempotent. Everything below is just the detail behind those four.

Verify the signature first, on the raw body

The single most common mistake is verifying a signature against a parsed and re-serialized body. Signatures are computed over the exact bytes that were sent. If your framework parses JSON and you re-stringify it, key ordering and whitespace change, and the signature will not match.

In Next.js App Router, read the raw text before anything touches it:

// app/api/webhooks/stripe/route.ts
import { NextRequest } from "next/server";
import Stripe from "stripe";
 
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
const webhookSecret = process.env.STRIPE_WEBHOOK_SECRET!;
 
export async function POST(req: NextRequest) {
  const body = await req.text(); // raw bytes, do NOT req.json()
  const signature = req.headers.get("stripe-signature");
 
  if (!signature) {
    return new Response("Missing signature", { status: 400 });
  }
 
  let event: Stripe.Event;
  try {
    event = stripe.webhooks.constructEvent(body, signature, webhookSecret);
  } catch (err) {
    // Bad signature means an unauthenticated caller. Reject loudly.
    return new Response("Invalid signature", { status: 400 });
  }
 
  await enqueue(event); // see below
  return new Response("ok", { status: 200 });
}

If you roll your own verification (some providers do not ship an SDK helper), use an HMAC and a constant-time compare so you do not leak timing information, the same care any API and backend engineering work demands:

import { createHmac, timingSafeEqual } from "node:crypto";
 
function verify(rawBody: string, header: string, secret: string): boolean {
  const expected = createHmac("sha256", secret).update(rawBody).digest("hex");
  const a = Buffer.from(expected);
  const b = Buffer.from(header);
  return a.length === b.length && timingSafeEqual(a, b);
}

A plain === works functionally, but timingSafeEqual is the idiomatic choice. Reject on a length mismatch first because it throws on unequal lengths.

Return 2xx fast, do the work later

Webhook senders have timeouts, usually a few seconds. If your handler calls three downstream APIs, sends an email, and updates four tables before responding, you will eventually blow past that timeout. The sender then treats the request as failed and retries, even though your code actually ran. Now you have the same event twice.

The fix is to split receipt from processing. The handler does the minimum: verify, persist the raw event, return 200. A worker picks up the persisted event and does the real work, which is exactly where background jobs and queues earn their keep.

async function enqueue(event: Stripe.Event) {
  // Insert the event. The unique constraint on event id makes this idempotent.
  await db.query(
    `INSERT INTO webhook_events (id, type, payload, status)
     VALUES ($1, $2, $3, 'pending')
     ON CONFLICT (id) DO NOTHING`,
    [event.id, event.type, event.data]
  );
}

That ON CONFLICT (id) DO NOTHING is doing a lot of work. It is your first line of defense against duplicates, and it is enforced by the database rather than by application logic that you hope runs correctly under concurrency.

Idempotency is not optional

Every major provider documents that webhooks are delivered at least once, not exactly once. Stripe, GitHub, Shopify, and the rest will all occasionally send you the same event twice, sometimes seconds apart. Your handler must produce the same result whether it runs once or five times.

The cleanest pattern is a dedupe table keyed on the provider's event id. The table above already gives you that. When the worker processes a row, it does the side effect and the status update in one transaction:

BEGIN;
 
UPDATE webhook_events
SET status = 'processing'
WHERE id = $1 AND status = 'pending'
RETURNING id;
-- If zero rows returned, another worker grabbed it. Bail out.
 
-- ... do the actual work, e.g. mark an order paid ...
UPDATE orders SET paid_at = now() WHERE id = $2 AND paid_at IS NULL;
 
UPDATE webhook_events SET status = 'done', processed_at = now() WHERE id = $1;
 
COMMIT;

The AND status = 'pending' clause on the claim, plus RETURNING, gives you a cheap optimistic lock so two workers do not process the same event. The paid_at IS NULL guard means even if the same logical action runs twice, the second one is a no-op rather than a double effect.

Do not key idempotency on payload contents or timestamps. The provider's event id is the only stable handle you get.

Order is not guaranteed

A subscription.updated can arrive before subscription.created. A delete can land before the create it logically follows. Providers parallelize delivery and retry on their own schedule, so the order events hit your endpoint is not the order they happened.

Two defenses. First, do not assume a prior event ran. If subscription.updated arrives and you have no subscription row, upsert one from the event payload rather than throwing. Second, when a payload includes a version or sequence field (Stripe events carry a created timestamp, many objects carry an updated-at), ignore events that are older than the state you already have:

UPDATE subscriptions
SET status = $1, updated_at = $2
WHERE id = $3 AND updated_at < $2;

If the incoming event is stale, the WHERE clause skips the write and you keep the newer state. This is the same last-write-wins idea you would use for any eventually consistent sync.

When processing fails

Two failure modes need different responses. If verification fails or the payload is malformed, return a 4xx. The sender will not retry a 400, which is correct because retrying will not fix a bad signature. If your downstream is temporarily down (database unreachable, a dependency timing out), you have a choice: return a 5xx and let the provider retry on its schedule, or always return 200 and rely on your own worker retries.

For anything beyond a toy, prefer your own retry loop. Provider retry windows vary and you do not control them. Once the raw event is safely persisted, your worker owns retries with backoff, and you get a dead-letter row after N attempts that an engineer can inspect:

UPDATE webhook_events
SET status = 'failed', attempts = attempts + 1, last_error = $2
WHERE id = $1;

A simple attempts column and a query for status = 'failed' AND attempts < 5 running every minute covers most needs without a queue product.

Things that save you at 3am

A few small habits pay off when something breaks.

Store the raw payload, not just the parsed fields you currently use. When a new bug appears you will want to replay events, and you cannot replay what you threw away.

Log the event id and type on every request. When support says "the order did not update," you want to grep for one id and see the full path.

Use separate endpoints or a routing key per provider. Mixing Stripe and GitHub events through one handler with a big switch statement turns into a mess fast, and the signature secrets are different anyway.

Build a replay tool early. A small script that reads a webhook_events row and re-runs the worker against it turns a production incident from a panic into a chore.

Takeaway

Treat the webhook endpoint as an untrusted, at-least-once, out-of-order delivery channel, because that is exactly what it is. Verify the raw body, persist fast, return 200, and let an idempotent worker do the actual work with the database enforcing uniqueness. Get those bones right and webhooks become boring, which is the goal.

If you are wiring up payments or third-party sync and want a second set of eyes before it ships, we are happy to take a look.

Related service

API & Backend Engineering

Secure, well-documented APIs that scale.

Learn more

Want this built right?

This is the work we do every day. Tell us what you are building and we will show you exactly how we would ship it.

hello@lazydevsagency.com