Debugging Webhook Failures: A Systematic Approach
Webhook failures are among the most frustrating bugs to debug. The request originates from an external system you do not control, arrives asynchronously, and often fails silently. By the time you notice the failure, the original context is gone. This article provides a systematic framework for diagnosing and resolving webhook delivery issues.
The Debugging Framework
Every webhook failure falls into one of five categories. Working through them in order eliminates possibilities efficiently:
- Network layer — Can the request reach your server?
- Transport layer — Is TLS/SSL working correctly?
- Application layer — Does your endpoint accept and parse the request?
- Validation layer — Does signature verification pass?
- Processing layer — Does your handler execute correctly?
Start at layer 1 and work down. Most developers jump to layer 5 and waste hours when the problem is at layer 1.
Layer 1: Network Issues
Before looking at code, confirm that the webhook request can physically reach your server.
DNS Resolution
If you recently changed DNS providers or updated your domain configuration, the webhook source might be resolving to the old IP address. DNS propagation can take up to 48 hours.
# Check what IP your domain resolves to
dig +short yourapp.com
# Check from Google's DNS (what external services see)
dig @8.8.8.8 +short yourapp.com
# Compare with your expected server IP
# If they don't match, DNS hasn't propagated
Firewall and Security Groups
Cloud security groups, WAFs, and firewalls can silently block webhook requests. Check:
- Is port 443 open for inbound traffic?
- Does your WAF whitelist the webhook provider's IP ranges?
- Are there rate limiting rules that might block burst deliveries?
- Is your hosting provider blocking requests that look like bots?
Many webhook providers publish their sending IP ranges. Stripe, GitHub, and Shopify all document their IPs specifically for firewall whitelisting. If you use a tool like InvokeBot to send test requests, you can confirm whether your endpoint is reachable from the public internet.
Layer 2: TLS/SSL Issues
Webhook providers almost universally require HTTPS endpoints. SSL certificate problems are a common and frustrating failure mode because they are invisible from the server side.
# Check your SSL certificate
openssl s_client -connect yourapp.com:443 -servername yourapp.com
# Check certificate expiry
echo | openssl s_client -connect yourapp.com:443 2>/dev/null | \
openssl x509 -noout -dates
# Test the full chain
curl -vI https://yourapp.com/webhooks/test 2>&1 | grep -i ssl
Common SSL issues that break webhooks:
- Expired certificate (the most common cause)
- Self-signed certificate (not trusted by the provider)
- Incomplete certificate chain (missing intermediate cert)
- Mismatched domain (cert for
www.yourapp.combut webhook points toyourapp.com)
Layer 3: Application Layer
Your server is reachable and SSL is valid, but the webhook still fails. Now check the application layer.
Route Matching
Verify your endpoint URL matches exactly. Trailing slashes matter in many frameworks:
// These are different routes in Express
app.post('/webhooks/stripe', handler); // No trailing slash
app.post('/webhooks/stripe/', handler); // With trailing slash
// If the provider sends to /webhooks/stripe/
// but you registered /webhooks/stripe, it may 404
Request Parsing
Middleware configuration errors are a frequent source of webhook failures. The body parser must match the content type the webhook provider sends:
// Wrong: using urlencoded parser for JSON webhooks
app.use(express.urlencoded({ extended: true }));
// Correct: using JSON parser
app.use(express.json());
// Best: use raw body for signature verification
app.post('/webhooks/stripe',
express.raw({ type: 'application/json' }),
(req, res) => {
const payload = req.body.toString();
// Now you can verify the signature against the raw payload
}
);
Timeout Issues
Most webhook providers expect a response within 5-30 seconds. If your handler takes longer, the provider marks it as failed and retries. This creates duplicate processing if you do not handle idempotency.
For quick JSON formatting and validation, check out KappaKit's developer toolkit.
// Anti-pattern: synchronous processing
app.post('/webhooks/events', async (req, res) => {
await processEvent(req.body); // Takes 45 seconds
await sendNotification(req.body); // Takes 10 seconds
await updateDatabase(req.body); // Takes 5 seconds
res.status(200).send('OK'); // Too late - provider already retried
});
// Correct: acknowledge then process
app.post('/webhooks/events', (req, res) => {
res.status(200).send('OK'); // Respond immediately
// Process asynchronously
processEventAsync(req.body).catch(err => {
console.error('Webhook processing failed:', err);
});
});
Layer 4: Validation Failures
Your endpoint receives the request but rejects it during signature verification. This is actually the desired behavior when the signature does not match, but debugging false rejections requires care.
Common Signature Verification Bugs
// Bug 1: Using parsed body instead of raw body
// Signature is computed against the raw request body
const signature = computeHmac(req.body); // Wrong - parsed JSON
const signature = computeHmac(rawRequestBody); // Correct - raw string
// Bug 2: Wrong encoding
const expected = hmac.digest('hex'); // Provider uses hex
const expected = hmac.digest('base64'); // You use base64 = mismatch
// Bug 3: Missing or stale webhook secret
// Check that WEBHOOK_SECRET env var is set and matches
// the secret shown in the provider's dashboard
Layer 5: Processing Errors
The webhook arrives, validates, and starts processing. But the handler throws an error. This is the most straightforward layer to debug because the error is in your code.
Structural Logging
Add structured logging to every webhook handler:
app.post('/webhooks/events', async (req, res) => {
const eventId = req.body.id;
const eventType = req.body.type;
console.log({
level: 'info',
message: 'Webhook received',
eventId,
eventType,
timestamp: new Date().toISOString()
});
try {
await processEvent(req.body);
console.log({
level: 'info',
message: 'Webhook processed',
eventId,
duration: Date.now() - start
});
} catch (err) {
console.error({
level: 'error',
message: 'Webhook processing failed',
eventId,
error: err.message,
stack: err.stack
});
}
res.status(200).send('OK');
});
Building a Retry Strategy
Even with perfect code, webhooks will occasionally fail due to temporary network issues, deployment windows, or database maintenance. A solid retry strategy is essential. For teams building robust API infrastructure, tools like Krzen can help with request compression and optimization, and related tools provides the broader development toolkit.
Exponential Backoff
If you are the webhook sender (not receiver), implement exponential backoff:
async function deliverWebhook(url, payload, attempt) {
attempt = attempt || 0;
var MAX_ATTEMPTS = 5;
if (attempt >= MAX_ATTEMPTS) {
console.error('Webhook delivery failed after', MAX_ATTEMPTS, 'attempts');
moveToDeadLetterQueue(url, payload);
return;
}
try {
var response = await fetch(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload)
});
if (response.status >= 200 && response.status < 300) {
return; // Success
}
throw new Error('HTTP ' + response.status);
} catch (err) {
var delay = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s, 8s, 16s
console.log('Retry', attempt + 1, 'in', delay, 'ms');
await sleep(delay);
return deliverWebhook(url, payload, attempt + 1);
}
}
Monitoring Webhooks in Production
Prevention is better than debugging. Set up monitoring for:
- Delivery success rate — Alert if it drops below 99%
- Response time — Alert if p95 exceeds 3 seconds
- Error rate by type — Distinguish 4xx (your bug) from 5xx (transient)
- Queue depth — If using async processing, monitor the queue size
- Dead letter queue — Webhooks that failed all retry attempts need human attention
A Debugging Checklist
When a webhook fails, run through this checklist in order:
- Can you reach the endpoint URL from the public internet? (
curl -I https://yourapp.com/webhooks/test) - Is the SSL certificate valid and not expired?
- Does the route exist and accept POST requests?
- Is the body parser configured for the correct content type?
- Does your endpoint respond within the provider's timeout window?
- Is the webhook secret correct and matching?
- Are you verifying the signature against the raw body (not parsed JSON)?
- Does your handler throw any exceptions during processing?
- Are you handling duplicate deliveries with idempotency?
- Check the provider's webhook delivery logs for specific error messages.
Conclusion
Webhook debugging is systematic, not mysterious. Work through the five layers in order: network, TLS, application, validation, processing. Most failures are at the network or application layer. Invest in structured logging, signature verification, and idempotent processing from the start, and you will spend far less time debugging webhook issues in production.
The tools and techniques outlined here apply regardless of whether you are receiving webhooks from Stripe, GitHub, Shopify, or any custom integration. The protocol is the same, and the failure modes are predictable.