Designing an Idempotent Payment Integration for Distributed Systems
Preventing double charges, handling retry loops, and managing distributed ledger consistency
#The Nightmare of Double Charges
In online payments engineering, there is no error worse than charging a customer twice for a single transaction. Under high transaction loads, network timeouts between microservices and external gateways (like Razorpay, Stripe, or bank API hooks) are common. When a timeout occurs, the client application initiates a retry loop. If the receiving gateway is not fully idempotent, it processes the retry as a new payment request, resulting in a double charge.
#Designing the Idempotency Layer
We established a strict idempotency layer using Redis and Redisson distributed locks. A unique idempotency key is generated from transaction invariants (such as `user_id + cart_id + attempt_number` or order IDs). Before invoking any payment gateway, the application must acquire a lock on the key. If the lock is held, or if the key already exists in the database with a 'processing' state, subsequent retries are blocked and instead poll for the primary request's result.
// Idempotency lock checker using Redisson distributed locks
public PaymentResponse processPayment(PaymentRequest request) {
String idempotencyKey = "lock:payment:" + request.getOrderId();
RLock lock = redissonClient.getLock(idempotencyKey);
try {
// Try acquiring lock with a lease time of 10 seconds
if (lock.tryLock(2, 10, TimeUnit.SECONDS)) {
// Check database to see if transaction has already completed
Transaction tx = transactionRepo.findByOrderId(request.getOrderId());
if (tx != null) {
return tx.getResponse();
}
// Execute external gateway call
return executeGatewayPayment(request);
} else {
throw new ConcurrentPaymentException("Payment request already in progress.");
}
} catch (InterruptedException e) {
throw new PaymentException("Failed to acquire payment lock", e);
} finally {
if (lock.isHeldByCurrentThread()) {
lock.unlock();
}
}
}#Dual-Write Ledger Consistency & Database Locking
To ensure accounting consistency, we use database dual-writes. We write a pending transaction record to the internal ledger database before contacting the external payment gateway. We utilize optimistic locking via version columns on the user's account entity. This guarantees that balance states cannot be modified by concurrent payment runs during the round-trip delay to the external gateway.