API Retry Wrapper with Exponential Backoff

mediumPython

Lesson

Retry Logic and Exponential Backoff

When building resilient systems, network calls and external API requests can fail for various reasons - temporary network issues, server overload, or rate limiting. Rather than immediately giving up, a well-designed system implements retry logic with exponential backoff to gracefully handle transient failures.

Exponential backoff is a strategy where the delay between retry attempts increases exponentially. Instead of retrying immediately or with fixed intervals, you wait progressively longer: 1 second, then 2 seconds, then 4 seconds, and so on. This approach prevents overwhelming an already struggling service while giving it time to recover.

Jitter adds randomness to retry delays to prevent the "thundering herd" problem. When many clients retry simultaneously with identical delays, they can create synchronized waves of requests that continue to overload the server. By adding random variation (typically ±25%), requests spread out more naturally.

The key components of effective retry logic are:

  1. Selective retrying: Only retry on transient errors (timeouts, rate limits) not permanent failures (authentication errors, invalid requests)
  2. Bounded retries: Set maximum attempts to prevent infinite loops
  3. Delay management: Use exponential backoff with a maximum delay cap
  4. Error preservation: Return the original error when retries are exhausted

This pattern is ubiquitous in distributed systems, cloud services, and API integrations where network reliability cannot be guaranteed.

Example
1import time 2import random 3 4def retry_with_backoff(func, max_attempts=3): 5 """Simple retry wrapper with exponential backoff""" 6 for attempt in range(max_attempts): 7 try: 8 return func() # Try the operation 9 except (ConnectionError, TimeoutError) as e: 10 if attempt == max_attempts - 1: 11 raise e # Last attempt, re-raise the error 12 13 # Calculate exponential delay: 1, 2, 4 seconds... 14 delay = 2 ** attempt 15 # Add jitter to prevent thundering herd 16 jittered_delay = delay * random.uniform(0.8, 1.2) 17 18 print(f"Attempt {attempt + 1} failed, retrying in {jittered_delay:.2f}s") 19 time.sleep(jittered_delay)
L6Only retry on specific transient error types
L11Exponential backoff: delay doubles each attempt (1, 2, 4, 8...)
L13Jitter adds ±20% randomness to prevent synchronized retries

Key Takeaways

  • •Exponential backoff prevents overwhelming struggling services by increasing delays between retries
  • •Jitter adds randomness to retry timing to avoid thundering herd problems
  • •Only retry on transient errors (timeouts, rate limits) not permanent failures
Loading...