HTTP Request Retries
Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports. This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors.
Overview
The retry functionality is built on top of the tenacity library and integrates seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client.
Installation
To use the retry transports, you need to install tenacity
, which you can do via the retries
dependency group:
pip install 'pydantic-ai-slim[retries]'
uv add 'pydantic-ai-slim[retries]'
Usage Example
Here's an example of adding retry functionality with smart retry handling:
from httpx import AsyncClient, HTTPStatusError
from tenacity import (
AsyncRetrying,
stop_after_attempt,
wait_exponential,
retry_if_exception_type
)
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
from pydantic_ai.providers.openai import OpenAIProvider
def create_retrying_client():
"""Create a client with smart retry handling for multiple error types."""
def should_retry_status(response):
"""Raise exceptions for retryable HTTP status codes."""
if response.status_code in (429, 502, 503, 504):
response.raise_for_status() # This will raise HTTPStatusError
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
# Retry on HTTP errors and connection issues
retry=retry_if_exception_type((HTTPStatusError, ConnectionError)),
# Smart waiting: respects Retry-After headers, falls back to exponential backoff
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=1, max=60),
max_wait=300
),
# Stop after 5 attempts
stop=stop_after_attempt(5),
# Re-raise the last exception if all retries fail
reraise=True
),
validate_response=should_retry_status
)
return AsyncClient(transport=transport)
# Use the retrying client with a model
client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
Wait Strategies
wait_retry_after
The wait_retry_after
function is a smart wait strategy that automatically respects HTTP Retry-After
headers:
from pydantic_ai.retries import wait_retry_after
from tenacity import wait_exponential
# Basic usage - respects Retry-After headers, falls back to exponential backoff
wait_strategy_1 = wait_retry_after()
# Custom configuration
wait_strategy_2 = wait_retry_after(
fallback_strategy=wait_exponential(multiplier=2, max=120),
max_wait=600 # Never wait more than 10 minutes
)
This wait strategy:
- Automatically parses Retry-After
headers from HTTP 429 responses
- Supports both seconds format ("30"
) and HTTP date format ("Wed, 21 Oct 2015 07:28:00 GMT"
)
- Falls back to your chosen strategy when no header is present
- Respects the max_wait
limit to prevent excessive delays
Transport Classes
AsyncTenacityTransport
For asynchronous HTTP clients (recommended for most use cases):
from httpx import AsyncClient
from tenacity import AsyncRetrying, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport
# Create the basic components
async_retrying = AsyncRetrying(stop=stop_after_attempt(3), reraise=True)
def validator(response):
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
Without a response validator, only network errors and timeouts will result in a retry.
"""
response.raise_for_status()
# Create the transport
transport = AsyncTenacityTransport(
controller=async_retrying, # AsyncRetrying instance
validate_response=validator # Optional response validator
)
# Create a client using the transport:
client = AsyncClient(transport=transport)
TenacityTransport
For synchronous HTTP clients:
from httpx import Client
from tenacity import Retrying, stop_after_attempt
from pydantic_ai.retries import TenacityTransport
# Create the basic components
retrying = Retrying(stop=stop_after_attempt(3), reraise=True)
def validator(response):
"""Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
Without a response validator, only network errors and timeouts will result in a retry.
"""
response.raise_for_status()
# Create the transport
transport = TenacityTransport(
controller=retrying, # Retrying instance
validate_response=validator # Optional response validator
)
# Create a client using the transport
client = Client(transport=transport)
Common Retry Patterns
Rate Limit Handling with Retry-After Support
from httpx import AsyncClient, HTTPStatusError
from tenacity import AsyncRetrying, stop_after_attempt, retry_if_exception_type, wait_exponential
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
def create_rate_limit_client():
"""Create a client that respects Retry-After headers from rate limiting responses."""
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=retry_if_exception_type(HTTPStatusError),
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=1, max=60),
max_wait=300 # Don't wait more than 5 minutes
),
stop=stop_after_attempt(10),
reraise=True
),
validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx
)
return AsyncClient(transport=transport)
# Example usage
client = create_rate_limit_client()
# Client is now ready to use with any HTTP requests and will respect Retry-After headers
The wait_retry_after
function automatically detects Retry-After
headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff.
Network Error Handling
import httpx
from tenacity import AsyncRetrying, retry_if_exception_type, wait_exponential, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport
def create_network_resilient_client():
"""Create a client that handles network errors with retries."""
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=retry_if_exception_type((
httpx.TimeoutException,
httpx.ConnectError,
httpx.ReadError
)),
wait=wait_exponential(multiplier=1, max=10),
stop=stop_after_attempt(3),
reraise=True
)
)
return httpx.AsyncClient(transport=transport)
# Example usage
client = create_network_resilient_client()
# Client will now retry on timeout, connection, and read errors
Custom Retry Logic
import httpx
from tenacity import AsyncRetrying, wait_exponential, stop_after_attempt
from pydantic_ai.retries import AsyncTenacityTransport, wait_retry_after
def create_custom_retry_client():
"""Create a client with custom retry logic."""
def custom_retry_condition(exception):
"""Custom logic to determine if we should retry."""
if isinstance(exception, httpx.HTTPStatusError):
# Retry on server errors but not client errors
return 500 <= exception.response.status_code < 600
return isinstance(exception, (httpx.TimeoutException, httpx.ConnectError))
transport = AsyncTenacityTransport(
controller=AsyncRetrying(
retry=custom_retry_condition,
# Use wait_retry_after for smart waiting on rate limits,
# with custom exponential backoff as fallback
wait=wait_retry_after(
fallback_strategy=wait_exponential(multiplier=2, max=30),
max_wait=120
),
stop=stop_after_attempt(5),
reraise=True
),
validate_response=lambda r: r.raise_for_status()
)
return httpx.AsyncClient(transport=transport)
client = create_custom_retry_client()
# Client will retry server errors (5xx) and network errors, but not client errors (4xx)
Using with Different Providers
The retry transports work with any provider that accepts a custom HTTP client:
OpenAI
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from smart_retry_example import create_retrying_client
client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
Anthropic
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider
from smart_retry_example import create_retrying_client
client = create_retrying_client()
model = AnthropicModel('claude-3-5-sonnet-20241022', provider=AnthropicProvider(http_client=client))
agent = Agent(model)
Any OpenAI-Compatible Provider
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from smart_retry_example import create_retrying_client
client = create_retrying_client()
model = OpenAIModel(
'your-model-name', # Replace with actual model name
provider=OpenAIProvider(
base_url='https://api.example.com/v1', # Replace with actual API URL
api_key='your-api-key', # Replace with actual API key
http_client=client
)
)
agent = Agent(model)
Best Practices
-
Start Conservative: Begin with a small number of retries (3-5) and reasonable wait times.
-
Use Exponential Backoff: This helps avoid overwhelming servers during outages.
-
Set Maximum Wait Times: Prevent indefinite delays with reasonable maximum wait times.
-
Handle Rate Limits Properly: Respect
Retry-After
headers when possible. -
Log Retry Attempts: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.)
-
Consider Circuit Breakers: For high-traffic applications, consider implementing circuit breaker patterns.
Error Handling
The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from smart_retry_example import create_retrying_client
client = create_retrying_client()
model = OpenAIModel('gpt-4o', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
Performance Considerations
- Retries add latency to requests, especially with exponential backoff
- Consider the total timeout for your application when configuring retry behavior
- Monitor retry rates to detect systemic issues
- Use async transports for better concurrency when handling multiple requests
For more advanced retry configurations, refer to the tenacity documentation.