Retrying HTTP requests with Python requests library.
The Python requests
library is among the most popular today. According to GitHub, 2.3 million users 😱 are using it. One of the primary reasons for its widespread adoption is its user-friendly API.
By default, the requests made with the requests
library are not "retrying", this means that once a request is sent and it fails for any reason (like network issues, server timeouts, or temporary outages), the library won't automatically attempt to send the request again.
Sometimes we want to ensure reliable communication between systems. For example, in e-commerce platforms, when a customer places an order, it's essential that the order details are accurately communicated to inventory systems, payment gateways, and shipping providers. Any failure in these communications can result in missed shipments, incorrect billing, or overselling of inventory. In such critical scenarios, merely logging an error isn't enough; implementing retries becomes crucial to ensure that business operations run smoothly and customer trust is maintained.
Implementing retries requires the use of certain strategies. We cannot retry a request immediately after it fails; we need to wait a specific period of time for the service/API to become available again. For this, several strategies exist that can be employed to optimize the retry mechanism. One of the most common is exponential backoff. This strategy involves increasing the waiting time between retries exponentially, so that each retry waits longer than the previous one. This allows the service or API sufficient time to recover from any transient issues or high traffic.
How can requests
library help us implement retries ?¶
The first thing to know is that the requests
library is built on top of urllib3. This foundational library, urllib3, has built-in support for connection pooling and retries, which makes it a robust choice for web requests. Leveraging this underlying functionality, requests
can be configured to implement retries with a big variety of features.
To provide the retrying capabilities to requests
we have to import the Retry
class from urllib3
library like this:
- If you installed
requests
,urllib3
is already installed too.
now, let's explore what attributes Retry
class accepts:
Note
For a better description of Retry
attributes, please go to https://urllib3.readthedocs.io/en/stable/reference/urllib3.util.html#urllib3.util.Retry. version 2.4.0
Retry(
total=10,
connect=None,
read=None,
redirect=None,
status=None,
other=None,
allowed_methods=frozenset({'DELETE', 'GET', 'HEAD', 'OPTIONS', 'PUT', 'TRACE'}),
status_forcelist=None,
backoff_factor=0,
backoff_max=120,
raise_on_redirect=True,
raise_on_status=True,
history=None,
respect_retry_after_header=True,
remove_headers_on_redirect=frozenset({'Authorization'}),
backoff_jitter=0.0
)
- total (int): Total number of retries to allow.
- connect (int): How many connection-related errors to retry on.
- read (int): How many times to retry on read errors.
- redirect (int): How many redirects to perform. Limit this to avoid infinite redirect loops.
- status (int): How many times to retry after bad status codes.
- other (int): How many times to retry on other types of errors.
- allowed_methods (frozenset): Set of uppercased HTTP method verbs that we should retry on.
- status_forcelist (list): A set of integer HTTP status codes that we should force a retry on.
- backoff_factor (float): A backoff factor to apply between attempts. urllib3 will sleep for:
{backoff factor} * (2 **({number of previous retries}))
seconds. For instance, withbackoff_factor
set to0.5
, the sleep times would be[0s, 0.5s, 1s, 2s, 4s, …]
. - backoff_max (int): Maximum sleep time between retries.
- raise_on_redirect (bool): Whether or not to raise an error on redirects.
- raise_on_status (bool): Whether or not to raise an error on status.
- history (tuple): The history of the request encountered during each call to
increment()
. - respect_retry_after_header (bool): Whether or not to respect the Retry-After header in responses.
- remove_headers_on_redirect (Collection): Sequence of headers to remove from the request when a response indicating a redirect is returned before firing off the redirected request.
- backoff_jitter (float): The amount of jitter (randomness) to apply to the delay time before the next retry, given as a fraction of the computed delay time.
Now that we brought the Retry
class from urllib3 and gained a foundational understanding of its attributes, it's time to integrate it with the requests
library. To achieve this, we'll use the Session
class from requests
library. After creating a session, we can then attach our retry logic by mounting an adapter (HTTPAdapter
) to it. Here's how it's done:
from urllib3.util import Retry
from requests import Session # (1)!
from requests.adapters import HTTPAdapter
session = Session()
retries = Retry() # (2)!
session.mount('https://', HTTPAdapter(max_retries=retries)) # (3)!
- The
requests
version used in this post is2.31.0
- If we leave out the Retry's arguments, by default, it'll retry the requests 10 times inmediately in case of connection or read errors.
- If you want to make requests using the
http
protocol you need to mount another HTTPAdapter like this:session.mount('http://', HTTPAdapter(max_retries=retries))
.
The above code means that when we make an HTTP request using the session
variable, it will have the retry logic we set up previously in the Retry
class.
Retrying in case of gettting status codes¶
Now consider a scenario where you're making an HTTP request to a service and you want to retry the request in case you get a 500(indicating server errors) and 503(often indicating service unavailability) status codes. For this case, you could set up a retry logic using requests
like so:
from urllib3.util import Retry
from urllib3 import add_stderr_logger
from requests import Session
from requests.adapters import HTTPAdapter
add_stderr_logger() # (1)!
session = Session()
retries = Retry(
total=None,
connect=False,
read=False,
status=3,
backoff_factor=1,
status_forcelist=[503, 500],
)
session.mount('https://', HTTPAdapter(max_retries=retries))
resp = session.get("https://my_sevice_url.com")
- This function help us see the logs for each retry on the stderror.
The above code initiates an HTTP GET request to https://my_service_url.com
. If the server responds with a status code found in status_forcelist
, the request will be retried 3 times. The wait times between retries follow this pattern: [0s, 2s, 4s]
. However, the delay won't exceed the value specified in backoff_max
(you can change this value if needed). If the server's response header includes the Retry-After
key, its value will override the calculated wait time based on the backoff factor (You can disable this behaviour setting respect_retry_after_header
to False
). In the above retry logic, it's important to note that if we exhaust all retries, we will receive a RetryError
exception. Additionally, if for any reason we receive a status code during a retry that's not in the list status_forcelist
(e.g., 403), the retries will stop immediately, and we will encounter the exception associated with that specific status code.
Warning
You might be curious about why I set connect
and read
to False
. The reason is that if these parameters are left at their default values, in the event of a connect or read error, the retries will be attempted indefinitely.I also set total
to None
, which causes the retry logic to fall back on the read
, connect
, or status
counters.
Retry HTTP POST method¶
By default, retryable HTTP methods include HEAD, GET, PUT, DELETE, OPTIONS, and TRACE. However, if you need to retry a POST request, you must update the allowed_methods
parameter in the Retry
configuration. See the example below for guidance.
# ... previous imports
session = Session()
retries = Retry(
total=None,
connect=False,
read=False,
status=3,
backoff_factor=1,
status_forcelist=[503, 500],
)
session.mount('https://', HTTPAdapter(max_retries=retries))
resp = session.post("https://my_service_url.com", data={"key": "value"})
# ... more code
You can include any HTTP methods of your choice and specify the status codes you wish to retry on.
To Sum up!¶
As demonstrated, it's straightforward to implement retry logic for HTTP requests using the requests
library. This is made possible by the Retry
class from urllib3, which offers extensive configuration options. Experiment with parameters not covered in this post and see how they can fit into your use cases.
Moreover, remember not to over-retry, as this can increase load on servers or even be perceived as a DDoS attack. Also, consider logging retries. When they occur, it's crucial for debugging.
There are other retry libraries out there that you can use to implement retries for your requests, I will drop a list of some that I know:
- Backoff - https://github.com/litl/backoff
- Tenacity - https://github.com/jd/tenacity
- Stamina - https://github.com/hynek/stamina
All of them use the decorator strategy and have some nice features that you can try.