How to perform HTTP requests with python – Part 2 – The request Library

In the previous article we saw how to perform basic HTTP requests using the python3 standard library. When requests become more complex, or we just want to use less code, and we don’t mind adding a dependency to our project, it’s possible (and sometimes even recommended) to use the external requests module. The library, which adopted the “HTTP for Humans” motto, will be the focus of this article.

In this tutorial you will learn:

  • How to perform HTTP requests with python3 and the ‘requests’ library
  • How to manage server responses
  • How to work with sessions


python-logo-requests-requests-library

HTTP requests with python – Pt. II: The requests library

Software Requirements and Conventions Used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Os-independent
Software Python3 and the “requests” library
Other Knowledge of the basic concepts of Object Oriented Programming and Python
Conventions # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux commands to be executed as a regular non-privileged user

Performing requests with the “requests” library

In the first part of this series, we performed basic HTTP requests using only the standard library. When requests become more complex, for example when we need to preserve cookies between one request and another, we can use the requests external library, which simplifies our job, performing a lot of operations under the hood for us. Since the library is not included in a default python3 installation, we must install it on our system before we can use it. A distribution-independent method to accomplish the task is to use pip, the python package manager:

$ pip3 install requests --user


Now that we installed the library, Let’s see some examples of how to use it.

Performing a get request

Remember the request we made using the NASA APIs, to retrieve the “image of the day” for a specific date? Building and sending the same request with the requests library requires just one line of code:

>>> import requests
>>> response = requests.get("https://api.nasa.gov/planetary/apod", params={"api_key": "DEMO_KEY", "date": "2019-04-11"})

We passed the URL and the query parameters (still as a dictionary), respectively as the first and the second argument of the get function. What does this function returns? It returns an instance of the requests.models.Response class. Interacting with instances of this class is very easy. Do we want to retrieve the json-encoded content of the response? Easy! we just need to call the json method of the object:

>>> response.json()
{'date': '2019-04-11',
'explanation': 'What does a black hole look like? To find out, radio '
                'telescopes from around the Earth coordinated observations of '
                'black holes with the largest known event horizons on the '
                ...
                'immediate vicinity of the black hole in the center of our '
                'Milky Way Galaxy.',
'hdurl': 'https://apod.nasa.gov/apod/image/1904/M87bh_EHT_2629.jpg',
'media_type': 'image',
'service_version': 'v1',
'title': 'First Horizon-Scale Image of a Black Hole',
'url': 'https://apod.nasa.gov/apod/image/1904/M87bh_EHT_960.jpg'}

Do we want to obtain the response of the server as a string? all we have to do is to access the text property:

response.text

In the same fashion we can access to the reason, status_code and headers of the request. We just have to access the respective properties:

>>> response.reason
'OK'
>>> response.status_code
200
>>> response.headers
{'Server': 'openresty', 'Date': 'Thu, 18 Apr 2019 10:46:26 GMT', 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'X-RateLimit-Limit': '40', 'X-RateLimit-Remaining': '39', 'Via': '1.1 vegur, http/1.1 api-umbrella (ApacheTrafficServer [cMsSf ])', 'Age': '0', 'X-Cache': 'MISS', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; preload', 'Content-Encoding': 'gzip'}

Downloading a file

Downloading a file is also very easy. First of all we have to use the stream parameter of the get function. By default this parameter is set to False, and this means that the body of the response will be downloaded at once. Since we may want to download a large file, we want to set it to True: this way only the headers of the response will be immediately downloaded and the connection will remain open so we can further process it as we want:



>>> latest_kernel_tarball = "https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.0.7.tar.xz"
>>> with requests.get(latest_kernel_tarball, stream=True) as response:
...    with open("latest-kernel.tar.xz", "wb") as tarball:
...        for chunk in response.iter_content(16384):
...            tarball.write(chunk)

The code is similar to its standard library counterpart: the thing that changed is the use of the iter_content method of the response object. In previous example we operated inside a while loop, which we interrupted only when the content of the response was consumed. Using this method, we can write to the destination file in a more elegant way, since we can iterate on the content of the response. The iter_content method accepts the optional argument chunk_size, an integer indicating the chunk size in bytes (the data to read in memory at each iteration).

Sending form-encoded data or json in a request

Sending form-encoded data (for example in a POST request) with the “requests” library, requires less code than the same operation performed only using the standard library:

>>>request_data = {
...    "variable1": "value1",
...    "variable2": "value2"
...}
>>>response = requests.post("https://httpbin.org/post", data=request_data)

To pass the same data, but as json:

response = requests.post("https://httpbin.org/post", json=request_data)

By using the json parameter of the function, we don’t even have to worry about encoding the string using json.dumps: it will be done for use under the hood.

Uploading a file

Uploading a file using the standard library can be a very tedious task, but it is very easy using the requests library. Say we want to upload a picture:

>>> response = requests.post(
...    "https://httpbin.org/post", files={'file': open('nasa_black_hole.png', 'rb')})

Impressively short code! We performed a post request, this time using the files argument. This argument must be a dictionary where the key is the field “name” and the value is a file object, in this case returned by the open function.

What about the other HTTP verbs? Each one of them is used with the accordingly named function: put, delete, head or options. All of them can be used with basically the same interface as the ones we saw before.

Working with sessions

The requests library allow us to use sessions: when requests are sent from a session context, cookies are preserved between one request and another. This is the recommended way of performing multiple requests to the same host, since even the same TCP connection will be reused. Let’s see how to create a session and send a request with it:

>>> session = requests.Session()
>>> response = session.get("https://httpbin.org/cookies/set?lastname=skywalker")


We created an instance of the requests.Session class, and, instead of running a request by itself, as we did in previous examples, we used the method named after the HTTP verb, (get in this case) which is used in the same manner. The request URL, this time, was http://httpbin.org/cookies/set, an endpoint which let us set the cookie parameters we send in the query string. The call we made set a cookie which is now stored in the session, and will be used in all the requests sent from the session context. To list all the cookies associated with a session we can access the cookies property, which is an instance of the requests.cookies.RequestsCookieJar' class:

>>> session.cookies
<RequestsCookieJar[Cookie(version=0, name='lastname', value='skywalker', port=None, port_specified=False, domain='httpbin.org', domain_specified=False,
domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)]>
>>> # Access the cookies keys
...session.cookies.keys()
['lastname']
>>>
>>> # Access the cookies values
...session.cookies.values()
['skywalker']
>>>
>>> # The iterkeys method returns an iterator of names of cookies
...session.cookies.iterkeys()
<generator object RequestsCookieJar.iterkeys at 0x7f414a1a5408>
>>> # The itervalues method does the same but for values
...session.cookies.itervalues()
<generator object RequestsCookieJar.itervalues at 0x7f414a1a5408>

To clean stored cookies in the session we can use the clear method:

>>> session.cookies.clear()
>>> session.cookies
<RequestsCookieJar[]>

Create a Request object

Until now we just used functions like get, post or put which basically create and send requests “on the fly”. There are cases in which we want to build a Request object but we don’t want to send it immediately. Here is how we can do it:

>>> request = requests.Request("GET", "https://httpbin.org/get")

The first argument of the Request constructor is the verb we want to use and the second one, the destination URL. The same parameters we use when we send a request directly can be used: headers, params, data, json and files. Once we created a Request we must “prepare” it before we can send it:



>>> session = requests.Session()
>>> request = requests.Request("GET", "https://httpbin.org/get")
>>> prepared_request = session.prepare_request(request)
>>> response = session.send(prepared_request)

We could also prepare a Request using the prepare method of the Request object itself, instead of calling session.prepare_request, but in this case, the request would loose the advantages of being part of the session.

Raise an exception when response status code is not 200

The status code returned by a server when a request is successful is 200. When some error happens, for example when a resource is not found or when we are not authorized to access it, other codes are returned (in this case 404 and 403 respectively). When this happens and we want our code to raise an exception we must call the raise_for_status method of the requests.models.Response object. Let’s see how the code behaves differently when we use it. We send a POST request to an endpoint which accepts only the GET verb:

>>> response = requests.post('https://httpbin.org/get')
>>> response.status_code
405
>>> response.reason
'METHOD NOT ALLOWED'

As expected, since we used the wrong HTTP verb, the response status code was 405, and the corresponding “reason” is METHOD NOT ALLOWED, however no exception was raised. To let a bad request raise an exception we must call the raise_for_status method after sending the request:

>>> response = requests.post('https://httpbin.org/get')
>>> response.raise_for_status()
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 405 Client Error: METHOD NOT ALLOWED for url: https://httpbin.org/get

Since we called raise_for_status, this time the request raised an requests.exceptions.HTTPError exception.

Conclusions

In this article, the second one of the series about performing HTTP request with python, we focused
on the use of the external requests library, which let us perform both simple and complex requests
in few lines of code. Want to know more about it? The official documentation is just one click away!



Comments and Discussions
Linux Forum