2

I'm sending a request for a set of images to one of my API's. The API returns these images in a JSON format. This format contains data about the resource together with a single property that represents the image in Base64.

An example of the JSON being returned.

{
 "id": 548613,
 "filename": "00548613.png",
 "pictureTaken": "2020-03-30T11:38:21.003",
 "isVisible": true,
 "lotcode": 23,
 "company": "05",
 "concern": "46",
 "base64": "..."
}

The correct content of the Base64
The incorrectly parsed Base64

This is done with the Python3 requests library. When i receive a successful response from the API i attempt to decode the body to JSON using:

url = self.__url__(f"/rest/all/V1/products/{sku}/images")
headers = self.__headers__()
r = requests.get(url=url, headers=headers)
if r.status_code == 200:
 return r.json()
elif r.status_code == 404:
 return None
else:
 raise IOError(
 f"Error retrieving product '{sku}', got {r.status_code}: '{r.text}'")

Calling .json() results in the Base64 content being messed up, some parts are not there, and some are replaced with other characters. I tried manually decoding the content using r.content.decode() with the utf-8 and ascii options to see if this was the problem after seeing this post. Sadly this didn't work. I know the response from the server is correct, it works with Postman, and calling print(r.content) results in a JSON document containing the valid Base64.

How would i go about de-serializing the response from the API to get the valid Base64?

asked Jun 15, 2020 at 17:00
3
  • @Trenton I assume you mean the Base64, sadly i cannot share it because i do not have ownership of the serialized resources. Commented Jun 15, 2020 at 18:00
  • 1
    @Harjan Take a random image of a duck. Convert it to base64. Put that base64 in a request like the one you provided and see if the problem arises. If yes, post that request so we can try. Commented Jun 15, 2020 at 20:05
  • 1
    @Trenton I have added some Base64, it should be a 1024x1024 picture of a pink and white box when parsed correctly. Commented Jun 16, 2020 at 7:10

1 Answer 1

1
import base64
import re
...
b64text = re.search(b"\"base64\": \"(?P<base>.*)\"", r.content, flags=re.MULTILINE).group("base")
decode = base64.b64decode(b64text).decode(utf-8)

Since you're saying "calling print(r.content) results in the valid Base64", it's just a matter of decoding the base64.

answered Jun 15, 2020 at 17:19
Sign up to request clarification or add additional context in comments.

7 Comments

Good suggestion, i think this might have worked if it was just Base64 that was being returned. Calling this on my content results in the entire JSON response being decoded from Base64.
@Harjan then it's just a matter of extracting the base64 data from the text directly, see my answer for an example implementation.
I tried your edited solution. But calling r.content or r.text results in the same corrupted Base64. Extracting works, but parsing is not possible because it still contains the illegal characters.
@Harjan Check your content-type and charset, the default in requests is text/html, you can set a charset utf-8, that's probably not what your API is using, set the appropriate value using r.encoding and retry. Have you tried using urrlib and reproducing this behaviour?
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.