2
\$\begingroup\$

I am requesting a JSON file that's gzipped. First, I download file:

import urllib.request
testfile = urllib.request.URLopener()
testfile.retrieve("https://xxxxx.auth0.com/1537150574", "file.gz")

After that, I will read the file.gz and get the data.

with gzip.GzipFile("file.gz", 'r') as fin: 
 json_bytes = fin.read() 
json_str = json_bytes.decode('utf-8') 
data = json.loads(json_str) 
print(data)

Actually, This above code can work well for me. But I would like to find another way (faster and brief code). Could I have a suggestion?

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Sep 17, 2018 at 3:15
\$\endgroup\$
2
  • \$\begingroup\$ You probably can't speed this code up since you can't stream from a .gz file, so you need to load the whole thing into memory anyway, and the code is perfectly readable. If you want you could one line the read part of your code, but why? \$\endgroup\$ Commented Sep 17, 2018 at 3:19
  • \$\begingroup\$ Hi @Turksarama, In this case, I want to read all the data in the file.gz file. Could you help me fix it? The data in file like this: {"email":"[email protected]","provider":"Username-Password-Authentication"} {"email":"[email protected]","provider":"Username-Password-Authentication"} \$\endgroup\$ Commented Sep 17, 2018 at 5:26

1 Answer 1

2
\$\begingroup\$

Your bottleneck is probably that you write the file to disk first and then read it again (I/O). If the file does not exceed your machines random access memory, decompressing the file on the fly in memory might be a faster option:

from gzip import decompress
from json import loads
from requests import get
def get_gzipped_json(url):
 return loads(decompress(get(url).content))
if __name__ == '__main__':
 print(get_gzipped_json("https://xxxxx.auth0.com/1537150574"))

Also note, that I put the running code into an if __name__ == '__main__': guard.

answered Sep 17, 2018 at 8:56
\$\endgroup\$
2
  • \$\begingroup\$ In this way, does it save the compressed file to zip or just retrieves some chunks, decompresses and processes? \$\endgroup\$ Commented Sep 19, 2023 at 19:08
  • \$\begingroup\$ The solution above writes to a buffer and then decompresses its content. It is not stream-decompressed. See @Turksarama's comment on the original question. \$\endgroup\$ Commented Sep 19, 2023 at 19:28

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.