I have some code that is intended to be a package (so that it can be accessed via import cachejson
).
Here is the code for the package:
The update variable is whether the user wants to update the file or retrieve the older file? (True means use the file should be overwritten, false means it should be used.)
import requests
import json
import os
folder_name = 'json_cache'
class cj(object):
def __init__(self, url, update=False):
self.url = url
self.filename = self.make_filename() + '.json'
self.update = update
self.make_cache_directory()
def download(self):
return requests.get(URL).json()
def save(self):
# print('{}/{}'.format(folder_name, self.filename))
with open('{}/{}'.format(folder_name, self.filename), 'w') as out:
json.dump(self.raw, out)
def make_cache_directory(self):
try:
os.makedirs(folder_name)
print('new cache folder...')
except FileExistsError as e:
pass
def make_filename(self): # make the filename for the saved json file
new = self.url.replace('/', '=')
new = new.replace(':', '-')
return new
def file_exists(self): # see if the file already exists
return os.path.isfile('{}/{}'.format(folder_name, self.filename))
def load(self): # json from file to python obj
with open('{}/{}'.format(folder_name, self.filename)) as file:
return json.load(file)
def json(self):
if self.file_exists() and self.update == False:
print('file exists...')
return self.load()
else:
self.raw = self.download()
self.save()
print('new file saved...')
return self.raw
Then the usage would be something like:
repos = cachejson.cj(APIURL).json()
How can I improve this code to make it a better package?
1 Answer 1
How can I improve this code to make it a better package?
Remove print
s
It may have some debug value to you but your print
calls will be the most annoying things for your future users. Few possibilities are there to remove it (I’m thinking about redirect_stdout
to /dev/null
for instance) but it add a lot of noise to a program. Besides you don't get such messages from other libraries: defaultdict
don't say when they create the missing entry or they get the already existing one; json.loads
don't print a success message; etc.
Add documentation
If you're going to release it for use by others, you’ll want to at least populate the help
of your module with meaninfull informations. Add docstrings at all levels: module, class and functions. Some of your comments should already be docstrings anyway.
Use a standard folder location
I can cd
anywhere on my filesystem and fire a Python console there. This will mean, each time, I’ll create a new 'json_cache'
folder and won't benefit from past downloads. Make your folder in a unique, centralized, location so any use will benefit from it. The ~/.cache/
folder seems a good fit.
Use a better name
Your class cj
is really, really badly named. Do not fear using the same name for a class/function than the name of the module. Look at socket.socket
, pprint.pprint
and others, it is worse to have a bad, simplified, name than to repeat yourself in such cases.
Other than that, I would change a few other things:
"""TODO: Module docstring"""
import os
import json
import requests
class cache_json(object):
"""Manage a JSON object through the cache.
Download the associated resource from the provided URL
when need be and retrieve the JSON from a cached file
if possible.
"""
CACHE_FOLDER = 'json_cache'
def __init__(self, url, update=False):
self.url = url
self.filename = '{}.json'.format(url.translate({
ord('/'): '=',
ord(':'): '-',
}))
self.update = update
self.make_cache_directory()
@property
def cache_folder(self):
"""Path to the cache folder"""
return os.path.join(
os.path.expanduser('~'),
'.cache', self.CACHE_FOLDER)
def download(self):
"""Perform the retrieval of the requested JSON data"""
return requests.get(self.url).json()
def save(self, raw):
"""Save the provided raw JSON data into the cached file"""
filename = os.path.join(self.cache_folder, self.filename)
with open(filename, 'w') as out:
json.dump(raw, out)
def load(self):
"""Retrieve the saved JSON data from the cached file"""
filename = os.path.join(self.cache_folder, self.filename)
with open(filename) as cached:
return json.load(cached)
def make_cache_directory(self):
"""Create the cache directory if it doesn't exist"""
os.makedirs(self.cache_folder, exist_ok=True)
@property
def file_exists(self):
"""Whether the cached file already exist"""
filename = os.path.join(self.cache_folder, self.filename)
return os.path.isfile(filename)
@property
def json(self):
"""The JSON data associated to the given URL.
Either read from the cache of fetched from the Internet.
"""
if not self.update and self.file_exists:
return self.load()
raw = self.download()
self.save(raw)
return raw
Note the use of property
: the expected usage becomes:
repos = cachejson.cache_json(APIURL).json
-
\$\begingroup\$ That class name is still annoying. It should be
class CacheJson
to follow PEP8 conventions. \$\endgroup\$Grajdeanu Alex– Grajdeanu Alex2017年07月03日 10:03:21 +00:00Commented Jul 3, 2017 at 10:03 -
\$\begingroup\$ Also, in
download
method it should beself.url
instead ofURL
. More, in__init__
there's a missing)
at the end ofself.filename
;P \$\endgroup\$Grajdeanu Alex– Grajdeanu Alex2017年07月03日 10:04:49 +00:00Commented Jul 3, 2017 at 10:04 -
\$\begingroup\$ @MrGrj PEP8 says " Class names should normally use the CapWords convention. The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.". The
socket.socket
class is an example. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2017年07月03日 10:06:01 +00:00Commented Jul 3, 2017 at 10:06 -
\$\begingroup\$ @MrGrj Right for the
URL
parameter, just got it from the OP's code which, apparently, was broken to begin with. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2017年07月03日 10:06:56 +00:00Commented Jul 3, 2017 at 10:06 -
\$\begingroup\$ Why use the
@property
annotation for those specific methods? \$\endgroup\$werdna3232– werdna32322017年07月03日 19:37:15 +00:00Commented Jul 3, 2017 at 19:37