I'm trying to make my first Python package as a learning experience. There's a lot of things that I suspect I am doing poorly, but this post is specifically about my HttpRequest class. I made this class so that I can use retries and persistent session.
class HttpRequest:
__session = None
@staticmethod
def __init__session__():
if HttpRequest.__session is not None:
return
stderr.write("Initializing HTTPS Session\n")
HttpRequest.__session = HttpSession()
retries = Retry(total=3,
backoff_factor=1,
status_forcelist=[], # 429
allowed_methods=False)
HttpRequest.__session.mount("https://", HTTPAdapter(max_retries=retries))
@staticmethod
def get(url):
print(f"Querying {url}")
HttpRequest.__init__session__()
return HttpRequest.__session.get(url)
I use this class basically everywhere in the package, so I want to make sure I'm doing this well.
1 Answer 1
I tend to be pretty happy with requests
defaults,
but I imagine you need specific retry behavior,
so I won't comment on that.
__session = None
That's just weird.
Please don't do that -- name mangling is seldom helpful.
Prefer a single leading _
underscore to mark it private.
Rather than naming it _init_session
(a perfectly nice and helpful name),
you might prefer _get_session
.
I'm just looking at how it's used, is all.
Currently it is invoked for Singleton side effects,
and caller must carefully remember to call it.
When designing an API we strive for more than just clear names. We also want it to be easy to use, and hard to accidentally misuse. So a getter feels more natural, in this case.
if HttpRequest.__session is not None:
Feel free to abbreviate that to just:
if HttpRequest.__session:
@staticmethod
def get(url):
print(f"Querying {url}")
HttpRequest.__init__session__()
return HttpRequest.__session.get(url)
Both methods are declared static, yet they immediately dive into class attributes. Doesn't seem like a good match. Prefer:
@classmethod
def get(cls, url):
...
cls._init_session()
return cls._session.get(url)
#
# or maybe just:
# return cls._get_session().get(url)
Also, instead of print
, consider using a
logger.
That way callers can dial verbosity up and down as they wish.
Plus the logged timestamps can come in very handy
when diagnosing why an overnight run suddenly slowed down.
-
\$\begingroup\$ Thanks, this is just what I was looking for. I meant to specifically ask about the use of class vs. static methods and persistent session, but you addressed both topics anyway! \$\endgroup\$JTB– JTB2023年03月12日 18:48:11 +00:00Commented Mar 12, 2023 at 18:48
-
\$\begingroup\$ Oh, my! I didn't even notice this one:
stderr.write("Initializing HTTPS Session\n")
. Definitely a logger would be more friendly, so I don't have to remember to redirect file descriptor 2. Much better that wegrep
some favorite message out of a combined log stream. \$\endgroup\$J_H– J_H2023年03月12日 18:50:28 +00:00Commented Mar 12, 2023 at 18:50 -
-
1\$\begingroup\$ Certainly
print()
has its place. If you remember to include timestamps / elapsed times in your crawl messages then sure, go for it. My specific objection tostderr
is it just seems inconvenient to the caller. E.g. if I make this a cron job I get an email unless I2>&1
or maybe2> err.log
. The item being logged isn't an error, it is totally routine and expected. But whatever, I'm sure it works for your use case, hang onto it if you like it. Me, personally? I find value in logging "prefix1: %s" and "prefix2: %s" and then grep'ing out a certain prefix during an investigation. \$\endgroup\$J_H– J_H2023年03月12日 19:27:04 +00:00Commented Mar 12, 2023 at 19:27