1
\$\begingroup\$

I am making a program that parses profiles from a website. Some profiles are "public" and others are "private" with limited information. Public profiles contain a lot more data and have ~15 more attributes.

Should I have one Profile class and set all unobtainable attributes to None? Or should I have 2 different classes and simply exclude unobtainable attributes if a profile is private?

Option #1

import requests
from bs4 import BeautifulSoup
from dataclasses import dataclass
@dataclass
class Profile:
 url: str 
 is_private: bool
 summary: str = None
 level: int = None
class Parser:
 def __init__(self):
 self.session = requests.Session()
 def get_profile(self, url):
 resp = self.session.get(url)
 soup = BeautifulSoup(resp.text, "html.parser")
 
 profile_is_private = bool(soup.find("div", {"class": "profile_private_info"}))
 if profile_is_private:
 return Profile(url=url, is_private=profile_is_private)
 summary=soup.find("div", {"class": "profile_summary"}).text,
 level=int(soup.find("span", {"class": "friendPlayerLevelNum"}).text)
 return Profile(url=url, is_private=profile_is_private, summary=summary, level=level)
parser = Parser()
public_profile = parser.get_profile("https://steamcommunity.com/id/afarnsworth")
print(public_profile)
private_profile = parser.get_profile("https://steamcommunity.com/id/private")
print(private_profile)

Option #2

import requests
from bs4 import BeautifulSoup
from dataclasses import dataclass
@dataclass
class BaseProfile:
 url: str 
 is_private: bool
@dataclass
class ExtendedProfile(BaseProfile):
 summary: str = None
 level: int = None
class Parser:
 def __init__(self):
 self.session = requests.Session()
 def get_profile(self, url):
 resp = self.session.get(url)
 soup = BeautifulSoup(resp.text, "html.parser")
 
 profile_is_private = bool(soup.find("div", {"class": "profile_private_info"}))
 if profile_is_private:
 return BaseProfile(url=url, is_private=profile_is_private)
 summary=soup.find("div", {"class": "profile_summary"}).text,
 level=int(soup.find("span", class_="friendPlayerLevelNum").text)
 return ExtendedProfile(url=url, is_private=profile_is_private, summary=summary, level=level)
parser = Parser()
public_profile = parser.get_profile("https://steamcommunity.com/id/afarnsworth")
print(public_profile)
private_profile = parser.get_profile("https://steamcommunity.com/id/private")
print(private_profile)

Both options seem to have downsides.

In option #1, the function returns a Profile object that might have over 15 attributes set to None because the profile was private and they couldn't be obtained.

In option #2, the function returns an entirely different object each time. So the user would have to manually check which object the function returned--either BaseProfile or ExtendedProfile

Which is better practice for a public API? Or are there better alternatives for these kind of designs.

asked Feb 13, 2023 at 20:53
\$\endgroup\$
7
  • \$\begingroup\$ Use #2. Your downside is not actually true. I'll explain in an answer in a bit. \$\endgroup\$ Commented Feb 13, 2023 at 23:16
  • \$\begingroup\$ But you should elaborate - what is the user doing with these objects that varies between the two types? \$\endgroup\$ Commented Feb 13, 2023 at 23:17
  • \$\begingroup\$ Your is_private is based on pseudocode. We cannot meaningfully answer questions that have pseudocode in them. \$\endgroup\$ Commented Feb 13, 2023 at 23:19
  • \$\begingroup\$ @Reinderien Nothing's different functionality wise. I'm using the dataclasses as a method of storing profile data. I was initially going to give the profile class functions like get_friends and get_comments but instead decided to separate the profile object from the functions. Also yeah I was going to include the entire code but just tried to make it less verbose. The is_private just parses the HTML and checks the profile's privacy status and stores it in a bool. \$\endgroup\$ Commented Feb 13, 2023 at 23:24
  • \$\begingroup\$ Can you include that, please? \$\endgroup\$ Commented Feb 14, 2023 at 0:41

1 Answer 1

1
\$\begingroup\$

Definitely avoid #1. Something something billion dollar mistake: if you don't need nulls, eliminate them.

Option #2 is okay-ish. Why are summary and level still nullable? First, as you have them, their typehint is incorrect because it would be Optional[str] instead; but I don't see how those would ever be None (that's the whole reason you're using #2).

Don't bool(soup.find). Explicitly write soup.find() is not None.

get_profile should be hinted as get_profile(self, url: str) -> Union[BaseProfile, ExtendedProfile] - or use a pipe if you're fancy.

answered Feb 14, 2023 at 1:39
\$\endgroup\$
2
  • \$\begingroup\$ Thank you! The only thing is that some attributes in ExtendedProfile might not be always present. For example, maybe a user has an empty summary. But I suppose I could just do an empty string then. \$\endgroup\$ Commented Feb 14, 2023 at 1:53
  • \$\begingroup\$ For that, Optionals with Nones are probably unavoidable \$\endgroup\$ Commented Feb 14, 2023 at 1:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.