I'm using my crawler class in the following manner and I'm beginning to think it's bad practice:
crawler.py
import requests
class Crawler():
def __init__(self, url):
self.url = url
def web_crawler(self):
requests.get(self.url)
return requests.text
main.py
for url in urls:
crawler = Crawler(url)
results = crawler.web_crawler()
Would it be better to move the url
parameter outside of Crawler
's __init__
and move it into the web_crawler
function? That way the class won't have to be reinitialized multiple times in main.py.
2 Answers 2
As the Crawler
class just has one method along with __init__
, you can avoid a class altogether and write:
def web_crawler(url):
requests.get(url)
return requests.text
You now have to initialize exactly 0 times, thus removing the problem from the root:
for url in urls:
results = web_crawler(url)
The code is also simplified, both in definition and usage.
You can also create a field name url, and use getter and setter to obtain/change the value outside of the class.
-
\$\begingroup\$ This may be the start of a good review, but in it's current form it doesn't provide much value. Would you care to expand a bit? \$\endgroup\$2016年02月06日 14:08:16 +00:00Commented Feb 6, 2016 at 14:08
-
3\$\begingroup\$ Would've been a good idea but I don't think it's a good practice. This reminds me of Java. \$\endgroup\$Jonathan– Jonathan2016年02月06日 14:46:35 +00:00Commented Feb 6, 2016 at 14:46
-
\$\begingroup\$ Well I just noted that you can take the variable out and provide mechanism for changing it, this way you will need to change a variable only. Also I must note that you may have problems with multi threading if it isn't implemented properly. \$\endgroup\$Planet_Earth– Planet_Earth2016年02月06日 15:05:12 +00:00Commented Feb 6, 2016 at 15:05
return requests.text
? Did you, at least, tried to run this code? \$\endgroup\$