Im trying to get more grip of Classes and also calling different classes to each other which will be more understandable when showing code before:
Utils.py
from proxymanager import ProxyManager # https://pypi.org/project/proxy-manager/
class ProxyServer:
def __init__(self):
self.proxy_manager_test = ProxyManager('./test.txt')
self.proxy_manager_test2 = ProxyManager('./test2.txt')
self.proxy_manager_test3 = ProxyManager('./test3.txt')
self.proxy_manager_test4 = ProxyManager('./test4.txt')
def hello(self):
return self.proxy_manager_test.random_proxy().get_dict()
def world(self):
return self.proxy_manager_test2.random_proxy().get_dict()
def hey(self):
return self.proxy_manager_test3.random_proxy().get_dict()
def yes(self):
return self.proxy_manager_test4.random_proxy().get_dict()
class Testing:
def GetData(self, value=None):
if value:
dataSaved = getattr(ProxyServer(), value)() # <--- Technically works but is it correct way to do it?
print(dataSaved)
Main.py
import random
import time
from .utils import Testing
scraper = Testing()
randomList = ["hello", "world", "hey", "yes"]
while True:
response = scraper.GetData(
value=random.choice(randomList)
)
time.sleep(10)
My concern is the dataSaved = getattr(ProxyServer(), value)()
which seem to work as expected but i'm not sure if its correct way to do it, I did get some review about it but the answer I got was that it will technically work but its wrong to do it this way and I want a second opinion from you guys, Is it actually wrong doing it this way?
And please, let me know if I am missing anything else here to provide missing information :)
-
\$\begingroup\$ Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers . Consider waiting till a full day has past and posting a follow-up question instead. \$\endgroup\$Mast– Mast ♦2020年11月24日 19:26:44 +00:00Commented Nov 24, 2020 at 19:26
-
1\$\begingroup\$ @Mast Woops im sorry! It is my fault, I was not able to find documenation about if it is ok to add additional code in my question (At the very bottom) based from Grajdeanu Alex. answer? \$\endgroup\$PythonNewbie– PythonNewbie2020年11月24日 19:29:33 +00:00Commented Nov 24, 2020 at 19:29
-
1\$\begingroup\$ No, consider asking a follow-up question as stated in my previous comment. Feel free to add a link to both this and the new question to the other one for bonus context. Don't touch the code. \$\endgroup\$Mast– Mast ♦2020年11月24日 19:33:35 +00:00Commented Nov 24, 2020 at 19:33
1 Answer 1
I think you've over-engineered the code a bit. If I were to write this code I wouldn't have used a class at all. Looking at what your code does I understand that:
- you have multiple proxies files
- you randomly choose one of them and perform whatever actions based on that
Why not just doing something along the following lines:
import random
import time
from proxymanager import ProxyManager
PROXY_FILES = (
'./test.txt',
'./test2.txt',
'./test3.txt',
'./test4.txt',
)
DELAY = 10
def get_random_proxy():
"""
Return a random proxy dict from a random file.
"""
random_file = random.choice(PROXY_FILES)
return ProxyManager(random_file).random_proxy().get_dict()
def main():
while True:
proxy = get_random_proxy()
print(f'Do something with {proxy["http"]}\n')
time.sleep(DELAY)
if __name__ == '__main__':
main()
This approach will save you a lot of development time since you:
- won't have to add a new method to your
ProxyServer
each time you have a new proxy file. - won't have to keep track of
random_list
(which is not quite random since it has to match the attributes of the class) - you won't need a
Testing()
class anymore
Notice that with this version is really easy to figure out what you're doing.
Recommendations (PEP8):
- use appropriate naming for your variables (snake_case naming convention for methods/functions, UPPER_CASE naming convention for constants);
- use docstrings to help your readers understand the code;
If you're really tied to use classes, I'd mix your solution with mine and do something along these lines:
import random
import time
import requests
from proxymanager import ProxyManager
PROXY_FILES = (
'./test.txt',
'./test2.txt',
'./test3.txt',
'./test4.txt',
)
DELAY = 10
class ProxyServer:
def __init__(self, random_proxy_file):
self.proxy_manager = ProxyManager(random_proxy_file)
def get_proxy(self):
"""
Return a random proxy dict.
"""
return self.proxy_manager.random_proxy().get_dict()
class Scrapper:
def __init__(self, base_url):
self.base_url = base_url
@staticmethod
def _get_random_proxy():
random_proxy_file = random.choice(PROXY_FILES)
return ProxyServer(random_proxy_file).get_proxy()
def get_url_data(self):
proxy = self._get_random_proxy()
data = requests.get(self.base_url, proxies={'http': proxy['http']})
return data
def do_something_with_data(self, data):
return data
def main():
scrapper = Scrapper('http://scrapeme.not/')
while True:
data = scrapper.get_url_data()
scrapper.do_something_with_data(data)
time.sleep(DELAY)
if __name__ == '__main__':
main()
But notice how this has become a lot more complex than it should've been. Some advantages that this has over your solution:
- it uses better naming;
- it avoids creating multiple
ProxyManager()
instances; - it avoids mapping the
getattr()
to a value from a list you'd have to maintain all the time as complexity of the scrapper increases;
And to answer your question: no, it's not wrong to use getattr()
like that but in this specific case I don't think it's necessary due to the above listed reasons.
-
\$\begingroup\$ Hello! This is very good review and I do really like it! There is so much I can easily improve with this code review and I do apprecaite it! There is one missing part that I might have needed to more explain, the reason why I had different functions for proxies is that my plan later on is that when I want to call a specific proxy, in my case etc: proxy "world" then I want to specific use the
./test2.txt
and in our case you are using a random function. For now I had a randomrandomList = ["hello", "world", "hey", "yes"]
for this part but later on I will instead call a specific one. \$\endgroup\$PythonNewbie– PythonNewbie2020年11月24日 17:39:08 +00:00Commented Nov 24, 2020 at 17:39 -
\$\begingroup\$ And here I might need in your case maybe a small change if it doesn't take up too much time for you! But the choice of calling a specific proxy is much needed and I would like to call the specific proxy when I do it later on. \$\endgroup\$PythonNewbie– PythonNewbie2020年11月24日 17:39:59 +00:00Commented Nov 24, 2020 at 17:39
-
\$\begingroup\$ Would you be able to put PROXY_FILES into a dict and then you can just do something like:
PROXY_FILES [proxyType].random_proxy().get_dict()
where proxyType can be etc:hello
? \$\endgroup\$PythonNewbie– PythonNewbie2020年11月24日 19:58:34 +00:00Commented Nov 24, 2020 at 19:58