1
\$\begingroup\$

Im trying to get more grip of Classes and also calling different classes to each other which will be more understandable when showing code before:

Utils.py

from proxymanager import ProxyManager # https://pypi.org/project/proxy-manager/
class ProxyServer:
 def __init__(self):
 self.proxy_manager_test = ProxyManager('./test.txt')
 self.proxy_manager_test2 = ProxyManager('./test2.txt')
 self.proxy_manager_test3 = ProxyManager('./test3.txt')
 self.proxy_manager_test4 = ProxyManager('./test4.txt')
 def hello(self):
 return self.proxy_manager_test.random_proxy().get_dict()
 
 def world(self):
 return self.proxy_manager_test2.random_proxy().get_dict()
 
 def hey(self):
 return self.proxy_manager_test3.random_proxy().get_dict()
 
 def yes(self):
 return self.proxy_manager_test4.random_proxy().get_dict()
 
class Testing:
 
 def GetData(self, value=None):
 if value:
 dataSaved = getattr(ProxyServer(), value)() # <--- Technically works but is it correct way to do it?
 print(dataSaved)

Main.py


import random
import time
from .utils import Testing
scraper = Testing()
randomList = ["hello", "world", "hey", "yes"]
while True:
 response = scraper.GetData(
 value=random.choice(randomList)
 )
 time.sleep(10)

My concern is the dataSaved = getattr(ProxyServer(), value)() which seem to work as expected but i'm not sure if its correct way to do it, I did get some review about it but the answer I got was that it will technically work but its wrong to do it this way and I want a second opinion from you guys, Is it actually wrong doing it this way?

And please, let me know if I am missing anything else here to provide missing information :)

Mast
13.8k12 gold badges57 silver badges127 bronze badges
asked Nov 24, 2020 at 9:23
\$\endgroup\$
3
  • \$\begingroup\$ Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers . Consider waiting till a full day has past and posting a follow-up question instead. \$\endgroup\$ Commented Nov 24, 2020 at 19:26
  • 1
    \$\begingroup\$ @Mast Woops im sorry! It is my fault, I was not able to find documenation about if it is ok to add additional code in my question (At the very bottom) based from Grajdeanu Alex. answer? \$\endgroup\$ Commented Nov 24, 2020 at 19:29
  • 1
    \$\begingroup\$ No, consider asking a follow-up question as stated in my previous comment. Feel free to add a link to both this and the new question to the other one for bonus context. Don't touch the code. \$\endgroup\$ Commented Nov 24, 2020 at 19:33

1 Answer 1

3
\$\begingroup\$

I think you've over-engineered the code a bit. If I were to write this code I wouldn't have used a class at all. Looking at what your code does I understand that:

  • you have multiple proxies files
  • you randomly choose one of them and perform whatever actions based on that

Why not just doing something along the following lines:

import random
import time
from proxymanager import ProxyManager
PROXY_FILES = (
 './test.txt',
 './test2.txt',
 './test3.txt',
 './test4.txt',
)
DELAY = 10
def get_random_proxy():
 """
 Return a random proxy dict from a random file.
 """
 random_file = random.choice(PROXY_FILES)
 return ProxyManager(random_file).random_proxy().get_dict()
def main():
 while True:
 proxy = get_random_proxy()
 print(f'Do something with {proxy["http"]}\n')
 time.sleep(DELAY)
if __name__ == '__main__':
 main()

This approach will save you a lot of development time since you:

  • won't have to add a new method to your ProxyServer each time you have a new proxy file.
  • won't have to keep track of random_list (which is not quite random since it has to match the attributes of the class)
  • you won't need a Testing() class anymore

Notice that with this version is really easy to figure out what you're doing.

Recommendations (PEP8):

  • use appropriate naming for your variables (snake_case naming convention for methods/functions, UPPER_CASE naming convention for constants);
  • use docstrings to help your readers understand the code;

If you're really tied to use classes, I'd mix your solution with mine and do something along these lines:

import random
import time
import requests
from proxymanager import ProxyManager
PROXY_FILES = (
 './test.txt',
 './test2.txt',
 './test3.txt',
 './test4.txt',
)
DELAY = 10
class ProxyServer:
 def __init__(self, random_proxy_file):
 self.proxy_manager = ProxyManager(random_proxy_file)
 def get_proxy(self):
 """
 Return a random proxy dict.
 """
 return self.proxy_manager.random_proxy().get_dict()
class Scrapper:
 def __init__(self, base_url):
 self.base_url = base_url
 @staticmethod
 def _get_random_proxy():
 random_proxy_file = random.choice(PROXY_FILES)
 return ProxyServer(random_proxy_file).get_proxy()
 def get_url_data(self):
 proxy = self._get_random_proxy()
 data = requests.get(self.base_url, proxies={'http': proxy['http']})
 return data
 def do_something_with_data(self, data):
 return data
def main():
 scrapper = Scrapper('http://scrapeme.not/')
 
 while True:
 data = scrapper.get_url_data()
 scrapper.do_something_with_data(data)
 time.sleep(DELAY)
if __name__ == '__main__':
 main()

But notice how this has become a lot more complex than it should've been. Some advantages that this has over your solution:

  • it uses better naming;
  • it avoids creating multiple ProxyManager() instances;
  • it avoids mapping the getattr() to a value from a list you'd have to maintain all the time as complexity of the scrapper increases;

And to answer your question: no, it's not wrong to use getattr() like that but in this specific case I don't think it's necessary due to the above listed reasons.

answered Nov 24, 2020 at 15:36
\$\endgroup\$
3
  • \$\begingroup\$ Hello! This is very good review and I do really like it! There is so much I can easily improve with this code review and I do apprecaite it! There is one missing part that I might have needed to more explain, the reason why I had different functions for proxies is that my plan later on is that when I want to call a specific proxy, in my case etc: proxy "world" then I want to specific use the ./test2.txt and in our case you are using a random function. For now I had a random randomList = ["hello", "world", "hey", "yes"] for this part but later on I will instead call a specific one. \$\endgroup\$ Commented Nov 24, 2020 at 17:39
  • \$\begingroup\$ And here I might need in your case maybe a small change if it doesn't take up too much time for you! But the choice of calling a specific proxy is much needed and I would like to call the specific proxy when I do it later on. \$\endgroup\$ Commented Nov 24, 2020 at 17:39
  • \$\begingroup\$ Would you be able to put PROXY_FILES into a dict and then you can just do something like: PROXY_FILES [proxyType].random_proxy().get_dict() where proxyType can be etc: hello ? \$\endgroup\$ Commented Nov 24, 2020 at 19:58

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.