I am writing a benchmarking tool from scratch in Python.
However I can't get the performance of other benchmarking tools like wrk
or wrk2
. Using wr2 I can make 42k requests/s while my code can only create up to 2200 reqs/s. I have tried multiple ways to parallelize the code execution. I have tried using multiprocessing and parallel computing libraries like Dask. But I can't get better performance. I understand wrk
and wrk2
are written in C which can be one reason but still 42k vs 2200 seems like a very large difference.
I have tried with different number of workers and number_of_request
, but the performance does not change much.
I am trying to understand if I am really hitting the upper limit or I am doing something wrong. The server is running on localhost and written in Java Spring.
This is my code using multiprocessing:
import time
import multiprocessing
from collections import Counter, defaultdict
import requests
# import multiprocessing as mp
num_workers = multiprocessing.cpu_count()
output = multiprocessing.Queue()
def runner(number_of_request):
output=""
for i in range(number_of_request):
try:
output+=str(requests.get("http://127.0.0.1:8000/").text)
except:
pass
# print(output)
return output
if __name__ == '__main__':
number_of_request = 1000
start = time.time()
pool = multiprocessing.Pool(processes=num_workers)
outputs = [pool.apply_async(runner, args = (number_of_request,)) for x in range(num_workers)]
pool.close()
pool.join()
duration = time.time() - start
req_s = (number_of_request*num_workers)/duration
print("duration =", time.time() - start)
print(req_s)
-
1\$\begingroup\$ Bit hard to compare two ways if we see only one of them. \$\endgroup\$Manuel– Manuel2021年04月22日 11:44:37 +00:00Commented Apr 22, 2021 at 11:44
-
5\$\begingroup\$ To anyone in the close vote queue, code that is working as expected but not performing well is a good question for code review, this question does not belong in the close vote queue. \$\endgroup\$pacmaninbw– pacmaninbw ♦2021年04月22日 11:52:57 +00:00Commented Apr 22, 2021 at 11:52
-
1\$\begingroup\$ Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Feel free to edit and give it a different title if there is something more appropriate. \$\endgroup\$Sᴀᴍ Onᴇᴌᴀ– Sᴀᴍ Onᴇᴌᴀ ♦2021年04月22日 16:07:47 +00:00Commented Apr 22, 2021 at 16:07
1 Answer 1
Performance notwithstanding, there's some other cleanup that's worth doing:
- Capitalize
NUM_WORKERS
since it's a global constant - Do not shadow your global
output
with a local variable of the same name; and ideally don't have a globaloutput
at all - After your
get()
and before your call to.text
, you need to check whether the request succeeded - either via.ok
and a log entry showing the reason; or by.raise_for_status()
- never
try / except / pass
. This is the broadest and most dangerous form of silent exception-swallowing. Ifrunner
is executing and the user attempts to terminate the application with a Ctrl+C, that will be ignored here, and all other error information has become invisible to you. Consider at leastexcept Exception:
and logging the exception using the standard logging framework. - Consider annotating
runner
asdef runner(number_of_requests: int) -> str
- Do not cast the result of
.text
to a string - it's already a string - Do not use
time.time()
here; instead usetime.perf_counter()
for a sufficiently short duration ortime.monotonic()
otherwise - The parens in
(number_of_request*num_workers)/duration
are redundant
Explore related questions
See similar questions with these tags.