5
\$\begingroup\$

Below is a Python implementation of the quicksort algorithm using parallelism. It takes about 1 second per every 10 items in the list, which is hilariously unacceptable. Why is it so slow?

from multiprocessing import *
def quicksort(lyst, connection=None):
 if len(lyst) > 1:
 pivot = lyst.pop(len(lyst)-1)
 wall = 0
 for i in range(len(lyst)):
 if lyst[i] <= pivot:
 lyst[wall], lyst[i] = lyst[i], lyst[wall]
 wall += 1
 receiveLeft, sendLeft = Pipe()
 receiveRight, sendRight = Pipe()
 Process(target=quicksort, args=(lyst[:wall], sendLeft)).start()
 Process(target=quicksort, args=(lyst[wall:], sendRight)).start()
 lyst = receiveLeft.recv() + [pivot] + receiveRight.recv()
 if connection:
 connection.send(lyst)
 connection.close()
 return lyst
if __name__ == '__main__':
 quicksort([8,4,6,9,1,3,10,2,7,5])

EDIT Thanks for the responses. As it turns out, switching to Threads and limiting the number of them I was spawning sped things up. However, my linear version of the algorithm still performed better.

asked Sep 11, 2016 at 15:51
\$\endgroup\$
1
  • \$\begingroup\$ Also, please consider adding your imports so your code will be easier to review. Also consider adding example usage, like how the function will be called on example data and it's result. \$\endgroup\$ Commented Sep 11, 2016 at 16:37

2 Answers 2

2
\$\begingroup\$

The problem seems to be the fact that you are not dealing with threads (as you should), but rather entire processes (which is more computationally demanding). Refer to this page for details on how to spawn threads.

Also, as a side node, it makes no sense to keep spawning new threads over and over again, since spawning a thread can take easily a couple of milliseconds, which would obliterate all performance gain. Instead, ask the user to pass the maximum amount of threads allowed, and do not spawn any more than that. Furthermore, it would be reasonable to cancel spawning a new thread in case the range to sort it receives is too small.

Hope that helps.

answered Sep 11, 2016 at 17:27
\$\endgroup\$
1
  • \$\begingroup\$ Oh really approved answer !!! Dears, There is no real parallel execution with threadsin Python !!!! Keep your process but manage to be reasonnable in the number of them you creat. Alex \$\endgroup\$ Commented May 20, 2018 at 16:49
2
\$\begingroup\$

Spawning new processes is expensive (the cost also varies greatly between types of operating systems). One optimization would be to spawn processes at first and keep them ready to accept tasks. Then you only get the overhead of passing the data and results between them.

A much better alternative is to use thread based parallelism. However, Python has some issues in this regard as explained here: https://softwareengineering.stackexchange.com/questions/186889/why-was-python-written-with-the-gil

answered Sep 11, 2016 at 18:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.