Here's the simplest multi threading example I found so far:
import multiprocessing
import subprocess
def calculate(value):
return value * 10
if __name__ == '__main__':
pool = multiprocessing.Pool(None)
tasks = range(10000)
results = []
r = pool.map_async(calculate, tasks, callback=results.append)
r.wait() # Wait on the results
print results
I have two lists and one index to access the elements in each list. The ith position on the first list is related to the ith position on the second. I didn't use a dict because the lists are ordered.
What I was doing was something like:
for i in xrange(len(first_list)):
# do something with first_list[i] and second_list[i]
So, using that example, I think can make a function sort of like this:
#global variables first_list, second_list, i
first_list, second_list, i = None, None, 0
#initialize the lists
...
#have a function to do what the loop did and inside it increment i
def function:
#do stuff
i += 1
But, that makes i a shared resource and I'm not sure if that'd be safe. It also seems to me my design is not lending itself well to this multithreaded approach, but I'm not sure how to fix it.
Here's a working example of what I wanted (Edit an image you want to use):
import multiprocessing
import subprocess, shlex
links = ['http://www.example.com/image.jpg']*10 # don't use this URL
names = [str(i) + '.jpg' for i in range(10)]
def download(i):
command = 'wget -O ' + names[i] + ' ' + links[i]
print command
args = shlex.split(command)
return subprocess.call(args, shell=False)
if __name__ == '__main__':
pool = multiprocessing.Pool(None)
tasks = range(10)
r = pool.map_async(download, tasks)
r.wait() # Wait on the results
1 Answer 1
First off, it might be beneficial to make one list of tuples, for example
new_list[i] = (first_list[i], second_list[i])
That way, as you change i, you ensure that you are always operating on the same items from first_list and second_list.
Secondly, assuming there are no relations between the i and i-1 entries in your lists, you can use your function to operate on one given i value, and spawn a thread to handle each i value. Consider
indices = range(len(new_list))
results = []
r = pool.map_async(your_function, indices, callback=results.append)
r.wait() # Wait on the results
This should give you what you want.
4 Comments
subprocess.call, but then each thread ends up waiting for it to complete and it means the multiprocess approach doesn't work. If I use subprocess.Popen, it just spawns too many processes. Is there a way around that?subprocess.call and subprocess.Popen should function the same. Can you edit your question with the exact code?subprocess.call waits for the process to finish and subprocess.popen just spawns multiple processes. At least that's been by experience with wget. docs.python.org/library/subprocess.html I guess they're the same if you do a Popen and wait() though.Explore related questions
See similar questions with these tags.
shlex.split()just do:args = ['wget', '-O', names[i], links[i]].