1

Here's the simplest multi threading example I found so far:

import multiprocessing
import subprocess
def calculate(value):
 return value * 10
if __name__ == '__main__':
 pool = multiprocessing.Pool(None)
 tasks = range(10000)
 results = []
 r = pool.map_async(calculate, tasks, callback=results.append)
 r.wait() # Wait on the results
 print results

I have two lists and one index to access the elements in each list. The ith position on the first list is related to the ith position on the second. I didn't use a dict because the lists are ordered.

What I was doing was something like:

for i in xrange(len(first_list)):
 # do something with first_list[i] and second_list[i]

So, using that example, I think can make a function sort of like this:

#global variables first_list, second_list, i
first_list, second_list, i = None, None, 0
#initialize the lists
...
#have a function to do what the loop did and inside it increment i
def function:
 #do stuff
 i += 1

But, that makes i a shared resource and I'm not sure if that'd be safe. It also seems to me my design is not lending itself well to this multithreaded approach, but I'm not sure how to fix it.

Here's a working example of what I wanted (Edit an image you want to use):

import multiprocessing
import subprocess, shlex
links = ['http://www.example.com/image.jpg']*10 # don't use this URL
names = [str(i) + '.jpg' for i in range(10)]
def download(i):
 command = 'wget -O ' + names[i] + ' ' + links[i]
 print command
 args = shlex.split(command)
 return subprocess.call(args, shell=False)
if __name__ == '__main__':
 pool = multiprocessing.Pool(None)
 tasks = range(10)
 r = pool.map_async(download, tasks)
 r.wait() # Wait on the results
asked Sep 19, 2011 at 3:51
1
  • you don't need shlex.split() just do: args = ['wget', '-O', names[i], links[i]]. Commented Sep 19, 2011 at 8:04

1 Answer 1

1

First off, it might be beneficial to make one list of tuples, for example

new_list[i] = (first_list[i], second_list[i])

That way, as you change i, you ensure that you are always operating on the same items from first_list and second_list.

Secondly, assuming there are no relations between the i and i-1 entries in your lists, you can use your function to operate on one given i value, and spawn a thread to handle each i value. Consider

indices = range(len(new_list))
results = []
r = pool.map_async(your_function, indices, callback=results.append)
r.wait() # Wait on the results

This should give you what you want.

answered Sep 19, 2011 at 4:15
Sign up to request clarification or add additional context in comments.

4 Comments

I tried doing it, but it seems another complication prevents it from working. Inside the function I call subprocess.call, but then each thread ends up waiting for it to complete and it means the multiprocess approach doesn't work. If I use subprocess.Popen, it just spawns too many processes. Is there a way around that?
subprocess.call and subprocess.Popen should function the same. Can you edit your question with the exact code?
subprocess.call waits for the process to finish and subprocess.popen just spawns multiple processes. At least that's been by experience with wget. docs.python.org/library/subprocess.html I guess they're the same if you do a Popen and wait() though.
I made a mistake elsewhere, nevermind. Everything works as expected. I'll add an example in the OP.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.