How to have one script call and run another one concurrently in python?

Question 1

What I am trying to accomplish is to stream tweets from Twitter for an hour, write the list of tweets to a file, clean and run analysis on the most recent hour of tweets, and then repeat the process indefinitely.

The problem I am running into is that if I run the cleaning and analysis of the tweets in the same script that's handling the streaming - by either hard-coding it or importing the functionality from a module - the whole script waits until these procedures are complete, and then begins again with the streaming. Is there a way to call the cleaning and analysis module within the streaming script so they run concurrently and the streaming doesn't stop while the cleaning and analysis is happening?

I've tried to achieve this by using subprocess.call('python cleaner.py', shell=True) and subprocess.Popen('python cleaner.py', shell=True), but I don't really know how to use these tools properly, and the two examples above have resulted in the streaming being stopped, cleaner.py being run, and then the streaming resumed.

Question 2

Subprocess

You can use subprocess.Popen, as you tried, to run a different script concurrently:

the_other_process = subprocess.Popen(['python', 'cleaner.py'])

That line alone does what you want. What you don't want to do is:

the_other_process.communicate()
# or
the_other_process.wait()

Those would stop current process and wait for the other one to finish. A very useful feature in other circumstances.

If you want to know whether the subprocess is finished (but not wait for it):

result = the_other_process.poll()
if result is not None:
 print('the other process has finished and retuned %s' % result)

Thread

Concurrency can also be achieved using threads. In that case, you are not running a new process, you are just splitting the current process into concurrent parts. Try this:

def function_to_be_executed_concurrently():
 for i in range(5):
 time.sleep(1)
 print('running in separate thread', i)
thread = threading.Thread(target=function_to_be_executed_concurrently)
thread.start()
for i in range(5):
 time.sleep(1)
 print('running in main thread', i)

The above code should result with mixed outputs of running in separate thread and running in main thread.

Thread vs process

Using subprocess, you can run anything which could be run standalone from the shell. It does not have to be python.
Using threading, you can run any function in a concurrent thread of execution.
Threads share the same memory, so it is easy to share data between them (although there are issues when synchronization is needed). With processes, sharing data can become a problem. If a lot of data has to be shared, susbprocesses can be much slower.
Starting a new process is slower and consumes more resources than running a thread
Since threads run in the same process, they share are bound to the same GIL, which means most things will run on the same CPU core. If very slow CPU-consuming tasks need to be sped up, running them in separate processes my be faster.

Multiprocessing

multiprocessing module provides an interface similar to threading, but it runs subprocesses instead. This is useful when you need to take full advantage of all CPU cores.

** Note that subprocess.Popen(['python', 'cleaner.py']) is the same thing as subprocess.Popen('python cleaner.py', shell=True), but the former is better practice to learn.

For example, if there is a space in the path, this will fail:

subprocess.Popen('python My Documents\\cleaner.py', shell=True)

It fails because it interprets My and Documents\cleaner.py as two separate arguments.

On the other hand, this will work as expected:

subprocess.Popen(['python', 'My Documents\\cleaner.py'])

It works, because the arguments are explicitly separated by using a list.

The latter is especially superior if one of the arguments is in a variable:

subprocess.Popen(['python', path_to_file])

Question 3

Thank you so much for your answer. Glad to know that I was on the right track. You say that 'shell=True' doesn't change how the 'subprocess.Popen

Question 4

'subprocess.Popen' method works in my instance, but could you elaborate on what it does do? (Excuse the compound comment, I posted the first fragment with a stray click and couldn't figure out any way to delete or edit it)

Question 5

@alvalentini I aded the explanation about shell=True to the answer

zvone 19.5k5 gold badges53 silver badges85 bronze badges · Accepted Answer · 2016-10-13 06:24:08Z

Subprocess

You can use subprocess.Popen, as you tried, to run a different script concurrently:

the_other_process = subprocess.Popen(['python', 'cleaner.py'])

That line alone does what you want. What you don't want to do is:

the_other_process.communicate()
# or
the_other_process.wait()

Those would stop current process and wait for the other one to finish. A very useful feature in other circumstances.

If you want to know whether the subprocess is finished (but not wait for it):

result = the_other_process.poll()
if result is not None:
 print('the other process has finished and retuned %s' % result)

Thread

Concurrency can also be achieved using threads. In that case, you are not running a new process, you are just splitting the current process into concurrent parts. Try this:

def function_to_be_executed_concurrently():
 for i in range(5):
 time.sleep(1)
 print('running in separate thread', i)
thread = threading.Thread(target=function_to_be_executed_concurrently)
thread.start()
for i in range(5):
 time.sleep(1)
 print('running in main thread', i)

The above code should result with mixed outputs of running in separate thread and running in main thread.

Thread vs process

Using subprocess, you can run anything which could be run standalone from the shell. It does not have to be python.
Using threading, you can run any function in a concurrent thread of execution.
Threads share the same memory, so it is easy to share data between them (although there are issues when synchronization is needed). With processes, sharing data can become a problem. If a lot of data has to be shared, susbprocesses can be much slower.
Starting a new process is slower and consumes more resources than running a thread
Since threads run in the same process, they share are bound to the same GIL, which means most things will run on the same CPU core. If very slow CPU-consuming tasks need to be sped up, running them in separate processes my be faster.

Multiprocessing

multiprocessing module provides an interface similar to threading, but it runs subprocesses instead. This is useful when you need to take full advantage of all CPU cores.

** Note that subprocess.Popen(['python', 'cleaner.py']) is the same thing as subprocess.Popen('python cleaner.py', shell=True), but the former is better practice to learn.

For example, if there is a space in the path, this will fail:

subprocess.Popen('python My Documents\\cleaner.py', shell=True)

It fails because it interprets My and Documents\cleaner.py as two separate arguments.

On the other hand, this will work as expected:

subprocess.Popen(['python', 'My Documents\\cleaner.py'])

It works, because the arguments are explicitly separated by using a list.

The latter is especially superior if one of the arguments is in a variable:

subprocess.Popen(['python', path_to_file])

Thank you so much for your answer. Glad to know that I was on the right track. You say that 'shell=True' doesn't change how the 'subprocess.Popen
'subprocess.Popen' method works in my instance, but could you elaborate on what it does do? (Excuse the compound comment, I posted the first fragment with a stray click and couldn't figure out any way to delete or edit it)
@alvalentini I aded the explanation about shell=True to the answer

CollectivesTM on Stack Overflow

How to have one script call and run another one concurrently in python?

1 Answer 1

Subprocess

Thread

Thread vs process

Multiprocessing

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Subprocess

Thread

Thread vs process

Multiprocessing

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related