View categories

Categories

Last updated April 29, 2024

The ability to schedule background jobs is a requirement for most modern web apps. These jobs might be user-oriented, like sending emails; administrative, like taking backups or synchronizing data; or even a more integral part of the app itself.

On a single server deployment a system level tool like cron is the obvious choice to accomplish this kind of scheduling. However, when deploying to a cloud platform like Heroku, something higher level is required since instances of the application will be running in a distributed environment where machine-local tools are not useful.

The Heroku Scheduler add-on is a fantastic solution for simple tasks that need to run at 10 minute, hourly, or daily intervals (or multiples of those intervals). But what about tasks that need to run every 5 minutes or 37 minutes or those that need to run at a very specific time? For these more unique and complicated use cases running your own scheduling process can be very useful.

APScheduler

There are a few Python scheduling libraries to choose from. Celery is an extremely robust synchronous task queue and message system that supports scheduled tasks.

For this example, we’re going to use APScheduler, a lightweight, in-process task scheduler. It provides a clean, easy-to-use scheduling API, has no dependencies and is not tied to any specific job queuing system.

Install APScheduler easily with pip:

$ pip install apscheduler

And make sure to add it to your requirements.txt:

APScheduler>=3.10,<4.0

Execution schedule

Next you’ll need to author the file to define your schedule. The APScheduler Documentation has a lot of great examples that show the flexibility of the library.

Here’s a simple clock.py example file:

from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
@sched.scheduled_job('interval', minutes=3)
def timed_job():
 print('This job is run every three minutes.')
@sched.scheduled_job('cron', day_of_week='mon-fri', hour=17)
def scheduled_job():
 print('This job is run every weekday at 5pm.')
sched.start()

Here we’ve configured APScheduler to queue background jobs in 2 different ways. The first directive will schedule an interval job every 3 minutes, starting at the time the clock process is launched. The second will queue a scheduled job once per weekday only at 5pm.

While this is a trivial example, it’s important to note that no work should be done in the clock process itself for reasons already covered in the clock processes article. Instead schedule a background job that will perform the actual work invoked from the clock process.

Clock process type

Finally, you’ll need to define a process type in the Procfile. In this example we’ll call the process clock, so the Procfile should look something like this:

clock: python clock.py

Deployment

Commit the requirements.txt, Procfile, and clock.py changes and redeploy your application with a git push heroku master.

The final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.

$ heroku ps:scale clock=1

You should see similar output to the following in your Heroku logs.

2023年05月30日T20:59:38+00:00 heroku[clock.1]: State changed from created to starting
2023年05月30日T20:59:38+00:00 heroku[api]: Scale to clock=1, web=3 by user@heroku.com
2023年05月30日T20:59:40+00:00 heroku[clock.1]: Starting process with command `python clock.py`
2023年05月30日T20:59:41+00:00 heroku[clock.1]: State changed from starting to up
2023年05月30日T20:59:48+00:00 app[clock.1]: Starting clock for 1 events: [ Queueing interval job ]
2023年05月30日T20:59:48+00:00 app[clock.1]: Queuing scheduled jobs

Now you have a custom clock process up and running. Check out the APScheduler Documentation for more info.