1

In official docs of Python exist next example:

import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
 'http://www.cnn.com/',
 'http://europe.wsj.com/',
 'http://www.bbc.co.uk/',
 'http://nonexistant-subdomain.python.org/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
 with urllib.request.urlopen(url, timeout=timeout) as conn:
 return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
 # Start the load operations and mark each future with its URL
 future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
 for future in concurrent.futures.as_completed(future_to_url):
 url = future_to_url[future]
 try:
 data = future.result()
 except Exception as exc:
 print('%r generated an exception: %s' % (url, exc))
 else:
 print('%r page is %d bytes' % (url, len(data)))

It works great, but i need set a delay between created futures, so that requests are not sent in same time, but with a delay of, for example, 100 ms from each other.

Link on docs -> https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example

Can I implement this by changing this line future_to_url = {executor.submit(load_url, url, 60): url for url in URLS} or will it require reworking all the code?

asked May 13, 2024 at 12:29
1
  • Instead of the dictionary comprehension, use a loop and add a sleep. Or submit via another function that has a sleep Commented May 13, 2024 at 12:55

1 Answer 1

0

Use a wrapper function that handles the delay as follows:

import concurrent.futures
import urllib.request
from time import sleep
URLS = ['http://www.foxnews.com/',
 'http://www.cnn.com/',
 'http://europe.wsj.com/',
 'http://www.bbc.co.uk/',
 'http://nonexistant-subdomain.python.org/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
 with urllib.request.urlopen(url, timeout=timeout) as conn:
 return conn.read()
def submit_with_delay(exe, func, url, timeout=60):
 future = exe.submit(func, url, timeout)
 sleep(0.1)
 return future
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
 # Start the load operations and mark each future with its URL
 future_to_url = {submit_with_delay(executor, load_url, url): url for url in URLS}
 for future in concurrent.futures.as_completed(future_to_url):
 url = future_to_url[future]
 try:
 data = future.result()
 except Exception as exc:
 print('%r generated an exception: %s' % (url, exc))
 else:
 print('%r page is %d bytes' % (url, len(data)))
answered May 13, 2024 at 13:01
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.