How can this async url response checker be cleaner/faster?

Question 1

with open('things.txt') as things: 
 urls = [url.strip().lower() for url in things]
async def is_site_404(s, url):
 async with s.head(f"https://example.com/{url}") as r1:
 if r1.status == 404:
 print('hello i am working')
async def create_tasks(urls):
 tasks = []
 async with aiohttp.ClientSession() as s:
 for url in urls:
 if len(url) >= 5 and len(url) < 16 and url.isalnum():
 task = asyncio.create_task(is_site_404(s, url))
 tasks.append(task)
 return await asyncio.gather(*tasks)
while True:
 asyncio.get_event_loop().run_until_complete(create_tasks(urls))

Hi, this is a basic asynchronous url response checker that I created and it's pretty fast, but I'm wondering if there is any way to get more requests per second with the base of this code. I have it designed so that it runs forever and just prints whenever there is a 404 in this example basically. I'm pretty new to python and coding in general and I would like some guidance/advice from anyone who has more experience with this kind of thing.. maybe there is an aiohttp alternative I should use that's faster? ANY advice is greatly appreciated.

Question 2

Hi, please edit your question so the title states what your code does since reviewers would usually wanna see that first. This is mentioned in How to Ask. I would've done this for you, but since it's your question I think you can choose the most appropriate title

Question 3

Overall, your code looks pretty decent. The functions are doing what they should be (create_task shouldn't be running the tasks as well), coroutines are gathered after aggregation.

I'd suggest a few things to make it more readable (and maintainable)

`if name` block

Put script execution content inside the if __name__ == "__main__" block. Read more about why on stack overflow.

Variable naming

While you follow the PEP-8 convention on variable naming, the names still could use a rework, for eg. session instead of just s.

URL or path

URL refers to "Uniform Resource Locator", which is of the form:

scheme:[//authority]path[?query][#fragment]

You are dealing with only the path here, scheme and authority sections have been fixed as https://example.com/. This is again naming convenience.

Gather vs create

You are creating as well as gathering tasks in create_tasks function.

Type hinting

New in python-3.x is the type hinting feature. I suggest using it whenever possible.

Rewrite

import asyncio
import aiohttp
HOST = "https://example.com"
THINGS_FILE = "things.txt"
def validate_path(path: str) -> bool:
 return 5 <= len(path) < 16 and path.isalnum()
async def check_404(session: aiohttp.ClientSession, path: str):
 async with session.head(f"{HOST}/{path}") as response:
 if response.status == 404:
 print("hello i am working")
async def execute_requests(paths: list[str]):
 async with aiohttp.ClientSession() as session:
 tasks = []
 for path in paths:
 if validate_path(path):
 task = asyncio.create_task(check_404(session, path))
 tasks.append(task)
 return await asyncio.gather(*tasks)
def main():
 with open(THINGS_FILE) as things:
 paths = [line.strip().lower() for line in things]
 while True:
 asyncio.get_event_loop().run_until_complete(execute_requests(paths))
if __name__ == "__main__":
 main()

This can be further rewritten depending on the data being read from file, using a map/filter to only iterate over validated paths in file etc. The above is mostly a suggestion.

Question 4

thank you for the insight, really appreciate it and will apply this stuff in the future.

Question 5

@humid Why haven't you updated the title yet?

score 3 · Answer 1 · 2020-10-23 10:15:56Z

Overall, your code looks pretty decent. The functions are doing what they should be (create_task shouldn't be running the tasks as well), coroutines are gathered after aggregation.

I'd suggest a few things to make it more readable (and maintainable)

`if name` block

Put script execution content inside the if __name__ == "__main__" block. Read more about why on stack overflow.

Variable naming

While you follow the PEP-8 convention on variable naming, the names still could use a rework, for eg. session instead of just s.

URL or path

URL refers to "Uniform Resource Locator", which is of the form:

scheme:[//authority]path[?query][#fragment]

You are dealing with only the path here, scheme and authority sections have been fixed as https://example.com/. This is again naming convenience.

Gather vs create

You are creating as well as gathering tasks in create_tasks function.

Type hinting

New in python-3.x is the type hinting feature. I suggest using it whenever possible.

Rewrite

import asyncio
import aiohttp
HOST = "https://example.com"
THINGS_FILE = "things.txt"
def validate_path(path: str) -> bool:
 return 5 <= len(path) < 16 and path.isalnum()
async def check_404(session: aiohttp.ClientSession, path: str):
 async with session.head(f"{HOST}/{path}") as response:
 if response.status == 404:
 print("hello i am working")
async def execute_requests(paths: list[str]):
 async with aiohttp.ClientSession() as session:
 tasks = []
 for path in paths:
 if validate_path(path):
 task = asyncio.create_task(check_404(session, path))
 tasks.append(task)
 return await asyncio.gather(*tasks)
def main():
 with open(THINGS_FILE) as things:
 paths = [line.strip().lower() for line in things]
 while True:
 asyncio.get_event_loop().run_until_complete(execute_requests(paths))
if __name__ == "__main__":
 main()

This can be further rewritten depending on the data being read from file, using a map/filter to only iterate over validated paths in file etc. The above is mostly a suggestion.

thank you for the insight, really appreciate it and will apply this stuff in the future.

Stack Exchange Network

How can this async url response checker be cleaner/faster?

1 Answer 1

`if name` block

Variable naming

URL or path

Gather vs create

Type hinting

Rewrite

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How can this async url response checker be cleaner/faster?

1 Answer 1

if __name__ block

Variable naming

URL or path

Gather vs create

Type hinting

Rewrite

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`if name` block