0

I wish to make a requests with the Python requests module. I have a large database of urls I wish to download. the urls are in the database of the form page.be/something/something.html

I get a lot of ConnectionError's. If I search the URL in my browser, the page exists.

My Code:

if not webpage.url.startswith('http://www.'):
 new_html = requests.get(webpage.url, verify=True, timeout=10).text

An example of a page I'm trying to download is carlier.be/categorie/jobs.html. This gives me a ConnectionError, logged as below:

Connection error, Webpage not available for "carlier.be/categorie/jobs.html" with webpage_id "229998"

What seems to be the problem here? Why can't requests make the connection, while I can find the page in the browser?

altocumulus
21.7k13 gold badges65 silver badges86 bronze badges
asked Feb 9, 2016 at 13:20

1 Answer 1

2

The Requests library requires that you supply a schema for it to connect with (the 'http://' part of the url). Make sure that every url has http:// or https:// in front of it. You may want a try/except block where you catch a requests.exceptions.MissingSchema and try again with "http://" prepended to the url.

answered Feb 9, 2016 at 13:29
Sign up to request clarification or add additional context in comments.

2 Comments

So what would be a good code snippet for trying with http and https? catching ConnectionError and retrying with https doesn't seem like the good way to do it..
@SandervanDorsten I would process the url string before attempting to make the request. Are all of the urls supposed to go over http[s]? If that's the case, then you could even just check the first 4 characters of the string and if they're not http then prepend http:// or https:// to the url before making the request. The other obvious answer is a regular expression to determine if there is a format specifier in the string.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.