0

I'm investigating the possibility of making a single http request using python to retrieve both the html as well as http headers info instead of having to make 2 seperate calls.

Anyone know of any good ways?

Also what is the performance differences between the different methods of making these requests, e.g. urllib2 and httpconnection, etc.

asked Jun 8, 2012 at 12:29

2 Answers 2

3

Just use urllib2.urlopen(). The HTML can be retrieved by calling the read() method of the returned object, and the headers are available in the headers attribute.

import urllib2
f = urllib2.urlopen('http://www.google.com')
>>> print f.headers
Date: 2012年6月08日 12:57:25 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Connection: close
>>> print f.read()
<!doctype html><html itemscope itemtype="http://schema.org/WebPage"><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
... etc ...
answered Jun 8, 2012 at 13:05
Sign up to request clarification or add additional context in comments.

Comments

1

If you use HTTPResponse you can the headers and the content with two function calls, but it doesn't make two trips to the server.

answered Jun 8, 2012 at 12:32

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.