0

I was learning socket programming and tried to design a basic http client of mine. But somehow everything is going good but I am not receiving any data. Can you please tell me what am I missing?

CODE

import socket
def create_socket():
 return socket.socket( socket.AF_INET, socket.SOCK_STREAM )
def remove_socket(sock):
 sock.close()
 del sock
sock = create_socket()
print "Connecting"
sock.connect( ('en.wikipedia.org', 80) )
print "Sending Request"
print sock.sendall ('''GET /wiki/List_of_HTTP_header_fields HTTP/1.1
Host: en.wikipedia.org
Connection: close
User-Agent: Web-sniffer/1.0.37 (+http://web-sniffer.net/)
Accept-Encoding: gzip
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7
Cache-Control: no-cache
Accept-Language: de,en;q=0.7,en-us;q=0.3
Referer: d_r_G_o_s
''')
print "Receving Reponse"
while True:
 content = sock.recv(1024)
 if content:
 print content
 else:
 break
print "Completed"

OUTPUT

Connecting
Sending Request
298
Receving Reponse
Completed

While I was expecting it show me html content of homepage of wikipedia :'(

Also, it would be great if somebody can share some web resources / books where I can read in detail about python socket programming for HTTP Request Client

Thanks!

asked Apr 10, 2012 at 7:07
2
  • 2
    Are your newlines proper newlines ('\r\n')? Also, after the headers you should have a single empty line, this tells the server that the headers are done. Commented Apr 10, 2012 at 7:10
  • No it wasnt.. I thought \n shall suffice but it doesnt.. I got it.. thanks :) Commented Apr 10, 2012 at 7:57

1 Answer 1

3

For a minimal HTTP client, you definitely shouldn't send Accept-Encoding: gzip -- the server will most likely reply with a gzipped response you won't be able to make much sense of by eye. :)

You aren't sending the final double \r\n (nor are you actually terminating your lines with \r\n as per the spec (unless you happen to develop on Windows with Windows line endings, but that's just luck and not programming per se).

Also, del sock there does not do what you think it does.

Anyway -- this works:

import socket
sock = socket.socket()
sock.connect(('en.wikipedia.org', 80))
for line in (
 "GET /wiki/List_of_HTTP_header_fields HTTP/1.1",
 "Host: en.wikipedia.org",
 "Connection: close",
):
 sock.send(line + "\r\n")
sock.send("\r\n")
while True:
 content = sock.recv(1024)
 if content:
 print content
 else:
 break

EDIT: As for resources/books/reference -- for a reference HTTP client implementation, look at Python's very own httplib.py. :)

answered Apr 10, 2012 at 7:14
Sign up to request clarification or add additional context in comments.

4 Comments

Most probably, the missing "\r\n" is the problem in the original code; the Wikipedia webserver closes the connection as soon as it sees invalid/broken HTTP headers.
@AKX: thank you.. . I was using only \n. And httplib.py as a reference sounds great!! :)
@AKX: What does del sock do what I think it doesnt? :p
@dragosrsupercool: it deletes the name from the scope it is in, namely the remove_socket function -- doing it there, just before the function is exited, does nothing worthwhile. (You very rarely actually need del in Python code anyway.)

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.