Python - Socket error

Question 1

My code :-

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 
s.connect(("www.python.org" , 80))
s.sendall(b"GET https://www.python.org HTTP/1.0\n\n")
print(s.recv(4096))
s.close()

Why the output shows me this:-

b'HTTP/1.1 500 Domain Not Found\r\nServer: Varnish\r\nRetry-After: 0\r\ncontent-type: text/html\r\nCache-Control: private, no-cache\r\nconnection: keep-alive\r\nContent-Length: 179\r\nAccept-Ranges: bytes\r\nDate: 2017年7月11日 15:23:55 GMT\r\nVia: 1.1 varnish\r\nConnection: close\r\n\r\n\n\n\nFastly error: unknown domain \n\n\nFastly error: unknown domain: . Please check that this domain has been added to a service.'

How can I fix it?

Question 2

GET https://www.python.org -- I think you want "GET /" instead.

Question 3

@BrianCain is correct. After the HTTP Verb you should provide the relative path to the resource you wish to access. By connecting to the domain, you're requests are already going through www.python.org. If you continue to have issues, add the Host HTTP Header.

Question 4

When I do this it is shown in plain text?

Question 5

The issue may actually be that the resource in question is accessed over HTTPS. You have to do a bit more work when using a raw socket to connect to a HTTPS service.

Question 6

This is wrong on multiple levels:

to access a HTTPS resource you need to create a TLS connection (i.e. ssl_wrap on top of an existing TCP connection, with proper certificate checking etc) and then send the HTTP request. Of course the TCP connection in this case should go to port 443(https) not 80 (http).
the HTTP request should only contain the path, not the full URL
the line end must be \r\n not \n
you better send a Host header too since many severs require it

And that's only the request. Properly handling the response is a different topic.

I really really recommend to use an existing library like requests. HTTP(S) is considerably more complex as most think who only had a look at a few traffic captures.

Question 7

Here is the requests quickstart docs.python-requests.org/en/master/user/quickstart/…

Question 8

I highly recommend the requests library instead of raw sockets, unless you want to learn the hard way.

Question 9

@Ch.Sohaib: Are you asking for sample code for requests: print(requests.get('https://www.python.org').content). Or are you asking how to fix your code: I don't think it is worth since too much is wrong.

Question 10

@Ch.Sohaib: I use stackoverflow.com more as a way to help others create the right code and learn this way instead of writing code for others. I've pointed out several problems with your code which primarily come from a too small understanding of how HTTP and HTTPS work. I recommend you first improve your understanding of HTTP(S) and try to fix the mentioned problems yourself. If you have specific problems with this I'm willing to help but I don't just write the code for you. I recommend to first start with plain HTTP and if you manage this continue with HTTPS.

Question 11

Ok no problem bro.

Question 12

import requests
x = requests.get('https://www.python.org')
print x.text

With the requests library, HTTPS requests are very simple! If you're doing this with raw sockets, you have to do a lot more work to negotiate a cipher and etc. Try the above code (python 2.7).

I would also note that, in my experience, Python is excellent for doing things quickly. If you are learning about networking and cryptography, try writing a HTTPS client on your own using sockets. If you want to automate something quickly, use the tools that are available to you. I almost always use requests for this type of task. As an additional note, if you're interested in parsing HTML content, check out the PyQuery library. I've used it to automate interaction with many web services.

Requests

PyQuery

Steffen Ullrich 125k11 gold badges157 silver badges194 bronze badges · Accepted Answer · 2017-07-11 15:48:59Z

4

This is wrong on multiple levels:

to access a HTTPS resource you need to create a TLS connection (i.e. ssl_wrap on top of an existing TCP connection, with proper certificate checking etc) and then send the HTTP request. Of course the TCP connection in this case should go to port 443(https) not 80 (http).
the HTTP request should only contain the path, not the full URL
the line end must be \r\n not \n
you better send a Host header too since many severs require it

And that's only the request. Properly handling the response is a different topic.

I really really recommend to use an existing library like requests. HTTP(S) is considerably more complex as most think who only had a look at a few traffic captures.

Share

Improve this answer

answered Jul 11, 2017 at 15:48

Steffen Ullrich's user avatar

Steffen Ullrich

125k11 gold badges157 silver badges194 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DisappointedByUnaccountableMod

DisappointedByUnaccountableMod Over a year ago

Here is the requests quickstart docs.python-requests.org/en/master/user/quickstart/…

2017年07月11日T15:52:50.373Z+00:00

h0r53

h0r53 Over a year ago

I highly recommend the requests library instead of raw sockets, unless you want to learn the hard way.

2017年07月11日T15:53:27.403Z+00:00

Steffen Ullrich

Steffen Ullrich Over a year ago

@Ch.Sohaib: Are you asking for sample code for requests: print(requests.get('https://www.python.org').content). Or are you asking how to fix your code: I don't think it is worth since too much is wrong.

2017年07月11日T15:57:26.523Z+00:00

Steffen Ullrich

Steffen Ullrich Over a year ago

@Ch.Sohaib: I use stackoverflow.com more as a way to help others create the right code and learn this way instead of writing code for others. I've pointed out several problems with your code which primarily come from a too small understanding of how HTTP and HTTPS work. I recommend you first improve your understanding of HTTP(S) and try to fix the mentioned problems yourself. If you have specific problems with this I'm willing to help but I don't just write the code for you. I recommend to first start with plain HTTP and if you manage this continue with HTTPS.

2017年07月11日T16:07:24.513Z+00:00

Ch. Sohaib

Ch. Sohaib Over a year ago

Ok no problem bro.

2017年07月11日T16:10:21.33Z+00:00

CollectivesTM on Stack Overflow

Python - Socket error

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related