homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urlparse insufficient port property validation
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: ezio.melotti, martin.panter, ncoghlan, orsenthil, python-dev, r.david.murray, zulla
Priority: normal Keywords: patch

Created on 2012年02月16日 23:40 by zulla, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
urlparse.py zulla, 2012年02月17日 01:54
urlparse.diff zulla, 2012年02月17日 02:18
Messages (18)
msg153512 - (view) Author: zulla (zulla) Date: 2012年02月16日 23:40
The "port" component of a URL is not properly be sanitized or validated. This may lead to the evasion of netloc/hostname based filters or exceptions.
msg153522 - (view) Author: zulla (zulla) Date: 2012年02月17日 01:27
The "port" and "netloc" component of a ParsedResult-object is not properly sanitized or validated. This may lead to bypass-able hostname-based filters. Remote Crash vulnerabilities be be also possible.
msg153523 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年02月17日 01:45
Did you upload urlparse.py to the issue by accident?
Can you please provide some examples of where you think the current code is producing incorrect results?
msg153524 - (view) Author: zulla (zulla) Date: 2012年02月17日 01:47
Hi. No, it's a patched version. It won't crash under circumstances like that [1] and won't succeed with invalid input:
>>> import urlparse
>>> urlparse.urlparse("http://www.google.com:foo")
ParseResult(scheme='http', netloc='www.google.com:foo', path='', params='', query='', fragment='')
>>> urlparse.urlparse("http://www.google.com:foo").port
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py", line 105, in port
 port = int(netloc.split(':')[1], 10)
ValueError: invalid literal for int() with base 10: 'foo'
>>>
msg153525 - (view) Author: zulla (zulla) Date: 2012年02月17日 01:54
Whops. I forgot an int() :-)
Here's the right patch.
msg153527 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年02月17日 02:05
It's not a patch if it is the whole file. A diff would be much more useful, since then we could see the changes easily.
This kind of change would require a bit of discussion. I'm doubtful that it would be applied as a bug fix, and we might even want the validation to be optional and not the default. Part of the issue is that urlparse was originally based on the older standards, as you can see from the docstring.
You may find others to agree with you, but personally it doesn't look to me like this rises to the level of a security issue.
msg153528 - (view) Author: zulla (zulla) Date: 2012年02月17日 02:18
I understand your point of view, but I disagree.
Various libraries and projects rely on urlparse.urlparse and urllib.parse.urlparse.
This bug just blew up in my face. I'm working with Cython and PyQt4.
When a developer relies on ParseResult().netloc being a valid netloc, and .port being None [bool(False)] or a integer between 1-65535 really bad things can happen in a environment that has 0-tolerance for security issues (like C/C++ mixed in python).
I agree that the 
if self.scheme == "http":
 return 80
elif self.scheme == "https":
 [...]
part of my patch is debetable, but we should _at least_ ensure that IF there is a ParseResult().port, the developer can be sure that it is a valid port between 1-65545.
i apologize for upload the whole file; i attached the diff now.
regards,
dan
msg153545 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2012年02月17日 13:58
Could you provide some failing examples?
The suggestion also seems to run slightly at odds with itself - in one part, silently replacing an invalid port specification with a different value, in another adding additional validation checks.
Also, rather than hardcoding default port numbers for selected protocols, it would make more sense to just look them up via socket.getservbyname(scheme) (and still return None if the scheme isn't recognised). However, I'll need to think about that one for a while - urlparse is designed to be almost purely a string *parsing* library. Looking up default port numbers if one isn't specified really isn't its job. (For one thing, you'd break the ability to exactly recreate the original URL text from the parsed version).
There should definitely by a try/except around that conversion to int(), though - it's just that I'm not yet sure what an appropriate return value is at that point. The raw port string? None? Should there be a separate "raw_port" descriptor that always returns some kind of string, with the port descriptor promising to always return a valid 16-bit port number or None?
msg154833 - (view) Author: zulla (zulla) Date: 2012年03月03日 12:38
>>> u("http://www.google.com:99999999999999999999999999").port
99999999999999999999999999L
msg155204 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012年03月09日 03:07
Couple of points:
1. On your last example, which webserver treats 'L' as part of port number? I can't of anything.
2. Can you write a "real application" which is listening to beyond 65535? Which platform would it be?
Current way of handling invalid port like, int('foo') by raising ValueError seems to be a better than returning a None. A better error message could be desirable, but that does not make it a security issue.
Additionally, for the example of int changing long integer to 'L' appended one would a 2.x case as it is no longer the behavior in 3.x
Also, I would advice to look at getPort function in a C library or a Java library and see what it does. The only difference I see is, they return -1 where Python returns None.
I am changing the request type to an enhancement, because there is not a valid argument to support that it is a security issue.
msg161275 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012年05月21日 14:51
I am not sure if anything should be done to this request. Saying that 
int("99999999999999999999999999",10) is converting to 99999999999999999999999999L in Python2.7 it is a bug/security issue is hypothetical. Practically, such high port numbers cannot exist.
I suggest, we close this issue as won't fix. 
Ezio, I noticed that you changed from pending to open. If you had any ideas for this issue, please share, otherwise you can consider closing this too. 
Thanks!
msg161276 - (view) Author: zulla (zulla) Date: 2012年05月21日 15:06
Your comment is completely senseless, sorry.
Of course such high port numbers do not exist.
An attacker is counting on that. Imagine something like that
pass_to_cython(urlparse("http://google.de:999999**999999[to be calculated]").port)
msg161277 - (view) Author: zulla (zulla) Date: 2012年05月21日 15:09
we should at least check if the .port attribute is an intereger >= 1 and <= 65535. _because_ this is the only valid port range. otherwise, it is no valid port. but it may be a integer overflow attack attempt
when a developer uses .port, he is counting on the result being valid
msg161278 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012年05月21日 15:23
pass_to_cython(urlparse("http://google.de:999999**999999[to be calculated]").port)
is no different than sending
pass_to_cython(999999**999999[to be calculated])
In that case, would the former make a security loop hole in urlparse? Looks pretty contrived to me as an example for .port bug. 
However, I agree with one point in your assertion, namely that port be checked that it is within the range integer >= 1 and <= 65535. If it is not, return None as a response in port.
msg161279 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年05月21日 15:32
> Ezio, I noticed that you changed from pending to open.
That was an accident, I just meant to add my self to the nosy.
I didn't have time yet to read all the messages and comment on the issue.
msg161506 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年05月24日 13:57
New changeset 988903cf24c5 by Senthil Kumaran in branch '2.7':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/988903cf24c5
New changeset d769e64aed79 by Senthil Kumaran in branch '3.2':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/d769e64aed79
New changeset b4d257c64db7 by Senthil Kumaran in branch 'default':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/b4d257c64db7 
msg161507 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012年05月24日 13:59
This is taken care. I was not really convinced on the need as likely seemed a non issue from "urlparse" standpoint, But still there is no harm in returning valid port as semantically the attribute stands for a port.
Thanks!
msg235661 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年02月10日 03:16
See Issue 20059 proposing to change this to raise ValueError
History
Date User Action Args
2022年04月11日 14:57:26adminsetgithub: 58244
2015年02月10日 03:16:40martin.pantersetnosy: + martin.panter
messages: + msg235661
2012年05月24日 13:59:22orsenthilsetstatus: open -> closed
type: enhancement -> behavior
messages: + msg161507

assignee: orsenthil
resolution: fixed
stage: resolved
2012年05月24日 13:57:55python-devsetnosy: + python-dev
messages: + msg161506
2012年05月21日 15:32:01ezio.melottisetmessages: + msg161279
2012年05月21日 15:23:16orsenthilsetmessages: + msg161278
2012年05月21日 15:09:21zullasetmessages: + msg161277
2012年05月21日 15:06:40zullasetmessages: + msg161276
2012年05月21日 14:51:03orsenthilsetmessages: + msg161275
2012年05月19日 14:17:33ezio.melottisetstatus: pending -> open
nosy: + ezio.melotti
2012年03月09日 03:07:01orsenthilsetstatus: open -> pending
type: security -> enhancement
messages: + msg155204
2012年03月03日 12:38:43zullasetmessages: + msg154833
2012年02月17日 13:58:32ncoghlansetnosy: + ncoghlan

messages: + msg153545
versions: - Python 3.1, Python 3.4
2012年02月17日 02:18:54zullasetfiles: + urlparse.diff
keywords: + patch
messages: + msg153528
2012年02月17日 02:05:52r.david.murraysetnosy: + orsenthil
messages: + msg153527
2012年02月17日 01:54:33zullasetfiles: - urlparse.py
2012年02月17日 01:54:09zullasetfiles: + urlparse.py

messages: + msg153525
2012年02月17日 01:47:03zullasetmessages: + msg153524
2012年02月17日 01:45:08r.david.murraysetnosy: + r.david.murray
messages: + msg153523
2012年02月17日 01:27:49zullasetmessages: + msg153522
2012年02月16日 23:40:38zullacreate

AltStyle によって変換されたページ (->オリジナル) /