Issue 15824: mutable urlparse return type

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/60028

classification

Title:	mutable urlparse return type
Type:	enhancement	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.4

process

Dependencies:	Superseder:
Status:	closed	Resolution:	works for me
Assigned To:	Nosy List:	eric.araujo, ezio.melotti, mastahyeti, mhcptg, orsenthil, r.david.murray
Priority:	normal	Keywords:	patch

Created on 2012年08月30日 18:17 by mastahyeti, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
urlparse_patch.patch	mastahyeti, 2012年08月30日 18:17	review

Messages (13)
msg169474 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 18:17
This patch removes the inheritance from namedtuple and attempts to add the necessary methods to make it backwards compatible. When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required: import urlparse url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha')) url[1] = 'python.com' new_url = urllib.urlunparse(url) I think this is really clunky. Moving to a mutable return type is challenging because (to my knowledge) there are no types that are mutable and compatible with tuple. This patch removes the inheritance from namedtuple and attempts to add the necessary methods to make it backwards compatible. Does any one know of a better way to do this? It would be nice if there were a namedlist type that acted like namedtuple but was mutable. With these updates, urlparse can be used as follows: import urlparse url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha')) url.netloc = 'www.python.com' urlparse.urlunparse(url) I think this is much better. Let me know if you disagree... Also, I ran the script through autopep8 because it was messy. Also, I'm not sure if I'm supposed to duplicate this patch over to Python3. I can do that if necessary
msg169475 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 18:21
TYPO!!! After my patch, urlparse can be used as such: import urlparse url = urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha') url.netloc = 'www.python.com' urlparse.urlunparse(url) The difference being that the result doesn't need to be casted to a list in order to be mutated...
msg169476 - (view)	Author: Ezio Melotti (ezio.melotti) * (Python committer)	Date: 2012年08月30日 18:21
This is a new feature, so it can't go in 2.7.
msg169477 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 18:24
This is my first patch for python. Is there a feature freeze? Does it need to go in Python3? Thanks. On Thu, Aug 30, 2012 at 1:22 PM, Ezio Melotti <report@bugs.python.org> wrote: > > Ezio Melotti added the comment: > > This is a new feature, so it can't go in 2.7. > > ---------- > nosy: +ezio.melotti, orsenthil > stage: -> needs patch > versions: +Python 3.4 -Python 2.7 > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15824> > _______________________________________
msg169478 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2012年08月30日 18:25
I think the first step is probably to get consensus on whether this is desirable or not. That might require a trip to python-idea, or it might not :) As for the patch itself, you should definitely not include any changes other than the ones you are proposing. Otherwise reviewing the patch is very difficult. As Ezio said, as a new feature this could only go into 3.4, so the patch should be against the default branch in the mercurial repository.
msg169479 - (view)	Author: Senthil Kumaran (orsenthil) * (Python committer)	Date: 2012年08月30日 18:38
On Thu, Aug 30, 2012 at 11:17 AM, mastahyeti <report@bugs.python.org> wrote: > > When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required: Not actually, using the namedtuple is a convenience and working through way may help you to be generate your target url in a more meaningful way. Also remember that we moved to namedtuple after understanding that it is more meaningful to use that for parsed result. So, my vote for this proposal is -1. And if you need discuss the strategies of how to use it, then you can ask over at python-help or related lists.
msg169482 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 18:58
Senthil, Can you give an example of how namedtuple would be more convenient? It is definitely more convenient than an ordinary tuple, but its inconvenient having its items not be assignable. As I showed in my example above, it is usable as-is, but it is clunky. As David says above, this obviously needs to be moved to another list for discussion of whether the current behavior is desirable. On Thu, Aug 30, 2012 at 1:38 PM, Senthil Kumaran <report@bugs.python.org> wrote: > > Senthil Kumaran added the comment: > > On Thu, Aug 30, 2012 at 11:17 AM, mastahyeti <report@bugs.python.org> wrote: >> >> When parsing a url with urlparse.urlparse, the return type is non-mutable (named tuple). This is really inconvenient, because one of the most common (imop) use cases for urlparse is to parse a url, make an adjustment or change and then unparse it. Currently, something like this is required: > > Not actually, using the namedtuple is a convenience and working > through way may help you to be generate your target url in a more > meaningful way. Also remember that we moved to namedtuple after > understanding that it is more meaningful to use that for parsed > result. So, my vote for this proposal is -1. And if you need discuss > the strategies of how to use it, then you can ask over at python-help > or related lists. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15824> > _______________________________________
msg169483 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2012年08月30日 19:03
Actually, Senthil is right. What you want is the _replace method of namedtuple to satisfy your use case.
msg169485 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 19:10
I can live with that, it just seems that ordinary item assignment is more pythonic.... On Thu, Aug 30, 2012 at 2:03 PM, R. David Murray <report@bugs.python.org> wrote: > > R. David Murray added the comment: > > Actually, Senthil is right. What you want is the _replace method of namedtuple to satisfy your use case. > > ---------- > resolution: -> works for me > stage: -> committed/rejected > status: open -> closed > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15824> > _______________________________________
msg169487 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2012年08月30日 19:16
Not in this case. We are treating the URL as an immutable object, so the Pythonic thing to do is create new object of the same type with the change applied. Similar to "abcd".replace('a', 'z') returning a new string.
msg169488 - (view)	Author: mastahyeti (mastahyeti)	Date: 2012年08月30日 19:22
Hrmm. Okay. I concede. On Thu, Aug 30, 2012 at 2:16 PM, R. David Murray <report@bugs.python.org> wrote: > > R. David Murray added the comment: > > Not in this case. We are treating the URL as an immutable object, so the Pythonic thing to do is create new object of the same type with the change applied. Similar to "abcd".replace('a', 'z') returning a new string. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15824> > _______________________________________
msg230715 - (view)	Author: Matthew Hall (mhcptg)	Date: 2014年11月05日 21:57
I don't think having to call a method with a weird secret underscored name to update a value in a URL named tuple is very elegant. Neither is creating a handful of pointless objects to make one simple validator function like the one I had to code today. I would urge some reconsideration of this, like a way to get back a named yet mutable object when needed, instead of trying to force everybody to do this one way which isn't always that great. def validate_url(url): parts = urlparse.urlparse(url.strip()) # scheme, netloc, path, params, query, fragment # XXX: preserve backward compatibility w/ old code if not parts.scheme: parts = parts._replace(scheme='http', netloc=parts.path.strip('/'), path='') # remove params, query, and fragment # params is nearly never used anywhere # (NOTE: it does NOT mean the stuff after '?') # it actually means this http://domain/page.py;param1=foo?query1=bar # query and fragment are used but aren't helpful for our application parts = parts._replace(params='', query='', fragment='') if parts.scheme not in URI_SCHEMES: raise ValueError('scheme=%s is not valid' % parts.scheme) if '.' not in parts.netloc: raise ValueError('location=%s does not contain a domain' % parts.netloc) if len(parts.path) and not parts.path.startswith('/'): raise ValueError('path=%s appears invalid' % parts.path) elif not parts.path: parts=parts._replace(path='/') validated_url = parts.geturl() return validated_url, parts
msg230718 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2014年11月05日 22:18
Think of it as immutable like a string is immutable. The cases are exactly parallel (the string function is of course named 'replace' since it doesn't have to deal with the 'arbitrary attribute names' problem namedtuple does), except that it is much easier to address the parts of a url using the namedtuple. _replace is not a "weird secrete method", it is part of the public API of namedtuple. I agree that using '_' is unfortunate. I would have preferred a name pattern like _replace_, to make it clearer that it is not a private method.

History
Date	User	Action	Args
2022年04月11日 14:57:35	admin	set	github: 60028
2014年11月05日 22:18:47	r.david.murray	set	messages: + msg230718
2014年11月05日 21:57:40	mhcptg	set	nosy: + mhcptg messages: + msg230715
2012年08月30日 19:58:49	eric.araujo	set	nosy: + eric.araujo
2012年08月30日 19:22:39	mastahyeti	set	messages: + msg169488
2012年08月30日 19:16:46	r.david.murray	set	messages: + msg169487
2012年08月30日 19:10:21	mastahyeti	set	messages: + msg169485
2012年08月30日 19:03:40	r.david.murray	set	status: open -> closed resolution: works for me messages: + msg169483 stage: resolved
2012年08月30日 18:58:07	mastahyeti	set	messages: + msg169482
2012年08月30日 18:38:23	orsenthil	set	messages: + msg169479
2012年08月30日 18:25:34	r.david.murray	set	nosy: + r.david.murray messages: + msg169478 stage: needs patch -> (no value)
2012年08月30日 18:24:57	mastahyeti	set	messages: + msg169477
2012年08月30日 18:22:00	ezio.melotti	set	versions: + Python 3.4, - Python 2.7 nosy: + ezio.melotti, orsenthil messages: + msg169476 stage: needs patch
2012年08月30日 18:21:01	mastahyeti	set	messages: + msg169475
2012年08月30日 18:17:44	mastahyeti	create

homepage