homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: merge urllib and urlparse functionality
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.0
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: brett.cannon, christian.heimes, facundobatista, gvanrossum, orsenthil, techtonik
Priority: normal Keywords:

Created on 2007年10月26日 12:16 by techtonik, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (7)
msg56787 - (view) Author: anatoly techtonik (techtonik) Date: 2007年10月26日 12:16
The purpose is to encapsulate all URL handling functions in one module.
At the moment there are three modules that dissect URLs for various bits
of information. These are urlparse - to split url into components,
urllib - to decode splitted data and cgi - to parse query component.
To outline the API of the proposed module I'll start with urlparse :
http://docs.python.org/lib/module-urlparse.html
1. There are two identical functions - urlparse and urlsplit that make
the same parsing operation, but vary in format of return arguments. They
could be replaced with one - let's call it urlsplitex - that returns
result in a class with attributes - not a subclass of list, but rather
dictionary subclass, because positional arguments are evil and you
always have to look into reference to find out the correct order if you
read or debug the code.
2. Returned class should not be immutable. It must be possible to modify
the results to unset extra parts (like fragments) or edit required parts
as needed and get the target URL via urlunsplitex or embedded method of
the same class. Thus arguments "default_scheme" and "allow_fragments"
will be useless as well as function urldefrag.
3. urlparsex, a replacement for "parsing" function of the new library
should be high-level functions to dissect url information into tree-like
structure with atomic leafs. This includes decoding url entities and
splitting parameters into child structures. The proposed structure of
url class attributes is:
scheme string
netloc class
 username string
 password string
 server string
 port integer
path list with objects of class
 part string
 param list with objects of class
 name string
 value string
query list with objects of class
 name string
 value string
fragment string
4. urlunparsex will be provided to reassemble class back into URL. This
will deprecate series of functions from urllib like quote, unquote,
urlencode and also functions parse_qs and parse_qsl from cgi module.
References:
http://mail.python.org/pipermail/patches/2005-February/016972.html
http://bugs.python.org/issue1722348
http://bugs.python.org/issue1462525 
msg56804 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007年10月26日 17:55
You missed urllib2 I think. :-)
I agree it's a mess. I'm sure it all started out with backwards
compatibility in mind. I find myself often importing cgi only to use
the tiny function escape() that is defined there...
I wonder if web-sig wouldn't be a good place to get some kindred spirits
together to redesign these APIs for Py3k?
msg56954 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2007年10月30日 03:55
I have started this work at
http://svn.python.org/projects/sandbox/trunk/urilib/
as a part of G-SoC, yes taking it to web-sig would be appropriate and I
shall do so. techtonik, you might want to review it urilib and we can
discuss it further.
Thanks,
msg58600 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007年12月14日 00:31
Please contact Brett Cannon. He organized the stdlib cleanup.
msg58601 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2007年12月14日 00:38
Yes, the modules should probably all get merged somehow. But discussing
it in the web-sig is fine with me and I am happy to look at what the
web-sig ends up recommending.
msg67140 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年05月21日 00:05
Brett, in consideration of PEP 3108... shouldn't we close this issue?
The urilib module in the sandbox wasn't updated in the last seven months.
Or we just keep this open as a reminder? (of what?)
Thanks!
msg67164 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2008年05月21日 17:56
While the work is appreciated, PEP 3108 is taking this in a different 
direction.
History
Date User Action Args
2022年04月11日 14:56:27adminsetgithub: 45674
2008年05月21日 17:56:02brett.cannonsetstatus: open -> closed
resolution: out of date
messages: + msg67164
2008年05月21日 00:05:27facundobatistasetnosy: + facundobatista
messages: + msg67140
2008年01月06日 22:29:45adminsetkeywords: - py3k
versions: Python 3.0
2007年12月14日 00:38:33brett.cannonsetmessages: + msg58601
2007年12月14日 00:31:55christian.heimessetassignee: brett.cannon
messages: + msg58600
nosy: + brett.cannon, christian.heimes
2007年12月13日 18:26:16alexandre.vassalottisetresolution: accepted -> (no value)
2007年11月08日 14:54:39christian.heimessetpriority: normal
keywords: + py3k
resolution: accepted
2007年10月30日 03:55:36orsenthilsetnosy: + orsenthil
messages: + msg56954
2007年10月26日 17:55:29gvanrossumsetnosy: + gvanrossum
messages: + msg56804
2007年10月26日 12:16:28techtonikcreate

AltStyle によって変換されたページ (->オリジナル) /