[Python-checkins] python/nondist/peps pep-0305.txt,1.7,1.8

montanaro@users.sourceforge.net montanaro@users.sourceforge.net
2003年1月31日 13:49:35 -0800


Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1:/tmp/cvs-serv9751
Modified Files:
	pep-0305.txt 
Log Message:
various cleanups
expanded Rationale a tad
added Post-History date (announcing it in a moment)
added pointer to sandbox implementation
mentioned implementation in the (massive ;-) Testing section
Index: pep-0305.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0305.txt,v
retrieving revision 1.7
retrieving revision 1.8
diff -C2 -d -r1.7 -r1.8
*** pep-0305.txt	30 Jan 2003 13:34:29 -0000	1.7
--- pep-0305.txt	31 Jan 2003 21:49:32 -0000	1.8
***************
*** 12,16 ****
 Content-Type: text/x-rst
 Created: 26-Jan-2003
! Post-History: 
 
 
--- 12,16 ----
 Content-Type: text/x-rst
 Created: 26-Jan-2003
! Post-History: 31-Jan-2003
 
 
***************
*** 25,29 ****
 PEP defines an API for reading and writing CSV files which should make
 it possible for programmers to select a CSV module which meets their
! requirements.
 
 
--- 25,30 ----
 PEP defines an API for reading and writing CSV files which should make
 it possible for programmers to select a CSV module which meets their
! requirements. It is accompanied by a corresponding module which
! implements the API.
 
 
***************
*** 47,55 ****
 CSV files:
 
! - Object Craft's CSV module [1]_
 
! - Cliff Wells's Python-DSV module [2]_
 
! - Laurence Tratt's ASV module [3]_
 
 Each has a different API, making it somewhat difficult for programmers
--- 48,56 ----
 CSV files:
 
! - Object Craft's CSV module [2]_
 
! - Cliff Wells' Python-DSV module [3]_
 
! - Laurence Tratt's ASV module [4]_
 
 Each has a different API, making it somewhat difficult for programmers
***************
*** 70,73 ****
--- 71,89 ----
 distribution.
 
+ CSV formats are not well-defined and different implementations have a
+ number of subtle corner cases. It has been suggested that the "V" in
+ the acronym stands for "Vague" instead of "Values". Different
+ delimiters and quoting characters are just the start. Some programs
+ generate whitespace after the delimiter. Others quote embedded
+ quoting characters by doubling them or prefixing them with an escape
+ character. The list of weird ways to do things seems nearly endless.
+ 
+ Unfortunately, all this variability and subtlety means it is difficult
+ for programmers to reliably parse CSV files from many sources or
+ generate CSV files designed to be fed to specific external programs
+ without deep knowledge of those sources and programs. This PEP and
+ the software which accompany it attempt to make the process less
+ fragile.
+ 
 
 Module Interface
***************
*** 77,81 ****
 writing. The basic reading interface is::
 
! reader(fileobj [, dialect='excel2000'] [optional keyword args])
 
 A reader object is an iterable which takes a file-like object opened
--- 93,98 ----
 writing. The basic reading interface is::
 
! obj = reader(fileobj [, dialect='excel2000']
! [optional keyword args])
 
 A reader object is an iterable which takes a file-like object opened
***************
*** 92,102 ****
 The writing interface is similar::
 
! writer(fileobj [, dialect='excel2000'], [, fieldnames=list]
! [optional keyword args])
 
 A writer object is a wrapper around a file-like object opened for
 writing. It accepts the same optional keyword parameters as the
 reader constructor. In addition, it accepts an optional fieldnames
! argument. This is a list which defines the order of fields in the
 output file. It allows the write() method to accept mapping objects
 as well as sequence objects.
--- 109,119 ----
 The writing interface is similar::
 
! obj = writer(fileobj [, dialect='excel2000'], [, fieldnames=seq]
! [optional keyword args])
 
 A writer object is a wrapper around a file-like object opened for
 writing. It accepts the same optional keyword parameters as the
 reader constructor. In addition, it accepts an optional fieldnames
! argument. This is a sequence that defines the order of fields in the
 output file. It allows the write() method to accept mapping objects
 as well as sequence objects.
***************
*** 116,119 ****
--- 133,138 ----
 csvwriter.write(row)
 
+ or arrange for it to be the first row in the iterable being written.
+ 
 
 Dialects
***************
*** 123,141 ****
 convenient handle on a group of lower level parameters.
 
! When dialect is a string it identifies one of the dialect which is
 known to the module, otherwise it is processed as a dialect class as
 described below.
! 
 Dialects will generally be named after applications or organizations
 which define specific sets of format constraints. The initial dialect
! is excel2000, which describes the format constraints of Excel 2000's
! CSV format. Another possible dialect (used here only as an example)
! might be "gnumeric".
 
! Dialects are implemented as attribute only classes to enable user to
! construct variant dialects by subclassing. The excel2000 dialect is
 implemented as follows::
 
! class excel2000:
 quotechar = '"'
 delimiter = ','
--- 142,160 ----
 convenient handle on a group of lower level parameters.
 
! When dialect is a string it identifies one of the dialects which is
 known to the module, otherwise it is processed as a dialect class as
 described below.
! 
 Dialects will generally be named after applications or organizations
 which define specific sets of format constraints. The initial dialect
! is "excel", which describes the format constraints of Excel 97 and
! Excel 2000 regarding CSV input and output. Another possible dialect
! (used here only as an example) might be "gnumeric".
 
! Dialects are implemented as attribute only classes to enable users to
! construct variant dialects by subclassing. The "excel" dialect is
 implemented as follows::
 
! class excel:
 quotechar = '"'
 delimiter = ','
***************
*** 151,161 ****
 delimiter = '\t'
 
! Two functions are defined in the API to set and retrieve dialects::
 
 set_dialect(name, dialect)
 dialect = get_dialect(name)
 
 The dialect parameter is a class or instance whose attributes are the
! formatting parameters defined in the next section.
 
 
--- 170,184 ----
 delimiter = '\t'
 
! Three functions are defined in the API to set, get and list dialects::
 
 set_dialect(name, dialect)
 dialect = get_dialect(name)
+ known_dialects = list_dialects()
 
 The dialect parameter is a class or instance whose attributes are the
! formatting parameters defined in the next section. The
! list_dialects() function returns all the registered dialect names as
! given in previous set_dialect() calls (both predefined and
! user-defined).
 
 
***************
*** 168,209 ****
 for the set_dialect() and get_dialect() module functions.
 
! - quotechar specifies a one-character string to use as the quoting
 character. It defaults to '"'.
 
! - delimiter specifies a one-character string to use as the field
 separator. It defaults to ','.
 
! - escapechar specifies a one character string used to escape the
 delimiter when quotechar is set to None.
 
! - skipinitialspace specifies how to interpret whitespace which
 immediately follows a delimiter. It defaults to False, which means
! that whitespace immediate following a delimiter is part of the
 following field.
 
! - lineterminator specifies the character sequence which should
 terminate rows.
 
! - quoting controls when quotes should be generated by the
! writer.
 
! "minimal" means only when required, for example, when a field
! contains either the quotechar or the delimiter
 
! "always" means that quotes are always placed around fields.
 
! "nonnumeric" means that quotes are always placed around fields
! which contain characters other than [+-0-9.].
 
! ... XXX More to come XXX ...
 
 When processing a dialect setting and one or more of the other
 optional parameters, the dialect parameter is processed first, then
 the others are processed. This makes it easy to choose a dialect,
! then override one or more of the settings. For example, if a CSV file
! was generated by Excel 2000 using single quotes as the quote
! character and TAB as the delimiter, you could create a reader like::
 
! csvreader = csv.reader(file("some.csv"), dialect="excel2000",
 quotechar="'", delimiter='\t')
 
--- 191,235 ----
 for the set_dialect() and get_dialect() module functions.
 
! - ``quotechar`` specifies a one-character string to use as the quoting
 character. It defaults to '"'.
 
! - ``delimiter`` specifies a one-character string to use as the field
 separator. It defaults to ','.
 
! - ``escapechar`` specifies a one character string used to escape the
 delimiter when quotechar is set to None.
 
! - ``skipinitialspace`` specifies how to interpret whitespace which
 immediately follows a delimiter. It defaults to False, which means
! that whitespace immediately following a delimiter is part of the
 following field.
 
! - ``lineterminator`` specifies the character sequence which should
 terminate rows.
 
! - ``quoting`` controls when quotes should be generated by the
! writer. It can take on any of the following module constants::
 
! csv.QUOTE_MINIMAL means only when required, for example, when a
! field contains either the quotechar or the delimiter
 
! csv.QUOTE_ALL means that quotes are always placed around fields.
 
! csv.QUOTE_NONNUMERIC means that quotes are always placed around
! fields which contain characters other than [+-0-9.].
 
! - ``doublequote`` (tbd)
! 
! - are there more to come?
 
 When processing a dialect setting and one or more of the other
 optional parameters, the dialect parameter is processed first, then
 the others are processed. This makes it easy to choose a dialect,
! then override one or more of the settings without defining a new
! dialect class. For example, if a CSV file was generated by Excel 2000
! using single quotes as the quote character and TAB as the delimiter,
! you could create a reader like::
 
! csvreader = csv.reader(file("some.csv"), dialect="excel",
 quotechar="'", delimiter='\t')
 
***************
*** 212,219 ****
 
 
 Testing
 =======
 
! TBD.
 
 
--- 238,253 ----
 
 
+ Implementation
+ ==============
+ 
+ There is a sample implementation available. [1]_ The goal is for it
+ to efficiently implement the API described in the PEP. It is heavily
+ based on the Object Craft csv module. [2]_
+ 
+ 
 Testing
 =======
 
! The sample implementation [1]_ includes a set of test cases.
 
 
***************
*** 284,294 ****
 ==========
 
! .. [1] csv module, Object Craft
! (http://www.object-craft.com.au/projects/csv) 
 
! .. [2] Python-DSV module, Wells
! (http://sourceforge.net/projects/python-dsv/) 
 
! .. [3] ASV module, Tratt
 (http://tratt.net/laurie/python/asv/)
 
--- 318,331 ----
 ==========
 
! .. [1] csv module, Python Sandbox
! (http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/csv/)
 
! .. [2] csv module, Object Craft
! (http://www.object-craft.com.au/projects/csv)
 
! .. [3] Python-DSV module, Wells
! (http://sourceforge.net/projects/python-dsv/)
! 
! .. [4] ASV module, Tratt
 (http://tratt.net/laurie/python/asv/)
 

AltStyle によって変換されたページ (->オリジナル) /