[Python-3000] PEP: str(container) should call str(item), not repr(item)

Oleg Broytmann phd at phd.pp.ru
Thu May 29 21:21:57 CEST 2008


Hello. A draft for a discussion.
PEP: XXX
Title: str(container) should call str(item), not repr(item)
Version: $Revision$
Last-Modified: $Date$
Author: Oleg Broytmann <phd at phd.pp.ru>,
 Jim Jewett <jimjjewett at gmail.com>
Discussions-To: python-3000 at python.org
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 27-May-2008
Post-History: 28-May-2008
Abstract
 This document discusses the advantages and disadvantages of the
 current implementation of str(container). It also discusses the
 pros and cons of a different approach - to call str(item) instead
 of repr(item).
Motivation
 Currently str(container) calls repr on items. Arguments for it:
 -- containers refuse to guess what the user wants to see on
 str(container) - surroundings, delimiters, and so on;
 -- repr(item) usually displays type information - apostrophes
 around strings, class names, etc.
 Arguments against:
 -- it's illogical; str() is expected to call __str__ if it exists,
 not __repr__;
 -- there is no standard way to print a container's content calling
 items' __str__, that's inconvenient in cases where __str__ and
 __repr__ return different results;
 -- repr(item) sometimes do wrong things (hex-escapes non-ascii
 strings, e.g.)
 This PEP proposes to change how str(container) works. It is
 proposed to mimic how repr(container) works except one detail
 - call str on items instead of repr. This allows a user to choose
 what results she want to get - from item.__repr__ or item.__str__.
Current situation
 Most container types (tuples, lists, dicts, sets, etc.) do not
 implement __str__ method, so str(container) calls
 container.__repr__, and container.__repr__, once called, forgets
 it is called from str and always calls repr on the container's
 items.
 This behaviour has advantages and disadvantages. One advantage is
 that most items are represented with type information - strings
 are surrounded by apostrophes, instances may have both class name
 and instance data:
 >>> print([42, '42'])
 [42, '42']
 >>> print([Decimal('42'), datetime.now()])
 [Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]
 The disadvantage is that __repr__ often returns technical data
 (like '<object at address>') or unreadable string (hex-encoded
 string if the input is non-ascii string):
 >>> print(['тест'])
 ['\xd4\xc5\xd3\xd4']
 One of the motivations for PEP 3138 is that neither repr nor str
 will allow the sensible printing of dicts whose keys are non-ascii
 text strings. Now that unicode identifiers are allowed, it
 includes Python's own attribute dicts. This also includes JSON
 serialization (and caused some hoops for the json lib).
 PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"
 invariant, and changing the way repr (which is used for
 persistence) outputs some objects, with system-dependent failures.
 Changing how str(container) works would allow easy debugging in
 the normal case, and retrain the safety of ASCII-only for the
 machine-readable case. The only downside is that str(x) and
 repr(x) would more often be different -- but only in those cases
 where the current almost-the-same version is insufficient.
 It also seems illogical that str(container) calls repr on items
 instead of str. It's only logical to expect following code
 class Test:
 def __str__(self):
 return "STR"
 def __repr__(self):
 return "REPR"
 test = Test()
 print(test)
 print(repr(test))
 print([test])
 print(str([test]))
 to print
 STR
 REPR
 [STR]
 [STR]
 where it actually prints
 STR
 REPR
 [REPR]
 [REPR]
 Especially it is illogical to see that print in Python 2 uses str
 if it is called on what seems to be a tuple:
 >>> print Decimal('42'), datetime.now()
 42 2008年05月27日 20:16:22.534285
 where on an actual tuple it prints
 >>> print((Decimal('42'), datetime.now()))
 (Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))
A different approach - call str(item)
 For example, with numbers it is often only the value that people
 care about.
 >>> print Decimal('3')
 3
 But putting the value in a list forces users to read the type
 information, exactly as if repr had been called for the benefit of
 a machine:
 >>> print [Decimal('3')]
 [Decimal("3")]
 After this change, the type information would not clutter the str
 output:
 >>> print "%s".format([Decimal('3')])
 [3]
 >>> str([Decimal('3')]) # ==
 [3]
 But it would still be available if desired:
 >>> print "%r".format([Decimal('3')])
 [Decimal('3')]
 >>> repr([Decimal('3')]) # ==
 [Decimal('3')]
 There is a number of strategies to fix the problem. The most
 radical is to change __repr__ so it accepts a new parameter (flag)
 "called from str, so call str on items, not repr". The
 drawback of the proposal is that every __repr__ implementation
 must be changed. Introspection could help a bit (inspect __repr__
 before calling if it accepts 2 or 3 parameters), but introspection
 doesn't work on classes written in C, like all builtin containers.
 Less radical proposal is to implement __str__ methods for builtin
 container types. The obvious drawback is a duplication of effort
 - all those __str__ and __repr__ implementations are only differ
 in one small detail - if they call str or repr on items.
 The most conservative proposal is not to change str at all but
 to allow developers to implement their own application- or
 library-specific pretty-printers. The drawback is again
 a multiplication of effort and proliferation of many small
 specific container-traversal algorithms.
Backward compatibility
 In those cases where type information is more important than
 usual, it will still be possible to get the current results by
 calling repr explicitly.
Copyright
 This document has been placed in the public domain.

Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
Oleg.
-- 
 Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru
 Programmers don't die, they just GOSUB without RETURN.


More information about the Python-3000 mailing list

AltStyle によって変換されたページ (->オリジナル) /