homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unittest.TestCase.assertEqual does not show diff when comparing str with unicode
Type: Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, jaap.karssenberg, michael.foord, r.david.murray
Priority: normal Keywords:

Created on 2012年02月15日 21:00 by jaap.karssenberg, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Messages (7)
msg153434 - (view) Author: Jaap Karssenberg (jaap.karssenberg) Date: 2012年02月15日 20:59
When you compare two multiline strings with unittest.TestCase.assertEqual it is supposed to dispatch to assertMultiLineEqual and show a diff when the strings differ.
However this only works for two string of the same type (str or unicode). But mixing the two will just give the default message that they differ without the diff. THis is due to the way the dispatch checks types. Probably need to make an exception for comparing str with unicode.
Note that if the contents of both strings are the same, the assert will regard them equal, so it is not a strict check on the type perse.
msg153435 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年02月15日 21:04
The latter is arguably a bug. The former is working as designed, as far as I know. In Python3 bytes and string do not compare equal.
msg153440 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年02月15日 21:39
In case it isn't clear, by "arguably a bug" I mean in a theoretical sense. Even if Michael agrees with me we can't change the fact that 2.7 unittest treats str and unicode with the same content as equal.
For the other it might have been a backport-from-3.x oversight, or it might have been intentional.
msg153449 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2012年02月15日 23:29
assertEqual uses Python equality semantics - so if a str instance and a unicode instance compare equal then assertEqual passes. This is by design.
The type check in assertEqual, that delegates to the different comparison methods, is strict because we can't know that using the error message algorithms is sane for arbitrary subclasses - all we can know is whether an equality comparison fails or succeeds.
Using a diff algorithm for creating an error message only makes sense for text, which is why it is only done for unicode. For binary strings a diff is more likely to be unintelligible nonsense. 
For comparing unicode to strings you can call asssertMultilineEqual directly.
msg153469 - (view) Author: Jaap Karssenberg (jaap.karssenberg) Date: 2012年02月16日 09:15
On Thu, Feb 16, 2012 at 12:29 AM, Michael Foord <report@bugs.python.org>wrote:
> The type check in assertEqual, that delegates to the different comparison
> methods, is strict because we can't know that using the error message
> algorithms is sane for arbitrary subclasses - all we can know is whether an
> equality comparison fails or succeeds.
>
So would you allow me to register a method for type "basestring" and have
assertEqual dispatch to that method when both arguments are of this type ?
That way at least I could customize the behavior in sub classes.
Thanks,
Jaap
msg153518 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年02月17日 00:23
If you really want the diff you could use assertMultiLineEqual, but even on Python 2 you shouldn't mix str and unicode. I would rather fix the code to return unicode than using assertMultilineEqual to get a diff between str and unicode. Moreover assertMultiLineEqual only works if the str happens to be ASCII-only:
>>> class MyTest(TestCase):
... def test_foo(self):
... self.assertMultiLineEqual('bàr', u'bàz')
... 
>>> unittest.main(exit=False)
E
======================================================================
ERROR: test_foo (__main__.MyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "<stdin>", line 3, in test_foo
 File "/home/wolf/dev/py/2.7/Lib/unittest/case.py", line 920, in assertMultiLineEqual
 diff = '\n' + ''.join(difflib.ndiff(firstlines, secondlines))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
----------------------------------------------------------------------
msg153535 - (view) Author: Jaap Karssenberg (jaap.karssenberg) Date: 2012年02月17日 08:37
On Fri, Feb 17, 2012 at 1:23 AM, Ezio Melotti <report@bugs.python.org>wrote:
>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> If you really want the diff you could use assertMultiLineEqual, but even
> on Python 2 you shouldn't mix str and unicode. I would rather fix the code
> to return unicode than using assertMultilineEqual to get a diff between str
> and unicode. Moreover assertMultiLineEqual only works if the str happens
> to be ASCII-only:
>
Yes I'm aware of that. However to my mind there is an inconsistency between
having assertEqual dispatch per type and having to use explicitly
assertMultiLineEqual. If assertMultiLineEqual accepts basestring, I should
be able to register it as such.
More practically I have a large suite of code using assertEqual comparing
mixed str and unicode. This code was written before the diff function was
available (in fact I had a custom diff function in the subclass). As long
as tests are OK this works fine, so I rather not touch them, but if they
fail I don't get the output I need.
Anyway, since I feel there is no consensus on this, I went ahead and
patched assertEqual in my custom subclass and move ahead. I can submit a
formal patch if there is a chance of it being accepted.
Regards,
Jaap
History
Date User Action Args
2022年04月11日 14:57:26adminsetgithub: 58233
2012年02月17日 08:37:18jaap.karssenbergsetmessages: + msg153535
2012年02月17日 00:23:33ezio.melottisetmessages: + msg153518
2012年02月16日 09:15:09jaap.karssenbergsetmessages: + msg153469
2012年02月15日 23:29:04michael.foordsetstatus: open -> closed
resolution: not a bug
messages: + msg153449
2012年02月15日 21:39:51r.david.murraysetmessages: + msg153440
2012年02月15日 21:06:41ezio.melottisetnosy: + ezio.melotti
2012年02月15日 21:04:01r.david.murraysetnosy: + r.david.murray, michael.foord
messages: + msg153435
2012年02月15日 21:00:00jaap.karssenbergcreate

AltStyle によって変換されたページ (->オリジナル) /