This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年10月07日 12:42 by rszefler, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| utffixedsysloghandler.py | Remi.Broemeling, 2010年09月02日 20:19 | UTFFixedSysLogHandler, an overriden logging.handlers.SysLogHandlers that fixes BOM bug introduced in r75586 | ||
| Messages (14) | |||
|---|---|---|---|
| msg93694 - (view) | Author: Robert Szefler (rszefler) | Date: 2009年10月07日 12:42 | |
Trying to .emit() a Unicode string causes an awkward exception to be thrown: Traceback (most recent call last): File "/usr/lib/python2.5/logging/handlers.py", line 672, in emit self.socket.sendto(msg, self.address) TypeError: sendto() takes exactly 3 arguments (2 given) The issue is fixed simply by adding some sort of encoding coercion before the sendto, for example: if type(msg)==unicode: msg=msg.encode('utf-8') |
|||
| msg93724 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2009年10月07日 21:15 | |
To do this in a non-arbitrary way, it would make sense for the SysLogHandler (and perhaps the other socket-based handlers, too) to grow an optional encoding argument to their constructors, to be used to encode when converting from unicode to str (str -> bytes for Py3K). How does that sound? |
|||
| msg93734 - (view) | Author: Robert Szefler (rszefler) | Date: 2009年10月08日 08:09 | |
Fine with me, though problems would arise. Default encoding for example. If encoding selection is mandatory it would break compatibility. Using default locale is not such a good idea - local machine's locale would generally not need to have any correlation to the remote logger's. Maybe the best solution would be to coerce the text to ASCII per default (such as not to break current semantics) but fix the exception thrown (throw an Unicode*Error) and allow an optional encoding parameter to handle non-ASCII characters? |
|||
| msg93737 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2009年10月08日 09:03 | |
> Robert Szefler added the comment: > > Fine with me, though problems would arise. Default encoding for example. > If encoding selection is mandatory it would break compatibility. Using > default locale is not such a good idea - local machine's locale would > generally not need to have any correlation to the remote logger's. I'm not planning to make encoding selection mandatory: I would provide a parameter encoding=None so backward compatibility is preserved. On 2.x: During output, a check will be made for Unicode. If not found, the data is output as is. Otherwise (if Unicode) it's encoded using either the specified encoding (if not None) or some default - for example, locale.getpreferredencoding(). I understand what you're saying about the locales of two different machines not being the same - but there's no way around this, because if a socket receives bytes representing text, it needs to know what encoding was used so that it can reconstruct the Unicode correctly. So that information at least needs to be known at the receiving end, rather than guessd. While 'utf-8' might be a reasonable choice, I'm not sure it should be enforced. So the code sending the bytes can specify e.g. 'cp1251' and the other end has to know so it can decode correctly. I've posted on python-dev for advice about what encoding to use if none is specified. On 3.x: We will always be passing Unicode in so we will always need to convert to bytes using some encoding. Again, if not specified, a suitable default encoding needs to be chosen. > Maybe the best solution would be to coerce the text to ASCII per default > (such as not to break current semantics) but fix the exception thrown > (throw an Unicode*Error) and allow an optional encoding parameter to > handle non-ASCII characters? I'm not exactly sure what you mean, but I think I've covered it in my comments above. To summarise: On 2.x, encoding is not mandatory but if Unicode is passed in, either a specified encoding or a suitable default encoding will be used to encode the Unicode into str. On 3.x, encoding is not mandatory and Unicode should always be passed in, which will be encoded to bytes using either a specified encoding or a suitable default encoding. |
|||
| msg94141 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2009年10月16日 16:28 | |
According to information from Martin von Löwis - see http://mail.python.org/pipermail/python-dev/2009-October/092825.html - UTF-8 should always be used, with a BOM, when sending Unicode (according to RFC 5424). The fix will use this approach. No encoding parameter will be added. |
|||
| msg94323 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2009年10月21日 20:34 | |
Fix checked into trunk and py3k (r75586). Please verify in your environment and post your results here. There are no plans to backport this to 2.6 or earlier. |
|||
| msg94937 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2009年11月05日 17:53 | |
No adverse feedback on fix, so closing. |
|||
| msg111646 - (view) | Author: Georg Brandl (georg.brandl) * (Python committer) | Date: 2010年07月26日 17:16 | |
There is indeed a problem with the patch: the BOM is put in front of the angle brackets indicating the priority/facility, so the syslog can't find it anymore. The BOM should be put after the brackets. |
|||
| msg114413 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2010年08月19日 22:20 | |
Updated implementation so that <n> + BOM + message is sent, for py3k branch only (r84218). Please verify fix in your environment. |
|||
| msg114422 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2010年08月20日 08:44 | |
Err, make that r84222. |
|||
| msg114566 - (view) | Author: Georg Brandl (georg.brandl) * (Python committer) | Date: 2010年08月21日 21:32 | |
Looks good to me. |
|||
| msg115382 - (view) | Author: Remi Broemeling (Remi.Broemeling) | Date: 2010年09月02日 14:54 | |
I was encountering the logging.handlers.SysLogHandler bug described by Georg Brandl yesterday/today -- it took quite a while to track down the issue as I assumed it would be in either my code or possibly the framework code (Django). I didn't take into account that the issue might well be in Python's core logging library. I've applied the patch (r84222) to my environment and it works, would vote this issue up as needing to be fixed in stable releases of Python ASAP. Not much worse than sporadically failing logging (i.e. if you're logging both UTF and non-UTF log lines). |
|||
| msg115402 - (view) | Author: Remi Broemeling (Remi.Broemeling) | Date: 2010年09月02日 20:19 | |
Attaching UTFFixedSysLogHandler, which is a sub-class of logging.handlers.SysLogHandler. The sub-class re-implements the emit() code to put the BOM in the right place (a re-implementation of r84218 and r84222). Can be used with existing Python codebases as a bug-fixed drop-in-replacement for logging.handlers.SysLogHandler without having to edit the core python module. Just use UTFFixedSysLogHandler instead of logging.handlers.SysLogHandler. Online necessary until such time as the bug-fixes are pulled into stable versions of python. |
|||
| msg115424 - (view) | Author: Vinay Sajip (vinay.sajip) * (Python committer) | Date: 2010年09月03日 09:10 | |
Fix backported to release27-maint (r84445). Python 2.6 and earlier are in security-fix-only mode. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:53 | admin | set | github: 51326 |
| 2010年09月03日 09:10:03 | vinay.sajip | set | messages: + msg115424 |
| 2010年09月02日 20:19:40 | Remi.Broemeling | set | files:
+ utffixedsysloghandler.py messages: + msg115402 |
| 2010年09月02日 14:59:07 | vstinner | set | nosy:
+ vstinner |
| 2010年09月02日 14:54:45 | Remi.Broemeling | set | nosy:
+ Remi.Broemeling messages: + msg115382 |
| 2010年08月21日 21:32:42 | georg.brandl | set | status: pending -> closed messages: + msg114566 |
| 2010年08月20日 08:45:01 | vinay.sajip | set | status: open -> pending |
| 2010年08月20日 08:44:09 | vinay.sajip | set | status: pending -> open messages: + msg114422 |
| 2010年08月19日 22:20:55 | vinay.sajip | set | status: open -> pending messages: + msg114413 |
| 2010年08月04日 22:54:14 | terry.reedy | set | stage: commit review versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.5 |
| 2010年07月26日 17:16:28 | georg.brandl | set | status: closed -> open nosy: + georg.brandl messages: + msg111646 |
| 2009年11月05日 17:53:11 | vinay.sajip | set | status: pending -> closed messages: + msg94937 |
| 2009年10月21日 20:34:43 | vinay.sajip | set | status: open -> pending resolution: fixed messages: + msg94323 |
| 2009年10月16日 16:28:47 | vinay.sajip | set | messages: + msg94141 |
| 2009年10月09日 21:45:55 | vinay.sajip | set | assignee: vinay.sajip |
| 2009年10月08日 09:03:32 | vinay.sajip | set | messages: + msg93737 |
| 2009年10月08日 08:09:16 | rszefler | set | messages: + msg93734 |
| 2009年10月07日 21:15:19 | vinay.sajip | set | messages: + msg93724 |
| 2009年10月07日 19:42:57 | r.david.murray | set | nosy:
+ vinay.sajip |
| 2009年10月07日 12:42:37 | rszefler | create | |