homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sdaoden
Recipients r.david.murray, sdaoden, wally1980
Date 2011年06月13日.13:56:23
SpamBayes Score 4.973578e-07
Marked as misclassified No
Message-id <1307973385.02.0.3665715882.issue11728@psf.upfronthosting.co.za>
In-reply-to
Content
Hello Valery Masiutsin, i recently stumbled over this while searching
for the link to the standart i've stored in another issue.
(Without being logged in, say.)
The de-facto standart (http://qmail.org/man/man5/mbox.html) says:
HOW A MESSAGE IS READ
 A reader scans through an mbox file looking for From_ lines.
 Any From_ line marks the beginning of a message. The reader
 should not attempt to take advantage of the fact that every
 From_ line (past the beginning of the file) is preceded by a
 blank line.
This is however the recent version. The "mbox" manpage of my up-to-date
Mac OS X 10.6.7 does not state this, for example. It's from 2002.
However, all known MBOX standarts, i.e. MBOXO, MBOXRD, MBOXCL, require
proper quoting of non-From_ "From " lines (by preceeding with '>').
So your example should not fail in Python.
(But hey - are you sure *that* has been produced by Perl?)
You're right however that Python seems to only support the old MBOXO
way of un-escaping only plain "From " to/from ">From ", which is not
even mentioned anymore in the current standart - that only describes
MBOXRD ("(>*From )" -> ">"+match.group(1)). 
(Lucky me: i own Mac OS X, otherwise i wouldn't even know.)
Thus you're in trouble if the unescaping is performed before the split..
This is another issue, though: "MBOX parser uses MBOXO algorithm".
;> - Ciao, Steffen
History
Date User Action Args
2011年06月13日 13:56:25sdaodensetrecipients: + sdaoden, r.david.murray, wally1980
2011年06月13日 13:56:25sdaodensetmessageid: <1307973385.02.0.3665715882.issue11728@psf.upfronthosting.co.za>
2011年06月13日 13:56:24sdaodenlinkissue11728 messages
2011年06月13日 13:56:23sdaodencreate

AltStyle によって変換されたページ (->オリジナル) /