Message138245
| Author |
sdaoden |
| Recipients |
r.david.murray, sdaoden, wally1980 |
| Date |
2011年06月13日.13:56:23 |
| SpamBayes Score |
4.973578e-07 |
| Marked as misclassified |
No |
| Message-id |
<1307973385.02.0.3665715882.issue11728@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Hello Valery Masiutsin, i recently stumbled over this while searching
for the link to the standart i've stored in another issue.
(Without being logged in, say.)
The de-facto standart (http://qmail.org/man/man5/mbox.html) says:
HOW A MESSAGE IS READ
A reader scans through an mbox file looking for From_ lines.
Any From_ line marks the beginning of a message. The reader
should not attempt to take advantage of the fact that every
From_ line (past the beginning of the file) is preceded by a
blank line.
This is however the recent version. The "mbox" manpage of my up-to-date
Mac OS X 10.6.7 does not state this, for example. It's from 2002.
However, all known MBOX standarts, i.e. MBOXO, MBOXRD, MBOXCL, require
proper quoting of non-From_ "From " lines (by preceeding with '>').
So your example should not fail in Python.
(But hey - are you sure *that* has been produced by Perl?)
You're right however that Python seems to only support the old MBOXO
way of un-escaping only plain "From " to/from ">From ", which is not
even mentioned anymore in the current standart - that only describes
MBOXRD ("(>*From )" -> ">"+match.group(1)).
(Lucky me: i own Mac OS X, otherwise i wouldn't even know.)
Thus you're in trouble if the unescaping is performed before the split..
This is another issue, though: "MBOX parser uses MBOXO algorithm".
;> - Ciao, Steffen |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2011年06月13日 13:56:25 | sdaoden | set | recipients:
+ sdaoden, r.david.murray, wally1980 |
| 2011年06月13日 13:56:25 | sdaoden | set | messageid: <1307973385.02.0.3665715882.issue11728@psf.upfronthosting.co.za> |
| 2011年06月13日 13:56:24 | sdaoden | link | issue11728 messages |
| 2011年06月13日 13:56:23 | sdaoden | create |
|