This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年11月01日 19:58 by jelie, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| nntpover.patch | pitrou, 2010年11月02日 22:52 | |||
| nntpover2.patch | pitrou, 2010年11月02日 23:19 | |||
| Messages (12) | |||
|---|---|---|---|
| msg120158 - (view) | Author: Julien ÉLIE (jelie) | Date: 2010年11月01日 19:58 | |
Following the first example of the documentation:
import nntplib
s = nntplib.NNTP('news.trigofacile.com')
resp, count, first, last, name = s.group('fr.comp.lang.python')
print('Group', name, 'has', count, 'articles, range', first, 'to', last)
resp, overviews = s.over((last - 9, last))
for id, over in overviews:
print(id, nntplib.decode_header(over['subject']))
s.quit()
An exception is raised:
"OVER/XOVER response doesn't include names of additional headers"
I believe the issue comes from the fact that the source code does not
handle the case described in Section 8.3.2 of RFC 3977:
For all fields, the value is processed by first removing all CRLF
pairs (that is, undoing any folding and removing the terminating
CRLF) and then replacing each TAB with a single space. If there is
no such header in the article, no such metadata item, or no header or
item stored in the database for that article, the corresponding field
MUST be empty.
Example of a successful retrieval of overview information for a range
of articles:
[C] GROUP misc.test
[S] 211 1234 3000234 3002322 misc.test
[C] OVER 3000234-3000240
[S] 224 Overview information follows
[S] 3000234|I am just a test article|"Demo User"
<nobody@example.com>|6 Oct 1998 04:38:40 -0500|
<45223423@example.com>|<45454@example.net>|1234|
17|Xref: news.example.com misc.test:3000363
[S] 3000235|Another test article|nobody@nowhere.to
(Demo User)|6 Oct 1998 04:38:45 -0500|<45223425@to.to>||
4818|37||Distribution: fi
[S] 3000238|Re: I am just a test article|somebody@elsewhere.to|
7 Oct 1998 11:38:40 +1200|<kfwer3v@elsewhere.to>|
<45223423@to.to>|9234|51
[S] .
Note the missing "References" and Xref headers in the second line,
the missing trailing fields in the first and last lines, and that
there are only results for those articles that still exist.
Also please note that nntplib should also work in case the database is not consistent. Some news servers might be broken and do not follow the MUST NOT...
The LIST OVERVIEW.FMT command SHOULD list all the fields for which
the database is consistent at that moment. It MAY omit such fields
(for example, if it is not known whether the database is consistent
or inconsistent). It MUST NOT include fields for which the database
is inconsistent or that are not stored in the database. Therefore,
if a header appears in the LIST OVERVIEW.FMT output but not in the
OVER output for a given article, that header does not appear in the
article (similarly for metadata items).
|
|||
| msg120272 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月02日 22:41 | |
I am wondering how to return the corresponding information. Should the field be totally absent from the returned dictionary, should it map to the empty string, or should it map to None? I'm leaning towards the latter (map to None), but perhaps the empty string is better? |
|||
| msg120275 - (view) | Author: Julien ÉLIE (jelie) | Date: 2010年11月02日 22:45 | |
The empty string would mean the header exists, and is empty (though not RFC-compliant). For instance: "User-Agent: \r\n" I believe None is better. |
|||
| msg120278 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月02日 22:52 | |
Here is a patch for returning None on absent fields. (works with trigofacile.com) |
|||
| msg120282 - (view) | Author: Julien ÉLIE (jelie) | Date: 2010年11月02日 23:05 | |
OK, thanks.
By the way, why is the token stripped?
token = token[len(h):].lstrip(" ")
"X-Header: test \r\n" in an header is kept in the overview as-is.
I do not see why " test " should not be the value returned.
Also, with:
token = token or None
"X-Header: \r\n" becomes None if I understand how the source code works... Yet, it is a real '', not None.
|
|||
| msg120283 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月02日 23:08 | |
> OK, thanks.
> By the way, why is the token stripped?
> token = token[len(h):].lstrip(" ")
>
> "X-Header: test \r\n" in an header is kept in the overview as-is.
> I do not see why " test " should not be the value returned.
It's a simple way of handling "Xref: foo" and returning "foo" rather
than " foo". If spaces are supposed to be significant I can just strip
the first one, though.
> Also, with:
> token = token or None
>
> "X-Header: \r\n" becomes None if I understand how the source code
> works... Yet, it is a real '', not None.
Er, so you're disagreeing with your previous message? Or am I missing
something? :)
|
|||
| msg120286 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月02日 23:18 | |
Here is a patch trying to better handle whitespace. Would it be ok for you? |
|||
| msg120287 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月02日 23:19 | |
Oops, sorry. |
|||
| msg120299 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2010年11月03日 02:00 | |
My conclusion in working on the email package is that only the first space after the ':', if it exists, should be stripped. That is, even though the RFC (for email) reads as if the space after the colon is part of the value, in practice it is part of the delimiter, but is optional (and almost always present, in email). Whether additional leading spaces are significant depends on why they are there. Since they are an unusual case, I would choose to preserve them on the theory that someone might care, and that someone who doesn't care can strip them. |
|||
| msg120332 - (view) | Author: Julien ÉLIE (jelie) | Date: 2010年11月03日 17:55 | |
> Er, so you're disagreeing with your previous message? > Or am I missing something? :) I was saying that if an empty string is returned, then it means that the header exists and is empty. An example was "User-Agent: \r\n". And my remark "I believe None is better." concerned your initial question "Should the field be totally absent [...]" regarding how to deal with a header that does not exist. Therefore, "User-Agent: \r\n" becomes a real '', not None. None is only when the User-Agent: header field is absent from the headers. > Here is a patch trying to better handle whitespace. > Would it be ok for you? Yes Antoine, thanks! |
|||
| msg120333 - (view) | Author: Julien ÉLIE (jelie) | Date: 2010年11月03日 18:01 | |
> My conclusion in working on the email package is that only > the first space after the ':', if it exists, should be stripped. > That is, even though the RFC (for email) reads as if the space > after the colon is part of the value, in practice it is part > of the delimiter, but is optional (and almost always present, > in email). That is why the RFC (for netnews) explicitly mentions that the space after the colon is not part of the value. See the grammar for OVER in RFC 3977: hdr-n-content = [(header-name ":" / metadata-name) SP hdr-content] So yes, only the first space should be stripped. |
|||
| msg120335 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月03日 18:19 | |
Ok, committed in r86139. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:08 | admin | set | github: 54490 |
| 2010年11月03日 18:19:27 | pitrou | set | status: open -> closed resolution: fixed messages: + msg120335 stage: resolved |
| 2010年11月03日 18:01:08 | jelie | set | messages: + msg120333 |
| 2010年11月03日 17:55:54 | jelie | set | messages: + msg120332 |
| 2010年11月03日 02:30:10 | rhettinger | set | messages: - msg120300 |
| 2010年11月03日 02:13:20 | rhettinger | set | nosy:
+ rhettinger messages: + msg120300 |
| 2010年11月03日 02:00:49 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg120299 |
| 2010年11月02日 23:19:31 | pitrou | set | files:
+ nntpover2.patch messages: + msg120287 |
| 2010年11月02日 23:19:20 | pitrou | set | files: - nntpover2.patch |
| 2010年11月02日 23:18:46 | pitrou | set | files:
+ nntpover2.patch messages: + msg120286 |
| 2010年11月02日 23:08:37 | pitrou | set | messages: + msg120283 |
| 2010年11月02日 23:05:08 | jelie | set | messages: + msg120282 |
| 2010年11月02日 22:52:33 | pitrou | set | files:
+ nntpover.patch keywords: + patch messages: + msg120278 |
| 2010年11月02日 22:45:10 | jelie | set | messages: + msg120275 |
| 2010年11月02日 22:41:23 | pitrou | set | messages: + msg120272 |
| 2010年11月01日 20:02:43 | pitrou | set | nosy:
+ pitrou |
| 2010年11月01日 19:58:22 | jelie | create | |