getting fileinput to do errors='ignore' or 'replace'?

MRAB python at mrabarnett.plus.com
Thu Dec 3 11:12:55 EST 2015


On 2015年12月03日 15:12, Adam Funk wrote:
> I'm having trouble with some input files that are almost all proper
> UTF-8 but with a couple of troublesome characters mixed in, which I'd
> like to ignore instead of throwing ValueError. I've found the
> openhook for the encoding
>> for line in fileinput.input(options.files, openhook=fileinput.hook_encoded("utf-8")):
> do_stuff(line)
>> which the documentation describes as "a hook which opens each file
> with codecs.open(), using the given encoding to read the file", but
> I'd like codecs.open() to also have the errors='ignore' or
> errors='replace' effect. Is it possible to do this?
>It looks like it's not possible with the standard "hook_encoded", but
you could write your own alternative:
import codecs
def my_hook_encoded(encoding, errors):
 def opener(path, mode):
 return codecs.open(path, mode, encoding=encoding, errors=errors)
 return opener
for line in fileinput.input(options.files, 
openhook=fileinput.my_hook_encoded("utf-8", "ignore")):
 do_stuff(line)


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /