Recode bytes which cannot be decoded in utf-8 in python -

- June 15, 2015

reading in txt files - there 1 byte causing me issues encode:

    open(input_filename_and_director, 'rb') f:         r = unicodecsv.reader(f, delimiter="|")

results in error message:

   unicodedecodeerror: 'utf8' codec can't decode byte 0xc3 in position 26: invalid continuation byte

is there anyway specify how want these bytes handled (i.e. read byte in character?)

depending upon want, try using unicodecsv.reader(f, delimiter="|", errors='replace') or unicodecsv.reader(f, delimiter="|", errors='ignore'). unicodecsv passes through errors parameter unicode encoding. see unicode or here more information.

Search This Blog

Sort

Recode bytes which cannot be decoded in utf-8 in python -

Comments

Post a Comment

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

gradle error "Cannot convert the provided notation to a File or URI" -

ios - Possible to get UIButton sizeThatFits to work? -