Recode bytes which cannot be decoded in utf-8 in python -


reading in txt files - there 1 byte causing me issues encode:

    open(input_filename_and_director, 'rb') f:         r = unicodecsv.reader(f, delimiter="|")  

results in error message:

   unicodedecodeerror: 'utf8' codec can't decode byte 0xc3 in position 26: invalid continuation byte 

is there anyway specify how want these bytes handled (i.e. read byte in character?)

depending upon want, try using unicodecsv.reader(f, delimiter="|", errors='replace') or unicodecsv.reader(f, delimiter="|", errors='ignore'). unicodecsv passes through errors parameter unicode encoding. see unicode or here more information.


Comments

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

[C++][SFML 2.2] Strange Performance Issues - Moving Mouse Lowers CPU Usage -

javascript - React + webpack: 'process.env' is undefined -