ProjectStack

python

UnicodeDecodeError: codec can't decode byte

Python tried to decode bytes into a string using a specific encoding (usually UTF-8) but encountered a byte sequence that isn't valid in that encoding. The file is likely encoded in a different character set.

Common causes

Opening a file without specifying an encoding, and the default UTF-8 can't read it
The file was saved in a legacy encoding like Latin-1, Windows-1252, or ISO-8859-1
A binary file (image, PDF, executable) is being opened in text mode
A CSV or text file with special characters (accented letters, symbols) from non-UTF-8 software

How to fix it

Specify the correct encoding when opening the file: open('file.csv', encoding='latin-1')
Use errors='replace' or errors='ignore' to skip bad bytes: open('file', encoding='utf-8', errors='replace')
Try to detect the encoding: pip install chardet, then chardet.detect(file.read())
Save the file as UTF-8 in your text editor or spreadsheet tool

Example

Traceback (most recent call last):
  File "app.py", line 3, in <module>
    content = f.read()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 42: invalid continuation byte

Opening a CSV file saved in Latin-1 encoding without specifying the encoding parameter

Have a different error?

Paste any error message into the Error Translator to get an instant explanation.