ProjectStack
python

UnicodeDecodeError: codec can't decode byte

Python tried to decode bytes into a string using a specific encoding (usually UTF-8) but encountered a byte sequence that isn't valid in that encoding. The file is likely encoded in a different character set.

Common causes

  • Opening a file without specifying an encoding, and the default UTF-8 can't read it
  • The file was saved in a legacy encoding like Latin-1, Windows-1252, or ISO-8859-1
  • A binary file (image, PDF, executable) is being opened in text mode
  • A CSV or text file with special characters (accented letters, symbols) from non-UTF-8 software

How to fix it

  1. Specify the correct encoding when opening the file: open('file.csv', encoding='latin-1')
  2. Use errors='replace' or errors='ignore' to skip bad bytes: open('file', encoding='utf-8', errors='replace')
  3. Try to detect the encoding: pip install chardet, then chardet.detect(file.read())
  4. Save the file as UTF-8 in your text editor or spreadsheet tool

Example

Traceback (most recent call last): File "app.py", line 3, in <module> content = f.read() UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 42: invalid continuation byte

Opening a CSV file saved in Latin-1 encoding without specifying the encoding parameter

Have a different error?

Paste any error message into the Error Translator to get an instant explanation.