Chardet: econding autodetect for Python

Last week, I has been fighting against the hordes of the character encoding. My new task in tha job-tasks-pool is develop a friendly web app to manage configuration files of cluster apps. Ohh, great idea!, why didn’t I realize it before? (irony) . Only I need 4 things: one parser for each conffile syntax, a stable and secure way to get/save remote files, work with non-controlled kind of differents char encodings, … and a fancy GUI. As you can intuit, the task is not just a bit app, but this post only refer to a very useful python lib ( chardet discovered a few days ago. This lib can be auto-detect the encoding of a file very reliable. I suggest you that visit chardet homesite to see some clear examples.

Using it in a couple of lines of code:

import io
import chardet‘channels_info’, ‘r’, errors=’replace’)

# For example:
# fileencoding=”iso-8859-15″
fe = chardet.detect(r)[‘encoding’]
fl = fl.decode(fe)
print fl


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s