"Achim Domma" <do***@procoders.net> writes:
I need to convert Big5 or GB encoded chinese strings to unicode. It would
be also nice to be able to detect the encoding of the original string.
Search with groups.google.com I found some links to different projects but
they all look not very active. Can somebody give me a short overview of the
status of processing chinese texts with python?
The very short summary: Use the CJK codecs package; it supports all
encodings you might encounter, and it is actively maintained.
As for detecting the encoding of the original string: Forget it. Tell
your communication partners to always properly declare the encoding.
Regards,
Martin