Dirk Hagemann wrote:
When I receive data from Microsoft Active Directory it is an
"ad_object" and has the type unicode. When I try to convert it to a
string I get this error:
UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\xfc' in
position 26: ordinal not in range(128)
This is caused by characters like the german ä, ö or ü.
But I (think I) need this as a string. Is there a simple solution???
A Unicode string is also a string.
If you want an 8-bit string, you need to decide what encoding you want to
use. Common encodings are us-ascii (which is the default if you convert from
unicode to 8-bit strings in Python), ISO-8859-1 (aka Latin-1), and UTF-8.
For example, if you want Latin-1 strings, you can use one of.
s = u.encode("iso-8859-1") # fail if some character cannot be converted
s = u.encode("iso-8859-1", "replace") # instead of failing, replace with ?
s = u.encode("iso-8859-1", "ignore") # instead of failing, leave it out
If you want an ascii string, replace "iso-8859-1" above with "ascii".
If you want to output the data to a web browser or an XML file, you can use
import cgi
s = cgi.escape(u).e ncode("ascii", "xmlcharrefrepl ace")
</F>