@Glenton
Hi,
Thanks for your reply. Let me elaborate problem:
I have used urllib module to open and read web site, scripts looks like:
import urllib
txt = urllib.urlopen("http://www.terme-catez.si").read()
txt
gives result like below:
....some more portion is skipped....
- <div class="noga">\r\n <p>\r\n Vse gradivo\r\n © 1999-\
-
r\n 2010\r\n <a href="http://www.terme-catez.si" target="_blank">Terme
-
\xc4\x8cate\xc5\xbe</a>\r\n Slovenija\r\n <br />\r\n Spletne re\
-
xc5\xa1itve\r\n © 1996-\r\n 2010\r\n <a href="http://www.tme
-
dia.biz" target="_blank">(T)media</a></p>\r\n </div>\r\n</div>\r\n<div class="o
If you see above code, accented chars looks like:
Terme \xc4\x8cate\xc5\xbe (original is Terme Čatež).
However, I want Terme Čatež to Terme Catez. So, code like \xc4\x8c or \xc5\xbe should be converted into unaccented chars.
Is there any way to replace all such code to unaccented chars.
Thanks