By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,908 Members | 1,945 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,908 IT Pros & Developers. It's quick & easy.

Yet another UnicodeDecodeError problem

ashitpro
Expert 100+
P: 542
Expand|Select|Wrap|Line Numbers
  1. getResponse="HTTP/1.1 200 OK\r\nContent-Length: %d\r\nCache-Control: no-cache\r\n\r\n%s" % (payloadLength,byteString)
  2. getResponse = unicode(getResponse)
  3.  
At second line I get UnicodeDecodeError.
I am trying to replace "200 OK" from getResponse string with "500", using getResponse.replace(...)

Internally it convert string to unicode, I have just made it explicitly for better understanding.

Any help?
Sep 7 '10 #1
Share this Question
Share on Google+
6 Replies


bvdet
Expert Mod 2.5K+
P: 2,851
What version of Python? Is your default encoding "ascii"? When converting from a standard string to a unicode string, a UnicodeError exception may be raised if a character that cannot be converted is encountered. A full traceback may provide the direct cause of the exception.
Expand|Select|Wrap|Line Numbers
  1. >>> s = "abcdef%s" % ("\xfc")
  2. >>> print s
  3. abcdefü
  4. >>> unicode(s)
  5. Traceback (most recent call last):
  6.   File "<interactive input>", line 1, in ?
  7. UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 6: ordinal not in range(128)
  8. >>> 
Sep 7 '10 #2

ashitpro
Expert 100+
P: 542
I am using python 2.6
Default encoding is "ascii"..

How do you tackle this problem....?

Let me make this more straight..

getResponse = getResponse + (some binary data)

getResponse.Replace("this_str","that_str")

Obviously, in second statement, it will try to decode to 'ascii' and throw exception.

Is there any standard way to deal with binary data in string, which could solve my problem?
Sep 7 '10 #3

bvdet
Expert Mod 2.5K+
P: 2,851
Check out the struct module.
Sep 7 '10 #4

Expert 100+
P: 624
Someday I will have to take the time to learn Unicode, or just switch to Python3.X. A workaround until then is to drop down to decimal values.
Expand|Select|Wrap|Line Numbers
  1. s="abcdef%s200 OK\r\notherstuff" % ("\xfc")
  2. to_find = [ord(ltr) for ltr in "200 OK"]
  3. new_str_list = []
  4. len_s = len(s)
  5. for ctr in range(len_s):
  6.     found = False
  7.     ltr=s[ctr]
  8.     if ord(ltr) == to_find[0]:     ## first characters match
  9.         ## assumes the range will not go past end of line
  10.         found = True
  11.         for x in range(len(to_find)):
  12.             if ord(s[ctr+x]) != to_find[x]:
  13.                 found = False         ## not a match
  14.     if found :
  15.         new_str_list.append("5")      ## replace "2"
  16.     else:        
  17.         new_str_list.append(ltr)
  18. print "".join(new_str_list) 
Sep 8 '10 #5

ashitpro
Expert 100+
P: 542
I don't have problem with switching to python 3.x

Will that make any difference? If yes, how?
Sep 9 '10 #6

Expert 100+
P: 624
"In Python 3, all strings are sequences of Unicode characters."
See Section 4.3 at Dive Into Python 3 for more info.
Sep 9 '10 #7

Post your reply

Sign in to post your reply or Sign up for a free account.