471,330 Members | 1,851 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,330 software developers and data experts.

urllib.unquote + unicode

Hello all,

i am using urllib.unquote_plus to unquote a string. Sometimes i get a
strange string like for example "spolu%u017E%E1ci.cz" to unquote. Here
the problem is that some application decided to quote a non-ascii
character as %uxxxx directly, instead of using an encoding and quoting
byte per byte.

Python (2.4.1) simply returns "'spolu%u017E\xe1ci.cz", which is likely
not what the application meant.

My question is, is this %u quoting a standard (i.e., urllib is in the
wrong), is it not (i.e., the application is in the wrong and urllib
silently ignores the '%u0' - why?), and most importantly, is there a
simple workaround to get it working as expected?

Cheers!

Nov 13 '07 #1
1 3597
En Tue, 13 Nov 2007 13:14:18 -0300, koara <ko***@atlas.czescribió:
i am using urllib.unquote_plus to unquote a string. Sometimes i get a
strange string like for example "spolu%u017E%E1ci.cz" to unquote. Here
the problem is that some application decided to quote a non-ascii
character as %uxxxx directly, instead of using an encoding and quoting
byte per byte.

Python (2.4.1) simply returns "'spolu%u017E\xe1ci.cz", which is likely
not what the application meant.

My question is, is this %u quoting a standard (i.e., urllib is in the
wrong),
Not that I know of (and that doesn't prove anything).
is it not (i.e., the application is in the wrong and urllib
silently ignores the '%u0' - why?), and most importantly, is there a
simple workaround to get it working as expected?
Try this (untested):

def unquote_plus_u(source):
result = unquote_plus(source)
if '%u' in result:
result = result.replace('%u','\\u').decode('unicode_escape' )
return result

--
Gabriel Genellina

Nov 14 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Stuart McGraw | last post: by
reply views Thread by Pieter Edelman | last post: by
1 post views Thread by Timothy Wu | last post: by
11 posts views Thread by George Sakkis | last post: by
1 post views Thread by John Nagle | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.