Hi newsgroup,
i am trying to replace german special characters in strings like
str = re.sub('ö', 'oe', str)
When i work with this, i always get the message
UniCode Error: ASCII decoding error : ordinal not in range(128)
Yes i have googled, i searched the faq, manual and python library and
searched all known soruces of information. I played with the python
builtin function encode to enforce the rigth encoding, but the error
stays the same. I ve read a lot about UniCode and internal conversion
about Strings done by python, but somehow i ve missed the clue.
Nope, python says Huuups... ordinal not in range(128), ;-(
Anyone of you having any idea?? Seems like i am too stupid to read
documentation carefully., perhaps i misunderstand something...
thanks for your help in advance
Sebastian 14 3529
Sebastian Meyer wrote: Hi newsgroup,
i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str)
When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128)
Yes i have googled, i searched the faq, manual and python library and searched all known soruces of information. I played with the python builtin function encode to enforce the rigth encoding, but the error stays the same. I ve read a lot about UniCode and internal conversion about Strings done by python, but somehow i ve missed the clue. Nope, python says Huuups... ordinal not in range(128), ;-(
Anyone of you having any idea?? Seems like i am too stupid to read documentation carefully., perhaps i misunderstand something...
thanks for your help in advance
Sebastian
I'm experiencing something similar for the moment. I try to
base64-encode Unicode strings and I get the exact same errormessage. s = u'ö' s
u'\xf6' s.encode('base6 4')
Traceback (most recent call last):
File "<interacti ve input>", line 1, in ?
File "C:\Python23\li b\encodings\bas e64_codec.py", line 24, in
base64_encode
output = base64.encodest ring(input)
File "C:\Python23\li b\base64.py", line 39, in encodestring
pieces.append(b inascii.b2a_bas e64(chunk))
UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\xe9' in
position 0: ordinal not in range(128)
When I don't specify it's unicode it works: s = 'ö' s
'\xf6' s.encode('base6 4')
'9g==\n'
The reason I want to base64-encode these unicode strings is because I
get those as input and want to store them in a MySQL database using
SQLObject.
"Sebastian Meyer" <s.*****@techno logy-network.de> writes: Hi newsgroup,
i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str)
1) str is the name of a builtin -- often a bad idea to use that as a
variable name.
2) I presume `str' is a unicode string? Try writing the literal as
u'ö' instead (and adding the appropriate coding cookie to your
source file if using Python 2.3). Or I guess you could write it
u'\N{LATIN SMALL LETTER O WITH DIAERESIS}'
Cheers,
mwh
--
Usenet is like a herd of performing elephants with diarrhea --
massive, difficult to redirect, awe-inspiring, entertaining, and
a source of mind-boggling amounts of excrement when you least
expect it. -- spaf (1992)
On Thu, 06 Nov 2003 13:39:25 +0000, Michael Hudson wrote: "Sebastian Meyer" <s.*****@techno logy-network.de> writes:
Hi newsgroup,
i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str) 1) str is the name of a builtin -- often a bad idea to use that as a variable name.
it was only the example name for the variable, be sure that dont
use any builtins as variable names
maybe not a good example ... thanks for the hint 2) I presume `str' is a unicode string? Try writing the literal as u'ö' instead (and adding the appropriate coding cookie to your source file if using Python 2.3). Or I guess you could write it
u'\N{LATIN SMALL LETTER O WITH DIAERESIS}'
i ll try and report back... Cheers, mwh
"Sebastian Meyer" <s.*****@techno logy-network.de> wrote in message
news:pa******** *************** ****@technology-network.de... Hi newsgroup,
i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str)
When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128)
Try adding
sys.setdefaulte ncoding( 'latin-1' )
to your site.py module, or rewrite your fragment as
from = 'ö'
to = 'oe'
s = re.sub( from.encode('la tin-1'), to.encode('lati n-1', s )
If you are running on Windows you might want to change 'latin-1' to 'mbcs',
as that seems to be the most forgiving codec, but it is Windows only.
Joe
Rudy Schockaert <ru************ *@pandoraSTOPSP AM.be> writes: Sebastian Meyer wrote:
Hi newsgroup, i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str) When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128) Yes i have googled, i searched the faq, manual and python library and searched all known soruces of information. I played with the python builtin function encode to enforce the rigth encoding, but the error stays the same. I ve read a lot about UniCode and internal conversion about Strings done by python, but somehow i ve missed the clue. Nope, python says Huuups... ordinal not in range(128), ;-( Anyone of you having any idea?? Seems like i am too stupid to read documentation carefully., perhaps i misunderstand something... thanks for your help in advance Sebastian I'm experiencing something similar for the moment. I try to base64-encode Unicode strings and I get the exact same errormessage.
"base64-encoding Unicode strings" is not a particularly well defined
operation. "base64-encoding" is a way of turning *binary data* into a
particularly "safe" sequence of ascii characters.
Unicode (in some sense) is a family of ways of representing strings of
characters as binary data.
So to base-64 encode a Unicode string, you need to choose *which*
member of this family you're going to use, which is to say the
encoding. UTF-8 would seem a good bet.
But... >>> s = u'ö' >>> s u'\xf6' >>> s.encode('base6 4') Traceback (most recent call last): File "<interacti ve input>", line 1, in ? File "C:\Python23\li b\encodings\bas e64_codec.py", line 24, in base64_encode output = base64.encodest ring(input) File "C:\Python23\li b\base64.py", line 39, in encodestring pieces.append(b inascii.b2a_bas e64(chunk)) UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128) u'ö'.encode('ut f-8').encode('bas e64')
'w7Y=\n'
When I don't specify it's unicode it works: >>> s = 'ö' >>> s '\xf6' >>> s.encode('base6 4') '9g==\n'
Well, this works because your terminal seems to be latin-1:
u'ö'.encode('la tin-1').encode('bas e64')
'9g==\n'
What would you like to do with a character that isn't in latin-1?
The reason I want to base64-encode these unicode strings is because I get those as input and want to store them in a MySQL database using SQLObject.
! Why can't you just encode them as utf-8 strings? (Or, thinking
about it, why doesn't SQLObject support unicode?)
Cheers,
mwh
--
I think if we have the choice, I'd rather we didn't explicitly put
flaws in the reST syntax for the sole purpose of not insulting the
almighty. -- /will on the doc-sig
Sebastian Meyer wrote: Hi newsgroup,
i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str)
When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128)
Yes i have googled, i searched the faq, manual and python library and searched all known soruces of information. I played with the python builtin function encode to enforce the rigth encoding, but the error stays the same. I ve read a lot about UniCode and internal conversion about Strings done by python, but somehow i ve missed the clue. Nope, python says Huuups... ordinal not in range(128), ;-(
Anyone of you having any idea?? Seems like i am too stupid to read documentation carefully., perhaps i misunderstand something...
thanks for your help in advance
Sebastian
Works here, even with my older snake:
Python 2.2.1 (#1, Sep 10 2002, 17:49:17)
[GCC 3.2] on linux2
Type "help", "copyright" , "credits" or "license" for more information. import re re.sub("ö", "oe", "Döspaddel" )
'Doespaddel' re.sub("ö", "oe", u"Döspaddel" )
u'Doespaddel' re.sub("ö", u"oe", u"Döspaddel" )
u'Doespaddel' re.sub(u"ö", u"oe", u"Döspaddel" )
u'Doespaddel'
To provoke a UnicodeError, I have to convert a unicode string with umlauts
to str without providing the encoding:
str(u"Döspaddel ")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)
I suspect that you have something similar hidden in your code (i. e.
characters >= 128 that are not converted). The remedy is to explicitly
decode with the appropriate encoding:
u"Döspaddel".en code("latin-1")
'D\xf6spaddel'
Try to build a minimal script that shows the reported behaviour and fix it
or post it for more detailed advice. By the way, don't use str as a
variable name, it's the type of "ordinary" strings.
Peter
Joe Fromm wrote: Try adding
sys.setdefaulte ncoding( 'latin-1' )
to your site.py module, or rewrite your fragment as
At the end of site.py you can enable a piece of code that sets your
default encoding to the current locale of your computer:
if 1:
# Enable to support locale aware default string encodings.
import locale
loc = locale.getdefau ltlocale()
if loc[1]:
encoding = loc[1]
This works great for me.
Thanks for pointing me to site.py
P.S. I really need some weeks off so I can read all the available
documentation ;-)
> u'ö'.encode ('utf-8').encode('bas e64') 'w7Y=\n'
This works indeed. And thanks to Joe Fromm's hint (site.py) I don't have
to worry about it anymore. What would you like to do with a character that isn't in latin-1?
Actually, I don't care as long as the encode and decode on the same
machine give me back the original value.The reason I want to base64-encode these unicode strings is because I get those as input and want to store them in a MySQL database using SQLObject.
! Why can't you just encode them as utf-8 strings? (Or, thinking about it, why doesn't SQLObject support unicode?)
The actual input strings don't really contain unicode text values, but
rather binary values i get as result from calling win32.NetUserEn um.
The manual of SQLObject (great product btw) explains how you can easily
store binary data in a SQL table by encoding it when setting and
decoding it when getting the value. Tha is just what I was trying to do.
Rudy Schockaert <ru************ *@pandoraSTOPSP AM.be> writes: >u'ö'.encode ('utf-8').encode('bas e64') 'w7Y=\n'
This works indeed. And thanks to Joe Fromm's hint (site.py) I don't have to worry about it anymore.
Well, I'm from the setdefaultencod ing-is-evil camp, but it sounds like
you're in a pretty icky situation. What would you like to do with a character that isn't in latin-1? Actually, I don't care as long as the encode and decode on the same machine give me back the original value.
Huh? The reason I want to base64-encode these unicode strings is because I get those as input and want to store them in a MySQL database using SQLObject. ! Why can't you just encode them as utf-8 strings? (Or, thinking about it, why doesn't SQLObject support unicode?)
The actual input strings don't really contain unicode text values, but rather binary values i get as result from calling win32.NetUserEn um.
Oh, so they're not really unicode strings at all? Blech. That's
really really nasty. Binary data should really be represented as
(narrow) strings in Python. Perhaps the utf-16-le codec would be the
most appropriate...
Cheers,
mwh
--
Q: What are 1000 lawyers at the bottom of the ocean?
A: A good start.
(A lawyer told me this joke.)
-- Michael Ströder, comp.lang.pytho n This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Thomas |
last post by:
Hi,
I implemented a composite pattern which should be serializable to xml.
After spending some time in the newsgroups, i finally managed
serializing, even with utf-8 instead of utf-16, which causes ie
problems. But when deserializing the xml into the object structure,
the following exception is beeing thrown:
There is an error in XML document (3, 701).
|
by: |
last post by:
hi,
I have got XML tag <fo:block font-weight="bold" font-size="13pt"><!]></fo:block>. Problem is when
i gives xmlURL from c# to InputStreamReader in J# code with the aspx web
page as XML which contain tag.
When i try to display output on PDF, it'll get the value from the tag
<!]> some thing like a~,^ 845 strange charachers instead of
Euro sign. Also where ever i have qoute in text data in XML tags it displays
strange charachers.
J#...
|
by: Oscar Thornell |
last post by:
Hi,
I generate and temporary saves a text file to disk. Later I upload this file
to Microsoft MapPoint (not so important).
The file needs to be in UTF-8 encoding and I explicitly use the
"Encoding.UTF8" in the constructor like this:
StreamWriter writer = new StreamWriter(file, Encoding.UTF8);
When I do this the StreamWriter inserts an UTF-8 preamble "" into the
|
by: Dales |
last post by:
I have a custom control that builds what we refer to as "Formlets"
around some content in a page. These are basically content "wrapper"
sections that are tables that have a colored header and provide an
open TD with a DIV in it for the content of this formlet. (The DIV is
for DHTML to hide and show the content)
I've created a web page showing step by step the two problems I'm
encountering. This problem is much easier to see than it...
|
by: Chris Ashley |
last post by:
I have been tearing my hair out (or indeed, what's left of it) all day
with this one. I'm not sure if it's a .NET issue, a server issue or
anything else and would appreciate any guidance.
Basically, I have a web app that sends emails. Very basic code, nothing
fancy, and I have had it working on about 5 machines! It's hardly worth
pasting, but here's the send method from my email class anyway. As you
can see, very basic:
public void...
| |
by: liam_weston |
last post by:
I have 2 supposedly identical Windows 2000 web servers each with IIS5.
Both have the ASPCODEPAGE set to 65001 (utf-8) in the metabase. The
first server has been running pages like the one below for a long time
and the output has been correct ie. the ™ character and international
characters have displayed correctly (except that the ™ has never
displayed correctly in the titlebar). My problem is that the second
server is not displaying...
|
by: John Kraft |
last post by:
Hello all,
I'm experiencing some, imo, strange behavior with the StreamReader
object I am using in the code below. Summary is that I am downloading
a file from a website and saving it to disk for further parsing. I
know, I could use the WebClient and it would be easier, but I don't
have the flexibility I want with it.
This code appears to work exactly the way I want unless the user
cancels the the background operation. In that...
|
by: Bexm |
last post by:
Hello
I have searched through this forum and it seems some people are having similar problems to me but none of the fixes are fixing mine..! :(
I have a table in my database that has two xml fields.
I have two bits of generated XML I want to store.. the first one has
"<?xml version="1.0" encoding="utf-8" ?>" as its declaration and the second has "<?xml version="1.0" encoding="utf-16" ?>"
First time round these both get added to the...
|
by: ioni |
last post by:
Good day, fellows!
I have a strange problem – at my site there is a flash strip, that
loads data dynamically.
It works fine (grabs data from the remote server and presents it),
however in IE7 and its clones I encounter a strange problem where I
can hear clicking sound non-stop (like the page is being reloaded non-
stop), whereas the page is not reloading.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |