473,465 Members | 1,892 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

string.replace non-ascii characters

Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).

Some googling seemed to indicate other people have reported similar
troubles:

http://mail.python.org/pipermail/pyt...ly/391617.html

Anyone have any enlightening advice for me?

--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
Feb 12 '07 #1
10 11117
Samuel Karl Peterson wrote:
Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).
Was it something like this?
>>u'\xa0'.replace('\xa0', '')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 0:
ordinal not in range(128)

You might get that if you're mixing str and unicode. If both strings are
of one type or the other, you should be okay:
>>u'\xa0'.replace(u'\xa0', '')
u''
>>'\xa0'.replace('\xa0', '')
''

STeVe
Feb 12 '07 #2
Steven Bethard <st************@gmail.comon Sun, 11 Feb 2007 22:23:59
-0700 didst step forth and proclaim thus:
Samuel Karl Peterson wrote:
Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).

Was it something like this?
>>u'\xa0'.replace('\xa0', '')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position
0: ordinal not in range(128)
Yeah that looks like exactly what was happening, thank you. I wonder
why I had a unicode string though. I thought urllib2 always spat out
a plain string. Oh well.

u'\xa0'.encode('latin-1').replace('\xa0', " ")

Horray.
--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
Feb 12 '07 #3
En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:

Sorry to steal the thread! This is only related to your signature:
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
I just did that last week. Around 250 useless lines removed from a 1000
lines module. I think the original coder didn't read the tutorial past the
dictionary examples: *all* functions returned a dictionary or list of
dictionaries! Of course using different names for the same thing here and
there, ugh... I just throw in a few classes and containers, removed all
the nonsensical packing/unpacking of data going back and forth, for a net
decrease of 25% in size (and a great increase in robustness,
maintainability, etc).
If I were paid for the number of lines *written* that would not be a great
deal :)

--
Gabriel Genellina

Feb 12 '07 #4
En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:

Sorry to steal the thread! This is only related to your signature:
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
I just did that last week. Around 250 useless lines removed from a 1000
lines module. I think the original coder didn't read the tutorial past the
dictionary examples: *all* functions returned a dictionary or list of
dictionaries! Of course using different names for the same thing here and
there, ugh... I just throw in a few classes and containers, removed all
the nonsensical packing/unpacking of data going back and forth, for a net
decrease of 25% in size (and a great increase in robustness,
maintainability, etc).
If I were paid for the number of lines *written* that would not be a great
deal :)

--
Gabriel Genellina

Feb 12 '07 #5
En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:

Sorry to steal the thread! This is only related to your signature:
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
I just did that last week. Around 250 useless lines removed from a 1000
lines module. I think the original coder didn't read the tutorial past the
dictionary examples: *all* functions returned a dictionary or list of
dictionaries! Of course using different names for the same thing here and
there, ugh... I just throw in a few classes and containers, removed all
the nonsensical packing/unpacking of data going back and forth, for a net
decrease of 25% in size (and a great increase in robustness,
maintainability, etc).
If I were paid for the number of lines *written* that would not be a great
deal :)

--
Gabriel Genellina

Feb 12 '07 #6
On Mon, 12 Feb 2007 03:01:55 -0300, Gabriel Genellina wrote:
En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:

Sorry to steal the thread! This is only related to your signature:
>"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown

I just did that last week. Around 250 useless lines removed from a 1000
lines module.
[snip]

Hot out of uni, my first programming job was assisting a consultant who
was writing an application in Apple's "Hypertalk", a so-called "fourth
generation language" with an English-like syntax, aimed at non-programmers.

Virtually the first thing I did was refactor part of his code that looked
something like this:

set the name of button id 1 to 1
set the name of button id 2 to 2
set the name of button id 3 to 3
....
set the name of button id 399 to 399
set the name of button id 400 to 400
into something like this:

for i = 1 to 400:
set the name of button id i to i
--
Steven D'Aprano

Feb 12 '07 #7
"Gabriel Genellina" <ga******@yahoo.com.arwrote in triplicate:
If I were paid for the number of lines *written* that would not be a
great deal :)
You don't by any chance get paid by the number of posts to c.l.python?
Feb 12 '07 #8
Duncan Booth wrote:
"Gabriel Genellina" <ga******@yahoo.com.arwrote in triplicate:
>If I were paid for the number of lines *written* that would not be a
great deal :)

You don't by any chance get paid by the number of posts to c.l.python?
I was thinking the same thing.
Feb 12 '07 #9
On Feb 12, 11:44 pm, Deniz Dogan <kristn...@nospam.comwrote:
Duncan Booth wrote:
"Gabriel Genellina" <gagsl...@yahoo.com.arwrote in triplicate:
If I were paid for the number of lines *written* that would not be a
great deal :)
You don't by any chance get paid by the number of posts to c.l.python?

I was thinking the same thing.
O maker of the monstrous millisecond-muncher, I was thinking that you
were paid by the number of times that you typed 3600000 :-)

Feb 12 '07 #10
En Mon, 12 Feb 2007 07:44:14 -0300, Duncan Booth
<du**********@invalid.invalidescribió:
"Gabriel Genellina" <ga******@yahoo.com.arwrote in triplicate:
>If I were paid for the number of lines *written* that would not be a
great deal :)

You don't by any chance get paid by the number of posts to c.l.python?
I post a few messages but certainly I'm not the most prolific poster here!

--
Gabriel Genellina

Feb 12 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: nboutelier | last post by:
Scenario: you enter "foo bar" into a text field... Is it possible through javascript to select/highlight just "foo"? formObject.select() selects all. I need to select only part of the string....
8
by: Grant Wagner | last post by:
I'm a bit confused by String() (typeof 'string') vs new String() (typeof 'object'). When you need to access a method or property of a -String-, what type is JavaScript expecting (or rather, what...
19
by: Paul | last post by:
hi, there, for example, char *mystr="##this is##a examp#le"; I want to replace all the "##" in mystr with "****". How can I do this? I checked all the string functions in C, but did not...
12
by: Jeff S | last post by:
In a VB.NET code behind module, I build a string for a link that points to a JavaScript function. The two lines of code below show what is relevant. PopupLink = "javascript:PopUpWindow(" &...
9
by: Crirus | last post by:
dim pp as string pp="{X=356, Y=256}{X=356, Y=311.2285}{X=311.2285, Y=356}{X=256, Y=356}{X=200.7715, Y=356}{X=156, Y=311.2285}{X=156, Y=256}{X=156, Y=200.7715}{X=200.7715, Y=156}{X=256,...
4
by: Terry Olsen | last post by:
In my NNTP program, i'm using the Message-ID's as the filename (to eliminate duplicate messages coming in from different groups). My program has been working fine for months until I received some...
4
by: Joe | last post by:
I need to do a find/replace on a column name in DataColumn.Expression. Is there a way to do the following using RegEx? MyColumn 10 and Desc = "This is MyColumn desc" I need to replace the...
1
by: Michael Yanowitz | last post by:
Hello: I am hoping someone knows if there is an easier way to do this or someone already implemented something that does this, rather than reinventing the wheel: I have been using the...
2
by: tawright915 | last post by:
Ok so here is my regex (--.*\n|/\*(.|\n)*?\*/). It finds all comments just fine. However I want it to return to me all strings that are not commented out. Is there a way to exclude the comments...
4
by: sandvet03 | last post by:
I am trying to expand on a earlier program for counting subs and now i am trying to replace substrings within a given string. For example if the main string was "The cat in the hat" i am trying to...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.