473,387 Members | 1,504 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

re.sub() backreference bug?

using this code:

import re
s = 'HelloWorld19-FooBar'
s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s)
s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s)
s = re.sub('-', '_', s)
s = s.lower()
print "s: %s" % s

i expect to get:
hello_world19_foo_bar

but instead i get:
hell☺_☻orld19_fo☺_☻ar

(in case the above doesn't come across the same, it's:
hellX_Yorld19_foX_Yar, where X is a white smiley face and Y is a black
smiley face !!)

is this a bug, or am i doing something wrong?

tested on
Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)]
on win32

and
Python 2.4.4c0 (#2, Jul 30 2006, 15:43:58) [GCC 4.1.2 20060715
(prerelease) (Debian 4.1.1-9)] on linux2

Aug 17 '06 #1
4 2813
s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s)
s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s)
i expect to get:
hello_world19_foo_bar

but instead i get:
hell☺_☻orld19_fo☺_☻ar

Looks like you need to be using "raw" strings for your
replacements as well:

s = re.sub(r'([A-Z]+)([A-Z][a-z])', r"\1_\2", s)
s = re.sub(r'([a-z\d])([A-Z])', r"\1_\2", s)

This should allow the backslashes to be parsed as backslashes,
not as escape-sequences (which in this case are likely getting
interpreted as octal numbers)

-tkc

Aug 17 '06 #2

je*******@gmail.com wrote:
using this code:

import re
s = 'HelloWorld19-FooBar'
s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s)
s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s)
s = re.sub('-', '_', s)
s = s.lower()
print "s: %s" % s

i expect to get:
hello_world19_foo_bar

but instead i get:
hell☺_☻orld19_fo☺_☻ar

(in case the above doesn't come across the same, it's:
hellX_Yorld19_foX_Yar, where X is a white smiley face and Y is a black
smiley face !!)

is this a bug, or am i doing something wrong?
Tim's given you the solution to the problem: with the re module,
*always* use raw strings in regexes and substitution strings.

Here's a simple diagnostic tool that you can use when the visual
presentation of a result leaves you wondering [did you get smiley faces
on Windows in IDLE? on Linux?]:

|>>print repr(s)
'hell\x01_\x02orld19_fo\x01_\x02ar'
|>>print "s: %r" % s
s: 'hell\x01_\x02orld19_fo\x01_\x02ar'

HTH,
John

Aug 17 '06 #3
Tim's given you the solution to the problem: with the re module,
*always* use raw strings in regexes and substitution strings.

"always" is so...um...carved in stone. One can forego using raw
strings if one prefers having one's strings looked like they were
trampled by a stampede of creatures with backslash-shaped hooves...

uh...yeah...stick with raw strings. :)

-tkc

Aug 17 '06 #4
thanks - that's the trick.

On 8/17/06, Tim Chase <py*********@tim.thechases.comwrote:
Looks like you need to be using "raw" strings for your
replacements as well:

s = re.sub(r'([A-Z]+)([A-Z][a-z])', r"\1_\2", s)
s = re.sub(r'([a-z\d])([A-Z])', r"\1_\2", s)

This should allow the backslashes to be parsed as backslashes,
not as escape-sequences (which in this case are likely getting
interpreted as octal numbers)

-tkc
Aug 18 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Brian Richmond | last post by:
I'm trying to use a regular expression to match a hidden html tag and replace it with the results of a mysql query. The query is based off a part of the hidden tag. For example: The article...
2
by: tshad | last post by:
I have an example I copied from "programming asp.net" (o'reilly) and can't seem to get the Sub (writefile) to execute. It displays all the response.write lines that are called directly, but not...
0
by: rodrigo | last post by:
This is the source text I am working on. 3593 blue $15.95 3944 yellow 8.10 4001 brown $9.75 Basically I want to extract each part separately like this part 1 3593
3
by: Kathy Burke | last post by:
Hi, I'm tired, so this question may be silly. I have a fairly long sub procedure. Based on one condition, I load another sub with the following: If Session("GRN") = "complete" Then txtScan.Text...
10
by: tmaster | last post by:
When I try to dynamically add a second sub menu item to this ContextMenu item, I get an error 'Specified argument was out of the range of valid values'. Private Sub mnuTopics_Show_Select(ByVal...
12
by: Ron | last post by:
Greetings, I am trying to understand the rational for Raising Events instead of just calling a sub. Could someone explain the difference between the following 2 scenarios? Why would I want to...
5
by: Sharon | last post by:
Hi all. To prevent access to a sub system internal types, is it necessary to create the sub system in a different project, and use the internal access level? Or is there another way that will...
6
by: Bob | last post by:
Hi, I found this code here below (about cartitems and shoppingcart) and I have two questions about sub New(). In the first class CartItem, there is two times sub New(): Public Sub New() End...
6
by: Greg Strong | last post by:
Hello All, Is is possible to use an ADO recordset to populate an unbound continuous Subform? I've done some Googling without much luck, so this maybe impossible, but let me try to explain...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.