473,395 Members | 1,678 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

re sub help

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

i have read the tutorial and came across negative/positive lookahead
and i think it can solve the problem.but am confused on how to use it.
anyone can give me some advice? or is there better way other than
lookaheads ...thanks..

Nov 5 '05 #1
7 1356
s9************@yahoo.com writes:
hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.


Well, I'm not an expert on re's - I've only been using them for three
decades - but I'm not sure this can be done with a single re, as the
pattern you're interested in depends on context, and re's don't handle
that well.

On the

--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 5 '05 #2
s9************@yahoo.com writes:
hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.


Well, I'm not an expert on re's - I've only been using them for three
decades - but I'm not sure this can be done with a single re, as the
pattern you're interested in depends on context, and re's don't handle
that well.

On the other hand, this is fairly straightforward with simple string
operations:
a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
sd = '[startdelim]'
ed = '[enddelim]'
s, r = a.split(sd, 1)
m, e = r.split(ed, 1)
a = s + sd + m.replace('\n', '') + ed + e
a 'this\nis\na\nsentence[startdelim]thisisanother[enddelim]this\nis\n'


<mike

--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 5 '05 #3
thanks for the reply.

i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")

t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?
thanks.

Nov 5 '05 #4
<s9************@yahoo.com> wrote:
i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")
"*" is greedy (=searches backwards from the right end), so that won't
do the right thing if you have multiple delimiters

to fix this, use "*?" instead.
t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?


http://docs.python.org/lib/re-syntax.html

(Dot.) In the default mode, this matches any character except
a newline. If the DOTALL flag has been specified, this matches any
character including a newline.

to fix this, pass in re.DOTALL or re.S as the flag argument, or
prepend (?s) to the expression.

</F>

Nov 5 '05 #5
s9************@yahoo.com writes:
i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")

t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?
thanks.


Newlines are magic to regular expressions. You use the flags in re to
change that. In this case, you want . to match them, so you use the
DOTALL flag:
a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
t = re.compile(r"\[startdelim\](.*)\[enddelim\]", re.DOTALL)
t.findall(a) ['this\nis\nanother']


<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 5 '05 #6
s9************@yahoo.com wrote:
hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.


Here is a solution using re.sub and a class that maintains state. It works when the input text contains multiple startdelim/enddelim pairs.

import re

a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n" * 2

class subber(object):
def __init__(self):
self.delimiterSeen = False

def __call__(self, m):
text = m.group()
if text == 'startdelim':
self.delimiterSeen = True
return text

if text == 'enddelim':
self.delimiterSeen = False
return text

if self.delimiterSeen:
return text

return ''

delimRe = re.compile('\n|startdelim|enddelim')

newText = delimRe.sub(subber(), a)
print repr(newText)
Kent
Nov 5 '05 #7
On 4 Nov 2005 22:49:03 -0800, s9************@yahoo.com wrote:
hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

i have read the tutorial and came across negative/positive lookahead
and i think it can solve the problem.but am confused on how to use it.
anyone can give me some advice? or is there better way other than
lookaheads ...thanks..


Sometimes splitting and processing the pieces selectively can be a solution, e.g.,
if delimiters are properly paired, splitting (with parens to keep matches) should
give you a repeating pattern modulo 4 of
<"everywhere else" as you said><first delim><between><second delim> ...
a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
import re
splitter = re.compile(r'(?s)(\[startdelim\]|\[enddelim\])')
sp = splitter.split(a)
sp ['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n'] ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) 'thisisasentence[startdelim]this\nis\nanother[enddelim]thisis' print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) thisisasentence[startdelim]this
is
another[enddelim]thisis

I haven't checked for corner cases, but HTH
Maybe I'll try two pairs of delimiters:
a += "2222\n33\n4\n55555555[startdelim]6666\n77\n8888888[enddelim]9999\n00\n"
sp = splitter.split(a)
print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) thisisasentence[startdelim]this
is
another[enddelim]thisis222233455555555[startdelim]6666
77
8888888[enddelim]999900

which came from sp ['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n2222\n33
\n4\n55555555', '[startdelim]', '6666\n77\n8888888', '[enddelim]', '9999\n00\n']

Which had the replacing when not i%4 was true
for i,s in enumerate(sp): print '%6s: %r'%(not i%4,s)

...
True: 'this\nis\na\nsentence'
False: '[startdelim]'
False: 'this\nis\nanother'
False: '[enddelim]'
True: 'this\nis\n2222\n33\n4\n55555555'
False: '[startdelim]'
False: '6666\n77\n8888888'
False: '[enddelim]'
True: '9999\n00\n'

Regards,
Bengt Richter
Nov 6 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

21
by: Dave | last post by:
After following Microsofts admonition to reformat my system before doing a final compilation of my app I got many warnings/errors upon compiling an rtf file created in word. I used the Help...
9
by: Tom | last post by:
A question for gui application programmers. . . I 've got some GUI programs, written in Python/wxPython, and I've got a help button and a help menu item. Also, I've got a compiled file made with...
6
by: wukexin | last post by:
Help me, good men. I find mang books that introduce bit "mang header files",they talk too bit,in fact it is my too fool, I don't learn it, I have do a test program, but I have no correct doing...
3
by: Colin J. Williams | last post by:
Python advertises some basic service: C:\Python24>python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) on win32 Type "help", "copyright", "credits" or "license" for more information. >>> With...
7
by: Corepaul | last post by:
Missing Help Files When I enter "recordset" as the keyword and search the Visual Basic Help index, I get many topics of interest in the resulting list. But there isn't any information available...
5
by: Steve | last post by:
I have written a help file (chm) for a DLL and referenced it using Help.ShowHelp My expectation is that a developer using my DLL would be able to access this help file during his development time...
8
by: Mark | last post by:
I have loaded Visual Studio .net on my home computer and my laptop, but my home computer has an abbreviated help screen not 2% of the help on my laptop. All the settings look the same on both...
10
by: JonathanOrlev | last post by:
Hello everybody, I wrote this comment in another message of mine, but decided to post it again as a standalone message. I think that Microsoft's Office 2003 help system is horrible, probably...
1
by: trunxnirvana007 | last post by:
'UPGRADE_WARNING: Array has a new behavior. Click for more: 'ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?keyword="9B7D5ADD-D8FE-4819-A36C-6DEDAF088CC7"' 'UPGRADE_WARNING: Couldn't resolve...
0
by: hitencontractor | last post by:
I am working on .NET Version 2003 making an SDI application that calls MS Excel 2003. I added a menu item called "MyApp Help" in the end of the menu bar to show Help-> About. The application...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.