re sub help

s99999999s2003

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

i have read the tutorial and came across negative/positive lookahead
and i think it can solve the problem.but am confused on how to use it.
anyone can give me some advice? or is there better way other than
lookaheads ...thanks..

Nov 5 '05 #1

Subscribe Post Reply

1356

Mike Meyer

s9************@yahoo.com writes:

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

Well, I'm not an expert on re's - I've only been using them for three
decades - but I'm not sure this can be done with a single re, as the
pattern you're interested in depends on context, and re's don't handle
that well.

On the

--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Nov 5 '05 #2

Mike Meyer

s9************@yahoo.com writes:

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

Well, I'm not an expert on re's - I've only been using them for three
decades - but I'm not sure this can be done with a single re, as the
pattern you're interested in depends on context, and re's don't handle
that well.

On the other hand, this is fairly straightforward with simple string
operations:

a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
sd = '[startdelim]'
ed = '[enddelim]'
s, r = a.split(sd, 1)
m, e = r.split(ed, 1)
a = s + sd + m.replace('\n', '') + ed + e
a 'this\nis\na\nsentence[startdelim]thisisanother[enddelim]this\nis\n'

<mike

--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Nov 5 '05 #3

s99999999s2003

thanks for the reply.

i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")

t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?
thanks.

Nov 5 '05 #4

Fredrik Lundh

<s9************@yahoo.com> wrote:

i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")
"*" is greedy (=searches backwards from the right end), so that won't
do the right thing if you have multiple delimiters

to fix this, use "*?" instead.
t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?

http://docs.python.org/lib/re-syntax.html

(Dot.) In the default mode, this matches any character except
a newline. If the DOTALL flag has been specified, this matches any
character including a newline.

to fix this, pass in re.DOTALL or re.S as the flag argument, or
prepend (?s) to the expression.

</F>

Nov 5 '05 #5

Mike Meyer

s9************@yahoo.com writes:

i am still interested about using re, i find it useful. am still
learning it's uses.
so i did something like this for a start, trying to get everything in
between [startdelim] and [enddelim]

a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

t = re.compile(r"\[startdelim\](.*)\[enddelim\]")

t.findall(a)
but it gives me []. it's the "\n" that prevents the results.
why can't (.*) work in this case? Or am i missing some steps to "read"
in the "\n"..?
thanks.

Newlines are magic to regular expressions. You use the flags in re to
change that. In this case, you want . to match them, so you use the
DOTALL flag:

a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
t = re.compile(r"\[startdelim\](.*)\[enddelim\]", re.DOTALL)
t.findall(a) ['this\nis\nanother']

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Nov 5 '05 #6

Kent Johnson

s9************@yahoo.com wrote:

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

Here is a solution using re.sub and a class that maintains state. It works when the input text contains multiple startdelim/enddelim pairs.

import re

a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n" * 2

class subber(object):
def __init__(self):
self.delimiterSeen = False

def __call__(self, m):
text = m.group()
if text == 'startdelim':
self.delimiterSeen = True
return text

if text == 'enddelim':
self.delimiterSeen = False
return text

if self.delimiterSeen:
return text

return ''

delimRe = re.compile('\n|startdelim|enddelim')

newText = delimRe.sub(subber(), a)
print repr(newText)
Kent

Nov 5 '05 #7

Bengt Richter

On 4 Nov 2005 22:49:03 -0800, s9************@yahoo.com wrote:

hi

i have a string :
a =
"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"

inside the string, there are "\n". I don't want to substitute the '\n'
in between
the [startdelim] and [enddelim] to ''. I only want to get rid of the
'\n' everywhere else.

i have read the tutorial and came across negative/positive lookahead
and i think it can solve the problem.but am confused on how to use it.
anyone can give me some advice? or is there better way other than
lookaheads ...thanks..

Sometimes splitting and processing the pieces selectively can be a solution, e.g.,
if delimiters are properly paired, splitting (with parens to keep matches) should
give you a repeating pattern modulo 4 of
<"everywhere else" as you said><first delim><between><second delim> ...

a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
import re
splitter = re.compile(r'(?s)(\[startdelim\]|\[enddelim\])')
sp = splitter.split(a)
sp ['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n'] ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) 'thisisasentence[startdelim]this\nis\nanother[enddelim]thisis' print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) thisisasentence[startdelim]this
is
another[enddelim]thisis

I haven't checked for corner cases, but HTH
Maybe I'll try two pairs of delimiters:
a += "2222\n33\n4\n55555555[startdelim]6666\n77\n8888888[enddelim]9999\n00\n"
sp = splitter.split(a)
print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)]) thisisasentence[startdelim]this
is
another[enddelim]thisis222233455555555[startdelim]6666
77
8888888[enddelim]999900

which came from sp ['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n2222\n33
\n4\n55555555', '[startdelim]', '6666\n77\n8888888', '[enddelim]', '9999\n00\n']

Which had the replacing when not i%4 was true
for i,s in enumerate(sp): print '%6s: %r'%(not i%4,s)

...
True: 'this\nis\na\nsentence'
False: '[startdelim]'
False: 'this\nis\nanother'
False: '[enddelim]'
True: 'this\nis\n2222\n33\n4\n55555555'
False: '[startdelim]'
False: '6666\n77\n8888888'
False: '[enddelim]'
True: '9999\n00\n'

Regards,
Bengt Richter

Nov 6 '05 #8

by: Dave | last post by:

After following Microsofts admonition to reformat my system before doing a final compilation of my app I got many warnings/errors upon compiling an rtf file created in word. I used the Help...

Visual Basic 4 / 5 / 6

Help, *.CHM, etc

by: Tom | last post by:

A question for gui application programmers. . . I 've got some GUI programs, written in Python/wxPython, and I've got a help button and a help menu item. Also, I've got a compiled file made with...

Python

help me? About "include files"

by: wukexin | last post by:

Help me, good men. I find mang books that introduce bit "mang header files",they talk too bit,in fact it is my too fool, I don't learn it, I have do a test program, but I have no correct doing...

C / C++

Problem with Help when using numarray

by: Colin J. Williams | last post by:

Python advertises some basic service: C:\Python24>python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) on win32 Type "help", "copyright", "credits" or "license" for more information. >>> With...

Python

Missing Help Files

by: Corepaul | last post by:

Missing Help Files When I enter "recordset" as the keyword and search the Visual Basic Help index, I get many topics of interest in the resulting list. But there isn't any information available...

Microsoft Access / VBA

Help file for a DLL

by: Steve | last post by:

I have written a help file (chm) for a DLL and referenced it using Help.ShowHelp My expectation is that a developer using my DLL would be able to access this help file during his development time...

Visual Basic .NET

Where is the help?

by: Mark | last post by:

I have loaded Visual Studio .net on my home computer and my laptop, but my home computer has an abbreviated help screen not 2% of the help on my laptop. All the settings look the same on both...

.NET Framework

I think that Office 2003 help system is horrible. Do you agree?

by: JonathanOrlev | last post by:

Hello everybody, I wrote this comment in another message of mine, but decided to post it again as a standalone message. I think that Microsoft's Office 2003 help system is horrible, probably...

Microsoft Access / VBA

need help with upgrade of from vb6 to vb.net

by: trunxnirvana007 | last post by:

'UPGRADE_WARNING: Array has a new behavior. Click for more: 'ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?keyword="9B7D5ADD-D8FE-4819-A36C-6DEDAF088CC7"' 'UPGRADE_WARNING: Couldn't resolve...

Visual Basic 4 / 5 / 6

.NET IDR_MAINFRAME menu problem - application Help menu conflict with Excel Help menu

by: hitencontractor | last post by:

I am working on .NET Version 2003 making an SDI application that calls MS Excel 2003. I added a menu item called "MyApp Help" in the end of the menu bar to show Help-> About. The application...

.NET Framework

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Similar topics