467,923 Members | 1,403 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,923 developers. It's quick & easy.

Regex -- Replace "-*?\n" with ""

Take this string:

"----------------------------------------
"

(i.e., hyphens followed by a newline )

I thought I could match it simply with this Regex:

"-*?\n"

(my interpretation: one or more of "-" followed by a newline)

But when I run it, it seems to match all newlines, regardless of being
preceded by the "-". I know I'm missing something. Any ideas of how to
do this match properly?

Many thanks.

--Brent
Nov 19 '05 #1
  • viewed: 1501
Share:
4 Replies
Hi Brent,

Starting a newline is \r\n on dos/windows systems.

So if you use
"-*?\r\n"
on
"----------------------------------------
"
"----------------------------------------
"

It will match 2 times.

A caution is maybe in order here. If you use a lazy non greedy quantifier
it will match the shortest string possible.

So if you let the regex
-*?
Loose on 5 times a dash
-----
The result will be 6 *empty* matches : m-m-m-m-m-m

Whe empty? bacause the star in your regex allows 0 occurences. So the shortest
string possible is the empty string.

Compare this with the regex
-*?a
on
-----a

Now you want an a to be at the end of a match so the shortest string possible
is now : -----a
This is the whole original input string and this is the only match.

You say you want a windows newline so your doing fine.
Let me know if you have any more questions..

Cheers,
Tom Pester
Take this string:

"----------------------------------------
"
(i.e., hyphens followed by a newline )

I thought I could match it simply with this Regex:

"-*?\n"

(my interpretation: one or more of "-" followed by a newline)

But when I run it, it seems to match all newlines, regardless of being
preceded by the "-". I know I'm missing something. Any ideas of how to
do this match properly?

Many thanks.

--Brent

Nov 19 '05 #2
Thanks Tom!

I hacked around some more with this stuff, and it turns out that
".*?--+\n" works. Some of the "----" strings were preceded by spaces, so
this seems to be a better, uh, match.

And, yes, your reply does help.

Oh...the text is being sucked in over HTTP directly from an Apache
server, so the \n seems to be appropriate.

Tricky things, these regexes!

--Brent
To********************@pandora.be wrote:
Hi Brent,

Starting a newline is \r\n on dos/windows systems.

So if you use
"-*?\r\n"
on
"----------------------------------------
"
"----------------------------------------
"

It will match 2 times.

A caution is maybe in order here. If you use a lazy non greedy
quantifier it will match the shortest string possible.

So if you let the regex -*?
Loose on 5 times a dash
-----
The result will be 6 *empty* matches : m-m-m-m-m-m

Whe empty? bacause the star in your regex allows 0 occurences. So the
shortest string possible is the empty string.

Compare this with the regex
-*?a
on
-----a

Now you want an a to be at the end of a match so the shortest string
possible is now : -----a
This is the whole original input string and this is the only match.

You say you want a windows newline so your doing fine.
Let me know if you have any more questions..

Cheers,
Tom Pester

Nov 19 '05 #3
actually http 1.1 switched to "\r\n", so the apache server should serve the
html the samer way.

-- bruce (sqlwork.com)

"Brent" <""b b i g l e r \"@ y a h o o . c o m"> wrote in message
news:11*************@corp.supernews.com...
Thanks Tom!

I hacked around some more with this stuff, and it turns out that
".*?--+\n" works. Some of the "----" strings were preceded by spaces, so
this seems to be a better, uh, match.

And, yes, your reply does help.

Oh...the text is being sucked in over HTTP directly from an Apache server,
so the \n seems to be appropriate.

Tricky things, these regexes!

--Brent
To********************@pandora.be wrote:
Hi Brent,

Starting a newline is \r\n on dos/windows systems.

So if you use
"-*?\r\n"
on
"----------------------------------------
"
"----------------------------------------
"

It will match 2 times.

A caution is maybe in order here. If you use a lazy non greedy quantifier
it will match the shortest string possible.

So if you let the regex -*?
Loose on 5 times a dash
-----
The result will be 6 *empty* matches : m-m-m-m-m-m

Whe empty? bacause the star in your regex allows 0 occurences. So the
shortest string possible is the empty string.

Compare this with the regex
-*?a
on
-----a

Now you want an a to be at the end of a match so the shortest string
possible is now : -----a
This is the whole original input string and this is the only match.

You say you want a windows newline so your doing fine.
Let me know if you have any more questions..

Cheers,
Tom Pester

Nov 19 '05 #4
Try:

-+\r\n

Not sure what you were trying to accomplish with your *?
(literally:zero or more, possibly.) That would always match any new
line, assuming you used \r or \r\n as your newline character.
Jason Kester
Expat Software Consulting Services
http://www.expatsoftware.com/

Nov 19 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by GlennH | last post: by
9 posts views Thread by Whitless | last post: by
4 posts views Thread by Morgan Cheng | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.