472,142 Members | 1,251 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,142 software developers and data experts.

Removing dots - please help me out

Could you please help me out with regular expressions. I'm trying to
write a perl script that proccesses some text, and i'm stuck at the
following:

need to remove from the text
1. dots followed by space & words starting with lower case letters
2. dots followed by only by words starting with lower case letters

ie

"pure text here bla bla bla. more text follows" --> changes to
"pure text here bla bla bla more text follows"

and

"pure text here bla bla bla.more text follows" --> changes to
"pure text here bla bla bla more text follows"

Need to remove just the dots, not letters.

No matter how hard i tried, i could not make it work. Tried various
things eg $line =~ s/\. (?=[a-z])/ /g; I'd really appreciate your
help, it's a must do and dont have anyone else to help out.

Thanks in advance.
Jul 19 '05 #1
3 4152
Aristotle wrote:
need to remove from the text
1. dots followed by space & words starting with lower case letters
2. dots followed by only by words starting with lower case letters

ie

"pure text here bla bla bla. more text follows" --> changes to
"pure text here bla bla bla more text follows"

and

"pure text here bla bla bla.more text follows" --> changes to
"pure text here bla bla bla more text follows"

Need to remove just the dots, not letters.

No matter how hard i tried, i could not make it work. Tried various
things eg $line =~ s/\. (?=[a-z])/ /g;


You seem to be close, but since the dot may or may not be followed by
a space, you'd better say so:

s/\. ?(?=[a-z])/ /g;
---------^

HTH

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Jul 19 '05 #2
Thanks, it does seem to work correctly.
However, there are still some dots that are not being removed,
for example when there is a 'return' after the dot:

"Asthma in children of sycotic parents ( Nat .
s ) Spasms of the glottis with clucking in the larynx ; air "

Ofcourse i'm using first
$line =~ s/\n//g;
in order to remove the return characters and then
$line =~ s/\. ?(?=[a-z])/ /g;
but these dots still escape.

Any ideas why it doesnt work ?
Gunnar Hjalmarsson <no*****@gunnar.cc> wrote in message news:<HVH5c.52520$mU6.
You seem to be close, but since the dot may or may not be followed by
a space, you'd better say so:

s/\. ?(?=[a-z])/ /g;
---------^

Jul 19 '05 #3
Please ignore my last message about some dots after returns, escaping the regex.
It was my mistake, needed to re-apply the regex a second time at a different point.
Thnx
Jul 19 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Vitali Stolpner | last post: by
3 posts views Thread by Chantal | last post: by
9 posts views Thread by Rob Meade | last post: by
5 posts views Thread by Gregc. | last post: by
10 posts views Thread by sophie_newbie | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.