By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,101 Members | 1,416 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,101 IT Pros & Developers. It's quick & easy.

Pattern matching help! grep emails from file!

P: n/a
Hello, I have a file with email address at a lot of junk data. I want
to get the email addresses out of that file so that each email address
is stored at a new line. I am trying to do wo**@word.word
substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com
kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.
Jul 19 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
Hello, I have a file with email address at a lot of junk data. I want to
get the email addresses out of that file so that each email address is
stored at a new line. I am trying to do wo**@word.word substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.
Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and
ONLY the e-mail address, you're actually telling the s/// to
search the entire string for an e-mail address and substitute
the e-mail address with itself, not, as you intend, to substitute
the entire string with itself. You want to add a .* after the closing
parenthesis, maybe.

Instead of: $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
Try: $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;
Or Possibly: $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;


Untested, but I had a similar problem recently, and the
principle is the same.

Shawn
Jul 19 '05 #2

P: n/a
Shawn,

Thanks for your help. But I couldn't use that as well. I am getting
null value for $filestring when I am printing it:

$filestring = <FILE>;
$filestring = s/.*(\w+\@\w+\.\w+).*/$1/;
print $filestring;

Got any suggestion?
Thanks.

Shawn Milochik <Sh***@Linurati.net> wrote in message news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
Hello, I have a file with email address at a lot of junk data. I want to
get the email addresses out of that file so that each email address is
stored at a new line. I am trying to do wo**@word.word substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.


Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and
ONLY the e-mail address, you're actually telling the s/// to
search the entire string for an e-mail address and substitute
the e-mail address with itself, not, as you intend, to substitute
the entire string with itself. You want to add a .* after the closing
parenthesis, maybe.

Instead of:
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;


Try:
$filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;


Or Possibly:
$filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;


Untested, but I had a similar problem recently, and the
principle is the same.

Shawn

Jul 19 '05 #3

P: n/a
Thanks again Shawn, It did work but only printed a part of the last
email in the first line. how do i go about the newline chars in the
$filestring? i am storing the string from the file handle in
$filestring. is this correct?

thanks.
d

Shawn Milochik <Sh***@Linurati.net> wrote in message news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 16:57:00 -0400, danpres2k wrote:
Shawn,

Thanks for your help. But I couldn't use that as well. I am getting null
value for $filestring when I am printing it:

$filestring = <FILE>;
$filestring = s/.*(\w+\@\w+\.\w+).*/$1/; print $filestring;

Got any suggestion?
Thanks.

Shawn Milochik <Sh***@Linurati.net> wrote in message
news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:

> Hello, I have a file with email address at a lot of junk data. I want
> to get the email addresses out of that file so that each email
> address is stored at a new line. I am trying to do wo**@word.word
> substitution: $filestring=<FILE>;
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
>
> The file is like:
> "testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
> "ks***@kdjk.com"
>
> Expected output:
> kj*****@ksjdf.com
> js****@ksjdf.com
> kj***@kjd.com
> ks***@kdjk.com
>
> Thanks guyz.

Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and ONLY the
e-mail address, you're actually telling the s/// to search the entire
string for an e-mail address and substitute the e-mail address with
itself, not, as you intend, to substitute the entire string with
itself. You want to add a .* after the closing parenthesis, maybe.

Instead of:
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

Try:
> $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;

Or Possibly:
> $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;

Untested, but I had a similar problem recently, and the principle is
the same.

Shawn

Yeah, just a typo. Replace
=
with:
=~

I didn't catch that in the OP.

Shawn

Jul 19 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.