473,404 Members | 2,174 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Pattern matching help! grep emails from file!

Hello, I have a file with email address at a lot of junk data. I want
to get the email addresses out of that file so that each email address
is stored at a new line. I am trying to do wo**@word.word
substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com
kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.
Jul 19 '05 #1
3 6131
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
Hello, I have a file with email address at a lot of junk data. I want to
get the email addresses out of that file so that each email address is
stored at a new line. I am trying to do wo**@word.word substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.
Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and
ONLY the e-mail address, you're actually telling the s/// to
search the entire string for an e-mail address and substitute
the e-mail address with itself, not, as you intend, to substitute
the entire string with itself. You want to add a .* after the closing
parenthesis, maybe.

Instead of: $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
Try: $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;
Or Possibly: $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;


Untested, but I had a similar problem recently, and the
principle is the same.

Shawn
Jul 19 '05 #2
Shawn,

Thanks for your help. But I couldn't use that as well. I am getting
null value for $filestring when I am printing it:

$filestring = <FILE>;
$filestring = s/.*(\w+\@\w+\.\w+).*/$1/;
print $filestring;

Got any suggestion?
Thanks.

Shawn Milochik <Sh***@Linurati.net> wrote in message news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:
Hello, I have a file with email address at a lot of junk data. I want to
get the email addresses out of that file so that each email address is
stored at a new line. I am trying to do wo**@word.word substitution:
$filestring=<FILE>;
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

The file is like:
"testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
"ks***@kdjk.com"

Expected output:
kj*****@ksjdf.com
js****@ksjdf.com
kj***@kjd.com
ks***@kdjk.com

Thanks guyz.


Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and
ONLY the e-mail address, you're actually telling the s/// to
search the entire string for an e-mail address and substitute
the e-mail address with itself, not, as you intend, to substitute
the entire string with itself. You want to add a .* after the closing
parenthesis, maybe.

Instead of:
$filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;


Try:
$filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;


Or Possibly:
$filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;


Untested, but I had a similar problem recently, and the
principle is the same.

Shawn

Jul 19 '05 #3
Thanks again Shawn, It did work but only printed a part of the last
email in the first line. how do i go about the newline chars in the
$filestring? i am storing the string from the file handle in
$filestring. is this correct?

thanks.
d

Shawn Milochik <Sh***@Linurati.net> wrote in message news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 16:57:00 -0400, danpres2k wrote:
Shawn,

Thanks for your help. But I couldn't use that as well. I am getting null
value for $filestring when I am printing it:

$filestring = <FILE>;
$filestring = s/.*(\w+\@\w+\.\w+).*/$1/; print $filestring;

Got any suggestion?
Thanks.

Shawn Milochik <Sh***@Linurati.net> wrote in message
news:<pa*********************************@Linurati .net>...
On Fri, 22 Aug 2003 10:46:22 -0400, danpres2k wrote:

> Hello, I have a file with email address at a lot of junk data. I want
> to get the email addresses out of that file so that each email
> address is stored at a new line. I am trying to do wo**@word.word
> substitution: $filestring=<FILE>;
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;
>
> The file is like:
> "testin" kj*****@ksjdf.com, <testing>js****@ksjdf.com kj***@kjd.com
> "ks***@kdjk.com"
>
> Expected output:
> kj*****@ksjdf.com
> js****@ksjdf.com
> kj***@kjd.com
> ks***@kdjk.com
>
> Thanks guyz.

Two things:

1. Do you really want 2 newlines for each output?

2. Since the first regex is matching the e-mail address and ONLY the
e-mail address, you're actually telling the s/// to search the entire
string for an e-mail address and substitute the e-mail address with
itself, not, as you intend, to substitute the entire string with
itself. You want to add a .* after the closing parenthesis, maybe.

Instead of:
> $filestring = s/(\w+\@\w+\.\w+)/\n$1\n/;

Try:
> $filestring = s/(\w+\@\w+\.\w+).*/\n$1\n/;

Or Possibly:
> $filestring = s/.*(\w+\@\w+\.\w+).*/\n$1\n/;

Untested, but I had a similar problem recently, and the principle is
the same.

Shawn

Yeah, just a typo. Replace
=
with:
=~

I didn't catch that in the OP.

Shawn

Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: NimP | last post by:
Hi,. I'm trying to detect any links that are contained within an html page using eregi pattern matching. I was wondering if there are any pattern matching geniuses out there who could write a...
8
by: gsv2com | last post by:
One of my weaknesses has always been pattern matching. Something I definitely need to study up on and maybe you guys can give me a pointer here. I'm looking to remove all of this code and just...
176
by: Thomas Reichelt | last post by:
Moin, short question: is there any language combining the syntax, flexibility and great programming experience of Python with static typing? Is there a project to add static typing to Python? ...
9
by: Xah Lee | last post by:
# -*- coding: utf-8 -*- # Python # Matching string patterns # # Sometimes you want to know if a string is of # particular pattern. Let's say in your website # you have converted all images...
1
by: Henry | last post by:
I have a table that stores a list of zip codes using a varchar column type, and I need to perform some string prefix pattern matching search. Let's say that I have the columns: 94000-1235 94001...
10
by: bpontius | last post by:
The GES Algorithm A Surprisingly Simple Algorithm for Parallel Pattern Matching "Partially because the best algorithms presented in the literature are difficult to understand and to implement,...
5
by: olaufr | last post by:
Hi, I'd need to perform simple pattern matching within a string using a list of possible patterns. For example, I want to know if the substring starting at position n matches any of the string I...
2
by: Ole Nielsby | last post by:
First, bear with my xpost. This goes to comp.lang.c++ comp.lang.functional with follow-up to comp.lang.c++ - I want to discuss an aspect of using C++ to implement a functional language, and...
3
by: konrad Krupa | last post by:
This message is a continuation of my previous post "Pattern Match" Doug - Thank you for your help. Doug Semler was able to solve my problem to some point but I still need some help. Doug's...
0
by: Peted | last post by:
Hi, im having some trouble with reg expression pattern matching for something i think should be a straightforward test. Im validating the text being entered in a winforms textbox and i need...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.