473,836 Members | 1,355 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Extract email addresses from big file.

5 New Member

I have a big text file with data,
and i want to extract mail addresses.

How i can do it?
May 17 '07
29 51693
3 New Member
guys can this perl script be used on websites ? and i replace the file with a web adress ? or how can i do this to get the emails included in a website ?

and let's say i have www.domain.com/aa.php=1 have some emails saved inside
and www.domain.com/aa.php=2 have also some mails .. how can i make a loop to get all the aa.php=variable and get the mails in all the files ?
thanks in advance and sorry for my english
Feb 9 '08 #11
David Akpan
1 New Member
I have a big file with many email addresses, how do i extract only the email address, if posible please include the software i can use
Mar 19 '08 #12
1 New Member
How would I use a script like this on a group of files that are in a directory to retrieve email addresses from all of them?
May 21 '08 #13
1,275 Recognized Expert Top Contributor
How would I use a script like this on a group of files that are in a directory to retrieve email addresses from all of them?

Try to combine the find command with xargs and the perl script given here like this.

find . -name "*.txt" | xargs perl <script given here>

May 22 '08 #14
1 New Member
I tried the above example but it didn't work for me.

I got the following error:

C:\Documents and Settings\user\D esktop\abc\trun k\docs>perl -wne'while(/[\w\.\-]+@[\w\.\-]+\w+/g){print "$&\n"}'db_ em
ails.txt | sort -u > output.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
-uThe system cannot find the file specified.
Dec 28 '09 #15
254 Recognized Expert Contributor

You are apparently trying to do this in a MS Windows environment rather than a *nix environment. The error you are seeing comes from the Windows command-line parser. I am going to assume (which may get me into trouble) that you entered the offending command on a single line. If so, then the only thing that leaps out at me is the lack of space between the final ' and the apparent input file "db_em". Did you copy and paste your example directly from your command window? If so, the first thing I'd try would be to make sure that you do have a space there.

If that does not help, then we have to look a little further. The prior discussion here has been under a Linux/Unix assumption and the *nix shells do parse command lines differently from Windows. The Perl itself should be OK, especially with peripatetic's modification. However, you may have to run it differently. If the windows command-line parser can't handle this as a one-liner, you can always just put the Perl into a file (e.g., "ExtractEmail.p l") and then you should be able to run it that way as:

C:\...>perl ExtractEmail.pl <db_em

or the like.

Let us know if this meets your needs.

Dec 31 '09 #16
1 New Member
Hey Paul,

I'm in the same boat as RADEP. I'm very much able to use the script on my linux machine, but unable to once I try it on my windows vm.

I tried copying the code verbatim into a .pl file and running it from command line per your suggestion with a similar output to RADEP's experience.

As for the "'" terminator, I have no clue, but I am going to guess that windows will not support the 'sort -u' command near the end. What are your thoughts?

And by the way, thanks for everyone's help in this. It's forums like these that help me get through the work day. :)

Dec 31 '09 #17
254 Recognized Expert Contributor
Hi Scott,

I'm afraid I was too lazy the other day.

If you're going to create a file to do the same job, you have to do the read from STDIN explicitly. So the file ExtractEmail.pl could look like this:

Expand|Select|Wrap|Line Numbers
  1. while (<STDIN>) {
  2.     while (/[\w\.\-]+@[\w\.\-]+\w+/g)
  3.         {print "$&\n"}
  4. }
Then you can invoke it like this:

Expand|Select|Wrap|Line Numbers
  1. C:\...>perl ExtractEmail.pl <test.txt >out.txt
Of course, the sorting as in a *nix environment is not available in the native windows environment. There are several ways to get the capability. You could install cygwin, which is a port of the bash shell with utilities including sort. (Then you should be able to use the original one-liner.) If you search for "windows unix sort" you should find some advice (which I have not tested) on other ports of the sort utility.

Jan 4 '10 #18
1 New Member
I keep getting this error:

Expand|Select|Wrap|Line Numbers
  1. syntax error at email.pl line 1, near "){"
  2. Can't find string terminator "'" anywhere before EOF at email.pl line 1
and I'm using this code:

Expand|Select|Wrap|Line Numbers
  1. perl -wne'while(/[\w\.\-]+@[\w\.\-]+\w+/g){print "$&\n"}' emails.txt | sort -u > output.txt
Jun 1 '10 #19
2 New Member
How about this ...

if for some reason the file lost some spaces or got extra letters and there's this case ....

onani12@yahoo.c oms@ <--- or what about this ... lacama@yaho.co

in those cases i want to create ...
onani12@yahoo.c om and also the one it catch onani12@yahoo.c oms <-- notice doesn't have the @ at the end
and fix lacama@yaho.com ( which is certainly a public email provider ) but just in case we want to keep that one we found
and add
lacama@yahoo.co m

I had many years ago a code that just to do that I'm going to try to find it but if you have a regular expression or short code that can fix that it will be wonderful !

Sep 9 '10 #20

Sign in to post your reply or Sign up for a free account.

Similar topics

by: Hoang | last post by:
anyone know of an algorithm to filter out real email addresses as opposed to computer generated email addresses? I have been going through past email archives in order to find friends email address. Unfortunately about 75% of them are junk addresses or spammer addresses. It's quite obvious when you look at it and delete it... but you don't want to do it by hand.
by: Duke of Hazard | last post by:
I have searched without success for a simple script that can read any text html file and extract the email addresses from it. I am not interested in spamming people. I play a sport that requires me to email people when I travel to find partners. Unfortunately only 20% are able to reply because their schedules don't mesh with mine. So this script would be a great way to email 5-10 people from a certain website without having to copy/paste...
by: MLH | last post by:
I routinely save failure notices from mail servers bouncing mail back to me that I sent with invalid address. I would like to write an access procedure in my contacts database that would open the bounced mail messages, find eMail address(es) & write 'em to a table. Later, I could go through the table, comparing eMail addrs found in there against those in tblContacts. I could set a flag in tblContacts records to indicate those I need to...
by: Mam | last post by:
Hi I had developed one site,that site hides all the email addresses.Now i want to develope an application whose extract mail addresses from that site,Is there any solution to this.If u know how to extract or any related websites let me know.Because i have tried in so many ways but i didn't get the solution. Thanks in advance
by: Nico | last post by:
Hi, I have a .txt file with a lot of text mixed with some email addresses. I would like to get all the email addresses in a $mails variable. Does anyone know how to do this in php. Thanks a lot for your help, Nicholas
by: tthomas | last post by:
Greetings, I am using CDO.Message to send email messages from my application. I now need to send email to existing distribution lists in our Global Address List. However, our exchange server blocks sending to distribution list (i.e. distribution list is named MyDistList and its email address is MyDistList@mycompany.com) from SMTP email. I have searched in vain for an example of VBA that will extract the email addresses of members of...
by: Alexander Vasilevsky | last post by:
How to extract email address from the letter in Outlook Express? http://www.alvas.net - Audio tools for C# and VB.Net developers
by: Dennis | last post by:
Hi, I have a text file that contents a list of email addresses like this: "foo@yahoo.com" "tom@hotmail.com" "jerry@gmail.com" "tommy@apple.com" I like to
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.