473,396 Members | 1,772 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

isolate email addresses

Hi,

I have a very large text file export of email messages from an email
client. I've imported into access.

It is not segregated in any way except by the message which contains, among
everything else, an email address somewhere in the body.
IOW, every record looks more or less like this (nonsense text substituted
in):

posuere dignissim, iaculis eget, turpis. Suspendisse consectetuer neque id
odio. Ut ullamcorper. Quisque hendrerit my*****@mydomain.com neque eu dui.
In lacinia purus in felis. Sed nonummy, nisi pharetra facilisis blandit,
lacus urna bibendum diam,

If you look close, you'll see in the middle is the email address:

my*****@mydomain.com

Every record is more or less like this.

What I need to do is delete all other text from the records except what
qualifies as an email address.

So, I guess logically - I need to first find the @ character, then move
left and right until I encounter the first space in both directions.
Everything in between those spaces would be the email address. Everything
else needs to be tossed.

Help?
Apr 30 '06 #1
4 2017
an**@anon.ocm (Jim Bradstreet) wrote in
news:97*******************@216.196.97.136:
Hi,

I have a very large text file export of email messages from an email
client. I've imported into access.

It is not segregated in any way except by the message which contains,
among everything else, an email address somewhere in the body.
IOW, every record looks more or less like this (nonsense text
substituted in):

posuere dignissim, iaculis eget, turpis. Suspendisse consectetuer
neque id odio. Ut ullamcorper. Quisque hendrerit my*****@mydomain.com
neque eu dui. In lacinia purus in felis. Sed nonummy, nisi pharetra
facilisis blandit, lacus urna bibendum diam,

If you look close, you'll see in the middle is the email address:

my*****@mydomain.com

Every record is more or less like this.

What I need to do is delete all other text from the records except
what qualifies as an email address.

So, I guess logically - I need to first find the @ character, then
move left and right until I encounter the first space in both
directions. Everything in between those spaces would be the email
address. Everything else needs to be tossed.

Help?


Public Function ExtractedEmailAddresses(ByVal s As String) As String
' requires that vbscript be installed (default for windows)
Dim r As Object
Dim m As Variant
Dim ms As Variant
Set r = CreateObject("VBScript.RegExp")
With r
.Global = True
.IgnoreCase = True
.pattern = "[\w-\.]{1,}\@([\da-zA-Z-]{1,}\.){1,}[\da-zA-Z-]{2,3}"
End With
Set ms = r.Execute(s)
For Each m In ms
ExtractedEmailAddresses = ExtractedEmailAddresses & ";" & m.Value
Next m
ExtractedEmailAddresses = Replace(ExtractedEmailAddresses, ";", "", , 1)
End Function

Private Sub TestExtractedEmailAddresses()
Debug.Print ExtractedEmailAddresses("posuere dignissim, iaculis eget,
turpis. Suspendisse consectetuer neque id odio. Ut ullamcorper. Quisque
hendrerit my*****@mydomain.com neque eu dui. In lacinia purus in felis.
Sed nonummy, nisi pharetra facilisis blandit,lacus urna bibendum diam")
' prints my*****@mydomain.com
End Sub
--
Lyle Fairfield
Apr 30 '06 #2
On Sun, 30 Apr 2006 23:11:41 GMT, Lyle Fairfield
<ly***********@aim.com> wrote:

We'll leave it up to the interested programmer to support top level
domain names like .info in this otherwise elegant solution.

-Tom.

an**@anon.ocm (Jim Bradstreet) wrote in
news:97*******************@216.196.97.136:
Hi,

I have a very large text file export of email messages from an email
client. I've imported into access.

It is not segregated in any way except by the message which contains,
among everything else, an email address somewhere in the body.
IOW, every record looks more or less like this (nonsense text
substituted in):

posuere dignissim, iaculis eget, turpis. Suspendisse consectetuer
neque id odio. Ut ullamcorper. Quisque hendrerit my*****@mydomain.com
neque eu dui. In lacinia purus in felis. Sed nonummy, nisi pharetra
facilisis blandit, lacus urna bibendum diam,

If you look close, you'll see in the middle is the email address:

my*****@mydomain.com

Every record is more or less like this.

What I need to do is delete all other text from the records except
what qualifies as an email address.

So, I guess logically - I need to first find the @ character, then
move left and right until I encounter the first space in both
directions. Everything in between those spaces would be the email
address. Everything else needs to be tossed.

Help?


Public Function ExtractedEmailAddresses(ByVal s As String) As String
' requires that vbscript be installed (default for windows)
Dim r As Object
Dim m As Variant
Dim ms As Variant
Set r = CreateObject("VBScript.RegExp")
With r
.Global = True
.IgnoreCase = True
.pattern = "[\w-\.]{1,}\@([\da-zA-Z-]{1,}\.){1,}[\da-zA-Z-]{2,3}"
End With
Set ms = r.Execute(s)
For Each m In ms
ExtractedEmailAddresses = ExtractedEmailAddresses & ";" & m.Value
Next m
ExtractedEmailAddresses = Replace(ExtractedEmailAddresses, ";", "", , 1)
End Function

Private Sub TestExtractedEmailAddresses()
Debug.Print ExtractedEmailAddresses("posuere dignissim, iaculis eget,
turpis. Suspendisse consectetuer neque id odio. Ut ullamcorper. Quisque
hendrerit my*****@mydomain.com neque eu dui. In lacinia purus in felis.
Sed nonummy, nisi pharetra facilisis blandit,lacus urna bibendum diam")
' prints my*****@mydomain.com
End Sub


May 1 '06 #3
Tom van Stiphout <no*************@cox.net> wrote in
news:u9********************************@4ax.com:

Yeppers. It's kinda old and I didn't have info on info when I wrote the
guts of it.

Maybe I'll attend to it some day.
On Sun, 30 Apr 2006 23:11:41 GMT, Lyle Fairfield
<ly***********@aim.com> wrote:

We'll leave it up to the interested programmer to support top level
domain names like .info in this otherwise elegant solution.

-Tom.


--
Lyle Fairfield
May 1 '06 #4
Lyle Fairfield <ly***********@aim.com> wrote in
news:Xn*********************************@216.221.8 1.119:
Tom van Stiphout <no*************@cox.net> wrote in
news:u9********************************@4ax.com:

Yeppers. It's kinda old and I didn't have info on info when I wrote the
guts of it.

Maybe I'll attend to it some day.
On Sun, 30 Apr 2006 23:11:41 GMT, Lyle Fairfield
<ly***********@aim.com> wrote:

We'll leave it up to the interested programmer to support top level
domain names like .info in this otherwise elegant solution.

-Tom.


Quick fix is to change the pattern to:

[\w-\.]{1,}\@([\da-zA-Z-]{1,}\.){1,}[\da-zA-Z-]{2,4}

This will find

so*****@domain.info

but problem is that if we have

so*****@domain.comdon't write this guy

the address will be

so*****@domain.comd

That is we will have to delimit the address with white space to be 100 %
accurate. Not sure what to do about that and maybe just expecting the
file to be that way is enough.

--
Lyle Fairfield
May 1 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: dave | last post by:
Hello, I'm creating a site that has several contact email addresses. These are vlid addresses, i'm wondering if i can use php to alter these addresses so that spam harvesters can't get them, yet a...
2
by: Hoang | last post by:
anyone know of an algorithm to filter out real email addresses as opposed to computer generated email addresses? I have been going through past email archives in order to find friends email...
2
by: Doug | last post by:
I'm a little confused by this functionality. It doesn't seem to be behaving like it should. I am using the following regular expression to validate email addresses:...
3
by: kieran | last post by:
Hi, i know this should be easy but cant get it exactly right. i have a Datagrid pulling results from a stored proc in database, it outputs four columns, one of which is an email address...
4
by: SAL | last post by:
I am using a RegularExpressionValidator control on my ASP page, and I have the ValidationExpression property set to "Internet E-mail Address". The email address is valiated when the user puts in a...
23
by: codefire | last post by:
Hi, I am trying to get a regexp to validate email addresses but can't get it quite right. The problem is I can't quite find the regexp to deal with ignoring the case james..kirk@fred.com, which...
3
by: g0c | last post by:
hi, how to hide or replace email addresses in php mail function with something like "group" so the email addresses to which email is sent are not visible ? i have : $subs = "email1@dot.com,...
12
by: Adrian | last post by:
Is there a way to know which email addresses viewed my website? Thanks, Adrian.
45
by: Dennis | last post by:
Hi, I have a text file that contents a list of email addresses like this: "foo@yahoo.com" "tom@hotmail.com" "jerry@gmail.com" "tommy@apple.com" I like to
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.