468,505 Members | 1,937 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,505 developers. It's quick & easy.

Regular Expression Function to remove email address in string

I have a form with text description field, but some people keep putting email addresses, telephone numbers, HTML and URL's in the descriptions.

I can remove the HTML tags using a regular expression, but I can't figure out how to remove email addresses, telephone numbers and URL's from a text string... eg.

'Lorem ipsum dolor sit amet, some.email@mydomain.com consectetur adipisicing elit, anotheremail@somedomain.com sed do eiusmod tempor incididunt ut labore 01234 567 891 et dolore +341234 567891 magna aliqua. 012 345 6789 Ut enim http://www.somedomain.comad minim www.somedoma.com veniam, quis somedomain.com nostrud exercitation http://somedomain.com ullamco laboris nisi ut aliquip ex ea commodo consequat.'

This is driving me round the bend! Can anyone save me??

Many thanks!

Dan
Mar 17 '10 #1
9 8317
jhardman
3,405 Expert 2GB
if someone posts urls to me I just delete the whole post and ban the user/add the ip address to a blocked list. I don't see any reasonable way to block a phone number, and I think you would have to block '@' to block emails.
Mar 25 '10 #2
Hi jhardman

Thanks for your reply and in essence I agree with you, unfortunately the peeps entering the info are paying customers... No matter how much I tell the sales dept. to tell their clients not to do this, they will always try to push it, which annoys all the other customers who do not.

With a DB of over 1 million description records, I could really use a funcion to do this rather than me wasting my life checking every record in the db manually...

Can you still help?
Mar 26 '10 #3
jhardman
3,405 Expert 2GB
OK, how about this, both urls and email addresses cannot contain white space. how about you delete from the @ symbol to the first white space on either side? For URLs, most should start with "http://" or "https://", most of the rest probably start with "www." (this won't get rid of all of them, but it will do the vast majority). I say delete from that to the next white space. for phone numbers you would need to search for a string of at least 7 characters that contain nothing but numbers and punctuation (spaces, periods, and hyphens are all commonly used, possibly other marks as well, to separate the parts of the phone number). Does that sound reasonable? If that will work for you, and if you need it, I can help you write the regexs.

Jared
Mar 26 '10 #4
Hi jhardman

Thanks, I see your logic, however the syntax I am struggling with... Can you help me write the regex's

Kind regards

Dan
Mar 26 '10 #5
jhardman
3,405 Expert 2GB
this should recognize phone numbers:
Expand|Select|Wrap|Line Numbers
  1. [\d\s\W]{7,}
( a string of at least seven characters with only numerals, white spaces and non-word characters).

This should recognize email addresses:
Expand|Select|Wrap|Line Numbers
  1. [.\w]{3,}@[.\w]{5,}
(at least 3 characters that can include periods or word characters, followed by an @ sign, followed by at least 5 characters that include word characters and periods)

between them, these 2 should recognize MOST URLs
Expand|Select|Wrap|Line Numbers
  1. https?://[.\w]{3,}
  2. www.[.\w]{3,}
(http s-optional :// followed by at least 3 word and period characters, and www. followed by at least 3 word and period characters) This doesn't catch domains without a prefix (like "mydomain.com"). If you wanted, you could try something like [.\w]{3,}.com, [.\w]{3,}.net etc.

I wrote a quick script to check, feel free to use it: regex test

Let me know if this helps. Did you need help with the code as well?

Jared
Mar 27 '10 #6
Hi Jared

Thanks for this! I'm gonna give a whirl ASAP...

I'm ok with the code, thanks for the offer!

I'll post back after testing

Thanks again

Dan
Apr 1 '10 #7
Hi Jared

Can you keep your test script live until I post back so I have something to verify against..?

Thanks again!

Dan
Apr 1 '10 #8
Hi Jared

Great news! It worked!

Take a look at this link to see my results Test Script

Thank you soo much!

Kind regards

Dan
Apr 1 '10 #9
jhardman
3,405 Expert 2GB
Dan,

Glad to hear it worked. The hard part for me is always defining what characters to delete. Anyway, thanks for posting back.

Jared
Apr 1 '10 #10

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

10 posts views Thread by Derek Basch | last post: by
2 posts views Thread by AGGoogle | last post: by
1 post views Thread by outoftherealm | last post: by
4 posts views Thread by శ్రీనివాస | last post: by
reply views Thread by NPC403 | last post: by
reply views Thread by fmendoza | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.