473,721 Members | 2,081 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Phone number regular expression...

Hello everyone!

First, I appologize if this posting isn't proper "netiquette " for this
group.

I've been working with perl for almost 2 years now. However, my regular
expression knowledge is pretty limited. I wrote the following expression to
take (hopefully) any _reasonable_ phone number input, and format it as
(999) 999-9999 x 9999.

Here's what I've come up with. I would like your comments, if you've got the
time. I'm really interested in regular expressions, and I want to know if
what I'm doing is inefficient, slow, etc...

# area code
\({0,1}\s*(\d{3 }){0,1}\s*\){0, 1}
# optional parentheses, 3 digits, optional parentheses
(?=[-| ]*(\d{3}){1}[-| ]*(\d{4}){1}) #
match only if the first match is followed by

# what looks like a phone number

# this is the same match as the standard 7 digit phone number below
# main phone number
[-| ]*
(\d{3}){1} # first 3 digits
[-| ]*
(\d{4}){0,1} # second 4 digits

# extension
[-| |x|X]*
(\d{3,4}){0,1} # extension

For example, here's a question I have. Is there a way to use the look-ahead
match in the area code section _again_ for matching the main number, since
they are the same? I also know that I could use ? instead of {0,1}
(correct?), but I always get confused between that and non-greedy
quantifier. Does that make sense?

I wrote a script to test it (it generates many different possible phone
number inputs, and then applies the regular expression), and it _seems_ to
work. But like I said, I kinda don't know what I'm doing. I've been using
http://www.perldoc.com/perl5.6/pod/perlre.html heavily. It's pretty useful.

Here's another question, do people ever have extensions less than 3, or
greater than 4 numbers?

Thanks for your help!

Joe
Jul 19 '05 #1
5 19230
joemono wrote:

(snipped)
I wrote the following expression to take (hopefully) any _reasonable_
phone number input, and format it as
(999) 999-9999 x 9999.
Parameter is "reasonable " American style phone numbers.

what I'm doing is inefficient, slow, etc...
(snipped a lot of regex matching)

Yes, very slow, very inefficient. Do not invoke a
regex engine unless you have no choice, or a regex
actually "proves" to be the most efficient method
found within a collection of tested methods.

Is there a way to use the look-ahead match
Never use look-ahead unless you have no choice.
Using any style of look-ahead will almost always
be slow and inefficient compared to other methods.

Note my "almost always" does not mean "always" as some
might ignorantly claim. In some cases, a look-ahead
could be your only choice, or most efficient choice.

do people ever have extensions less than 3, or greater than 4 numbers?


Extensions cannot be predicted. Length of an extension is
directly controlled by an internal PBX system. An extension
length can literally be any length.

What is the length of those extensions you hear during a
recorded menu selection? Is there more than one extension?
These type of numbers, could be a problem.

1-800-tru-idiots
if you are stupid, press 1 now
*next menu*
if you are stupid and gullible, press 2 now
*next menu*
if you are stupid, gullible and tired of this, press 3 now
*next menu*
Thank you for calling America Onlame! You are an idiot! Goodbye!
*dial tone*

I count three extensions each with a length of one.

Your methodology allows parentheses, hyphens and such, then
tries to match for all possible combinations. This is quite
inefficient and prone to error.

Remove all characters except numbers, then work with your data.
You are interested in phone numbers, are you not? So work with
numbers, nothing else.

Keep in mind, regardless of what methodology you employ, there
is a good chance there will be false positives and false negatives.
Parsing phone numbers is similar to parsing email addresses; it
is difficult and unpredictable.

Look over my method below. This method eliminates all characters
except numbers, then generates a very uniform output appropriate
for a data file. Output is also easy on the human eye.
Ever wonder why people use "spelled" phone numbers, like

1-800-bite-me

When someone tries to give me a spelled number, I say,

"Don't bother. I will not call you."
Purl Gurl
--
Rock Midis! Science Fiction! Amazing Androids!
http://www.purlgurl.net/~callgirl

My $test_it is used to exemplify a non-destructive
method, needed for a print of invalid numbers. You
could easily use $_ throughout as well, but this
defeats "full" printing of an invalid phone number.

#!perl

while (<DATA>)
{
my $test_it = $_;
$test_it =~ s/[^\d+]//g;

if ($test_it =~ tr/0-9// == 7)
{
substr ($test_it, 3, 0, " ");
print "$test_it\n ";
}
elsif ($test_it =~ tr/0-9// == 10)
{
substr ($test_it, 3, 0, " ");
substr ($test_it, 7, 0, " ");
print "$test_it\n ";
}
elsif ($test_it =~ tr/0-9// > 10)
{
substr ($test_it, 3, 0, " ");
substr ($test_it, 7, 0, " ");
substr ($test_it, 12, 0, " ");
print "$test_it\n ";
}
else
{ print "Phone Number Appears Invalid: $_\n"; }
}
__DATA__
123-4567
123 4567
(310) 123 4567
310-123-4567
310-123-4567 ext 890
310 123 4567 890
123-4567FUBAR
310 123 FUBAR

PRINTED RESULTS:
_______________ _

123 4567
123 4567
310 123 4567
310 123 4567
310 123 4567 890
310 123 4567 890
123 4567
Phone Number Appears Invalid: 310 123 FUBAR
Jul 19 '05 #2
I thought that you made a few odd (either esoteric or not Lazy enough)
implementation decisions.

Purl Gurl <pu******@purlg url.net> wrote in message news:<3F******* ********@purlgu rl.net>...
[...]You could easily use $_ throughout as well, but this
defeats "full" printing of an invalid phone number.


Instead of preserving $_ and working on $test_it, you could have saved
a copy and then worked on $_ itself.

You used s/[^\d+]//g instead of tr/0-9//dc to remove all non-digits.

You used tr/0-9// instead of length.

The use of the 4-argument version of substr() was neat, but a
judicious pattern match instead of length-checking makes for tighter
code:

while (<DATA>) {
my $save = $_;
tr/0-9//dc;
if (/(...)?(...)(... .)/) {
printf "%3s %s %s %s\n", $1, $2, $3, $';
}
else {
print "Invalid phone number: $save\n";
}
}

Now let's go back to the issue of stripping all non-numerics. If you
do that, you can't distinguish 123-4567 x890 from (123) 456 7890.
Granted, when you dial, the phone doesn't know the difference, but
there may be some difference in how the person doing the dialing has
to behave.

If, instead of stripping the non-digits, you just look for groups of
digits (optional 3, then mandatory 3 and 4, then optional however
many) amongst the non-digits, you can address that:

#!perl
while (<DATA>) {
my $save = $_;
if (/^\D*(?:(\d{3})\ D+)?(\d{3})\D+( \d{4})(?:\D+(\d +))?/) {
printf "%3s %s %s %s\n", $1, $2, $3, $4;
}
else {
print "Invalid phone number: $save\n";
}
}

__DATA__
123-4567
123 4567
123 4567 x890 <-- note
(310) 123 4567
310-123-4567
310-123-4567 ext 890
310 123 4567 890
123-4567FUBAR
310 123 FUBAR
Output is:
123 4567
123 4567
123 4567 890
310 123 4567
310 123 4567
310 123 4567 890
310 123 4567 890
123 4567
Invalid phone number: 310 123 FUBAR
Jul 19 '05 #3
joemono wrote:
I wrote the following expression to take (hopefully) any
_reasonable_ phone number input, and format it as (999) 999-9999 x
9999.


Hi Joe,

I don't know the likelihood in your case that people outside the US
are asked to enter their phone numbers. The reason why I mention it is
that I have tried to enter my non-US number at quite a few US based
web sites, resulting in error messages...

So, out from that experience, I'd say that a strict phone number
checking is sometimes a really bad idea. ;-)

Gunnar
(Sweden)

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Jul 19 '05 #4
Roy Johnson wrote:
Purl Gurl wrote in message I thought that you made a few odd (either esoteric or not Lazy enough)
implementation decisions.


I have no interest in reading Code Cop Crap.

It is annoying to open an article only to discover
this type of troll mule manure you write.

Respond to the originating author as you should.

You are wasting your time and the time of readers.
Purl Gurl
Jul 19 '05 #5
Purl Gurl <pu******@purlg url.net> wrote in message news:<3F******* ********@purlgu rl.net>...
I have no interest in reading Code Cop Crap.


Interesting. I have no interest in your critiques of my posts that
have nothing to do with Perl.

It's not "trolling" to point out that you're doing bizarre things when
straightforward methods are available. My code was much more clear
than yours, as well as being shorter.

delete $shoulder->{'chip'}
Jul 19 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
5682
by: Brian Davis | last post by:
The problem is in the word boundary \b. A leading "(" will match as a word boundary before it gets to the test for a "(". Changing the expression to: (?n)(\b|\()1??\(?(?<areaCode>\d\d)?\)??(? <firstThree>\d\d)(?:\s|-)?(?<lastFour>\d{4})\b will allow the leading "(" to be captured in the match, but has the uninteded consequence of matching "(1 (219) 555-5555)" as well, so you probably need to use something
3
12511
by: Eddy Soeparmin | last post by:
Hi, How do I apply phone format in a string field? for example (770) 123-1234. Please let me know. Thanks. Eddy
2
4830
by: Andrew Banks | last post by:
In VS.NET, C#, I can validate agaist US Phone number, Zip, French phone number etc in the IDE... but nothing for the UK. Is there an online reference or add in for VS.NET that includes validation for UK based data - e.g. post codes, phone numbers...
2
4007
by: Ori | last post by:
Hi, I'm looking for a good way to validate a US phone number and i though using regular expression for this. I want to support 3 different ways to enter a phone number: 1.Local Phone : 888-8899 2.With extension: (310)888-2569 3.With extension + 1 : 1(888)789-2569 Can someone tell me what expression I can us in order to support those
1
2125
by: venu | last post by:
Hi, I have a different requirement and it is : I need to validate a phone number field. It may or may not be a US phone number. The constraints are : *********************** # It should accept any number of numbers
3
6787
by: venu | last post by:
Hi, I have a different requirement and it is : I need to validate a phone number field. It may or may not be a US phone number. The constraints are : *********************** # It should accept any number of numbers
2
1541
by: David C | last post by:
Is there a way to validate a specific phone# format on a control? I also want to be able to have the user enter an extension as part of the text. For example, the following would be valid. 123-456-7890 123-456-7890 x123 Below would be invalid. (123) 456-7890
1
2451
by: hellboss | last post by:
Hi ! Can u tel me what is the expression (Regular expression) for the a field like Telephone number which Doesnt exceeds 6 digits Ex:999999 d{5} or ()* Can we use this expression , if its Wrong , kindly Suggest me the Correct one ! Thanks in Advance !
5
3678
by: Abhishek | last post by:
Hi this is my another validator in javascript to validate the Phone Number :-) <script language='javascript'> function funcCheckPhoneNumber(ctrtxtMobile,e){ if(window.event){ var strkeyIE = e.keyCode if(((strkeyIE >= 48) && (strkeyIE <= 57 )) || (strkeyIE >= 40) && (strkeyIE <= 41 ) || (strkeyIE == 32) || (strkeyIE == 46)||(strkeyIE
0
8852
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9227
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9145
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8020
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6676
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5992
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4497
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
2590
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2143
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.