We do something like this: (Note that *we here* do not translate letters to
numbers, as in 1-800-CALL-NOW):
' This is one of my classes that does dozens
' of different kinds of string manipulation
Public Class Transform
Private _cleanPhone As New Regex("\D")
' Other declarations...
' Note that since strings are also immutable, I pass
' the input to this method as ByRef instead of ByVal,]
' for greater speed.
Public Sub CleanPhoneNumber(ByRef value As String)
value = _cleanPhone.Replace(value, "")
End Sub
' Other Methods...
End Class
So in the client, if I do this:
Dim tx As New Transform()
Dim phone As String
' load my data table (dt)
For Each dr As DataRow In dt.Rows
phone = dr("phone").ToString()
phone = tx.CleanPhoneNumber(phone)
' now I have 1234567890
' and then I have line like Joseph's code snip:
dr("phone") = _
"(" & phone.substring(0,3) & ") " & _
phone.substring("3,3) & "-" & _
phone.substring(6,4)
' now I have (123) 456-7890
Next
da.Update(dt) ' give it to a SqlDataAdapter
Note that this format only works with NANP (3 digits)-(7 digits) phone
numbers, as are found in the US, Canada and a few other countries. Here's a
good reference for other international phone numbers:
http://www.wtng.info
If you have to deal with international phone numbers, you will have to build
a table of all countries you're interested in, that contains a country code
or name, International Prefix, Trunk Prefix and maybe some other data from
the web site if it concerns you.
I cannot share details because of intellectual property issues, but I spent
4 months last year developing a phone number parsing system compatible with
auto-dialers that can reliably dialy any phone number on the planet *from*
anywhere on the planet.
The above website was my guide to all the telephone mysteries. :)
While I'm here, I will point out that if you use Regex.Replace() directly
without creating an instance of Regex, and I bet you're doing this in a
loop... That's bad.
The static methods of the Regex class incur the overhead of creating and
destroying a Regex instance. This means that if you do something like this:
For Each dr As DataRow In dt.Rows
Regex.Replace(...)
Next
What you're doing is the same as this:
For Each dr As DataRow In dt.Rows
Dim rx As New Regex(...)
rx.Replace(...)
rx = Nothing
Next
So for something like an immutable Regex instance, you should declare an
instance of it and *then* run your loop:
Dim rx As New Regex(...)
For Each dr As DataRow In dt.Rows
rx.Replace(...)
Next
The difference between the first loop and the last loop is that the first
one will create and destroy one instance of Regex per each record in your
DataTable. The last loop, directly above, use only a single instance of
Regex. The difference between the two depends on how many records you're
processing. In our case, the difference was astronomical. When we clean
phone numbers, it could be close to a million at a time. Just imagine what
the previous loop would have done to the Garbage Collector! :)
--
Peace & happy computing,
Mike Labosh, MCSD
"Mr. McKittrick, after very careful consideration, I have
come to the conclusion that this new system SUCKS."
-- General Barringer, "War Games"