Find Credit Card numbers in text field 
May 19th, 2009, 06:17 PM
|  | Expert | | Join Date: Mar 2007 Location: England
Posts: 1,063
Provided Answers: 2 | |
Data Protection etc.... shouldn't be there.
How do I find credit card numbers buried amongst any length of text.
Googling finds credit card validation claims, which are too complicated.
I need to find groups of numbers that resemble credit card numbers.
I think the criteria is 13 - 24 digits in groups of 4 or 5
seperated by spaces or dashes or bunched together.
It looks like a regex using PATINDEX is the answer but regex is black magic to me
Have I got this right and could I please have some pattern suggestions? - SELECT notes FROM cust_order WHERE PATINDEX('pattern',notes) > 0
| 
May 20th, 2009, 06:50 AM
|  | Member | | Join Date: Nov 2008
Posts: 47
| | | re: Find Credit Card numbers in text field
Using a similar function to the one I posted last time you could get something that might help you on your way (I'm not sure what CK had in mind for an alternative approach with String Parsing so I'm bumbling on as before !): - CREATE FUNCTION [dbo].[R_Test_func2](@sub nvarchar(1000))
-
RETURNS int
-
AS
-
BEGIN
-
-
DECLARE @Output int
-
DECLARE @myOutput int
-
SET @Output = 0
-
SET @myOutput = 0
-
-
WHILE patindex('%[0123456789]%', @sub) <> 0
-
BEGIN
-
SET @Output =
-
len(substring
-
(
-
substring(@sub,patindex('%[0123456789]%',@sub),len(@sub))
-
,1
-
,CASE
-
WHEN patindex('%[^0123456789 -]%',substring(@sub,patindex('%[0123456789]%',@sub),len(@sub))) = 0
-
THEN len(@sub)
-
ELSE patindex('%[^0123456789 -]%',substring(@sub,patindex('%[0123456789]%',@sub),len(@sub)))-1
-
END
-
))
-
-
SET @sub =
-
CASE
-
WHEN patindex('%[^0123456789 -]%',substring(@sub,patindex('%[0123456789]%',@sub),len(@sub))) = 0
-
THEN 'a'
-
ELSE substring
-
(
-
substring(@sub,patindex('%[0123456789]%',@sub),len(@sub))
-
,patindex('%[^0123456789 -]%',substring(@sub,patindex('%[0123456789]%',@sub),len(@sub)))
-
,len(@sub)
-
)
-
END
-
-
SET @myOutput =
-
CASE
-
WHEN @Output > @myOutput THEN @Output ELSE @myOutput
-
END
-
END
-
RETURN @myOutput
-
-
END
The general principle is the same as before
It looks for the 1st numeric character: - patindex('%[0123456789]%',@sub)
Then for the 1st non numeric character thereafter (this time also allowing for a space or -): - patindex('%[^0123456789 -]%',substring(@sub,patindex('%[0123456789]%',@sub),len(@sub)))
The length of this string is stored and then the rest of the string is searched.
The resulting output will be the length of the longest string of consecutive numbers (including spaces and -)
So, you could use a WHERE statement such as: - WHERE dbo.R_Test_Func2(notes) > 12
You could get rid of one of the two variables @Output and @myOutput by doing the SET as a CASE statement in one go but I felt this may complicate the (already less than easy to read) code too much!
As a note, another fix from the original function I posted is that it will now work if the last character is non-numeric, hence the addition of a couple of CASE statements buried in there. In the previous version if the last character was not a number then it would try and return a string of either length 0 or starting from the 0th character which I discovered it did not like very much!
Hope this helps
| 
May 20th, 2009, 08:41 AM
|  | Expert | | Join Date: Mar 2007 Location: England
Posts: 1,063
Provided Answers: 2 | | | re: Find Credit Card numbers in text field
Looks like it could work Uncle Dickie.
I am under a bit of pressure with this one now, so will give it a try
| 
May 20th, 2009, 06:51 PM
|  | Expert | | Join Date: Jun 2007
Posts: 1,914
Provided Answers: 1 | | | re: Find Credit Card numbers in text field
It meant the same thing :) You're basically parsing the TEXT which is by common definition is a string. As string parsing is reading the string one character at a time or finding patterns within it.
As I always said, if that algorithm gave you the performance you need, that should be fine, until someone suggested a better one :)
Thanks Uncle Dickie and Happy Coding CG :)
--- CK
| 
May 21st, 2009, 10:13 AM
|  | Member | | Join Date: Nov 2008
Posts: 47
| | | re: Find Credit Card numbers in text field
Having looked a bit more at the patindex function, a much easier option than my previous function would simply be: -
SELECT notes
-
FROM cust_order
-
WHERE patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',TranID) > 0
-
This is just looking for a string of 13 consecutive numbers, spaces or '-' (it must start with a number).
|  | | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 225,662 network members.
|