473,387 Members | 1,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

regex pro

here's the deal...cvs, tick encapsulted data. trying to use regex's to
validate records. here's an example row:

'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'

record type is in the 8th column ('1278'). using regex b/c there are a
miriad of types that cause other data w/n the record (or related records) to
be in/valid. i'm having problems getting a match on the generalization of
the first 7 columns:

something like this:

(?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))

(?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.

if someone can help me generalize that patter, i'd appreciate it very much.

tia,

steve
Nov 20 '05 #1
17 1625
good thing i posted the explaination of the problem...lead me right to the
solution. record type is in column 7!!!

(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to
| validate records. here's an example row:
|
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a
| miriad of types that cause other data w/n the record (or related records)
to
| be in/valid. i'm having problems getting a match on the generalization of
| the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|
Nov 20 '05 #2
Hi,

Regular expression tester.
http://royo.is-a-geek.com/iserializable/regulator/

Ken
----------------

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com:
good thing i posted the explaination of the problem...lead me right to the

solution. record type is in column 7!!!
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[

0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <HYPERLINK "mailto:a@b.com"a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to

| validate records. here's an example row:
|
|

'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '

| ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a

| miriad of types that cause other data w/n the record (or related
records)
to
| be in/valid. i'm having problems getting a match on the generalization
of
| the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|


--
Outgoing mail is certified Virus Free.
Checked by AVG Anti-Virus (http://www.grisoft.com).
Version: 7.0.230 / Virus Database: 263.3.0 - Release Date: 6/12/2004
Nov 20 '05 #3
Glad u solved it.

I personally hate RegExp, I know often there is little other sensible or
efficient way of doing stuff without it, but I still really really hate it.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
good thing i posted the explaination of the problem...lead me right to the
solution. record type is in column 7!!!

(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[ 0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to
| validate records. here's an example row:
|
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' ' | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a
| miriad of types that cause other data w/n the record (or related records) to
| be in/valid. i'm having problems getting a match on the generalization of | the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|

Nov 20 '05 #4
depends on what you need to use it for. in this instance, it saves my remote
app from having to program for the business rules applied formally on
another, remote server. a generic way for the remote server to tell my
application what a valid set of results should "look like". otherwise, i'd
have fun rebuilding and deploying my app each time new or changed rules
appeared. each "rule" gets a description, whether or not the pattern should
match (exist)...etc. works great...and, all i have to do is collect the
rules.

hate's a strong word...regex is, or s/b, you're best friend when dealing w/
text. there's no tool in my toolbox that i wouldn't give up first b4 a good
regex program.

but, to each their own.
"One Handed Man ( OHM - Terry Burns )" <news.microsoft.com> wrote in message
news:%2****************@TK2MSFTNGP09.phx.gbl...
| Glad u solved it.
|
| I personally hate RegExp, I know often there is little other sensible or
| efficient way of doing stuff without it, but I still really really hate
it.
|
| --
|
| OHM ( Terry Burns )
| . . . One-Handed-Man . . .
|
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com...
| > good thing i posted the explaination of the problem...lead me right to
the
| > solution. record type is in column 7!!!
| >
| >
|
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
| > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to
| > | validate records. here's an example row:
| > |
| > |
| >
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
| > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a
| > | miriad of types that cause other data w/n the record (or related
| records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization
| of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very
| > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
| >
|
|
Nov 20 '05 #5
thanks ken...read about it in the msdn mag too. i have it and only have a
couple of issues w/ it. i also have a couple of other .net based regex
validators i've found on the net as well. they're all helpful. as is your
response.

thanks again,

steve
"Ken Tucker [MVP]" <vb***@bellsouth.net> wrote in message
news:u3*************@TK2MSFTNGP12.phx.gbl...
| Hi,
|
| Regular expression tester.
| http://royo.is-a-geek.com/iserializable/regulator/
|
| Ken
| ----------------
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com:
| > good thing i posted the explaination of the problem...lead me right to
the
| >
| > solution. record type is in column 7!!!
| >
| >
| >
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
| >
| > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <HYPERLINK "mailto:a@b.com"a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to
| >
| > | validate records. here's an example row:
| > |
| > |
| >
| >
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| >
| > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
| > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a
| >
| > | miriad of types that cause other data w/n the record (or related
| > records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization
| > of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very
| > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
|
| --
| Outgoing mail is certified Virus Free.
| Checked by AVG Anti-Virus (http://www.grisoft.com).
| Version: 7.0.230 / Virus Database: 263.3.0 - Release Date: 6/12/2004
|
|
Nov 20 '05 #6
Well, I know its versitile, I accept that, its just that is seems so, so
'Unixy' like

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
depends on what you need to use it for. in this instance, it saves my remote app from having to program for the business rules applied formally on
another, remote server. a generic way for the remote server to tell my
application what a valid set of results should "look like". otherwise, i'd
have fun rebuilding and deploying my app each time new or changed rules
appeared. each "rule" gets a description, whether or not the pattern should match (exist)...etc. works great...and, all i have to do is collect the
rules.

hate's a strong word...regex is, or s/b, you're best friend when dealing w/ text. there's no tool in my toolbox that i wouldn't give up first b4 a good regex program.

but, to each their own.
"One Handed Man ( OHM - Terry Burns )" <news.microsoft.com> wrote in message news:%2****************@TK2MSFTNGP09.phx.gbl...
| Glad u solved it.
|
| I personally hate RegExp, I know often there is little other sensible or
| efficient way of doing stuff without it, but I still really really hate
it.
|
| --
|
| OHM ( Terry Burns )
| . . . One-Handed-Man . . .
|
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com...
| > good thing i posted the explaination of the problem...lead me right to
the
| > solution. record type is in column 7!!!
| >
| >
|
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[ | > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to | > | validate records. here's an example row:
| > |
| > |
| >
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' ' | > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST' | > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a | > | miriad of types that cause other data w/n the record (or related
| records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization | of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very | > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
| >
|
|

Nov 20 '05 #7
Hi Terry,

I have again exactly the same idea about it as you, I call it scripting
like.

Cor
Nov 20 '05 #8
it has nothing to do w/ unix. even vb.not.net and vb.net have a "like"
statement. consider a regex to be like "like" on acid. it also has nothing
to do with scripting. it is simply the best way to recognize/find patterns
w/n text w/o having to know exactly what the pattern is or what the text to
be search is before hand.

whether or not it is "unixy" or "scriptish", why start discarding
options/tools based on something other than performance, flexibility, or
cost? that's dangerous programming in a project manager's mind.
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
| Hi Terry,
|
| I have again exactly the same idea about it as you, I call it scripting
| like.
|
| Cor
|
|
Nov 20 '05 #9
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so
tight because of memory constraints that it was unreadable, or of die hard C
programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
it has nothing to do w/ unix. even vb.not.net and vb.net have a "like"
statement. consider a regex to be like "like" on acid. it also has nothing
to do with scripting. it is simply the best way to recognize/find patterns
w/n text w/o having to know exactly what the pattern is or what the text to be search is before hand.

whether or not it is "unixy" or "scriptish", why start discarding
options/tools based on something other than performance, flexibility, or
cost? that's dangerous programming in a project manager's mind.
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
| Hi Terry,
|
| I have again exactly the same idea about it as you, I call it scripting
| like.
|
| Cor
|
|

Nov 20 '05 #10
Hi Terry,

Again exactly the same
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so
tight because of memory constraints that it was unreadable, or of die hard C programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.


With the addition that I sometimes think that it is used as a kind of
obfuscating the code for others.

However the last sentence from you (OHM) is as well for me.

Cor
Nov 20 '05 #11
If you have ever processed lots of text like screen scrapping you will end
up using regex. I agree that it is somewhat hard to see what it does but
once you learn it like everything you will wonder why you did not use it
before. But I had to buy a book since resources on the web are sparse and
MS documentation is pretty much non-existant.

Lloyd Sheen

"Cor Ligthert" <no**********@planet.nl> wrote in message
news:uk*************@TK2MSFTNGP11.phx.gbl...
Hi Terry,

Again exactly the same
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so tight because of memory constraints that it was unreadable, or of die
hard C
programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.


With the addition that I sometimes think that it is used as a kind of
obfuscating the code for others.

However the last sentence from you (OHM) is as well for me.

Cor

Nov 20 '05 #12
Hi Lloyd,

I never say never, however it is not the first choise from me while I not
say forever do not use it.
When you have to do complex text changes in one search through a not
orginized documents I believe there are not much alternatives, however I see
it here often special for simple changes.

Cor
If you have ever processed lots of text like screen scrapping you will end
up using regex. I agree that it is somewhat hard to see what it does but
once you learn it like everything you will wonder why you did not use it
before. But I had to buy a book since resources on the web are sparse and
MS documentation is pretty much non-existant.

Nov 20 '05 #13
| With the addition that I sometimes think that it is used as a kind of
| obfuscating the code for others.

set the regex options to ignore white space and then indent and add comments
w/n the regex. what you have in that case is a programming language. regex
is like a sql statement...but for text. consider what it would take to
program the following:

testString:

'abcdefg', 'lkasjdflk', 'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg',
'lkasjdflk', 'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg', 'lkasjdflk',
'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg', 'lkasjdflk', 'kslthhtjkehsljt',
'.zx,mv.zmx'

your job:

find all ticked strings, then replace the ones that repeat with only one
instance.

well, w/ regex, it is as simple as:

dim regex as new regex("('[^']*?', )(?:'[^']*?', )*(\1)")
testString = regex.replace(testString, "$1")

if the pattern looks complex, ignore whitespace and add your comments and/or
indenting or whatever. pretty simple.

but again, to each their own.
Nov 20 '05 #14
Hi Steve,
so, both C and regex are not your friends? ;^)


Although I wrote that there are circumstances that I do not ommit things as
regex forever, I get the idea that you try (without directly saying it
however with that single ;^) to hit the knowledge of OHM and me.

Try this

Dim teststring1 As String =
"'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
"'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk
ehsljt'," & _
"'.zx,mv.zmx'"
Dim start As Integer = Environment.TickCount
Dim teststring2 As String
For i As Integer = 0 To 10000
Dim regex As New
System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
teststring2 = regex.Replace(teststring1, "$1")
Next
Console.Write(teststring2 & "time: " & _
(Environment.TickCount - start).ToString & vbCrLf)
start = Environment.TickCount
Dim teststring3 As String

For i As Integer = 0 To 10000
Dim sb As System.Text.StringBuilder
sb = New System.Text.StringBuilder
Dim sp As String() = Split(teststring1, "','")
sp(0) = sp(0).Substring(1)
sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
sp(sp.Length - 1).Length - 1)
For Each da1 As String In sp
Dim da2 As String = "'" & da1 & "'"
If sb.ToString.IndexOf(da2) = -1 Then
sb.Append(da2)
sb.Append(",")
End If
Next
teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
Next
Console.Write(teststring3 & "time: " & _
(Environment.TickCount - start).ToString & vbCrLf)

You will see that the second one without the regex is 4 times faster and
gives the same result.
About the style and code we can discus if the second as well reach a kind of
obfuscating style, however that is for me the same as with the Regex sample.

Cor
Nov 20 '05 #15
'C' was a great language as far as I am concerned, I used it for several
years quite happily, so I would accredit myself with a reasonable
understanding of it.

However, I personally know two programmers who used to write code in way
which was designed to demonstrate ( to the knowledgeable eye ) their
expertise in understanding the nuances of the compiler. I've seen a piece of
code which printf'ed a poem which no obvious source. All of it looked like
garbage, not unlike a RegExp.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
this just hit me as funny logical consequence...

|die hard C
| programmers who like to write code which is impossible to undersrtand.

punctuated differently:

"die hard C programmers who like to write code, which is impossible to
understand."

the are only two logical assumptions...only die hard C programmers like to
write code that is impossible to understand...or...ockham's razor - any code can be made very hard to understand; few, if any, programmers of any
language "like" to write hard to understand code; understanding is inference from experience; therefore, it is simplest to say that the language of C is impossible for you to understand (since understanding is wholly an
individual endeavor). and, as we all know, this simplest answer or solution is more oft' than not, the correct one.

so, both C and regex are not your friends? ;^)

just playing w/ you ohm.

cheers.

Nov 20 '05 #16
i was kidding w/ ohm.

ahhhh...i wonder what the difference w/b if the data itself contained no
identifying marker by which you could perform a nifty split operation? that
would require you to rewrite your entire function...i'd just have to change
the pattern in my example and still be left w/ two lines of code to
maintain. the regex pattern itself is no longer considered obfuscation if,
like i said, one were to place comments w/n it (just as you would w/ any
language).

but i digress...i can see my humor was lost to offense. appologies s/b
applied where they are needed to both you and ohm.

later,

steve
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:eW**************@TK2MSFTNGP10.phx.gbl...
| Hi Steve,
|
| > so, both C and regex are not your friends? ;^)
|
| Although I wrote that there are circumstances that I do not ommit things
as
| regex forever, I get the idea that you try (without directly saying it
| however with that single ;^) to hit the knowledge of OHM and me.
|
| Try this
|
| Dim teststring1 As String =
| "'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
| "'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
|
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk
| ehsljt'," & _
| "'.zx,mv.zmx'"
| Dim start As Integer = Environment.TickCount
| Dim teststring2 As String
| For i As Integer = 0 To 10000
| Dim regex As New
| System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
| teststring2 = regex.Replace(teststring1, "$1")
| Next
| Console.Write(teststring2 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
| start = Environment.TickCount
| Dim teststring3 As String
|
| For i As Integer = 0 To 10000
| Dim sb As System.Text.StringBuilder
| sb = New System.Text.StringBuilder
| Dim sp As String() = Split(teststring1, "','")
| sp(0) = sp(0).Substring(1)
| sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
| sp(sp.Length - 1).Length - 1)
| For Each da1 As String In sp
| Dim da2 As String = "'" & da1 & "'"
| If sb.ToString.IndexOf(da2) = -1 Then
| sb.Append(da2)
| sb.Append(",")
| End If
| Next
| teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
| Next
| Console.Write(teststring3 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
|
| You will see that the second one without the regex is 4 times faster and
| gives the same result.
| About the style and code we can discus if the second as well reach a kind
of
| obfuscating style, however that is for me the same as with the Regex
sample.
|
| Cor
|
|
Nov 20 '05 #17
No appology needed, I was smirking while both reading and replying. RegExp
does have its place as you say.

Cheers

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
i was kidding w/ ohm.

ahhhh...i wonder what the difference w/b if the data itself contained no
identifying marker by which you could perform a nifty split operation? that would require you to rewrite your entire function...i'd just have to change the pattern in my example and still be left w/ two lines of code to
maintain. the regex pattern itself is no longer considered obfuscation if,
like i said, one were to place comments w/n it (just as you would w/ any
language).

but i digress...i can see my humor was lost to offense. appologies s/b
applied where they are needed to both you and ohm.

later,

steve
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:eW**************@TK2MSFTNGP10.phx.gbl...
| Hi Steve,
|
| > so, both C and regex are not your friends? ;^)
|
| Although I wrote that there are circumstances that I do not ommit things
as
| regex forever, I get the idea that you try (without directly saying it
| however with that single ;^) to hit the knowledge of OHM and me.
|
| Try this
|
| Dim teststring1 As String =
| "'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
| "'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
|
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk | ehsljt'," & _
| "'.zx,mv.zmx'"
| Dim start As Integer = Environment.TickCount
| Dim teststring2 As String
| For i As Integer = 0 To 10000
| Dim regex As New
| System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
| teststring2 = regex.Replace(teststring1, "$1")
| Next
| Console.Write(teststring2 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
| start = Environment.TickCount
| Dim teststring3 As String
|
| For i As Integer = 0 To 10000
| Dim sb As System.Text.StringBuilder
| sb = New System.Text.StringBuilder
| Dim sp As String() = Split(teststring1, "','")
| sp(0) = sp(0).Substring(1)
| sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
| sp(sp.Length - 1).Length - 1)
| For Each da1 As String In sp
| Dim da2 As String = "'" & da1 & "'"
| If sb.ToString.IndexOf(da2) = -1 Then
| sb.Append(da2)
| sb.Append(",")
| End If
| Next
| teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
| Next
| Console.Write(teststring3 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
|
| You will see that the second one without the regex is 4 times faster and
| gives the same result.
| About the style and code we can discus if the second as well reach a kind of
| obfuscating style, however that is for me the same as with the Regex
sample.
|
| Cor
|
|

Nov 20 '05 #18

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
9
by: Tim Conner | last post by:
Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) { return true; } else return false; }
20
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
6
by: Extremest | last post by:
I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is a faster way tried to use string.replace with all...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
4
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. ...
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.