By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,290 Members | 1,666 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,290 IT Pros & Developers. It's quick & easy.

regex pro

P: n/a
here's the deal...cvs, tick encapsulted data. trying to use regex's to
validate records. here's an example row:

'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'

record type is in the 8th column ('1278'). using regex b/c there are a
miriad of types that cause other data w/n the record (or related records) to
be in/valid. i'm having problems getting a match on the generalization of
the first 7 columns:

something like this:

(?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))

(?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.

if someone can help me generalize that patter, i'd appreciate it very much.

tia,

steve
Nov 20 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
good thing i posted the explaination of the problem...lead me right to the
solution. record type is in column 7!!!

(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to
| validate records. here's an example row:
|
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a
| miriad of types that cause other data w/n the record (or related records)
to
| be in/valid. i'm having problems getting a match on the generalization of
| the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|
Nov 20 '05 #2

P: n/a
Hi,

Regular expression tester.
http://royo.is-a-geek.com/iserializable/regulator/

Ken
----------------

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com:
good thing i posted the explaination of the problem...lead me right to the

solution. record type is in column 7!!!
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[

0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <HYPERLINK "mailto:a@b.com"a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to

| validate records. here's an example row:
|
|

'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '

| ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a

| miriad of types that cause other data w/n the record (or related
records)
to
| be in/valid. i'm having problems getting a match on the generalization
of
| the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|


--
Outgoing mail is certified Virus Free.
Checked by AVG Anti-Virus (http://www.grisoft.com).
Version: 7.0.230 / Virus Database: 263.3.0 - Release Date: 6/12/2004
Nov 20 '05 #3

P: n/a
Glad u solved it.

I personally hate RegExp, I know often there is little other sensible or
efficient way of doing stuff without it, but I still really really hate it.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
good thing i posted the explaination of the problem...lead me right to the
solution. record type is in column 7!!!

(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[ 0-3])|(1[16])|(2[0135])))|1612)')

thanks for looking though...if you did.

;^)

"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
| here's the deal...cvs, tick encapsulted data. trying to use regex's to
| validate records. here's an example row:
|
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' ' | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
|
| record type is in the 8th column ('1278'). using regex b/c there are a
| miriad of types that cause other data w/n the record (or related records) to
| be in/valid. i'm having problems getting a match on the generalization of | the first 7 columns:
|
| something like this:
|
| (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
|
| (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
|
| if someone can help me generalize that patter, i'd appreciate it very
much.
|
| tia,
|
| steve
|
|

Nov 20 '05 #4

P: n/a
depends on what you need to use it for. in this instance, it saves my remote
app from having to program for the business rules applied formally on
another, remote server. a generic way for the remote server to tell my
application what a valid set of results should "look like". otherwise, i'd
have fun rebuilding and deploying my app each time new or changed rules
appeared. each "rule" gets a description, whether or not the pattern should
match (exist)...etc. works great...and, all i have to do is collect the
rules.

hate's a strong word...regex is, or s/b, you're best friend when dealing w/
text. there's no tool in my toolbox that i wouldn't give up first b4 a good
regex program.

but, to each their own.
"One Handed Man ( OHM - Terry Burns )" <news.microsoft.com> wrote in message
news:%2****************@TK2MSFTNGP09.phx.gbl...
| Glad u solved it.
|
| I personally hate RegExp, I know often there is little other sensible or
| efficient way of doing stuff without it, but I still really really hate
it.
|
| --
|
| OHM ( Terry Burns )
| . . . One-Handed-Man . . .
|
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com...
| > good thing i posted the explaination of the problem...lead me right to
the
| > solution. record type is in column 7!!!
| >
| >
|
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
| > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to
| > | validate records. here's an example row:
| > |
| > |
| >
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
| > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a
| > | miriad of types that cause other data w/n the record (or related
| records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization
| of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very
| > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
| >
|
|
Nov 20 '05 #5

P: n/a
thanks ken...read about it in the msdn mag too. i have it and only have a
couple of issues w/ it. i also have a couple of other .net based regex
validators i've found on the net as well. they're all helpful. as is your
response.

thanks again,

steve
"Ken Tucker [MVP]" <vb***@bellsouth.net> wrote in message
news:u3*************@TK2MSFTNGP12.phx.gbl...
| Hi,
|
| Regular expression tester.
| http://royo.is-a-geek.com/iserializable/regulator/
|
| Ken
| ----------------
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com:
| > good thing i posted the explaination of the problem...lead me right to
the
| >
| > solution. record type is in column 7!!!
| >
| >
| >
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[
| >
| > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <HYPERLINK "mailto:a@b.com"a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to
| >
| > | validate records. here's an example row:
| > |
| > |
| >
| >
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' '
| >
| > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST'
| > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a
| >
| > | miriad of types that cause other data w/n the record (or related
| > records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization
| > of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very
| > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
|
| --
| Outgoing mail is certified Virus Free.
| Checked by AVG Anti-Virus (http://www.grisoft.com).
| Version: 7.0.230 / Virus Database: 263.3.0 - Release Date: 6/12/2004
|
|
Nov 20 '05 #6

P: n/a
Well, I know its versitile, I accept that, its just that is seems so, so
'Unixy' like

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
depends on what you need to use it for. in this instance, it saves my remote app from having to program for the business rules applied formally on
another, remote server. a generic way for the remote server to tell my
application what a valid set of results should "look like". otherwise, i'd
have fun rebuilding and deploying my app each time new or changed rules
appeared. each "rule" gets a description, whether or not the pattern should match (exist)...etc. works great...and, all i have to do is collect the
rules.

hate's a strong word...regex is, or s/b, you're best friend when dealing w/ text. there's no tool in my toolbox that i wouldn't give up first b4 a good regex program.

but, to each their own.
"One Handed Man ( OHM - Terry Burns )" <news.microsoft.com> wrote in message news:%2****************@TK2MSFTNGP09.phx.gbl...
| Glad u solved it.
|
| I personally hate RegExp, I know often there is little other sensible or
| efficient way of doing stuff without it, but I still really really hate
it.
|
| --
|
| OHM ( Terry Burns )
| . . . One-Handed-Man . . .
|
|
| "steve" <a@b.com> wrote in message
| news:10*************@corp.supernews.com...
| > good thing i posted the explaination of the problem...lead me right to
the
| > solution. record type is in column 7!!!
| >
| >
|
(?=(?:(?<!',').*(?!=',')',')){8}('((12((7[89])|(8[0-7])|(9[0-468])))|(13((0[ | > 0-3])|(1[16])|(2[0135])))|1612)')
| >
| > thanks for looking though...if you did.
| >
| > ;^)
| >
| > "steve" <a@b.com> wrote in message
| > news:10*************@corp.supernews.com...
| > | here's the deal...cvs, tick encapsulted data. trying to use regex's to | > | validate records. here's an example row:
| > |
| > |
| >
|
'AD,'BF','132465','06/09/2004','','BNSF','A','TYPE','1278','','BR','2999',' ' | > | ,'LX','','01','09','1','','','','','','','','','CU STOM JOB CODE TEST' | > |
| > | record type is in the 8th column ('1278'). using regex b/c there are a | > | miriad of types that cause other data w/n the record (or related
| records)
| > to
| > | be in/valid. i'm having problems getting a match on the generalization | of
| > | the first 7 columns:
| > |
| > | something like this:
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7}(?:'1278(?=','))
| > |
| > | (?=(?:(?<!',').*(?!=',')',')){7} represents the first 7 colums.
| > |
| > | if someone can help me generalize that patter, i'd appreciate it very | > much.
| > |
| > | tia,
| > |
| > | steve
| > |
| > |
| >
| >
|
|

Nov 20 '05 #7

P: n/a
Hi Terry,

I have again exactly the same idea about it as you, I call it scripting
like.

Cor
Nov 20 '05 #8

P: n/a
it has nothing to do w/ unix. even vb.not.net and vb.net have a "like"
statement. consider a regex to be like "like" on acid. it also has nothing
to do with scripting. it is simply the best way to recognize/find patterns
w/n text w/o having to know exactly what the pattern is or what the text to
be search is before hand.

whether or not it is "unixy" or "scriptish", why start discarding
options/tools based on something other than performance, flexibility, or
cost? that's dangerous programming in a project manager's mind.
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
| Hi Terry,
|
| I have again exactly the same idea about it as you, I call it scripting
| like.
|
| Cor
|
|
Nov 20 '05 #9

P: n/a
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so
tight because of memory constraints that it was unreadable, or of die hard C
programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
it has nothing to do w/ unix. even vb.not.net and vb.net have a "like"
statement. consider a regex to be like "like" on acid. it also has nothing
to do with scripting. it is simply the best way to recognize/find patterns
w/n text w/o having to know exactly what the pattern is or what the text to be search is before hand.

whether or not it is "unixy" or "scriptish", why start discarding
options/tools based on something other than performance, flexibility, or
cost? that's dangerous programming in a project manager's mind.
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:%2****************@TK2MSFTNGP12.phx.gbl...
| Hi Terry,
|
| I have again exactly the same idea about it as you, I call it scripting
| like.
|
| Cor
|
|

Nov 20 '05 #10

P: n/a
Hi Terry,

Again exactly the same
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so
tight because of memory constraints that it was unreadable, or of die hard C programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.


With the addition that I sometimes think that it is used as a kind of
obfuscating the code for others.

However the last sentence from you (OHM) is as well for me.

Cor
Nov 20 '05 #11

P: n/a
If you have ever processed lots of text like screen scrapping you will end
up using regex. I agree that it is somewhat hard to see what it does but
once you learn it like everything you will wonder why you did not use it
before. But I had to buy a book since resources on the web are sparse and
MS documentation is pretty much non-existant.

Lloyd Sheen

"Cor Ligthert" <no**********@planet.nl> wrote in message
news:uk*************@TK2MSFTNGP11.phx.gbl...
Hi Terry,

Again exactly the same
I dont discard it. I simply dont like it, everytime I look at a complex
regExp, it reminds me of the old days where people used to cram code in so tight because of memory constraints that it was unreadable, or of die
hard C
programmers who like to write code which is impossible to undersrtand.

Thats my point. If you dont agree with me, then thats fine.


With the addition that I sometimes think that it is used as a kind of
obfuscating the code for others.

However the last sentence from you (OHM) is as well for me.

Cor

Nov 20 '05 #12

P: n/a
Hi Lloyd,

I never say never, however it is not the first choise from me while I not
say forever do not use it.
When you have to do complex text changes in one search through a not
orginized documents I believe there are not much alternatives, however I see
it here often special for simple changes.

Cor
If you have ever processed lots of text like screen scrapping you will end
up using regex. I agree that it is somewhat hard to see what it does but
once you learn it like everything you will wonder why you did not use it
before. But I had to buy a book since resources on the web are sparse and
MS documentation is pretty much non-existant.

Nov 20 '05 #13

P: n/a
| With the addition that I sometimes think that it is used as a kind of
| obfuscating the code for others.

set the regex options to ignore white space and then indent and add comments
w/n the regex. what you have in that case is a programming language. regex
is like a sql statement...but for text. consider what it would take to
program the following:

testString:

'abcdefg', 'lkasjdflk', 'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg',
'lkasjdflk', 'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg', 'lkasjdflk',
'kslthhtjkehsljt', '.zx,mv.zmx', 'abcdefg', 'lkasjdflk', 'kslthhtjkehsljt',
'.zx,mv.zmx'

your job:

find all ticked strings, then replace the ones that repeat with only one
instance.

well, w/ regex, it is as simple as:

dim regex as new regex("('[^']*?', )(?:'[^']*?', )*(\1)")
testString = regex.replace(testString, "$1")

if the pattern looks complex, ignore whitespace and add your comments and/or
indenting or whatever. pretty simple.

but again, to each their own.
Nov 20 '05 #14

P: n/a
Hi Steve,
so, both C and regex are not your friends? ;^)


Although I wrote that there are circumstances that I do not ommit things as
regex forever, I get the idea that you try (without directly saying it
however with that single ;^) to hit the knowledge of OHM and me.

Try this

Dim teststring1 As String =
"'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
"'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk
ehsljt'," & _
"'.zx,mv.zmx'"
Dim start As Integer = Environment.TickCount
Dim teststring2 As String
For i As Integer = 0 To 10000
Dim regex As New
System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
teststring2 = regex.Replace(teststring1, "$1")
Next
Console.Write(teststring2 & "time: " & _
(Environment.TickCount - start).ToString & vbCrLf)
start = Environment.TickCount
Dim teststring3 As String

For i As Integer = 0 To 10000
Dim sb As System.Text.StringBuilder
sb = New System.Text.StringBuilder
Dim sp As String() = Split(teststring1, "','")
sp(0) = sp(0).Substring(1)
sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
sp(sp.Length - 1).Length - 1)
For Each da1 As String In sp
Dim da2 As String = "'" & da1 & "'"
If sb.ToString.IndexOf(da2) = -1 Then
sb.Append(da2)
sb.Append(",")
End If
Next
teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
Next
Console.Write(teststring3 & "time: " & _
(Environment.TickCount - start).ToString & vbCrLf)

You will see that the second one without the regex is 4 times faster and
gives the same result.
About the style and code we can discus if the second as well reach a kind of
obfuscating style, however that is for me the same as with the Regex sample.

Cor
Nov 20 '05 #15

P: n/a
'C' was a great language as far as I am concerned, I used it for several
years quite happily, so I would accredit myself with a reasonable
understanding of it.

However, I personally know two programmers who used to write code in way
which was designed to demonstrate ( to the knowledgeable eye ) their
expertise in understanding the nuances of the compiler. I've seen a piece of
code which printf'ed a poem which no obvious source. All of it looked like
garbage, not unlike a RegExp.

--

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
this just hit me as funny logical consequence...

|die hard C
| programmers who like to write code which is impossible to undersrtand.

punctuated differently:

"die hard C programmers who like to write code, which is impossible to
understand."

the are only two logical assumptions...only die hard C programmers like to
write code that is impossible to understand...or...ockham's razor - any code can be made very hard to understand; few, if any, programmers of any
language "like" to write hard to understand code; understanding is inference from experience; therefore, it is simplest to say that the language of C is impossible for you to understand (since understanding is wholly an
individual endeavor). and, as we all know, this simplest answer or solution is more oft' than not, the correct one.

so, both C and regex are not your friends? ;^)

just playing w/ you ohm.

cheers.

Nov 20 '05 #16

P: n/a
i was kidding w/ ohm.

ahhhh...i wonder what the difference w/b if the data itself contained no
identifying marker by which you could perform a nifty split operation? that
would require you to rewrite your entire function...i'd just have to change
the pattern in my example and still be left w/ two lines of code to
maintain. the regex pattern itself is no longer considered obfuscation if,
like i said, one were to place comments w/n it (just as you would w/ any
language).

but i digress...i can see my humor was lost to offense. appologies s/b
applied where they are needed to both you and ohm.

later,

steve
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:eW**************@TK2MSFTNGP10.phx.gbl...
| Hi Steve,
|
| > so, both C and regex are not your friends? ;^)
|
| Although I wrote that there are circumstances that I do not ommit things
as
| regex forever, I get the idea that you try (without directly saying it
| however with that single ;^) to hit the knowledge of OHM and me.
|
| Try this
|
| Dim teststring1 As String =
| "'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
| "'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
|
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk
| ehsljt'," & _
| "'.zx,mv.zmx'"
| Dim start As Integer = Environment.TickCount
| Dim teststring2 As String
| For i As Integer = 0 To 10000
| Dim regex As New
| System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
| teststring2 = regex.Replace(teststring1, "$1")
| Next
| Console.Write(teststring2 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
| start = Environment.TickCount
| Dim teststring3 As String
|
| For i As Integer = 0 To 10000
| Dim sb As System.Text.StringBuilder
| sb = New System.Text.StringBuilder
| Dim sp As String() = Split(teststring1, "','")
| sp(0) = sp(0).Substring(1)
| sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
| sp(sp.Length - 1).Length - 1)
| For Each da1 As String In sp
| Dim da2 As String = "'" & da1 & "'"
| If sb.ToString.IndexOf(da2) = -1 Then
| sb.Append(da2)
| sb.Append(",")
| End If
| Next
| teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
| Next
| Console.Write(teststring3 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
|
| You will see that the second one without the regex is 4 times faster and
| gives the same result.
| About the style and code we can discus if the second as well reach a kind
of
| obfuscating style, however that is for me the same as with the Regex
sample.
|
| Cor
|
|
Nov 20 '05 #17

P: n/a
No appology needed, I was smirking while both reading and replying. RegExp
does have its place as you say.

Cheers

OHM ( Terry Burns )
. . . One-Handed-Man . . .
"steve" <a@b.com> wrote in message
news:10*************@corp.supernews.com...
i was kidding w/ ohm.

ahhhh...i wonder what the difference w/b if the data itself contained no
identifying marker by which you could perform a nifty split operation? that would require you to rewrite your entire function...i'd just have to change the pattern in my example and still be left w/ two lines of code to
maintain. the regex pattern itself is no longer considered obfuscation if,
like i said, one were to place comments w/n it (just as you would w/ any
language).

but i digress...i can see my humor was lost to offense. appologies s/b
applied where they are needed to both you and ohm.

later,

steve
"Cor Ligthert" <no**********@planet.nl> wrote in message
news:eW**************@TK2MSFTNGP10.phx.gbl...
| Hi Steve,
|
| > so, both C and regex are not your friends? ;^)
|
| Although I wrote that there are circumstances that I do not ommit things
as
| regex forever, I get the idea that you try (without directly saying it
| however with that single ;^) to hit the knowledge of OHM and me.
|
| Try this
|
| Dim teststring1 As String =
| "'abcdefg','lkasjdflk','kslthhtjkehsljt','.zx,mv.z mx','abcdefg'," & _
| "'lkasjdflk','kslthhtjkehsljt','.zx,mv.zmx','abcde fg','lkasjdflk'," & _
|
"'kslthhtjkehsljt','.zx,mv.zmx','lkasjdflk','abcde fg','lkasjdflk','kslthhtjk | ehsljt'," & _
| "'.zx,mv.zmx'"
| Dim start As Integer = Environment.TickCount
| Dim teststring2 As String
| For i As Integer = 0 To 10000
| Dim regex As New
| System.Text.RegularExpressions.Regex("('[^']*?',)(?:'[^']*?',)*(\1)")
| teststring2 = regex.Replace(teststring1, "$1")
| Next
| Console.Write(teststring2 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
| start = Environment.TickCount
| Dim teststring3 As String
|
| For i As Integer = 0 To 10000
| Dim sb As System.Text.StringBuilder
| sb = New System.Text.StringBuilder
| Dim sp As String() = Split(teststring1, "','")
| sp(0) = sp(0).Substring(1)
| sp(sp.Length - 1) = sp(sp.Length - 1).Substring(0, _
| sp(sp.Length - 1).Length - 1)
| For Each da1 As String In sp
| Dim da2 As String = "'" & da1 & "'"
| If sb.ToString.IndexOf(da2) = -1 Then
| sb.Append(da2)
| sb.Append(",")
| End If
| Next
| teststring3 = sb.ToString.Substring(0, sb.ToString.Length - 1)
| Next
| Console.Write(teststring3 & "time: " & _
| (Environment.TickCount - start).ToString & vbCrLf)
|
| You will see that the second one without the regex is 4 times faster and
| gives the same result.
| About the style and code we can discus if the second as well reach a kind of
| obfuscating style, however that is for me the same as with the Regex
sample.
|
| Cor
|
|

Nov 20 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.