By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,028 Members | 1,092 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,028 IT Pros & Developers. It's quick & easy.

Regular expression : Grouping decimal values and double quote

P: n/a
Hi all,

I have a problem constructing a regular expression using .net.

I have a string, separated with comma, and I want to group the string
together but, I failed to group a numeric character with decimal values.

Example string : 1, 2.3, "two"," three"

So, I want to group this string into 4 groups (1), (2.3), (two) and (three)

The best regular expression that I have so far is:
(?:^|\s*\,\s*)((?:"(?<SubString>(?:""|[^"])*)")+)|((?<SubString>(\d))+)

But this regex will return (1), (2), (3), (two) and (three).

So, what is the right regular expression to do this? Please help.

Thanks.


Nov 15 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
Will String.Split(",") do ?

<Kalpesh/>

Nov 15 '05 #2

P: n/a
No. Consider this string:
string s = "1, 2.3, \"ab,\"\"c\", \"e.ff,;$\"";

I want to split it into (1), (2.3), (ab,""c) and (e.ff,;$).

Got my point? Please help.
"Kalpesh Shah" <ka*****@disc.microsoft.com> wrote in message
news:OV**************@TK2MSFTNGP10.phx.gbl...
Will String.Split(",") do ?

<Kalpesh/>

Nov 15 '05 #3

P: n/a
Hello

Try this expression
(?:^|\s*\,\s*)(?:(?:"(?<SubString>(?:""|[^"])*)"\s*)|(?:\s*(?<SubString>(?:\
s*[^\s,]+)*)\s*))

Best regards
Sherif

"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:uG**************@tk2msftngp13.phx.gbl...
Hi all,

I have a problem constructing a regular expression using .net.

I have a string, separated with comma, and I want to group the string
together but, I failed to group a numeric character with decimal values.

Example string : 1, 2.3, "two"," three"

So, I want to group this string into 4 groups (1), (2.3), (two) and (three)
The best regular expression that I have so far is:
(?:^|\s*\,\s*)((?:"(?<SubString>(?:""|[^"])*)")+)|((?<SubString>(\d))+)

But this regex will return (1), (2), (3), (two) and (three).

So, what is the right regular expression to do this? Please help.

Thanks.

Nov 15 '05 #4

P: n/a
Thanx a lot Metainy!

It works. But that regex also matches invalid decimal values. Like, it still
match 1.23aa value. And same case also happened to the double-quote
character, of which I wanted it to start and end with double-quote, with no
trailing character except a comma or no char at all. Got my point?

Just to get it clear:
- 1.23, "abc"qwe" = valid
- 1.23x, "abc"qwe" = invalid
- 1.23, "abc"qwe"xx = invalid

And one more thing is, any good (but free) resource of regex tutorial? ebook
or website.

Still hoping for assistance here.

Thank you.


Nov 15 '05 #5

P: n/a
Hello Ahmad

Try this one
^(?:(?:(?:^|,)\s*)(?:(?:"(?<SubString>(""|[^"])*)")|(?<SubString>\d+(?:\.\d+
)?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif

"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:Ou**************@TK2MSFTNGP10.phx.gbl...
Thanx a lot Metainy!

It works. But that regex also matches invalid decimal values. Like, it still match 1.23aa value. And same case also happened to the double-quote
character, of which I wanted it to start and end with double-quote, with no trailing character except a comma or no char at all. Got my point?

Just to get it clear:
- 1.23, "abc"qwe" = valid
- 1.23x, "abc"qwe" = invalid
- 1.23, "abc"qwe"xx = invalid

And one more thing is, any good (but free) resource of regex tutorial? ebook or website.

Still hoping for assistance here.

Thank you.

Nov 15 '05 #6

P: n/a
Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick
solution right now.

"Sherif ElMetainy" <el*************@wayout.net.NOSPAM> wrote in message
news:%2******************@TK2MSFTNGP09.phx.gbl...
Hello Ahmad

Try this one
^(?:(?:(?:^|,)\s*)(?:(?:"(?<SubString>(""|[^"])*)")|(?<SubString>\d+(?:\.\d+ )?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif

"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:Ou**************@TK2MSFTNGP10.phx.gbl...
Thanx a lot Metainy!

It works. But that regex also matches invalid decimal values. Like, it

still
match 1.23aa value. And same case also happened to the double-quote
character, of which I wanted it to start and end with double-quote, with

no
trailing character except a comma or no char at all. Got my point?

Just to get it clear:
- 1.23, "abc"qwe" = valid
- 1.23x, "abc"qwe" = invalid
- 1.23, "abc"qwe"xx = invalid

And one more thing is, any good (but free) resource of regex tutorial?

ebook
or website.

Still hoping for assistance here.

Thank you.


Nov 15 '05 #7

P: n/a
Hello

This can be too complicated

How do I treat the double quote and comma. A ',' between double quotes is
considered a part of the string and a double quote between double quotes is
also considered a part of the string
Imagine this

1.23,"aa,ddd",123 this should match [1.23], [aa,ddd] and [123]

1.23,"abc"qwe,"ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
considered invalid??
To take this decision you have to understand the nature of the data (for
example being able to distinguish a contact's first name from his
nickname) which is not possible with regular expressions.

This is why it is difficult to match one double quote between 2 double
quotes.

Here is where the "" resolves the ambiguity
1.23,"abc""qwe,""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
meaning that 2 consecutive double quotes within double quotes should be
treated as a one double quote which is a part of the string. The "" is
standard in formats like csv.
Best regards,
Sherif
"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:#n**************@TK2MSFTNGP09.phx.gbl...
Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick
solution right now.

"Sherif ElMetainy" <el*************@wayout.net.NOSPAM> wrote in message
news:%2******************@TK2MSFTNGP09.phx.gbl...
Hello Ahmad

Try this one

^(?:(?:(?:^|,)\s*)(?:(?:"(?<SubString>(""|[^"])*)")|(?<SubString>\d+(?:\.\d+
)?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif

"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:Ou**************@TK2MSFTNGP10.phx.gbl...
Thanx a lot Metainy!

It works. But that regex also matches invalid decimal values. Like, it

still
match 1.23aa value. And same case also happened to the double-quote
character, of which I wanted it to start and end with double-quote,
with no
trailing character except a comma or no char at all. Got my point?

Just to get it clear:
- 1.23, "abc"qwe" = valid
- 1.23x, "abc"qwe" = invalid
- 1.23, "abc"qwe"xx = invalid

And one more thing is, any good (but free) resource of regex tutorial?

ebook
or website.

Still hoping for assistance here.

Thank you.



Nov 15 '05 #8

P: n/a
Hi,

I know that it was too complicated, that's why I'm here.

But, I think I have my way out now. I can use MatchColelction and break the
string apart between the comma, and use another regex to validate every
broken string. :)

Anyway, you've been a great help ElMetainy. Thanks a lot.

Bye.

"Sherif ElMetainy" <el*************@wayout.net.NOSPAM> wrote in message
news:uy*************@TK2MSFTNGP11.phx.gbl...
Hello

This can be too complicated

How do I treat the double quote and comma. A ',' between double quotes is
considered a part of the string and a double quote between double quotes is also considered a part of the string
Imagine this

1.23,"aa,ddd",123 this should match [1.23], [aa,ddd] and [123]

1.23,"abc"qwe,"ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
considered invalid??
To take this decision you have to understand the nature of the data (for
example being able to distinguish a contact's first name from his
nickname) which is not possible with regular expressions.

This is why it is difficult to match one double quote between 2 double
quotes.

Here is where the "" resolves the ambiguity
1.23,"abc""qwe,""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
meaning that 2 consecutive double quotes within double quotes should be
treated as a one double quote which is a part of the string. The "" is
standard in formats like csv.
Best regards,
Sherif
"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:#n**************@TK2MSFTNGP09.phx.gbl...
Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick solution right now.

"Sherif ElMetainy" <el*************@wayout.net.NOSPAM> wrote in message
news:%2******************@TK2MSFTNGP09.phx.gbl...
Hello Ahmad

Try this one

^(?:(?:(?:^|,)\s*)(?:(?:"(?<SubString>(""|[^"])*)")|(?<SubString>\d+(?:\.\d+
)?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif

"Ahmad A. Rahman" <je************@yahoo.com> wrote in message
news:Ou**************@TK2MSFTNGP10.phx.gbl...
> Thanx a lot Metainy!
>
> It works. But that regex also matches invalid decimal values. Like, it still
> match 1.23aa value. And same case also happened to the double-quote
> character, of which I wanted it to start and end with double-quote, with no
> trailing character except a comma or no char at all. Got my point?
>
> Just to get it clear:
> - 1.23, "abc"qwe" = valid
> - 1.23x, "abc"qwe" = invalid
> - 1.23, "abc"qwe"xx = invalid
>
> And one more thing is, any good (but free) resource of regex tutorial? ebook
> or website.
>
> Still hoping for assistance here.
>
> Thank you.
>
>
>
>



Nov 15 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.