468,458 Members | 1,850 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,458 developers. It's quick & easy.

Finding formatting items in a string

Hi there,

Given a standard .NET string, does anyone know what the regular expression
would be to locate each (optional) formatting item in the string (or more
likely does anyone have a link that will show me this). For instance, given
the following simple string:

"My phone number is {0} and my SSN is {1}"

I want to enumerate (or create a collection of) all formatting items in the
string which would be "{0}" and "{1}" in this (trivial) example. The regular
expression itself should handle all legal cases of course (as described
under "composite formatting" in MSDN - see here:
http://msdn2.microsoft.com/en-us/library/txafckwd.aspx). Any help would be
appreciated. Thanks.
Feb 17 '08 #1
6 1485
Hello Jack,
Hi there,

Given a standard .NET string, does anyone know what the regular
expression would be to locate each (optional) formatting item in the
string (or more likely does anyone have a link that will show me
this). For instance, given the following simple string:

"My phone number is {0} and my SSN is {1}"

I want to enumerate (or create a collection of) all formatting items
in the string which would be "{0}" and "{1}" in this (trivial)
example. The regular expression itself should handle all legal cases
of course (as described under "composite formatting" in MSDN - see
here: http://msdn2.microsoft.com/en-us/library/txafckwd.aspx). Any
help would be appreciated. Thanks.

The following expression will take care of most you want:

(?<!([^\{]|^)\{(\{{2})*)\{[0-9]+(,[-]?[0-9]+)?(:[^\}]+)?\}(?!\}(\{{2})*([^\}]|$))

I'll try to explain what it does:

(?<!([^\{]|^)\{(\{{2})*)
This part sees if we're dealing with an even number of opening {. In that
case all are escaped and should therefore be ignored.
Due to the fact that there is no easy way to check for off or even numbers
I've done it as follows:
- first make sure we're either at the beginning of a line or that we match
a character that is no {. That way we're sure where we're startign to count.
- Now chop off the first {, followed by any group of 2 extra {'s.

\{
- If that still leaves us with one {, then we're in business.

[0-9]+
- Now accept the numbered part. I've made it pretty simple here, any number
will so.

(,[-]?[0-9]+)?
- Now accept the optional alignment. I think you could write the [-] as [+-],
but I'm not sure from the top of my head that a plus is allowed for the alignment.
I guess it is though.

(:[^\}]+)?
- Accept almost anything as optional formatting mask. As you can specify
the formatting mask for each and every tipe differently based on the TypeFormatter,
I guess there's no use in limiting the possible formats any way.
- So chop off everything that's not a closing }

\}
- Pick off the closing }

(?!\}(\{{2})*([^\}]|$))
- But only if it's followed by no or an odd number of closing }'s. This used
the same logic as above.

You could make the regex more specific, but I guess this should get you started.

Also note that I haven't taken any whitespace into account, as I haven't
had time to experiment where you would be allowed to add whitespace and where
not.

If you still have any questions on how to improve or further limit the expression,
feel free to ask.

--
Jesse Houwing
jesse.houwing at sogeti.nl
Feb 18 '08 #2
Hello Jack,

I even found a FAQ on this... I'm going to write a blogpost about this pattern
at some poitn I guess. It has a lot of interesting regex things in it.

The FAQ:
http://msdn2.microsoft.com/en-us/net.../aa569608.aspx

And a further completed regex, including the fact that you can use { and
} within the custom pattern if you want to... Just escape them again. Which
makes the whole escaping { harder and harder to understand...

This is the expression I've got so far:
(?<!([^\{]|^){({{)*){(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?<format>:([^{}]|{{|}})+)?}(?!}({{)*([^}]|$))

note also that I removed the \ before most, if not all of the { and } in
the expression. It seems that the .NET regex parser is quite content with
this. Only if the {n(,(m)?)?} format is used, do you need to escape the {
and the }. I found that by accident. Not that it makes the expression any
easier to read... *sigh*...

--
Jesse Houwing
jesse.houwing at sogeti.nl
Feb 18 '08 #3
BTW, I came across this gem during my own research. You may be interested in
checking it out (it even breaks down your regex expression into plain
English)

http://www.ultrapico.com/Expresso.htm
Feb 18 '08 #4
Just an update that it looks very good so far (thanks). I haven't unravelled
the opening and closing brace (stuff) yet but I think there may be a (rare)
problem with the handling of the ":[formatString]". Any pairs of "{{" or
"}}" are valid in "formatString" if I understand the docs correctly so they
should be ignored. I'm still reviewing the situation however (and your code)
so this is just a heads-up before you start tackling your article :)
Feb 18 '08 #5
"Jack" <no_spam@_nospam.comwrote in message
news:ev**************@TK2MSFTNGP06.phx.gbl...
Just an update that it looks very good so far (thanks). I haven't
unravelled the opening and closing brace (stuff) yet but I think there may
be a (rare) problem with the handling of the ":[formatString]". Any pairs
of "{{" or "}}" are valid in "formatString" if I understand the docs
correctly so they should be ignored. I'm still reviewing the situation
however (and your code) so this is just a heads-up before you start
tackling your article :)
Ok. It appears that your expression:

(?<!([^\{]|^)\{(\{{2})*)\{(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?<format>:[^\}]+)?\}(?!\}(\{{2})*([^\}]|$))

may need to be modified slightly:

(?<!([^\{]|^)\{(\{{2})*)\{(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?<format>:([^\}]|\}{2})+)?\}(?!\}(\{{2})*([^\}]|$))

I've simply changed the "format" so that it in addition to allowing one or
more of any character except a "}" (as per your original expression), it
also now allows one or more pairs of "}}" (before the final "}" that
terminates it). I'm still digging through it all though as I rarely ever
work with regular expressions.
Feb 18 '08 #6
Hello Jack,
"Jack" <no_spam@_nospam.comwrote in message
news:ev**************@TK2MSFTNGP06.phx.gbl...
>Just an update that it looks very good so far (thanks). I haven't
unravelled the opening and closing brace (stuff) yet but I think
there may be a (rare) problem with the handling of the
":[formatString]". Any pairs of "{{" or "}}" are valid in
"formatString" if I understand the docs correctly so they should be
ignored. I'm still reviewing the situation however (and your code) so
this is just a heads-up before you start tackling your article :)
Ok. It appears that your expression:

(?<!([^\{]|^)\{(\{{2})*)\{(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?
<format>:[^\}]+)?\}(?!\}(\{{2})*([^\}]|$))

may need to be modified slightly:

(?<!([^\{]|^)\{(\{{2})*)\{(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?
<format>:([^\}]|\}{2})+)?\}(?!\}(\{{2})*([^\}]|$))

I've simply changed the "format" so that it in addition to allowing
one or more of any character except a "}" (as per your original
expression), it also now allows one or more pairs of "}}" (before the
final "}" that terminates it). I'm still digging through it all though
as I rarely ever work with regular expressions.
That would indeed solve the issue. I've been experimenting a bit more and
came to the same conclusion...

To make it even less readable, but shorter, you can remove the escapes from
the \{ and \} to make it the following expression:

(?<!([^{]|^){({{)*){(?<item>[0-9]+)(?<alignment>,[-+]?[0-9]+)?(?<format>:([^{}]|}}|{{)+)?}(?!}({{)*([^}]|$))

Also as far as I can tell the opening { must be escaped in the format pattern
as well. I adjusted the above expression for that.
--
Jesse Houwing
jesse.houwing at sogeti.nl
Feb 18 '08 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

11 posts views Thread by Steve Holden | last post: by
8 posts views Thread by Mr. B | last post: by
6 posts views Thread by Tarun | last post: by
7 posts views Thread by Matthew Wieder | last post: by
12 posts views Thread by Mike Smith | last post: by
7 posts views Thread by L. Scott M. | last post: by
1 post views Thread by fatima.issawi | last post: by
7 posts views Thread by Brad Baker | last post: by
reply views Thread by NPC403 | last post: by
reply views Thread by kmladenovski | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.