ae********@yahoo.com wrote in
news:11**********************@z14g2000cwz.googlegr oups.com:
expanding this message to microsoft.public.dotnet.xml
Greetings
Please direct me to the right group if this is an inappropriate
place to post this question. Thanks.
I want to format a numeric value according to an arbitrary
regular expression.
Background:
I have an XML schema that I have no control over. It is filled
with simpleTypes with restrictions that include xsd:pattern
elements, eg:
<xsd:restriction base="xsd:float">
<xsd:minInclusive value="0.000" />
<xsd:maxInclusive value="10.000" />
<xsd:pattern value="\d{1,2}\.\d{3}" />
</xsd:restriction>
These patterns and BaseTypes vary widely.
I am dynamically creating an XmlDocument at runtime and
validating it against the schema. Each
XmlDocumentElement.InnerText property is populated from a data
structure that contains an object reference that points to a
value of the appropriate type. That is: The object references
may point to a float or to a string or whatever is called for,
depending on the BaseType of the corresponding simpleType.
(Usings strings for everything is not an option.)
The problem is that in order to create the XmlDocument, I have
to convert an object that might actually be a float or an int in
to a formatted string.
I can't just go
myElement.InnerText = Convert.ToString( Convert.ToDouble(
myObject ) );
because myElement.InnerText will then contain strings that look
like this
1.000 ==> "1"
1.050 ==> "1.05"
which cannot be guaranteed to satisfy the schema. It certainly
doesn't satisfy the one above. In the case of that schema, I
would obviously want to have conversions like this:
1.000 ==> "1.000"
1.050 ==> "1.050"
Yes, I know I can easily write code to do this for a single
regular expression, but the problem is that there is no
guarantee what the xsd:pattern is going to be. Hard coding
something that uses three decimal places just won't do.
Here's what I would like to do:
myElement.InnerText = FormatAccordingToRegEx( Convert.ToDouble(
myObject ), myXsdPattern );
Remember, myXsdPattern could be anything, so I can't do
something like
Regex.Replace( Convert.ToString( Convert.ToDouble( myObject ) ),
myXsdPattern, "${1},${2}.${3}" ); // not sure about syntax here
That won't work if myXsdPattern = "\d{1,}.\d{3}" or
"\d{1,}.\d{1}" or "-{0,1}\d{1,2}.\d{1}" (to list a few
examples); and even if it did, I think I would need to have
access to the schema so I could interject parentheses in to the
regular expressions/xsd:patterns. As I said, I have no control
over the schema. (Although, the ultimate solution may indeed
require me to read the xsd:pattern in, insert parentheses
according to some algorithm and then do something like the code
fragment above. I don't know.)
Now, I'm pretty sure that xsd:pattern will only be used with
numeric values like xsd:float and xsd:int in the schema. I don't
think it would be possible to do this with an arbitrary string
anyway because in many cases, several formatted outputs are
possible from one input... so I would think this simplifies the
problem a bit. (Several outputs are possible from one input with
numbers as well, but they all mean the same thing -- leading
zeros and commas are pretty irrelevant. A number is still a
number.)
I hope I'm being clear. I'm grammatically challenged today.
I know this is a bit backward, but I'm though I'd check and see
if anyone has already written code to do this. I will be
grateful for any and all suggestions -- even guesses about what
to try. Thanks in advance.
Tony,
Interesting problem. Regexes are usually used to see if a given piece of
data matches a given format. I'm not aware of any way to directly use a
regex to *make* the data match the format.
FWIW, .Net has numeric format strings that supply this functionality.
http://msdn.microsoft.com/library/de...matstrings.asp
or
http://tinyurl.com/4tus6
I know you stated you have no direct control over the XSD, but if it's
feasible, you should find a way to get the author of the XSD to insert
an additional element in the xsd:restriction element. Something like:
<xsd:format value="0#.###"/>
If that's not possible, there may be a way to hack something together...
Correct me if I'm wrong, but it seems the algorithm you're trying to
produce is:
int input = 4;
string regex = "\d{1,2}\.\d{3}"; // No optional clauses.
string result = BlackBox(input, regex);
// result would be 4.000.
int input = -4;
string regex = "-{0,1}\d{1,2}.\d{1}"; // Optional minus sign.
string result = BlackBox(input, regex);
// result would be -4.0.
I'm assuming that illegal input values that are out of range of the
supplied regular expression will be caught using xsd:minInclusive
and xsd:maxInclusive.
I'm also assuming the format of the regex roughly matches the format
of the xsd:minInclusive and xsd:maxInclusive values. If that's the
case, then you may be able to use that knowledge to use the supplied
regex to format the input value.
It appears from the regex examples you give that the number of digits
to the right side of the decimal point are usually fixed, whereas the
number of digits on the left side of the decimal point are variable.
Using that observation, it's possible to pad the input value with
zeros on the right side of the decimal point, and use the regex to
extract the formatted number:
// Compile with "csc /t:exe example.cs"
using System;
using System.Text.RegularExpressions;
namespace Example
{
public class TestClass
{
public static int Main(string[] args)
{
Console.WriteLine(BlackBox("4", @"\d{1,2}\.\d{3}", "10.000"));
Console.WriteLine(BlackBox("4", @"\d{1,}.\d{3}", "1000.000"));
Console.WriteLine(BlackBox("4", @"\d{1,}.\d{1}", "1000.0"));
Console.WriteLine(BlackBox("-4", @"-{0,1}\d{1,2}.\d{3}", "10.000"));
return 0;
}
public static string BlackBox(string input, string regex, string maxInput)
{
const string decimalPoint = ".";
if (maxInput.IndexOf(decimalPoint) > 0)
{
if (input.IndexOf(decimalPoint) < 0)
input += decimalPoint;
// Pad input with zeros to the right of the decimal
// point so it matches the number of decimal places in
// maxInput (we don't care about numbers on the left side of
// the decimal point).
// For example, if maxInput = 12.3456, and
// input = 9.8, pad input so it equals 9.8000.
input += new string('0',
GetNumberOfDecimalPlaces(maxInput) -
GetNumberOfDecimalPlaces(input));
}
return Regex.Match(input, regex).Groups[0].ToString();
}
public static int GetNumberOfDecimalPlaces(string input)
{
return Regex.Match(input, @"\.(?<digits>\d*)",
RegexOptions.ExplicitCapture).Groups["digits"].ToString().Length;
}
}
}
--
Hope this helps.
Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/