On 14 Feb, 14:59, Max <a...@tiscali.itwrote:
Hello Pete!
I have written this regular expression:
<!\\[CDATA\\[(((?:\\u0009|\\u000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFF*D]|[\\u10000-\\u10FFFF])*?)(]]>(?:\\u0009|\\u000A|\\u000D|[\\u0020-\\uD7FF]*|[\\uE000-\\uFFFD]|[\\u10000-\\u10FFFF])*?)*)]]>
I break it into these component parts:
XParser.CHAR =
"(?:\\u0009|\\u000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFFD]|[\\u10000-\*\u10FFFF])";
XParser.CDSTART = "<!\\[CDATA\\[";
XParser.CDATA = "((" + XParser.CHAR + "*?)(]]>" + XParser.CHAR + "*?)*)";
XParser.CDEND = "]]>";
XParser.CDSECT = XParser.CDSTART + XParser.CDATA + XParser.CDEND;
XML code example:
<![CDATA[this child is of <<<>nodeType CDATA]]>
The problem is been born expanding the simple regular expression for
CDATA ('(" + XParser.CHAR + "*?)') with the feature to capture more
markup ']]>'.
But in this way it capture also two or more CDSECT...
Example:
1 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA
2 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]><![CDATA[this
child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA]]><![CDATA[this child is of
<<<>nodeType CDATA
Is it possible to resolve this?
Thanks in advance,
Max
Hi Max,
In this case I think you need to rework your XParser.CDATA rule along
the lines of the following:
// You could write these using a similar approach to your XParser.CHAR
if you prefer
var no_bracket = "[^\\]]*";
var one_bracket = "][^\\]]";
var two_brackets = "]][^>]";
XParser.CDATA = "(" + no_bracket + "|" + one_bracket + "|" +
two_bracket + ")*" + "]*";
The logic is basically:
if( current char is not ] ||
current char is ] AND next char is NOT ] ||
current char is ] and the next char is ] and the next one is NOT
)
then OK;
which is more easily understood as:
if( current char is not ] ) then OK;
else if( current char is ] AND next char is NOT ] ) then OK;
else if( current char is ] and the next char is ] and the next one is
NOT ) then OK;
The end just allow any number of ] characters if necessary.
HTH,
Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
(or
http://www.xml2cpp.com)
=============================================