473,785 Members | 2,484 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Convert CDATA expression to Javascript RegExp

Max
Hello everyone!

Can anyone help me to convert the CDATA expression "CDATA ::= (Char* -
(Char* ']]>' Char*)" to Javascript Regular Expression?

Thanks,

Max
Feb 13 '07 #1
7 3980
Translation to English: A CDATA's value can contain any legal XML
characters except the three-character sequence ]](which is used to
terminate the value.

I don't do Javascript, so you'll have to translate it the rest of the
way yourself.
--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Feb 13 '07 #2
On 13 Feb, 17:38, Max <a...@tiscali.i twrote:
Hello everyone!

Can anyone help me to convert the CDATA expression "CDATA ::= (Char* -
(Char* ']]>' Char*)" to Javascript Regular Expression?

Thanks,

Max
Doing regular expressions that end with a string of characters is
slightly involved. You need to do something like:

/([^\]]*|][^\]]|]][^>]|]]?$)*/

Not the easiest thing to see! Maybe the best thing is to break it
into it's component parts. e.g.:

var no_bracket = "[^\]]*";
var one_bracket = "][^\]]";
var two_brackets = "]][^>]";
var end_bracket = "]]?$";

var expr = "/(" + no_bracket + "|" + one_bracket + "|" + two_bracket +
+ "|" + end_bracket + ")*/";

I'll admit I haven't tested it, but hopefully it gives you an idea!
(The $ anchor may not work where it is. In which case try \Z in its
place.)

HTH,

Pete.
--
=============== =============== ===============
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
http://www.codalogic.com/lmx
(or http://www.xml2cpp.com)
=============== =============== ===============

Feb 13 '07 #3
On 13 Feb, 20:38, use...@tech-know-ware.com wrote:
On 13 Feb, 17:38, Max <a...@tiscali.i twrote:
Hello everyone!
Can anyone help me to convert the CDATA expression "CDATA ::= (Char* -
(Char* ']]>' Char*)" to Javascript Regular Expression?
Thanks,
Max

Doing regular expressions that end with a string of characters is
slightly involved. You need to do something like:

/([^\]]*|][^\]]|]][^>]|]]?$)*/

Not the easiest thing to see! Maybe the best thing is to break it
into it's component parts. e.g.:

var no_bracket = "[^\]]*";
var one_bracket = "][^\]]";
var two_brackets = "]][^>]";
var end_bracket = "]]?$";

var expr = "/(" + no_bracket + "|" + one_bracket + "|" + two_bracket +
+ "|" + end_bracket + ")*/";

I'll admit I haven't tested it, but hopefully it gives you an idea!
(The $ anchor may not work where it is. In which case try \Z in its
place.)
I was thinking more about this over night. The details of the regular
expression depend on what input string you want to apply the matching
on. If you could give an idea of the types of strings you want the
match to be applied (e.g. whole XML message, or element text etc) to
it might be possible to have a better pattern.

Pete.
--
=============== =============== ===============
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
http://www.codalogic.com/lmx
(or http://www.xml2cpp.com)
=============== =============== ===============


Feb 14 '07 #4
Max
Hello Pete!

I have written this regular expression:

<!\\[CDATA\\[(((?:\\u0009|\\ u000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFFD]|[\\u10000-\\u10FFFF])*?)(]]>(?:\\u0009|\\u 000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFFD]|[\\u10000-\\u10FFFF])*?)*)]]>

I break it into these component parts:

XParser.CHAR =
"(?:\\u0009|\\u 000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFFD]|[\\u10000-\\u10FFFF])";
XParser.CDSTART = "<!\\[CDATA\\[";
XParser.CDATA = "((" + XParser.CHAR + "*?)(]]>" + XParser.CHAR + "*?)*)";
XParser.CDEND = "]]>";
XParser.CDSECT = XParser.CDSTART + XParser.CDATA + XParser.CDEND;

XML code example:

<![CDATA[this child is of <<<>nodeType CDATA]]>

The problem is been born expanding the simple regular expression for
CDATA ('(" + XParser.CHAR + "*?)') with the feature to capture more
markup ']]>'.
But in this way it capture also two or more CDSECT...

Example:
1 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA

2 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]><![CDATA[this
child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA]]><![CDATA[this child is of
<<<>nodeType CDATA

Is it possible to resolve this?

Thanks in advance,

Max
Feb 14 '07 #5
This sounds like it's really a Javascript programming question rather
than an XML question, since the question is how to express something in
that language's reg-exp syntax rather than what to express. So you might
get better answers by asking in a Javascript newsgroup than here.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Feb 14 '07 #6
(After all, most of us just use an existing XML parser and let *it* deal
with syntax.)

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Feb 14 '07 #7
On 14 Feb, 14:59, Max <a...@tiscali.i twrote:
Hello Pete!

I have written this regular expression:

<!\\[CDATA\\[(((?:\\u0009|\\ u000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFF*D]|[\\u10000-\\u10FFFF])*?)(]]>(?:\\u0009|\\u 000A|\\u000D|[\\u0020-\\uD7FF]*|[\\uE000-\\uFFFD]|[\\u10000-\\u10FFFF])*?)*)]]>

I break it into these component parts:

XParser.CHAR =
"(?:\\u0009|\\u 000A|\\u000D|[\\u0020-\\uD7FF]|[\\uE000-\\uFFFD]|[\\u10000-\*\u10FFFF])";
XParser.CDSTART = "<!\\[CDATA\\[";
XParser.CDATA = "((" + XParser.CHAR + "*?)(]]>" + XParser.CHAR + "*?)*)";
XParser.CDEND = "]]>";
XParser.CDSECT = XParser.CDSTART + XParser.CDATA + XParser.CDEND;

XML code example:

<![CDATA[this child is of <<<>nodeType CDATA]]>

The problem is been born expanding the simple regular expression for
CDATA ('(" + XParser.CHAR + "*?)') with the feature to capture more
markup ']]>'.
But in this way it capture also two or more CDSECT...

Example:
1 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA

2 Tag: <![CDATA[this child is of <<<>nodeType CDATA]]><![CDATA[this
child is of <<<>nodeType CDATA]]>
Capture: this child is of <<<>nodeType CDATA]]><![CDATA[this child is of
<<<>nodeType CDATA

Is it possible to resolve this?

Thanks in advance,

Max
Hi Max,

In this case I think you need to rework your XParser.CDATA rule along
the lines of the following:

// You could write these using a similar approach to your XParser.CHAR
if you prefer
var no_bracket = "[^\\]]*";
var one_bracket = "][^\\]]";
var two_brackets = "]][^>]";

XParser.CDATA = "(" + no_bracket + "|" + one_bracket + "|" +
two_bracket + ")*" + "]*";

The logic is basically:

if( current char is not ] ||
current char is ] AND next char is NOT ] ||
current char is ] and the next char is ] and the next one is NOT
)
then OK;

which is more easily understood as:

if( current char is not ] ) then OK;
else if( current char is ] AND next char is NOT ] ) then OK;
else if( current char is ] and the next char is ] and the next one is
NOT ) then OK;

The end just allow any number of ] characters if necessary.

HTH,

Pete.
--
=============== =============== ===============
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
(or http://www.xml2cpp.com)
=============== =============== ===============

Feb 14 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6906
by: Leila | last post by:
Hi, I am having a problem retrieving the html tags from my XML document when it's being loaded into a DOM object. For example, my xml contains the following: <my:InsideView> .. ..
3
1535
by: Gopinath | last post by:
Hi JavaScript Gurus, I've a question on Regular Expressions using RegExp object. I just want to know whether it is possible to do the search (see below) using RegExp. Any pointers would be of immense help. Thanks. My simple JavaScript code (inside a function): ======================= var strList = "~@!~1#Apple~@!~2#Orange~@!~3#Mango~@!~4#Grapes~@!~";
9
3154
by: Harry | last post by:
Hi there, does anyone know how I can build a regular expression e.g. for the string.search() function on runtime, depending on the content of variables? Should be something like this: var strkey = "something"; var str = "Somethin like this"; if( str.search( / + strkey + / ) > -1 )
3
6195
by: Balaras | last post by:
Hi, Can sombody here please help me a bit with a regular expression. I have a xml file where I need to strip the CDATA sections of any contained data. Eg. <xml> <tag><]></tag> <tag><]></tag>
4
60001
by: joe_rattz | last post by:
I need to convert a text string ("Dewey & Cheatham & Howe") to an XML encoded string ("Dewey &amp; Cheatham &amp; Howe"). I am not building an XML document, I am just trying to convert a single string. I have looked at the System.Xml namespace, but I can't quite find what I need. XmlTextWriter.WriteString() does the type of encoding I need, but it writes the string. I just want it to return the encoded string. Can someone tell me the class and...
9
13710
by: MLibby | last post by:
How do I convert an HTML page into XML? My initial thought is to convert the page to xslt but I'm not sure how to do this. Please provide any source code examples if you have them. Thanks, Mike -- mcp, mcse, mcsd, mcad.net, mcsd.net
8
7331
by: Rajeev Soni | last post by:
Hi I am looking for the regular expression for validating the allowed file types to upload like files like "zip,pdf,doc,rtf,gif,jpg,png,txt"; and the expression should not be case sensitive like it must match ZIP | zip | zIp..... regards rajeev
7
3149
by: intrader | last post by:
The regular expression is /(?!((00000)|(11111)))/ in oRe. That is oRE=/(?!((00000)|(11111)))/ The test strings are 92708, 00000, 11111 in checkStr The expression used is checkStr.search(oRE). The values returned are are 0,1,1 - the values should be 0,-1,-1. The positive lookahead expressiono RE=/(?=((00000)|(11111)))/ returns -1, 0, 0 respectively - this is correct
4
2596
by: Velhari | last post by:
Hi all, I wrote the following Javascript function used to execute the Javascript codes from the ajax response which contains both html & javascript. It works fine, if the javascript codes from the response doesn't have any line break at all. function executeJSCodes( html ){ var regExp = new RegExp('<script language="javascript">(.*)<\/script>',"gi"); while( res = regExp.exec(html) ){ eval(res); ...
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10325
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10147
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10091
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9950
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.