473,722 Members | 2,168 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

RegEx to find CFML tags nested in HTML tags

I'm currently developing a tool in perl to search out potential XSS
(Cross Site Scripting) vulnerabilities and correct them in a
ColdFusion based web app. I've been having great success so far,
however, one scenario has me banging my head against the wall.

I need a regex to find all <cfoutput ...>...</cfoutput> blocks in a
CFM template. The regex should find all such blocks that are *not*
nested within HTML tags (the tag itself, <cfoutput> blocks located
between an opening and closing tag are OK). In other words a tag that
looks like the following:

<link rel="<cfoutput> #directory#/foo.css</cfoutput>">

should be ignored by the regex.

I have tried this a number of ways, however, I have not ocme up with a
solution yet.

The basic RegEx to match the <cfoutput> block is:

$text =~ s/(<cfoutput[^>]*>.*?<\/cfoutput>)/process_cfoutpu t($1)/sige;

This of course doesn't provide the tag exclusion that I am looking

The next RegEx finds the <cfoutput> block in the tag and continues
matching through the first <cfoutput> block it finds not nested in a
tag, returning too much data.

$text =~ s/(<cfoutput[^>]*>.*?<\/cfoutput>)(?=[^>]*(<|$))/process_cfoutpu t($1)/sige;

From this point forward I tried various lookaround constructs to limit
the scope of the match without any luck. Any help that can be offered
is greatly appreciated.
Jul 19 '05 #1
0 3464

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

by: darrel | last post by:
I have some vb.net code that is running a regex, matching groups, and replacing them. I'm trying to come up with a simple script that will strip all attributes from all HTML tags. This is what I have: ============================================================= function stripAllAttributes(ByVal textToParse as String, ByVal tagToFind as String) as String
by: Maya | last post by:
Hello, Im trying to retrieve a text between the <body> and </body> tags in an HTML file using this code: public string ReadContentsFromPage(string sb) { Regex S = new Regex(@"<BODY>(.*)</BODY>", RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.Compiled);
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag. Implementing the list tag itself was fairly easy. What was not was trying to handle the list items. For some reason, in BBcode, they didn't bother defining an end tag for a list item. I guess that they designed it with bad old HTML 3.2 in mind...
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat lot of times here.... Requirement: ------------------- I want to get the value of "href" i.e "city1.html" by searching "city1" between the <a</atag. Please note that "city1" can repeat lot of
by: MrNobody | last post by:
I'm trying to do some regex in C# but for some reason linebreaks are causing my regex to not work. the test string goes like this: string ss = "<tagname something=45678&somethingelse=12345>blah</tagname>\r\n<tag2>stuff</tag2>"; and my regex code is like:
by: Barry L. Camp | last post by:
Hi all... hope someone can help out. Not a unique situation, but my search for a solution has not yielded what I need yet. I'm trying to come up with a regular expression for a RegularExpressionValidator that will allow certain HTML tags: <a>, <b>, <blockquote>, <br>, <i>, <img>, <li>, <ol>, <p>, <quote>, <ul>
by: Matthias S. | last post by:
hello, I've got the following regexpression: "<span .*bbc_underline.*>(.*)</span>" and the following input string: "this <span class="bbc_underline">is underlined <span class="bbc_strikethrough">and striked through</span>text.</span>"
by: Shawn B. | last post by:
Greetings, Lets say I have the following expression: (<A|ABBR|ADDRESS|APPLET(\s){1,}(.*?)>(.*?)</A|ABBR|ADDRESS|APPLET) Such that it'll match any HTML tag that opens with the above specified (simplified for brevity) and the closing tag as well. Assuming that I had a list of opening possibilities, how can I specify for
by: mdaWeb335 | last post by:
Hi I'm trying to use RegEx to remove a tag by it's ID attribute. The actual HTML has already been cleaned up as I am formatting for export to Excel (so no need to point out that it's incorrect HTML!) - therefore, the example of what I am wanting to remove is as follows: STUFF I WANT TO KEEP GOES HERE <TABLE id=multiupdate width=*> <TABLE> <TR>
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.