473,655 Members | 3,056 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regex question - can't get unclosed quote marks inside of HTML tags

I do not know much about regex.

I'm worried about lines like this:

<a href="myFile>my file</a>

There is only one quote mark in that html.

I wanted to fix this problem, so I tried this:
function command($string =false) {
$pattern = '/(.*)<a (.*)"(.*)>/i';
$replacement = '$1<$2"$3">';
$newString = preg_replace($p attern, $replacement, $string);
return $newString;
}
This finds no matches. Even when I feed it the above as a test line.

What have I done wrong.

Dec 5 '05 #1
5 2206
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

Dec 5 '05 #2

de**********@gm ail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.


I see. So if I have a string like this:

<a href="myfile">m yfile</a>

and I feed it to this function:

function command($string =false) {
$pattern = '/(.*)<a (.*)"(.*)>/i';
$replacement = '$1<$2"$3">';
$newString = preg_replace($p attern, $replacement,
$string);
return $newString;
}

I will get a match? But does that mean I'll end up with this:

<a href="myfile""> myfile</a>

With an extra quote mark? That is not the problem I was having.

But you are saying that I need to replace the final (.*) with something
that says "everything but a quote mark"?

Dec 6 '05 #3

de**********@gm ail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

So I want this?

function command($string =false) {

$pattern = '/(.*)<a (.*)"[^"]+>/i';
$replacement = '$1<a $2"$3">';
$newString = preg_replace($p attern, $replacement, $string);
return $newString;
return $string;
}

Dec 6 '05 #4

de**********@gm ail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.


Thanks for the reply. I just created this file and I uploaded it to my
server for testing:

<?php

echo "hey";
flush();

function command($string =false) {
echo "hey";
$pattern = '/(.*)<a (.*)"[^"]+>/i';
$replacement = '$1<a $2"$3">';
// $newString = preg_replace($p attern, $replacement, $string);
return $newString;
}

$string = "<p><a href=\"myfile>m yfile</a> ";
echo $string;
flush();
$string = command($string );
echo $string;

?>
When I don't comment out the preg_replace line then I get one "hey"
echoed to the screen and then the script apparently dies without error.

Dec 6 '05 #5

de**********@gm ail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

Well, okay, for anyone interested, I worked it out and found the
correct pattern is this:

<a ([^">]+)"([^">]+)>

Dec 6 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3701
by: Wayfarer | last post by:
Hi, I've been developing in Firebird, so I didn't notice this until today. When I use the <q> tag, IE 6 doesn't put quotes around the quotation. I tried specifically doing it in a stylesheet: q:before { content: open-quote; } q:after { content: close-quote; } but it didn't work.
8
1486
by: G. Stewart | last post by:
The objective is to extract the first n characters of text from an HTML block. I wish to preserve all HTML (links, formatting etc.), and at the same time, extend the size of the block to ensure that all closing tags are recovered. For example, simply extracting the first 400 characters of a HTML block may result in an <i> opening tag being including, but its closing tag being excluding. Or a link may get chopped halfway - may be the...
3
2334
by: Craig Kenisston | last post by:
I have the sudden need to split a text that may have any of the following tokens : Words with quotes or double quotes. Words with no quotes at all. Numbers with and without decimal points, no commas allowed, but may contain parenthesis which I would like to keep apart to drop later. They may be separated by comas, spaces or semicolon.
13
2361
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag. Implementing the list tag itself was fairly easy. What was not was trying to handle the list items. For some reason, in BBcode, they didn't bother defining an end tag for a list item. I guess that they designed it with bad old HTML 3.2 in mind...
2
2871
by: Tim_Mac | last post by:
hi, i have a tricky problem and my regex expertise has reached its limit. i have read other posts on this newsgroup that pull out the plain text from a html string, but that won't work for me because i want to preserve the html, and replace some of the plain text. i basically want to show the user's search terms highlighted in the page, like google does, but i want to do this server side (i have the mechanics of intercepting the html...
9
2079
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat lot of times here.... Requirement: ------------------- I want to get the value of "href" i.e "city1.html" by searching "city1" between the <a</atag. Please note that "city1" can repeat lot of
7
2421
by: MrNobody | last post by:
I'm trying to do some regex in C# but for some reason linebreaks are causing my regex to not work. the test string goes like this: string ss = "<tagname something=45678&somethingelse=12345>blah</tagname>\r\n<tag2>stuff</tag2>"; and my regex code is like:
3
4098
rizwan6feb
by: rizwan6feb | last post by:
I am trying to extract php code from a php file (php file also contains html, css and javascript code). I am using the following regex for this <\?*?\?> but this doesn't cater quotation marks (single and double quotes) and comments, i mean how can i skip php tags inside a string (and comments). Please have a look at the following code <?php include("db.php"); $name=$_REQUEST;
1
1879
by: anglaissam | last post by:
I have a regex that is designed to help improve readability for a html document. "(?=((?!<\/?em).)*<\/em>) The purpose of this regex is to escape " marks from within <EM> affected sentences. Example: Before: <P>This "is" <EM>a <STRONG>"Test"</STRONG></EM></P> After: <P>This "is" <EM>a <STRONG></EM>"<EM>Test</EM>"<EM></STRONG></EM></P> Note the regex only affects " inside of <EM> elements. My problem is that i need to modify the regex to...
0
8380
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8296
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8816
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8710
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8598
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7310
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
4150
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2721
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1928
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.