473,503 Members | 1,735 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regex question - can't get unclosed quote marks inside of HTML tags

I do not know much about regex.

I'm worried about lines like this:

<a href="myFile>my file</a>

There is only one quote mark in that html.

I wanted to fix this problem, so I tried this:
function command($string=false) {
$pattern = '/(.*)<a (.*)"(.*)>/i';
$replacement = '$1<$2"$3">';
$newString = preg_replace($pattern, $replacement, $string);
return $newString;
}
This finds no matches. Even when I feed it the above as a test line.

What have I done wrong.

Dec 5 '05 #1
5 2194
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

Dec 5 '05 #2

de**********@gmail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.


I see. So if I have a string like this:

<a href="myfile">myfile</a>

and I feed it to this function:

function command($string=false) {
$pattern = '/(.*)<a (.*)"(.*)>/i';
$replacement = '$1<$2"$3">';
$newString = preg_replace($pattern, $replacement,
$string);
return $newString;
}

I will get a match? But does that mean I'll end up with this:

<a href="myfile"">myfile</a>

With an extra quote mark? That is not the problem I was having.

But you are saying that I need to replace the final (.*) with something
that says "everything but a quote mark"?

Dec 6 '05 #3

de**********@gmail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

So I want this?

function command($string=false) {

$pattern = '/(.*)<a (.*)"[^"]+>/i';
$replacement = '$1<a $2"$3">';
$newString = preg_replace($pattern, $replacement, $string);
return $newString;
return $string;
}

Dec 6 '05 #4

de**********@gmail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.


Thanks for the reply. I just created this file and I uploaded it to my
server for testing:

<?php

echo "hey";
flush();

function command($string=false) {
echo "hey";
$pattern = '/(.*)<a (.*)"[^"]+>/i';
$replacement = '$1<a $2"$3">';
// $newString = preg_replace($pattern, $replacement, $string);
return $newString;
}

$string = "<p><a href=\"myfile>myfile</a> ";
echo $string;
flush();
$string = command($string);
echo $string;

?>
When I don't comment out the preg_replace line then I get one "hey"
echoed to the screen and then the script apparently dies without error.

Dec 6 '05 #5

de**********@gmail.com wrote:
This isn't really a solution to your problem... just a hint of what's
wrong; a quote is also a character... i.e. in the expression (.*) will
also match a quote.
What you want to look into is negative lookarounds; (?<!x)y matches an
'y' not preceeded by a 'x' and x(?!y) matches a 'x' not followed by an
'y'.
But if you're not a regexp-ninja I'd recommend you to find an easier
solution... negative lookarounds aint trivial.

Well, okay, for anyone interested, I worked it out and found the
correct pattern is this:

<a ([^">]+)"([^">]+)>

Dec 6 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3696
by: Wayfarer | last post by:
Hi, I've been developing in Firebird, so I didn't notice this until today. When I use the <q> tag, IE 6 doesn't put quotes around the quotation. I tried specifically doing it in a stylesheet: ...
8
1465
by: G. Stewart | last post by:
The objective is to extract the first n characters of text from an HTML block. I wish to preserve all HTML (links, formatting etc.), and at the same time, extend the size of the block to ensure...
3
2330
by: Craig Kenisston | last post by:
I have the sudden need to split a text that may have any of the following tokens : Words with quotes or double quotes. Words with no quotes at all. Numbers with and without decimal points, no...
13
2341
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag....
2
2863
by: Tim_Mac | last post by:
hi, i have a tricky problem and my regex expertise has reached its limit. i have read other posts on this newsgroup that pull out the plain text from a html string, but that won't work for me...
9
2070
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat...
7
2408
by: MrNobody | last post by:
I'm trying to do some regex in C# but for some reason linebreaks are causing my regex to not work. the test string goes like this: string ss = "<tagname...
3
4090
rizwan6feb
by: rizwan6feb | last post by:
I am trying to extract php code from a php file (php file also contains html, css and javascript code). I am using the following regex for this <\?*?\?> but this doesn't cater quotation marks...
1
1866
by: anglaissam | last post by:
I have a regex that is designed to help improve readability for a html document. "(?=((?!<\/?em).)*<\/em>) The purpose of this regex is to escape " marks from within <EM> affected sentences....
0
7199
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7076
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7274
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7323
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7453
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5005
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4670
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3162
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1507
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.