473,320 Members | 2,133 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Help me with a regular expression for PHP

I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.

Oct 29 '06 #1
13 1300
cendrizzi wrote:
So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
You'd better not use regular expressions to validate HTML.
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">
Get the name. Verify it has '{' and '}' (in that order and once only?)

<?php
$name = get_name('<input name="test{0}">'); // 'test{0}'
if (name_is_valid($name)) {
// whatever
}

function get_name($html) {
return 'test{0}'; // sorry!
}

function name_is_valid($name) {
if (($p1 = strpos($name, '{')) === false) return false;
if (strpos($name, '{', $p1+1) !== false) return false;
if (($p2 = strpos($name, '}')) === false) return false;
if (strpos($name, '}', $p2+1) !== false) return false;
return $p1 < $p2;
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 29 '06 #2
It's not for validation. It's for some custom template stuff that
tells my stuff where to store the value of the form element in the
session. That may not make sense but it's what I need for my
application. So I use the ob_start, etc functions and use regular
expressions against the buffer to manipulate the html or change the
behaivor of certain elements. I could just get the name of each
element and check them using strpos or strstr for the '{' character but
I hoped I could use RE to check from the start if it had that so it
wouldn't require the extra string searches.

Hope that makes sense, it's always a bit of a challenge to explain
things clearly, especially if the program is quite a big one.

On Oct 29, 4:17 pm, Pedro Graca <hex...@dodgeit.comwrote:
cendrizzi wrote:
So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/You'd better not use regular expressions to validate HTML.
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.
So this would not match:
<input name="test">
But this would match:
<input name="test{0}">Get the name. Verify it has '{' and '}' (in that order and once only?)

<?php
$name = get_name('<input name="test{0}">'); // 'test{0}'
if (name_is_valid($name)) {
// whatever
}

function get_name($html) {
return 'test{0}'; // sorry!
}

function name_is_valid($name) {
if (($p1 = strpos($name, '{')) === false) return false;
if (strpos($name, '{', $p1+1) !== false) return false;
if (($p2 = strpos($name, '}')) === false) return false;
if (strpos($name, '}', $p2+1) !== false) return false;
return $p1 < $p2;
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #3
cendrizzi top-posted and totally messed it up:
I hoped I could use RE to check from the start if it had that so it
wouldn't require the extra string searches.

<?php
$data = array(
'<input type="text" name="no!" id="test0" ',
'<input type="text" name="no{!}" id="test0" ',
'<input type="text" name="test0" id="test0" ',
'<input type="text" name="test 0" id="test0" ',
'<input type="text" name="test{0}" id="test0" ',
'<input type="text" name="test {0}" id="test0" ',
'<input type="text" name="test{0}test" id="test0" ',
'<input type="text" name="test {0} test" id="test0">',
);
$rx = '/<(input|select|textarea)[^>]*' .
# 'name\s*\=\s*\"[_a-zA-Z0-9\s]*\"' . // your original version
'name\s*\=\s*\"[_a-zA-Z0-9\s]*{[_a-zA-Z0-9\s]*}[_a-zA-Z0-9\s]*\"' .
# ---^--- ---^---
'[^>]*>/';
### I think there's a few \ too many in there,
### I didn't look at it very attentively

foreach ($data as $val) {
echo $val, ' :: ';
if (preg_match($rx, $val)) {
echo 'M';
} else {
echo 'No m';
}
echo "atch.\n";
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #4

Pedro Graca wrote:
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
I would have to disagree
<input type="text" name="x is invalid: no closing quote around
name value
<y" id="xy" is invalid. y" isn't a valid cname (only
alphanumeric?)

if you want 'x><y' as a value you'd need to use name="x&gt;&lt;y"

Oct 30 '06 #5
I had a similar RE problem and never figured it out, or found an
answer. I basically ended up using two callbacks..or doing the 2nd
check (does it contain "x") in the first callback

Capture and send all name values to the first (whether or not they
contain the {)
check whether or not the name value contains "{" inside that

cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.
Oct 30 '06 #6

cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.
Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.

Oct 30 '06 #7
No I didn't know that \w was the same. What do you mean by proper
capturing. I really am a 2 year old when it comes to RE stuff.

Thanks!

On Oct 29, 10:04 pm, "Chung Leong" <chernyshev...@hotmail.comwrote:
cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.
So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.
So this would not match:
<input name="test">
But this would match:
<input name="test{0}">
Thanks much in advance.Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.
Oct 30 '06 #8
BKDotCom:
Pedro Graca wrote:
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
Yes, yes it is. In any version.
I would have to disagree
Run it through a validator. You'll find it's valid.

The 'name' attribute is defined as CDATA, so pretty much anything goes
if the attribute value is quoted, including literal less-than and
greater-than signs.
<input type="text" name="x is invalid: no closing quote around
name value
Yes, as a start-tag _in itself_. That wasn't Pedro's example though;
his example was the whole

| <input type="text" name="x><y" id="xy">
<y" id="xy" is invalid. y" isn't a valid cname
As a tag in itself, it is invalid HTML, yes. It isn't invalid as part
of the example above.
(only alphanumeric?)
Generic identifiers (aka, element type names) must begin with upper- or
lowercase letters.
if you want 'x><y' as a value you'd need to use name="x&gt;&lt;y"
No. You only need to replace '<' and '>' with references where they
would be understood as something other than character data.

--
Jock

Oct 30 '06 #9
Chung Leong wrote:
\w is equivalent to [_a-zA-Z0-9] by the way.
It is /almost/ equivalent:

~$ php -r 'echo (preg_match("/^\w+$/", "Graça"))?("yes"):("no"), "\n";'
yes
~$ php -r 'echo (preg_match("/^[_a-zA-Z0-9]+$/", "Graça"))?("yes"):("no"), "\n";'
no

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #10
BKDotCom wrote:
Pedro Graca wrote:
>>The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">


I would have to disagree
<input type="text" name="x is invalid: no closing quote around
name value
<y" id="xy" is invalid. y" isn't a valid cname (only
alphanumeric?)

if you want 'x><y' as a value you'd need to use name="x&gt;&lt;y"
Actually, it is legal. name="x><y" is a perfectly valid tag and value.
&lt; and &gt; aren't required here because they are within a quoted
string in a tag.

You do need &lt; and &gt; in plain text, however, when they may be
mistaken for the start/end of a tag.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Oct 30 '06 #11

cendrizzi wrote:
No I didn't know that \w was the same. What do you mean by proper
capturing. I really am a 2 year old when it comes to RE stuff.

Thanks!

On Oct 29, 10:04 pm, "Chung Leong" <chernyshev...@hotmail.comwrote:
cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_callback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.
So I start with this:
/<(input|select|textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.
So this would not match:
<input name="test">
But this would match:
<input name="test{0}">
Thanks much in advance.Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.
By that I mean you need to grab the substrings which precedes and
follows the text inside the quotation marks. If the input is

<input name="test{0}" size="40">

you'd want

<input name="

and

" size="40">

so that you can form the replacement <input name=" + DATA + "
size="40">.

Presumably you'd want 'test' and '0' as well for looking up the data.

Oct 30 '06 #12

John Dunlop wrote:
Run it through a validator. You'll find it's valid.
Will I?

http://validator.w3.org/check
Warning character "<" is the first character of a delimiter but
occurred as data
This message may appear in several cases:
* You tried to include the "<" character in your page: you should
escape it as "&lt;"
* You used an unescaped ampersand "&": this may be valid in some
contexts, but it is recommended to use "&amp;", which is always safe.
* Another possibility is that you forgot to close quotes in a
previous tag.

Oct 30 '06 #13
On 30 Oct 2006 09:02:18 -0800, "BKDotCom" <bk***********@yahoo.comwrote:
>John Dunlop wrote:
>Run it through a validator. You'll find it's valid.

Will I?
You certainly should. I've just tried it against the W3C validator, and it
agreed it's valid.
>http://validator.w3.org/check
Warning character "<" is the first character of a delimiter but
occurred as data
This message may appear in several cases:
* You tried to include the "<" character in your page: you should
escape it as "&lt;"
* You used an unescaped ampersand "&": this may be valid in some
contexts, but it is recommended to use "&amp;", which is always safe.
* Another possibility is that you forgot to close quotes in a
previous tag.
Result: Passed validation
File: test.html
Encoding: iso-8859-1
Doctype: HTML 4.01 Transitional
This Page Is Valid HTML 4.01 Transitional!

Here's what I uploaded:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>test</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<form method="post" action="test.php">
<input type="text" name="x><y" id="xy">
</form>
</body>
</html>

(the <metabeing there because I validated it by upload rather than from a
real site that would have sent the relevant HTTP header instead)

What did you upload?

--
Andy Hassall :: an**@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Oct 30 '06 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Steve | last post by:
Hello, I am writing a script that calls a URL and reads the resulting HTML into a function that strips out everthing and returns ONLY the links, this is so that I can build a link index of various...
5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
6
by: JohnSouth | last post by:
Hi I've been using a Regular expression to test for valid email addresses. It looks like: \w+(\w+)*@\w+(\w+)*\.\w+(\w+)* I've now had 2 occassions where it has rejected and email address...
3
by: Joe | last post by:
Hi, I have been using a regular expression that I don’t uite understand to filter the valid email address. My regular expression is as follows: <asp:RegularExpressionValidator...
1
by: Rahul | last post by:
Hi Everybody I have some problem in my script. please help me. This is script file. I have one *.inq file. I want run this script in XML files. But this script errors shows . If u want i am...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
6
by: deepak_kamath_n | last post by:
Hello, I am relatively new to the world of regex and require some help in forming a regular expression to achieve the following: I have an input stream similar to: Slot: slot1 Description:...
14
by: Chris | last post by:
I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = Can anybody help me out? Thanks for any help!
3
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.