473,587 Members | 2,509 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Help me with a regular expression for PHP

I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_ca llback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.

Oct 29 '06 #1
13 1322
cendrizzi wrote:
So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
You'd better not use regular expressions to validate HTML.
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">
Get the name. Verify it has '{' and '}' (in that order and once only?)

<?php
$name = get_name('<inpu t name="test{0}"> '); // 'test{0}'
if (name_is_valid( $name)) {
// whatever
}

function get_name($html) {
return 'test{0}'; // sorry!
}

function name_is_valid($ name) {
if (($p1 = strpos($name, '{')) === false) return false;
if (strpos($name, '{', $p1+1) !== false) return false;
if (($p2 = strpos($name, '}')) === false) return false;
if (strpos($name, '}', $p2+1) !== false) return false;
return $p1 < $p2;
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 29 '06 #2
It's not for validation. It's for some custom template stuff that
tells my stuff where to store the value of the form element in the
session. That may not make sense but it's what I need for my
application. So I use the ob_start, etc functions and use regular
expressions against the buffer to manipulate the html or change the
behaivor of certain elements. I could just get the name of each
element and check them using strpos or strstr for the '{' character but
I hoped I could use RE to check from the start if it had that so it
wouldn't require the extra string searches.

Hope that makes sense, it's always a bit of a challenge to explain
things clearly, especially if the program is quite a big one.

On Oct 29, 4:17 pm, Pedro Graca <hex...@dodgeit .comwrote:
cendrizzi wrote:
So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/You'd better not use regular expressions to validate HTML.
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.
So this would not match:
<input name="test">
But this would match:
<input name="test{0}"> Get the name. Verify it has '{' and '}' (in that order and once only?)

<?php
$name = get_name('<inpu t name="test{0}"> '); // 'test{0}'
if (name_is_valid( $name)) {
// whatever
}

function get_name($html) {
return 'test{0}'; // sorry!
}

function name_is_valid($ name) {
if (($p1 = strpos($name, '{')) === false) return false;
if (strpos($name, '{', $p1+1) !== false) return false;
if (($p2 = strpos($name, '}')) === false) return false;
if (strpos($name, '}', $p2+1) !== false) return false;
return $p1 < $p2;
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #3
cendrizzi top-posted and totally messed it up:
I hoped I could use RE to check from the start if it had that so it
wouldn't require the extra string searches.

<?php
$data = array(
'<input type="text" name="no!" id="test0" ',
'<input type="text" name="no{!}" id="test0" ',
'<input type="text" name="test0" id="test0" ',
'<input type="text" name="test 0" id="test0" ',
'<input type="text" name="test{0}" id="test0" ',
'<input type="text" name="test {0}" id="test0" ',
'<input type="text" name="test{0}te st" id="test0" ',
'<input type="text" name="test {0} test" id="test0">',
);
$rx = '/<(input|select| textarea)[^>]*' .
# 'name\s*\=\s*\"[_a-zA-Z0-9\s]*\"' . // your original version
'name\s*\=\s*\"[_a-zA-Z0-9\s]*{[_a-zA-Z0-9\s]*}[_a-zA-Z0-9\s]*\"' .
# ---^--- ---^---
'[^>]*>/';
### I think there's a few \ too many in there,
### I didn't look at it very attentively

foreach ($data as $val) {
echo $val, ' :: ';
if (preg_match($rx , $val)) {
echo 'M';
} else {
echo 'No m';
}
echo "atch.\n";
}
?>

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #4

Pedro Graca wrote:
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
I would have to disagree
<input type="text" name="x is invalid: no closing quote around
name value
<y" id="xy" is invalid. y" isn't a valid cname (only
alphanumeric?)

if you want 'x><y' as a value you'd need to use name="x&gt;&lt; y"

Oct 30 '06 #5
I had a similar RE problem and never figured it out, or found an
answer. I basically ended up using two callbacks..or doing the 2nd
check (does it contain "x") in the first callback

Capture and send all name values to the first (whether or not they
contain the {)
check whether or not the name value contains "{" inside that

cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_ca llback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.
Oct 30 '06 #6

cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_ca llback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.

So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/

but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.

So this would not match:
<input name="test">

But this would match:
<input name="test{0}">

Thanks much in advance.
Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.

Oct 30 '06 #7
No I didn't know that \w was the same. What do you mean by proper
capturing. I really am a 2 year old when it comes to RE stuff.

Thanks!

On Oct 29, 10:04 pm, "Chung Leong" <chernyshev...@ hotmail.comwrot e:
cendrizzi wrote:
I have no idea where to get help on RE stuff. Since it's for a PHP app
I thought I would ask here to see if there was some RE pros. Basically
I'm doing some template stuff and I wanted to use a
preg_replace_ca llback function to call another function when the
criteria of the RE expression is matched but have no idea how to
accomplish it.
So I start with this:
/<(input|select| textarea)[^>]*name\s*\=\s*\"[_a-zA-Z0-9\s]*\"[^>]*>/
but need to modify it so it only matches if it has '{' characters in
the name but to not match if it does not.
So this would not match:
<input name="test">
But this would match:
<input name="test{0}">
Thanks much in advance.Well, just change the [_a-zA-Z0-9\s]* part to [\w\s]*{[\w\s]*}. Of
course, you'll need to do proper capturing in order to form the
replacement string.

\w is equivalent to [_a-zA-Z0-9] by the way.
Oct 30 '06 #8
BKDotCom:
Pedro Graca wrote:
The following line is perfectly valid HTML (I think in any version)

<input type="text" name="x><y" id="xy">
Yes, yes it is. In any version.
I would have to disagree
Run it through a validator. You'll find it's valid.

The 'name' attribute is defined as CDATA, so pretty much anything goes
if the attribute value is quoted, including literal less-than and
greater-than signs.
<input type="text" name="x is invalid: no closing quote around
name value
Yes, as a start-tag _in itself_. That wasn't Pedro's example though;
his example was the whole

| <input type="text" name="x><y" id="xy">
<y" id="xy" is invalid. y" isn't a valid cname
As a tag in itself, it is invalid HTML, yes. It isn't invalid as part
of the example above.
(only alphanumeric?)
Generic identifiers (aka, element type names) must begin with upper- or
lowercase letters.
if you want 'x><y' as a value you'd need to use name="x&gt;&lt; y"
No. You only need to replace '<' and '>' with references where they
would be understood as something other than character data.

--
Jock

Oct 30 '06 #9
Chung Leong wrote:
\w is equivalent to [_a-zA-Z0-9] by the way.
It is /almost/ equivalent:

~$ php -r 'echo (preg_match("/^\w+$/", "Graça"))?("yes "):("no"), "\n";'
yes
~$ php -r 'echo (preg_match("/^[_a-zA-Z0-9]+$/", "Graça"))?("yes "):("no"), "\n";'
no

--
I (almost) never check the dodgeit address.
If you *really* need to mail me, use the address in the Reply-To
header with a message in *plain* *text* *without* *attachments*.
Oct 30 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3686
by: Steve | last post by:
Hello, I am writing a script that calls a URL and reads the resulting HTML into a function that strips out everthing and returns ONLY the links, this is so that I can build a link index of various pages. I have been programming in PHP for over 2 years now and have never encountered a problem like the one I am having now. To me this seems like it should be just about the simplest thing in the world, but I must admit I'm stumped BIG TIME!...
5
2513
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL, and then reconstruct another URL based on it. For example, I need to scan a web page looking for something like <a href="some_dir/list_20050815100225.csv">. I don't know in advance what the date/time in the file name will be. I need to take the...
4
3215
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go over each document, find out if it contains a header and/or a footer and extract only the main content part. The headers and the footers have no specific format and I have to detect and remove them using a list of strings that may appear as...
6
489
by: JohnSouth | last post by:
Hi I've been using a Regular expression to test for valid email addresses. It looks like: \w+(\w+)*@\w+(\w+)*\.\w+(\w+)* I've now had 2 occassions where it has rejected and email address with a "&" character in the local part. I know I should be able to work it out myself, but I'd like to ask anyone to suggest the best way to
3
2283
by: Joe | last post by:
Hi, I have been using a regular expression that I don’t uite understand to filter the valid email address. My regular expression is as follows: <asp:RegularExpressionValidator id="valValidEmail" runat="server" ControlToValidate="txtEmail" ValidationExpression="^(+)(\.+)*@(+)(\.+)*(\.{2,4})$"
1
3700
by: Rahul | last post by:
Hi Everybody I have some problem in my script. please help me. This is script file. I have one *.inq file. I want run this script in XML files. But this script errors shows . If u want i am attach this script files and inq files. I cant understand this error. Please suggest me. You can talk with my yahoo id b_sahoo1@yahoo.com. Now i am online. Plz....Plz..Plz...
3
2554
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular expression. ^(.+?) uses (?!a spoon)\.$
6
2224
by: deepak_kamath_n | last post by:
Hello, I am relatively new to the world of regex and require some help in forming a regular expression to achieve the following: I have an input stream similar to: Slot: slot1 Description: this is a description Slot: slot2
14
2254
by: Chris | last post by:
I need a pattern that matches a string that has the same number of '(' as ')': findall( compile('...'), '42^((2x+2)sin(x)) + (log(2)/log(5))' ) = Can anybody help me out? Thanks for any help!
3
1825
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular expression that only returns part of a string. For Example I have a multi-line text fragment like below:
0
7852
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8216
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8349
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8221
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6629
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5719
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
3845
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
1455
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1192
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.