473,799 Members | 3,276 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

preg_match doesn't work properly!?

I might have found a problem with how preg_match works though I'm not
sure.
Lets say you have a regular expression that you want to match a string
of numbers. You might write the code like this:
preg_match( '/^[0-9]+$/', $TestString );

OK everything seems fine. However, did you know if you pass the
following to preg_match: "12345\n" it will return that a match
occurred?!? Even though the newline is not a valid character in our
regular expression.

Here is the test program, *please run the program as written below*:

<?php
$TestString = "12345\n";
print preg_match( '/^[0-9]+$/', $TestString );
?>

You will find it prints 1 even though the newline character isn't a
valid part of our regular expression. What other characters I wonder
can be put in a regular expression and have the string match!? Any
ideas on this? Why is this undocumented behavior present in PHP?!?
For regular expressions to not work as expected or documented seems
like a pretty serious bug in PHP. I don't think there is a problem
with the regular expression.

Thoughts?
Jun 2 '08 #1
13 5374
I found this link about the topic:
http://blog.php-security.org/archive...h-filters.html

Apparently '$' isn't the end of the string unless you add the 'D' to
the end as in:
print preg_match( '/^[0-9]+$/D', $TestString );

The page says 'even documented in the PHP manual is that $...' however
I looked at the preg_match page on php.net and there is no mention of
this or the /D switch either. Any ideas what the author was referring
too?

I am new to PHP but I would certainly consider this a 'gotcha'
especially since it is relatively undocumented.
Jun 2 '08 #2
ch************* ****@yahoo.com wrote:
I might have found a problem with how preg_match works though I'm not
sure.
Lets say you have a regular expression that you want to match a string
of numbers. You might write the code like this:
preg_match( '/^[0-9]+$/', $TestString );

OK everything seems fine. However, did you know if you pass the
following to preg_match: "12345\n" it will return that a match
occurred?!? Even though the newline is not a valid character in our
regular expression.

Here is the test program, *please run the program as written below*:

<?php
$TestString = "12345\n";
print preg_match( '/^[0-9]+$/', $TestString );
?>

You will find it prints 1 even though the newline character isn't a
valid part of our regular expression. What other characters I wonder
can be put in a regular expression and have the string match!? Any
ideas on this? Why is this undocumented behavior present in PHP?!?
For regular expressions to not work as expected or documented seems
like a pretty serious bug in PHP. I don't think there is a problem
with the regular expression.

Thoughts?
'/^[0-9]+$/D'

http://nl2.php.net/manual/en/referen....modifiers.php
D (PCRE_DOLLAR_EN DONLY)
If this modifier is set, a dollar metacharacter in the pattern matches
only at the end of the subject string. Without this modifier, a dollar
also matches immediately before the final character if it is a newline
(but not before any other newlines). This modifier is ignored if m
modifier is set. There is no equivalent to this modifier in Perl.
Yes, I also think this is weird. If I want to match for newlines, I'll
match for newlines :).
--
Rik Wasmus
....spamrun finished
Jun 2 '08 #3
ch************* ****@yahoo.com wrote:
>I might have found a problem with how preg_match works though I'm not
sure.
Lets say you have a regular expression that you want to match a string
of numbers. You might write the code like this:
preg_match( '/^[0-9]+$/', $TestString );

OK everything seems fine. However, did you know if you pass the
following to preg_match: "12345\n" it will return that a match
occurred?!? Even though the newline is not a valid character in our
regular expression.
Yes, I did, but only because that's what it says in the manual:
D (PCRE_DOLLAR_EN DONLY)

If this modifier is set, a dollar metacharacter in the pattern matches only
at the end of the subject string. Without this modifier, a dollar also
matches immediately before the final character if it is a newline (but not
before any other newlines). This modifier is ignored if m modifier is set.
There is no equivalent to this modifier in Perl.
Here is the test program, *please run the program as written below*:

<?php
$TestString = "12345\n";
print preg_match( '/^[0-9]+$/', $TestString );
?>

You will find it prints 1 even though the newline character isn't a
valid part of our regular expression. What other characters I wonder
can be put in a regular expression and have the string match!? Any
ideas on this? Why is this undocumented behavior present in PHP?!?
It isn't since it is documented.
For regular expressions to not work as expected or documented seems
like a pretty serious bug in PHP.
If this was the case then I would agree. However since the cause is not that
it is not in the documentation, but simply that you did not read it in the
documentation.. ...
I don't think there is a problem
with the regular expression.
Neither do I.
Jun 2 '08 #4
In our last episode,
<15************ *************** *******@k30g200 0hse.googlegrou ps.com>, the
lovely and talented ch************* ****@yahoo.com broadcast on
comp.lang.php:
I might have found a problem with how preg_match works though I'm not
sure. Lets say you have a regular expression that you want to match a
string of numbers. You might write the code like this: preg_match(
'/^[0-9]+$/', $TestString );
OK everything seems fine. However, did you know if you pass the
following to preg_match: "12345\n" it will return that a match
occurred?!?
Right, because it did.
Even though the newline is not a valid character in our regular
expression.
Doesn't matter. The whole expression matches before the newline.
Here is the test program, *please run the program as written below*:
><?php
$TestString = "12345\n";
print preg_match( '/^[0-9]+$/', $TestString );
?>
You will find it prints 1 even though the newline character isn't a
valid part of our regular expression.
It returns 1 (a match exists) because all of the pattern is found
in $TestString. That is how perl regular expressions work.

preg_match('/dog/','catisnotadog bubba')

matches because all of 'dog' is in 'catisnotadogbu bba'.
What other characters I wonder can be put in a regular expression and have
the string match!?
You can put just about anything in if the pattern matches some part of the
string.
Any ideas on this? Why is this undocumented behavior present in PHP?!?
Of course it is not undocumented. The manuel page makes it perfectly clear
what a match consists of.
For regular expressions to not work as expected or documented seems
like a pretty serious bug in PHP. I don't think there is a problem
with the regular expression.
There isn't. There is a serious problem in your understanding of what a
match is --- or possibly what $ means in a perl regular expression. You
do know the p in preg_match means perl.
Thoughts?
man perlre

--
Lars Eighner <http://larseighner.com/us****@larseigh ner.com
Countdown: 237 days to go.
Jun 2 '08 #5
Lars Eighner a écrit :
There isn't. There is a serious problem in your understanding of what a
match is --- or possibly what $ means in a perl regular expression. You
do know the p in preg_match means perl.
First, we're not talking about Perl, but PHP function "preg_repla ce",
which use PCRE syntax, and not Perl syntax.

Second, PCRE (just like Perl actually O_o) defines ^ and $ as being
start and end of string/line (cf.
http://www.pcre.org/pcre.txt "PCRE_MULTILINE ") (Perl defines them as
start/end of string and start/end of line if used with /m).
POSIX doesn't define them, but that's not the point here.

Pattern ^[0-9]+$ should not match, because in "12345\n" there is a "\n"
between the last number and the end of string, basically "between the
plus and the dollar".

Regards,
--
Guillaume
Jun 2 '08 #6
On Tue, 27 May 2008 18:47:07 +0200, Lars Eighner <us****@larseig hner.com
wrote:
In our last episode,
<15************ *************** *******@k30g200 0hse.googlegrou ps.com>, the
lovely and talented ch************* ****@yahoo.com broadcast on
comp.lang.php:
>I might have found a problem with how preg_match works though I'm not
sure. Lets say you have a regular expression that you want to match a
string of numbers. You might write the code like this: preg_match(
'/^[0-9]+$/', $TestString );
>OK everything seems fine. However, did you know if you pass the
following to preg_match: "12345\n" it will return that a match
occurred?!?

Right, because it did.
>Even though the newline is not a valid character in our regular
expression.

Doesn't matter. The whole expression matches before the newline.
>Here is the test program, *please run the program as written below*:
><?php
$TestString = "12345\n";
print preg_match( '/^[0-9]+$/', $TestString );
?>
>You will find it prints 1 even though the newline character isn't a
valid part of our regular expression.

It returns 1 (a match exists) because all of the pattern is found
in $TestString. That is how perl regular expressions work.

preg_match('/dog/','catisnotadog bubba')
<SNIPPED more>

With all due respect, you're talking nonsense. You appartently missed that
the match is anchored to the start & end of string. Nothing of your story
has any relevance to the op's problem (which he already googled & solved
himself just before I answered him :) ).
--
Rik Wasmus
....spamrun finished
Jun 2 '08 #7
>You do know the p in preg_match means perl.

Well I come from a Perl background and that's where the original
misunderstandin g came from. Assuming preg_match operated like a Perl
regular expression (how stupid could I be?) in a function named after
Perl...

I now submit that preg_match should really be named
klpbnratagybrtd cidreg_match which stands for:
"Kinda Like Perl But Not Really There Are Gotchas You Better Read The
Documentation In Detail regular expression" matching. Though maybe
others have ideas for a shorter name. :)

Chad. :)
Jun 2 '08 #8
Actually, I have to correct myself! Much to my surprise this is
actually how Perl works after I tried it out. As documented here:
http://www.regular-expressions.info/anchors.html

So in Perl:

my $x = "12345\n";
if ( $x =~ /^[0-9]+$/ )
{
print 1;
}
else
{
print 0;
}

Prints 1 whereas:

$x = "12345\n";
if ( $x =~ /^[0-9]+\z/ )
{
print 1;
}
else
{
print 0;
}

Prints 0. So I guess preg_match is a good name... :)
Jun 2 '08 #9
In our last episode, <g1**********@b iggoron.nerim.n et>, the lovely and
talented Guillaume broadcast on comp.lang.php:
Lars Eighner a écrit :
>There isn't. There is a serious problem in your understanding of what a
match is --- or possibly what $ means in a perl regular expression. You
do know the p in preg_match means perl.
First, we're not talking about Perl, but PHP function "preg_repla ce",
which use PCRE syntax, and not Perl syntax.
Second, PCRE (just like Perl actually O_o) defines ^ and $ as being start
and end of string/line (cf. http://www.pcre.org/pcre.txt "PCRE_MULTILINE ")
(Perl defines them as start/end of string and start/end of line if used
with /m). POSIX doesn't define them, but that's not the point here.
Pattern ^[0-9]+$ should not match, because in "12345\n" there is a "\n"
between the last number and the end of string, basically "between the
plus and the dollar".
This is absurd. $ matches the end of the line. You see that is why a
"newline" is called a newline. It is after the end of the line.
--
Lars Eighner <http://larseighner.com/us****@larseigh ner.com
Countdown: 237 days to go.
Jun 2 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2536
by: Station Media | last post by:
Hi, here my problem, i use stored procedure in ACCESS database(latest version), and i would like to use this procedure but it doesnt work, do you know why ? Procedure: PARAMETERS MyField Text ( 255 ); SELECT DISTINCT MyField AS Result FROM tb_Client;
3
7226
by: Clouds | last post by:
Hi ! How do I add the dynamic event handler for a dropdownlist present in the itemtemplate of a datalist !! I am doing it in the itemdatabound event of the datalist but it doesnt work... I am also setting the autopostback property to true for the dropdown list and it works but the handler doesnt get invoked at runtime... I have to do it in itemdatabound becaz whether to add the handler or not is driven based on the information which i...
2
2832
by: effendi | last post by:
Hi I tested the following function in Safari and it doesnt work. This is tested fine in IE. function processOutcome(){ mainDatabase=document.forms.AssessDatabase.value var oCheckboxs=document.forms.TeamID if (oCheckboxs=="undefined" )
0
1647
by: Juna | last post by:
I have been working in vs2003, but now started to work in vs2005 but the problem, I have simple web application not website, which work i mean open in browser when we press F5 or run the application by pressing run button. but it doesnt work when we type the path directly in the browser's address bar eg. http://localhost/testapplicationdefault.aspx error it gives is "COULD NOT LOAD testapplication._default". IN default.aspx. While in...
1
2082
Digital Don
by: Digital Don | last post by:
I am writing a program for Peg solitaire... To check for no repetition of previous states I use a Set for storage of Board states.. The pronblem is when I declare the set as type char i.e. set <char> open it doesnt take any char values into it and says cannot convert char to const char... I wonder why is it converting char to const char when I have declared that it shud be of type char...It doesnt work with int either,.. Please help...
3
2986
by: jx2 | last post by:
hi guys i would appriciate your coments on this code - when i ran it for the very first time it doesnt see @last = LAST_INSERT_ID() but when i ran it next time it read it properly i need to know it imiedietely after i insert value into session1... is there any other way to do it? insert into 2 tables at the same time ...? if(!($sessid)){ session_register('sessid'); mysql_query("insert into session1 set sessid=null,
1
1601
by: Dany13 | last post by:
hi all. i using some text box for input value and some localvarible for passing this data to dataset . give instance for correct row of dataset and data in data table . use one gird view for showing curent data in dataset . in end i am calling update metod to insert data in sql database but this metod doesnt work correctly. at all doesnt work. but givenot any error . fill data from my database (work propebly) ...
20
2881
by: Hush | last post by:
Hi, The following code works fine in IE7 but FF returns with an error: Access denied to achieve the property Element.firstChild. In this line: nodes = xmlDoc.documentElement.childNodes; My code:
9
1703
by: AGP | last post by:
I've been scratching my head for weeks to understand why some code doesnt work for me. here is what i have: dim sVal as string = "13.2401516" dim x as double x = sVal debug.writeline ( x)
0
9687
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
10228
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10027
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9072
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7565
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6805
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5463
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5585
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3759
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.