473,486 Members | 1,984 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

regexp test function behavior

I coudn't understand some behavior of RegExp.test function.

Example html code:
----------------
<html><head></head><body><script type="text/javascript">
var r = /^https?:\/\//g;
document.write( [
r.test('http://a'),
r.test('http://b'),
r.test('http://c'),
r.test('http://d')
]);
</script></body></html>
---------------------

The page displays true, false, true, false. (in Opera, Firefox and IE)
This is strange because I expected it would display true, true, true,
true. There must be something I didn't know about the function
RegExp.test.

Oct 29 '06 #1
11 2896
HopfZ wrote:

[snip]
var r = /^https?:\/\//g;
document.write( [
r.test('http://a'),
r.test('http://b'),
r.test('http://c'),
r.test('http://d')
]);
[snip]
The page displays true, false, true, false. (in Opera, Firefox and IE)
This is strange because I expected it would display true, true, true,
true. There must be something I didn't know about the function
RegExp.test.
The global flag is the cause of your confusion. It doesn't even make
sense for it to be included: you're using an expression with an
input-start assertion[1] (^) and that could only ever match once.

The RegExp.prototype.test method is equivalent to the expression,

re.exec(str) != null

and the global flag is significant when the RegExp.prototype.exec method
is used. After a match, the lastIndex property of the regular expression
object is modified to point just beyond the end of the previously
matched sub-string. On the next invocation of the exec method, this
position is used to begin the next search.

At the end of the first call, the lastIndex property will point beyond
the end of the match (to the character, 'a'). Whilst attempting to match
the input-start assertion (^) in the second call, the assertion will
fail (the match is attempted after the start of the string). These
attempts will continue until the end of the string is reached, at which
point the lastIndex property is reset to zero and null is returned. With
the lastIndex property reset, the third call can proceed normally like
the first. The fourth call will be a repeat of the second.

Mike
[1] With the multi-line flag, it also acts as a line-start
assertion, but that doesn't apply here.
Oct 29 '06 #2
Michael Winter wrote on 29 okt 2006 in comp.lang.javascript:
The global flag is the cause of your confusion. It doesn't even make
sense for it to be included: you're using an expression with an
input-start assertion[1] (^) and that could only ever match once.
Even more so, setting the global flag in a test() never makes any sense.
At the end of the first call, the lastIndex property will point beyond
the end of the match
A good explanation.

Even so it is a bug!!!!

The global flag should either lead to an error,
or be disregarded in test().

===================================

Testing:

<script type='text/javascript'>

// IE7 tested

var r = /x/g;
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write('<br>');
r = /x/g;
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true

document.write('<br>');
r = /x/;
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // true
</script>

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Oct 29 '06 #3
Evertjan. wrote:

[snip]
Even more so, setting the global flag in a test() never makes any
sense.
Never? I don't know about that. Rare, certainly.

For example, one way to count the number of occurrences of a pattern
within a string is to use the String.prototype.match method[1]:

var result = string.match(regExp),
count = result ? result.length : 0;

where regExp is a regular expression object with the global flag set.
However, one could also do the same with the RegExp.prototype.test method:

function countMatches(string, pattern) {
var count = 0,
index = pattern.lastIndex = 0;

while (pattern.test(string)) {
++count;
if (pattern.lastIndex == index) {++pattern.lastIndex;}
}
return count;
}

var count = countMatches(string, regExp);

Marginally easier to use, and slightly more efficient in Fx and Op - the
test was simple: a case-sensitive, single character search. Even though
it's slower in MSIE, performance is still better than in either Fx or Op.

[snip]
Even so it is a bug!!!!

The global flag should either lead to an error,
or be disregarded in test().
Not at all. The blame would fall on the developer who used a global flag
where it didn't belong, or failed to reset the lastIndex property after
a previous invocation.

[snip]

Mike
[1] Browsers return null from the RegExp.prototype.match method
when both the global flag is set for the regular expression
object and no matches are found. It seems to me that
15.5.4.10, ECMA-262 3rd Ed. would call for an empty array.
Not that big a deal, but it would make the example above a
bit simpler.
Oct 30 '06 #4
Michael Winter wrote on 30 okt 2006 in comp.lang.javascript:
Evertjan. wrote:

[snip]
>Even more so, setting the global flag in a test() never makes any
sense.

Never? I don't know about that. Rare, certainly.
Never!
For example, one way to count the number of occurrences of a pattern
within a string is to use the String.prototype.match method[1]:

var result = string.match(regExp),
count = result ? result.length : 0;
Michael, I said: in "test()"

"match()" is not "test()"

[snap]
>
[snip]
>Even so it is a bug!!!!

The global flag should either lead to an error,
or be disregarded in test().

Not at all.
But it is.

The subject of this thread is:
"regexp test function behavior"
not:
"regexp match function behavior"

"match()" is not "test()"

[snap]
[1] Browsers return null from the RegExp.prototype.match method
[snip]

"match()" is not "test()"

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Oct 30 '06 #5
Evertjan. wrote:
Michael Winter wrote on 30 okt 2006 in comp.lang.javascript:
>Evertjan. wrote:

[snip]
>>Even more so, setting the global flag in a test() never makes any
sense.

Never? I don't know about that. Rare, certainly.

Never!
Care to state a reason?

The information that can be gleaned from using the RegExp.prototype.test
method in this way is limited[1], which is why such usage would be rare.
However, that is a far cry from claiming that it makes no sense. Indeed,
my previous post demonstrated a reasonable use.

The point of the global flag is to allow repetitive processing, where
the lastIndex property indicates the position from which the next
invocation starts. This would allow the test method to assert that there
is more than one match, or even that one begins after a certain point
should the lastIndex property be set explicitly. If that's all that's
required, then there's no need to use a method that would return more
information (and be wasteful, in the process).
>For example, one way to count the number of occurrences of a pattern
within a string is to use the String.prototype.match method[1]:

var result = string.match(regExp),
count = result ? result.length : 0;

Michael, I said: in "test()"
I read what you wrote.
"match()" is not "test()"
If you read past the part that you quoted, you would notice that I go on
to present an equivalent using the test method and a regular expression
with the global flag set. Mentioning the String.prototype.match method
was merely a comparison.
>>Even so it is a bug!!!!

The global flag should either lead to an error,
or be disregarded in test().

Not at all.
I wrote a little more than that.
But it is.
Again, would you like to actually provide an explanation?

The test method is /defined/ in terms of the exec method; it is the
behavioural equivalent of:

regExp.exec(string) == null

/including/ all of the side effects that the exec method introduces. The
method should be used with that in mind, and if it's not, then it's the
fault of the developer and nobody else.

Note that an implementation doesn't have to use that exact expression.
Instead, it might copy the algorithm of the exec method (see 15.10.6.2),
except returning false instead of null in step 6, and returning true
instead of steps 12 and 13. This would save some time whilst providing
the same behaviour, however this latter issue is the most significant.
The subject of this thread is:
"regexp test function behavior"
not:
"regexp match function behavior"
I know. I answered the OP's question, did I not? Even so, threads drift.
"match()" is not "test()"
I hope you're going to feel a little silly now after banging on about
that so irrationally.
>[1] Browsers return null from the RegExp.prototype.match method

[snip]

"match()" is not "test()"
That comment was an aside, which was why I presented it as an endnote.

Mike
[1] As far as I can see, only three facts can be obtained:

1. Whether the string matched the pattern (the return value
of the method itself),
2. The location of first character to follow the match just
obtained (the value of the lastIndex property), and
3. Whether the pattern matched a zero-length string (the
lastIndex property will not have changed).
Oct 30 '06 #6
Michael Winter wrote on 30 okt 2006 in comp.lang.javascript:
I wrote a little more than that.
>But it is.

Again, would you like to actually provide an explanation?

The test method is /defined/ in terms of the exec method; it is the
behavioural equivalent of:

regExp.exec(string) == null
That is neither here nor there. A method is not defined as a behavioural
equivalent, it's behaviour is described. It's implementation could be
defined as a behavioural equivalent, but that could make the method
buggy,
as it does in this case.

Having a global flag in a test makes no sense, since the result is
stable at the first match, and further searching should be aborted.

The possible "defining" of test() in the sense of having a search
starting point left over by an earlier test(), only if the regex string
variable is not refreshed, is so strange, we can only call that a bug.
I know. I answered the OP's question, did I not? Even so, threads
drift.
You specificly said that my assertion was wrong, by stating an unrelated
code, not using test() but match().
>"match()" is not "test()"

I hope you're going to feel a little silly now after banging on about
that so irrationally.
Shall we keep on subject, Michael, or do you feel attacked in person?
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Oct 30 '06 #7
Evertjan. wrote:
Michael Winter wrote on 30 okt 2006 in comp.lang.javascript:
[snip]
>The test method is /defined/ in terms of the exec method; it is the
behavioural equivalent of:

regExp.exec(string) == null
A typo on my part: the comparison operator should be not-equal (!=), of
course.
That is neither here nor there.
How so? It is a very succinct description of the behaviour of the method.
A method is not defined as a behavioural equivalent, it's behaviour
is described.
And it is: if the exec method were to return null or undefined, the test
method should return false. By examining the algorithm for the former,
one can ascertain precisely what is returned, where, and for what
reason, and how to modify the process to return booleans instead.
It's implementation could be defined as a behavioural equivalent, but
that could make the method buggy, as it does in this case.
I fail to see how.
Having a global flag in a test makes no sense, since the result is
stable at the first match, and further searching should be aborted.
That depends on what the test method is meant to do. Clearly, you have
decided upon a very limited definition. That does not make something the
language faulty; it means that your expectations are. The global flag
changes the behaviour of several methods related to regular expressions,
so it should only be used where that behaviour is desired.
The possible "defining" of test() in the sense of having a search
starting point left over by an earlier test(), only if the regex
^^^^^^^^^^^^^^^^^
string variable is not refreshed, is so strange, we can only call
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
that a bug.
What?

[snip]
You specificly said that my assertion was wrong, by stating an
unrelated code, not using test() but match().
If you really believe that, you didn't read my second post properly. In
fact, that would seem to indicate that you didn't read the previous one
properly, either, where I wrote (emphasis added):

... I go on to present an equivalent *using the test method*
and a regular expression with the global flag set. Mentioning
the String.prototype.match method was merely a comparison.

[snip]

Mike
Oct 31 '06 #8
Michael Winter wrote on 31 okt 2006 in comp.lang.javascript:
>The possible "defining" of test() in the sense of having a search
starting point left over by an earlier test(), only if the regex
^^^^^^^^^^^^^^^^^
>string variable is not refreshed, is so strange, we can only call
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>that a bug.

What?
As I wrote about before in this thread:

var r = /x/g;
// r, the regex string variable. will not be refreshed here:
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write('<br>');
r = /x/g;
// r, will NOW be refreshed every time:
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true
r = /x/g;
document.write(r.test('x')+'<br>'); // true
// if a litteral regex string is used, that works like refreshed:
document.write(/x/g.test('x')+'<br>'); // true
document.write(/x/g.test('x')+'<br>'); // true
document.write(/x/g.test('x')+'<br>'); // true

[..]
... I go on to present an equivalent *using the test method*
and a regular expression with the global flag set. Mentioning
the String.prototype.match method was merely a comparison.
You could perhaps be correctly explaining the behavour of test(),
I still fail to see why an explanation of a behavour of a js-method does
prevent that behavour to be a bug.

I certainly helps understanding a bug, so that we can programme "around"
it.

However, the above mentioned refreshing of the regex string variable
behavour difference is not explained, methinks.

Either way, I am still convinced we should call this a bug.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Oct 31 '06 #9
Evertjan. wrote:

[snip]
var r = /x/g;
// r, the regex string variable. will not be refreshed here:
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write(r.test('x')+'<br>'); // true
document.write(r.test('x')+'<br>'); // false
document.write('<br>');
r = /x/g;
// r, will NOW be refreshed every time:
document.write(r.test('x')+'<br>'); // true
r = /x/g;
[snip]

Ah, I see. If you wrote "regex object", that would have been more obvious.

Creating a new regular expression object is hardly necessary. Just set
the lastIndex property to zero:

var re = /x/g;

document.write(re.test('x') + '<br>'); // true
document.write(re.test('x') + '<br>'); // false
document.write(re.test('x') + '<br>'); // true
re.lastIndex = 0;
document.write(re.test('x') + '<br>'); // true

[snip]
You could perhaps be correctly explaining the behavour of test(), I
still fail to see why an explanation of a behavour of a js-method
does prevent that behavour to be a bug.
It doesn't, not automatically. Specifications can be badly thought out,
but, in my opinion, that doesn't apply in this case.

[snip]
However, the above mentioned refreshing of the regex string variable
behavour difference is not explained, methinks.
Each literal evaluates to an object reference (the object itself is
created before execution begins as the literal is scanned), and each of
those objects are completely different - they do not compare as equal
even if the literal is exactly the same. The test method will alter the
lastIndex property of the referenced object, but that object will
eventually be discarded and replaced by a new one.

[snip]

Mike
Nov 1 '06 #10
Michael Winter wrote on 01 nov 2006 in comp.lang.javascript:
[..]
>However, the above mentioned refreshing of the regex string variable
behavour difference is not explained, methinks.

Each literal evaluates to an object reference (the object itself is
created before execution begins as the literal is scanned), and each of
those objects are completely different - they do not compare as equal
even if the literal is exactly the same. The test method will alter the
lastIndex property of the referenced object, but that object will
eventually be discarded and replaced by a new one.
I begin to see.

However, I think this construction while being useful in match() and exec
(), is a bad one in test(). I would never have allowed test() to change any
property of the regex object, even it's lastIndex property.
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Nov 1 '06 #11
Michael Winter wrote:

[snip]
function countMatches(string, pattern) {
var count = 0,
index = pattern.lastIndex = 0;

while (pattern.test(string)) {
++count;
if (pattern.lastIndex == index) {++pattern.lastIndex;}
index = pattern.lastIndex;
}
return count;
}
Forgot to update the index variable.

[snip]

Mike
Nov 3 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
39323
by: Anand Pillai | last post by:
To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1:...
10
2187
by: Jeff Sandler | last post by:
I have a page that accepts input from many textboxes. Many of the textboxes are intended to accept dates and times, thus, I expect only digits to be entered. I originally tested using parseInt...
0
1796
by: Chris Croughton | last post by:
I'm trying to use the EXSLT regexp package from http://www.exslt.org/regexp/functions/match/index.html (specifically the match function) with the libxml xltproc (which supports EXSLT), but...
3
2657
by: jasonkester | last post by:
Just a heads up for anybody that comes across this in the future. Noticed a strange behavior in RegExp.test() today. Check out the following code. It will alternately display "chokes" and null,...
8
2002
by: Dmitry Korolyov | last post by:
ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web server. A single-line asp:textbox control and regexp validator attached to it. ^\d+$ expression does match an empty...
26
2081
by: Matt Kruse | last post by:
Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape...
6
2253
by: runsun pan | last post by:
Hi I am wondering why I couldn't get what I want in the following 3 cases of re: (A) var p=/(+-?+):(+)/g p.exec("style='font-size:12'") -- // expected
4
3886
by: Matt | last post by:
Hello all, I have just discovered (the long way) that using a RegExp object with the 'global' flag set produces inconsistent results when its test() method is executed. I realize that 'global'...
4
2521
by: r | last post by:
Hello, It seems delimiters can cause trouble sometimes. Look at this : <script type="text/javascript"> function isDigit(s) { var DECIMAL = '\\.'; var exp = '/(^?0(' + DECIMAL
0
7099
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6964
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7123
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
6842
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7319
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5430
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4864
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4559
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
1
598
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.