By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,571 Members | 955 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,571 IT Pros & Developers. It's quick & easy.

Regular expression capture dependent on token order?

P: n/a
So I'm trying to write a WDDX deserializer (I know they exist, but I'm
not fond of what I've seen) and so far everything is hunky dory.
Rather than writing a full lexical analyzer, I'm just using RE's to
match the pieces of the WDDX schema that I care about. And everything
works except for the the recordset element. Below is the string I'm
trying to match and expression I'm using. The RE matches and populates
$2 and $3 with the correct captures, but $1 is left undefined. If I
switch the order of the rowCount and fieldNames attributes of the
recordset element, then the RE matches and populates all three
captures correctly. Any ideas?

var testString = " <recordset rowCount='2'
fieldNames='FIRST,LAST,AGE'>\n"+
" <field name='FIRST'><string>Scott</string><string>Jack</
string></field>\n"+
" <field name='age'><number>27</number><number>69</number></
field>\n"+
" <field name='LAST'><string>Hussey</string><string>Hussey</
string></field>\n"+
" </recordset>\n";

var RE = /^\s*<recordset(?:(\s+rowCount='\d+?')|(\s+fieldNam es='[A-Za-
z0-9,]+?')){2}\s*>((?:.|\s)+?)<\/recordset>/;

var result = RE.exec(testString);

// From Firebug:
// result[0] = entirety of testString
// result[1] = undefined as above, " rowCount='2'" if I switch the
rowCont and fieldNames attributes
// result[2] = " fieldNames='FIRST,LAST,AGE'"
// result[3] = the string within the recordSet tags

Feb 8 '07 #1
Share this Question
Share on Google+
1 Reply

P: n/a
So a little more testing shows this to be a bug in the Firefox RE
engine. It has been reported.

On Feb 7, 5:38 pm, sthus...@gmail.com wrote:
So I'm trying to write a WDDX deserializer (I know they exist, but I'm
not fond of what I've seen) and so far everything is hunky dory.
Rather than writing a full lexical analyzer, I'm just using RE's to
match the pieces of the WDDX schema that I care about. And everything
works except for the the recordset element. Below is the string I'm
trying to match and expression I'm using. The RE matches and populates
$2 and $3 with the correct captures, but $1 is left undefined. If I
switch the order of the rowCount and fieldNames attributes of the
recordset element, then the RE matches and populates all three
captures correctly. Any ideas?

var testString = " <recordset rowCount='2'
fieldNames='FIRST,LAST,AGE'>\n"+
" <field name='FIRST'><string>Scott</string><string>Jack</
string></field>\n"+
" <field name='age'><number>27</number><number>69</number></
field>\n"+
" <field name='LAST'><string>Hussey</string><string>Hussey</
string></field>\n"+
" </recordset>\n";

var RE = /^\s*<recordset(?:(\s+rowCount='\d+?')|(\s+fieldNam es='[A-Za-
z0-9,]+?')){2}\s*>((?:.|\s)+?)<\/recordset>/;

var result = RE.exec(testString);

// From Firebug:
// result[0] = entirety of testString
// result[1] = undefined as above, " rowCount='2'" if I switch the
rowCont and fieldNames attributes
// result[2] = " fieldNames='FIRST,LAST,AGE'"
// result[3] = the string within the recordSet tags

Feb 9 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.