By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,569 Members | 1,394 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,569 IT Pros & Developers. It's quick & easy.

RegExp Assistance

P: n/a
Hi, I am working with this regexp to extract address: city, state, and
zip. This version kinda works but it extracts one element of an array
instead of three and keeps my "city" too long, including all text
before it.
.....................
var regex = /\s*(.*)\s*,\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)\s*/g;
function doit(){
var arr = d.innerHTML.match(regex);
if(arr.length=3){
d2.innerHTML = arr[0]+" | "+arr[1]+" | "+arr[2];
}else{
d2.innerHTML = "Found "+arr.length+" matches";
}
}

//-->
</script>
.......................
<div id="myDiv">
Some text here, not always break after <br>New Haven, CT 06460 plus
whatever text here too
</div>

Thanks.

Oct 17 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
VUNETdotUS wrote on 17 okt 2007 in comp.lang.javascript:
Hi, I am working with this regexp to extract address: city, state, and
zip. This version kinda works but it extracts one element of an array
instead of three and keeps my "city" too long, including all text
before it.
....................
var regex = /\s*(.*)\s*,\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)\s*/g;
function doit(){
var arr = d.innerHTML.match(regex);
what is d?
if(arr.length=3){
'=' is an assignment operator, not a equality operator.

(arr.length == 3)

You made the mistake of thinking
it gives 3 array members per location
Bertter read up on match()
d2.innerHTML = arr[0]+" | "+arr[1]+" | "+arr[2];
}else{
d2.innerHTML = "Found "+arr.length+" matches";
}
}

//-->
do not use last century code, skip this line
</script>
......................
<div id="myDiv">
Some text here, not always break after <br>New Haven, CT 06460 plus
whatever text here too
</div>
Try:

<script type='text/javascript'>

var regex = /((\s*\b[A-Z]\w+)+),\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)/g;

var d = 'Some text here, not always break after'+
' <br>New Haven, CT 06460 plus whatever text here too';

// d = d + ' Buffalo, NY 12345 '; // dual test
// d = 'abc'; // empty test

var arr = d.match(regex);

if (arr) {
alert(arr.length + ' location(s) found');
for (var i = 0;i<arr.length;i++)
alert( arr[i].replace(/(,)/,' |').replace(/([A-Z]{2})/,'$1 |') );
};

</script>


--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Oct 18 '07 #2

P: n/a
pr
VUNETdotUS wrote:
Hi, I am working with this regexp to extract address: city, state, and
zip. This version kinda works but it extracts one element of an array
Do you mean result.length == 1?
instead of three and keeps my "city" too long, including all text
before it.
....................
var regex = /\s*(.*)\s*,\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)\s*/g;
Looks reasonable to me, although I'm no expert on what a zip code should
contain. Try to use fewer *s though, because they frequently match zero
occurrences (out of zero-or-more), which can be unintended. + is better
wherever you can do it. And you don't need a 'g' flag on a single match.
function doit(){
var arr = d.innerHTML.match(regex);
if(arr.length=3){
Assuming success, arr.length will be 5 (all the parentheticals plus one).
d2.innerHTML = arr[0]+" | "+arr[1]+" | "+arr[2];
Don't forget arr[0] is the *entire match*, arr[1] is the first bracketed
subexpression, arr[2] the second, etc.
}else{
d2.innerHTML = "Found "+arr.length+" matches";
}
}

//-->
</script>
......................
<div id="myDiv">
Some text here, not always break after <br>New Haven, CT 06460 plus
If your city is one or more words preceded by one or more words, then
it's impossible to tell where it starts, unless perhaps it is the only
thing that starts with initial capitals. Something to think about.
whatever text here too
</div>

Thanks.
Oct 18 '07 #3

P: n/a
On Oct 18, 4:15 am, "Evertjan." <exjxw.hannivo...@interxnl.netwrote:
VUNETdotUS wrote on 17 okt 2007 in comp.lang.javascript:
Hi, I am working with this regexp to extract address: city, state, and
zip. This version kinda works but it extracts one element of an array
instead of three and keeps my "city" too long, including all text
before it.
....................
var regex = /\s*(.*)\s*,\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)\s*/g;
function doit(){
var arr = d.innerHTML.match(regex);

what is d?
if(arr.length=3){

'=' is an assignment operator, not a equality operator.

(arr.length == 3)

You made the mistake of thinking
it gives 3 array members per location
Bertter read up on match()
d2.innerHTML = arr[0]+" | "+arr[1]+" | "+arr[2];
}else{
d2.innerHTML = "Found "+arr.length+" matches";
}
}
//-->

do not use last century code, skip this line
</script>
......................
<div id="myDiv">
Some text here, not always break after <br>New Haven, CT 06460 plus
whatever text here too
</div>

Try:

<script type='text/javascript'>

var regex = /((\s*\b[A-Z]\w+)+),\s*([A-Z]{2})\s+(\d{5}(\-\d{4})?)/g;
this worked fine for me... thanks for advice
>
var d = 'Some text here, not always break after'+
' <br>New Haven, CT 06460 plus whatever text here too';

// d = d + ' Buffalo, NY 12345 '; // dual test
// d = 'abc'; // empty test

var arr = d.match(regex);

if (arr) {
alert(arr.length + ' location(s) found');
for (var i = 0;i<arr.length;i++)
alert( arr[i].replace(/(,)/,' |').replace(/([A-Z]{2})/,'$1 |') );

};

</script>

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)

Oct 18 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.