By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,605 Members | 2,137 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,605 IT Pros & Developers. It's quick & easy.

sort() array names sequentially with numbers

P: n/a

Hi all, A while back I asked how to sort an array of strings which would
have numerals and I wanted to put them in sequential numerical order.

For example:

myArray[0] = "file1";
myArray[1] = "file2";
myArray[2] = "file3";
myArray[3] = "file4";
myArray[4] = "file5";
myArray[5] = "file6";
myArray[6] = "file7";
myArray[7] = "file8";
myArray[8] = "file9";
myArray[9] = "file10";
myArray[10] = "file11";

etc...

However using the normal sort function below for numerical sorting...
---------------------------------------
function compare(a,b){
return(a-b)
}

myArray.sort(compare);
---------------------------------------
returns the array in an order I do not want and is not consistent.

For example:
myArray[0] = "file1";
myArray[1] = "file10";
myArray[2] = "file2";
myArray[3] = "file3";
myArray[4] = "file4";
etc..

Same thing happens when it gets to 20 etc..

I was given this code below which works great in most modern browsers but
crashes IE5, actually it pops an alert that the script has been running a
long time and wants me to bail. Can anyone shed some light as to why this
code fails or know of another way to accomplish this?
function cmp(a,b) {
return (b<a)-(a<b);
}
function numCmp(a,b) {
var re1 = /(\d+)|\D+/g;
var re2 = /(\d+)|\D+/g;
re1.lastIndex = 0;
re2.lastIndex=0;
var res = 0;
do{ match1 = re1.exec(a);match2 = re2.exec(b);
if(match1){if (match2){
if(match1[1]) {
if(match2[1]) {
res = Number(match1[1]) - Number(match2[1]) ||
cmp(match1[0],match2[0]);
}else{res = -1;}
}else{if(match2[1]){res = 1;
}else{res = cmp(match1[0],match2[0]);}}
}else{res = 1;}}else{if(match2){res = -1;}else{res = 0; break;
}
}
} while(res == 0);
return res;
}
--
Thanks David
Sep 29 '05 #1
Share this Question
Share on Google+
19 Replies


P: n/a
ASM
David a écrit :
Hi all, A while back I asked how to sort an array of strings which would
have numerals and I wanted to put them in sequential numerical order.

For example:

myArray[0] = "file1";
myArray[1] = "file2";
myArray[2] = "file3";
myArray[3] = "file4";
myArray[4] = "file5";
myArray[5] = "file6";
myArray[6] = "file7";
myArray[7] = "file8";
myArray[8] = "file9";
myArray[9] = "file10";
myArray[10] = "file11";

etc...

However using the normal sort function below for numerical sorting...

and ... why not rename your files ?

myArray[0] = "file01";
myArray[1] = "file02";
myArray[2] = "file03";
myArray[3] = "file04";
myArray[4] = "file05";
myArray[5] = "file06";
myArray[6] = "file07";
myArray[7] = "file08";
myArray[8] = "file09";
myArray[9] = "file10";
myArray[10] = "file11";

would fix your myArray.sort(compare)
---------------------------------------
function compare(a,b){
return(a-b)
}

myArray.sort(compare);
---------------------------------------
returns the array in an order I do not want and is not consistent.

For example:
myArray[0] = "file1";
myArray[1] = "file10";
myArray[2] = "file2";
myArray[3] = "file3";
myArray[4] = "file4";
etc..

Same thing happens when it gets to 20 etc..

--
Stephane Moriaux et son [moins] vieux Mac
Sep 29 '05 #2

P: n/a
> and ... why not rename your files ?
Because they won't be my files and I will have no control over what they are
named, however they must be sorted in a specific order.
David


Sep 29 '05 #3

P: n/a

David <ri***@dd.com> wrote in message news:NhG_e.11469$qC4.1168@trnddc02...

Hi all, A while back I asked how to sort an array of strings which would
have numerals and I wanted to put them in sequential numerical order.
However using the normal sort function below for numerical sorting...
---------------------------------------
function compare(a,b){
return(a-b)
}

myArray.sort(compare);
---------------------------------------
returns the array in an order I do not want and is not consistent.

For example:
myArray[0] = "file1";
myArray[1] = "file10";
myArray[2] = "file2";
myArray[3] = "file3";
myArray[4] = "file4";
etc..

Same thing happens when it gets to 20 etc..


This routine compares the numerical value of the suffixes (if present).

var fNames=["file10","file1","file12","file123","fileNAN","fil e5","file11","file3","file02"];

function bubbleSortOnSuffix( f )
{
var didSwap=true,tmp,n1,n2;

while(didSwap)
for(var i=0, didSwap=false; i<f.length-1; i++)
{
if( isNaN( n1=parseInt( f[i].match( /\d+/ ) ) ) )
n1=0;

if( isNaN( n2=parseInt( f[i+1].match( /\d+/ ) ) ) )
n2=0;

if( n1 > n2 )
{
tmp=f[i];
f[i]=f[i+1];
f[i+1]=tmp;
didSwap=true;
}
}
}

bubbleSortOnSuffix( fNames );

alert(fNames);

--
S.C.
Sep 29 '05 #4

P: n/a
David wrote:
Hi all, A while back I asked how to sort an array of strings which would
have numerals and I wanted to put them in sequential numerical order.

For example:

myArray[0] = "file1";
myArray[1] = "file2";
myArray[2] = "file3";
myArray[3] = "file4";
myArray[4] = "file5";
myArray[5] = "file6";
myArray[6] = "file7";
myArray[7] = "file8";
myArray[8] = "file9";
myArray[9] = "file10";
myArray[10] = "file11";

etc...

However using the normal sort function below for numerical sorting...
---------------------------------------
function compare(a,b){
return(a-b)
}

myArray.sort(compare);
---------------------------------------
returns the array in an order I do not want and is not consistent.

For example:
myArray[0] = "file1";
myArray[1] = "file10";
myArray[2] = "file2";
myArray[3] = "file3";
myArray[4] = "file4";
etc..

Same thing happens when it gets to 20 etc..

I was given this code below which works great in most modern browsers but
crashes IE5, actually it pops an alert that the script has been running a
long time and wants me to bail. Can anyone shed some light as to why this
code fails or know of another way to accomplish this?
Using a comparison function for sort was introduced with Navigator 3 and
you say that sort(cmp) works (though the result is not what you want) so
we can rule that out as the issue.

The logic in numCmp() is implemented using regular expressions, a much
more likely source of your (or rather IE 5's) problem.

Reduce the size of your array to say ['file1','file2','file10'] just for
convenience and insert an alert in numCmp() that spits out the values of
re1 and re2 immediately after they are set - you may well find that IE5
barfs about there.

If it gets further, move the alert until you find out what the problem
is. Once you've done that, you know which bit of logic you have to
replace. I don't have IE 5 so can't test it for you.

What numCmp is doing is breaking the file names into arrays of digits
and non-digit parts, then compares them - chars as chars and numbers as
numbers. If one is a number and one a char, they are compared as chars.
The value of the first mis-match is returned - if no differences are
found, 0 is returned.

You can replicate the logic say using split and arrays, but it is
somewhat longer and more tedious to keep the bits in the right order.

As a quick and dirty solution, if the 'file' part is always identical
and only the number changes, then numCmp can be:

function numCmp2(a,b)
{
var x = a.replace(/\D+/g,'');
var y = b.replace(/\D+/g,'');
return cmp(+x, +y);
}



function cmp(a,b) {
return (b<a)-(a<b);
}
function numCmp(a,b) {
var re1 = /(\d+)|\D+/g;
var re2 = /(\d+)|\D+/g;
re1.lastIndex = 0;
re2.lastIndex=0;
var res = 0;
do{ match1 = re1.exec(a);match2 = re2.exec(b);
if(match1){if (match2){
if(match1[1]) {
if(match2[1]) {
res = Number(match1[1]) - Number(match2[1]) ||
cmp(match1[0],match2[0]);
}else{res = -1;}
}else{if(match2[1]){res = 1;
}else{res = cmp(match1[0],match2[0]);}}
}else{res = 1;}}else{if(match2){res = -1;}else{res = 0; break;
}
}
} while(res == 0);
return res;
}
--
Thanks David

--
Rob
Sep 29 '05 #5

P: n/a

Thanks Stephen. Your example works well and IE 5 isn't choking on it. I
think the problem is solved.

--
David


Sep 29 '05 #6

P: n/a
Thanks for the detailed explanation Rob. I will go through this function
with the alerts to see where the issue is. Stephens example works well and
IE5 seems to like it as well.

--
David
Sep 29 '05 #7

P: n/a
David wrote:
Thanks for the detailed explanation Rob. I will go through this function
with the alerts to see where the issue is. Stephens example works well and
IE5 seems to like it as well.

--
David


Stephen's function does not seem to handle mixed digits and letters very
well, you might as well use the numCmp2() I posted.

Anyhow, below is aversion that implements the logic using arrays instead
of RegExp - actually it still uses a regular expression but not the same
way. I also removed the call to cmp() by including its functionality.

As far as I can tell it works exactly the same in Firefox and IE 6, can
you test it in IE 5 to see if it works?

<div id="xx"></div>

<script type="text/javascript">

var fileArray = [
"file02fin.jpg","file01.jpg","afile30.jpg","file30 ",
"file4.gif","5","5.jpg","file6.bat","file7","zfile 8","file9",
"fi6le10","file11","1file20.jpg","file02","zfile08 ","zfile8",
"file4.bat","file4"
];

function cmpNumStr(a, b)
{
// Turn a & b into arrays of digits & non-digit elements
var A = a.match(/\d+|\D+/g)
var B = b.match(/\d+|\D+/g)
var res = 0;
var i = 0;
do {
if(A[i]){
if(B[i]){ // Compare as numbers or chars if that fails
res = (A[i] - B[i]) || ((B[i]<A[i]) - (A[i]<B[i]));
} else {
return 1; // Theres an A element but no B, A wins
}
} else if (B[i]){
return -1; // Theres a B element but no A, B wins
} else {
return res; // Run out of elements, so even
}
i++;
} while ( res == 0 ); // Stop as soon non-equal elements found
return res;
}

var xx = document.getElementById('xx');
xx.innerHTML = '<b>Before:</b><br>' + fileArray.join('<br>')
+ '<hr><b>After:</b><br>'
+ fileArray.sort(cmpNumStr).join('<br>');

</script>
--
Rob
Sep 29 '05 #8

P: n/a
RobG wrote:
[...]
As far as I can tell it works exactly the same in Firefox and IE 6, can
you test it in IE 5 to see if it works?

It works in Safari 1.0.3 (Mac OS X 10.2.8) and IE 5.2 on Mac if that's
any help.
--
Rob
Sep 29 '05 #9

P: n/a
As far as I can tell it works exactly the same in Firefox and IE 6, can
you test it in IE 5 to see if it works?

It works in Safari 1.0.3 (Mac OS X 10.2.8) and IE 5.2 on Mac if that's any
help.


Let me give it a whirl in the real world use here. I will let you know....

--
David
Sep 29 '05 #10

P: n/a
Lee
David said:


Hi all, A while back I asked how to sort an array of strings which would
have numerals and I wanted to put them in sequential numerical order.

For example:

myArray[0] = "file1";
myArray[1] = "file2";
myArray[2] = "file3";
myArray[3] = "file4";
myArray[4] = "file5";
myArray[5] = "file6";
myArray[6] = "file7";
myArray[7] = "file8";
myArray[8] = "file9";
myArray[9] = "file10";
myArray[10] = "file11";

etc...

However using the normal sort function below for numerical sorting... ...returns the array in an order I do not want and is not consistent. I was given this code below which works great in most modern browsers but
crashes IE5, actually it pops an alert that the script has been running a
long time and wants me to bail.

The default sort is actually very consistent. It's lexical order.
The code that you were given is very slow because it performs
expensive string manipulations for each and every comparison in
the sort. That's many times for each name.

It would be much more efficient to temporarily rename the files,
sort them lexically, and then name them back. That way, each
name is only manipulated twice.

Sep 29 '05 #11

P: n/a
JRS: In article <43**********@mk-nntp-2.news.uk.tiscali.com>, dated
Thu, 29 Sep 2005 03:28:20, seen in news:comp.lang.javascript, Stephen
Chalmers <ig******@lycos.co.uk> posted :

This routine compares the numerical value of the suffixes (if present). if( isNaN( n1=parseInt( f[i].match( /\d+/ ) ) ) )

Has it been tried with varying leading zeroes in the numeric parts?

For compactness of writing, and certainty of symmetry, the comparison
function should be of the nature

function C(a, b) { var A = Fn(a), B = Fn(b) ; return (A>B)-(B>A} }

where Fn does the transforming.

But for speed with large data, the elements of the data arrays should be
transformed into elements for which the default sort is appropriate, and
de-transformed after.

There should be a <FAQENTRY> on sorting.

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://www.jibbering.com/faq/> JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Sep 29 '05 #12

P: n/a
>> As far as I can tell it works exactly the same in Firefox and IE 6, can
you test it in IE 5 to see if it works?
It works in Safari 1.0.3 (Mac OS X 10.2.8) and IE 5.2 on Mac if that's any
help.


Ok, I've tested all of the examples and the last one you provided performed
the most consistent with various file names. It does work in IE5 as well.

Thanks RobG for your help. As well, thanks to everyone else for thier input.
I have learned quite a bit from all of you.

--
David
Sep 29 '05 #13

P: n/a
Dr John Stockton wrote:
[...]
But for speed with large data, the elements of the data arrays should be
transformed into elements for which the default sort is appropriate, and
de-transformed after.
Yes (it was suggested in a way by ASM). Do you have an efficient method
of determining an appropriate number of digits to use for padding? The
only ways I can think of involve breaking the string into digit and
non-digit components, getting the longest digit component, padding to
that length and putting the strings back together and sorting that.

I can't see that would be faster than just comparing the elements and
returning as soon as a non-equal pair are found (I don't presume that
I'm correct about that, it just seems reasonable).

If a function is used for a sort comparison, does that mean that the
sort becomes a 'bubble' sort? If so, then the performance can be
expected to degrade (exponentially?) as the number of things to sort
gets longer and more disorganised.

There should be a <FAQENTRY> on sorting.


Yes.

--
Rob
Sep 30 '05 #14

P: n/a
Dr John Stockton <jr*@merlyn.demon.co.uk> wrote in message news:IP**************@merlyn.demon.co.uk...
JRS: In article <43**********@mk-nntp-2.news.uk.tiscali.com>, dated
Thu, 29 Sep 2005 03:28:20, seen in news:comp.lang.javascript, Stephen
Chalmers <ig******@lycos.co.uk> posted :

This routine compares the numerical value of the suffixes (if present).
if( isNaN( n1=parseInt( f[i].match( /\d+/ ) ) ) )

Has it been tried with varying leading zeroes in the numeric parts?

It was written to handle a specified consistent format, whose data should
contain no duplicates. The expected numeric suffix evaluates to the same
integer regardless of the number of leading zeroes, therefore
"file0002","file02","file002", would be placed contiguously in an order
reflecting their relative positions in the original array.
For compactness of writing, and certainty of symmetry, the comparison
function should be of the nature

function C(a, b) { var A = Fn(a), B = Fn(b) ; return (A>B)-(B>A} }

where Fn does the transforming.
Introducing a subroutine invariably makes the calling code more compact,
albeit at the expense of speed; it doesn't mean that the total code will be
reduced.
But for speed with large data, the elements of the data arrays should be
transformed into elements for which the default sort is appropriate, and
de-transformed after.


Provided that the ignored section of the data is the same for all elements,
otherwise the re-construction routine has to make comparisons with the
original data.

--
S.C.
Sep 30 '05 #15

P: n/a
The default sort is actually very consistent. It's lexical order.
The code that you were given is very slow because it performs
expensive string manipulations for each and every comparison in
the sort. That's many times for each name.

It would be much more efficient to temporarily rename the files,
sort them lexically, and then name them back. That way, each
name is only manipulated twice.

Yes the default is very consistent but in my case the default won't suffice.
The files to be sorted are not of my creation and will come in with any
given and possibly mixed names. They will all be images however. People name
thier images some strange things, but the thing is, they will want them in
an order that say thier Operating Systems file management program orders
them.

The standard arry.sort() doesn't do this. If I were to rename the files, I
would still need to sort them first. Otherwise if I give them names as they
first appear in the unsorted array I am back where I started.

I do have it working with RobG's example. I don't mind a little overhead to
accomplish the task, more importantly it sorts in a manner consistent with
what the user expects. I don't expect the number of files to sort to go much
over 100 if that.

Thanks ..

--
David


Sep 30 '05 #16

P: n/a
Lee
David said:

The default sort is actually very consistent. It's lexical order.
The code that you were given is very slow because it performs
expensive string manipulations for each and every comparison in
the sort. That's many times for each name.

It would be much more efficient to temporarily rename the files,
sort them lexically, and then name them back. That way, each
name is only manipulated twice.

Yes the default is very consistent but in my case the default won't suffice.
The files to be sorted are not of my creation and will come in with any
given and possibly mixed names. They will all be images however. People name
thier images some strange things, but the thing is, they will want them in
an order that say thier Operating Systems file management program orders
them.

The standard arry.sort() doesn't do this. If I were to rename the files, I
would still need to sort them first. Otherwise if I give them names as they
first appear in the unsorted array I am back where I started.


No, no, no. You rename them in a consistent way, such as
image6 -> image0006
then sort the results lexically.
then name them back, either by keeping a mapping of the changed name
to the original name, or by assuming that the original never has
leading zeros.

Sep 30 '05 #17

P: n/a
> No, no, no. You rename them in a consistent way, such as
image6 -> image0006
then sort the results lexically.
then name them back, either by keeping a mapping of the changed name
to the original name, or by assuming that the original never has
leading zeros.


Ah, I see now. hmm... I'll give that a try as well. Thank you.

--
David
Oct 1 '05 #18

P: n/a
JRS: In article <9U*****************@news.optus.net.au>, dated Fri, 30
Sep 2005 00:44:53, seen in news:comp.lang.javascript, RobG
<rg***@iinet.net.au> posted :
Dr John Stockton wrote:
[...]
But for speed with large data, the elements of the data arrays should be
transformed into elements for which the default sort is appropriate, and
de-transformed after.
Yes (it was suggested in a way by ASM). Do you have an efficient method
of determining an appropriate number of digits to use for padding?


The programmer should know how many digits may be needed. Otherwise, it
takes just another O(N) scan to find out.

The
only ways I can think of involve breaking the string into digit and
non-digit components, getting the longest digit component, padding to
that length and putting the strings back together and sorting that.
I would prepend the padded key to the existing string. That cannot be
much slower than re-ordering the existing string, and should be quicker
to undo. But, if speed matters, that should be tested.

I can't see that would be faster than just comparing the elements and
returning as soon as a non-equal pair are found (I don't presume that
I'm correct about that, it just seems reasonable).
It's not a matter of whether individual operations are faster, but of
how many of them are done.

Pre- and post- processing is done (of the order of) once per item;
comparison during sorting is done more than once per item, by a factor
which increases as the number of items increases.

If a function is used for a sort comparison, does that mean that the
sort becomes a 'bubble' sort?
No.
If so, then the performance can be
expected to degrade (exponentially?) as the number of things to sort
gets longer and more disorganised.


Bubble sort takes only O(N^2) time.
Efficient sorts take O(N log N) or O(N^x) for x about 1.3.
We expect all javascript implementations to use an efficient sort.

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://www.jibbering.com/faq/> JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Oct 1 '05 #19

P: n/a
JRS: In article <43**********@mk-nntp-2.news.uk.tiscali.com>, dated
Fri, 30 Sep 2005 01:51:50, seen in news:comp.lang.javascript, Stephen
Chalmers <ig******@lycos.co.uk> posted :
Dr John Stockton <jr*@merlyn.demon.co.uk> wrote in message news:IPFOPDBDoGPDFwvh
@merlyn.demon.co.uk...
JRS: In article <43**********@mk-nntp-2.news.uk.tiscali.com>, dated
Thu, 29 Sep 2005 03:28:20, seen in news:comp.lang.javascript, Stephen
Chalmers <ig******@lycos.co.uk> posted :
>
>This routine compares the numerical value of the suffixes (if present).

> if( isNaN( n1=parseInt( f[i].match( /\d+/ ) ) ) )

Has it been tried with varying leading zeroes in the numeric parts?

It was written to handle a specified consistent format, whose data should
contain no duplicates. The expected numeric suffix evaluates to the same
integer regardless of the number of leading zeroes, therefore
"file0002","file02","file002", would be placed contiguously in an order
reflecting their relative positions in the original array.


Other readers with similar problems may not have that consistency.

AIUI, in an arbitrary efficient sort method, there is *no* guarantee
that that "equal" items will retain their original order.

For compactness of writing, and certainty of symmetry, the comparison
function should be of the nature

function C(a, b) { var A = Fn(a), B = Fn(b) ; return (A>B)-(B>A} }

where Fn does the transforming.


Introducing a subroutine invariably makes the calling code more compact,
albeit at the expense of speed; it doesn't mean that the total code will be
reduced.


Not invariably, I agree, but it seems rather common that the contents of
Fn will be larger than the contents of C, which is more or less the
condition that code using C will be smaller.

But for speed with large data, the elements of the data arrays should be
transformed into elements for which the default sort is appropriate, and
de-transformed after.


Provided that the ignored section of the data is the same for all elements,
otherwise the re-construction routine has to make comparisons with the
original data.


I was not proposing to ignore any data, just to make it irrelevant for
the sorting decisions. Prepend a fixed-length key to the unaltered
data, and reconstruction is merely decapitation.

I wonder what the effect of sort is on an Array whose elements are
similar Objects such as {a:1, b:"fred").

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://www.jibbering.com/faq/> JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Oct 1 '05 #20

This discussion thread is closed

Replies have been disabled for this discussion.