468,244 Members | 1,748 Online

# 2Q : Condensing an array; how random works.

Q1 :

Given an array such as might have been generated by

var A = [,,2,,,,4,,,6,,,8,,,]

is there a highly effective way of reducing it to [2,4,6,8] - i.e.
removing the undefineds and shifting the rest down?

A.sort().slice(0,n) // would do it, but sorts; and the number
of active elements must be known.

I know, of course, about explicitly looping through all of A and testing
for undefined.

A.join('#').split(/#+/) // looks inefficient

Q2 :

Is anything exact known about the internal mechanism of Math.random() ?

--
Web <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links.
PAS EXE TXT ZIP via <URL:http://www.merlyn.demon.co.uk/programs/00index.htm>
My DOS <URL:http://www.merlyn.demon.co.uk/batfiles.htm> - also batprogs.htm.
Jul 20 '05 #1
11 2994
Dr John Stockton <sp**@merlyn.demon.co.uk> writes:
Q1 :

Given an array such as might have been generated by

var A = [,,2,,,,4,,,6,,,8,,,]

is there a highly effective way of reducing it to [2,4,6,8] - i.e.
removing the undefineds and shifting the rest down?
You want something faster than running through the array with a for
loop and testing each element. That means that the iteration (which
must be performed somehow) must happen in the interpreter, not the
interpreted code.

Do you have an idea how long the array is, and how sparse?
Is the active elements all numbers, or can they be any object?
Are the active elements always sorted?

Since we know that we need to use a built-in iteration, we should look
at the methods of the array property. I can't find any that looks
promising. Only sort and join really traverse the array, and both have
serious problems.
A.sort().slice(0,n) // would do it, but sorts; and the number
of active elements must be known.
Sorting is a problem if the elements aren't already sorted and the
order must be maintained. The efficiency will drop violently if you
even guaranteed to be stable, so I don't think there is a workaround.

Apart from that, the extra work needed to find the number of active
elements would only be proportional ot the number of active elements,
not the length of the array, so there might be a saving there if
sorting the elements is acceptable.
A.join('#').split(/#+/) // looks inefficient
It might be inneficient, but at least the iteration is performed by
efficient non-interpreted code. I won't wager on how efficient it is
without trying first.

However, it has problems. The first and last elements might become
empty strings (but that is easily checked for). Joining and splitting
turns all the active elements into strings. If one of them was the
empty string already, it will even fail.
If the elements are all numbers, it should work.
Q2 :

Is anything exact known about the internal mechanism of Math.random() ?

I don't know anything about the browsers' implementations. There is
nothing mandatory about it. Neither Javascript 1.5 nor ECMAScript
says anything except the range of the function. Quote from ECMA262:
---
random ( )
Returns a number value with positive sign, greater than or equal to 0
but less than 1, chosen randomly or pseudo randomly with
approximately uniform distribution over that range, using an
implementation-dependent algorithm or strategy. This function takes
no arguments.
---

Microsoft says that their Math.random is automatically seeded when
JScript is first loaded. Apart from that, there is very little
documentation.
<URL:http://msdn.microsoft.com/library/en-us/script56/html/js56jsmthrandom.asp>

/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
Art D'HTML: <URL:http://www.infimum.dk/HTML/randomArtSplit.html>
'Faith without judgement merely degrades the spirit divine.'
Jul 20 '05 #2
Lasse Reichstein Nielsen wrote:
Dr John Stockton <sp**@merlyn.demon.co.uk> writes:

Q1 :

Given an array such as might have been generated by

var A = [,,2,,,,4,,,6,,,8,,,]

is there a highly effective way of reducing it to [2,4,6,8] - i.e.
removing the undefineds and shifting the rest down?

You want something faster than running through the array with a for
loop and testing each element.

<snip>
Since we know that we need to use a built-in iteration, we should look
at the methods of the array property. I can't find any that looks
promising. Only sort and join really traverse the array, and both have
serious problems.

A.sort().slice(0,n) // would do it, but sorts; and the number
of active elements must be known.

Sorting is a problem if the elements aren't already sorted and the
order must be maintained. The efficiency will drop violently if you

Would have thought so too, but was unable to confirm.

and the sorting algorithm isn't even guaranteed to be stable, so I don't think there is a workaround.

As you say, ECMA 262 allows an unstable sort. IE and Opera seem to use a
stable version, whilst Mozilla/Netscape use an unstable sort. This is in
contrast to claims made in Netscape javascript 1.x documentation but I
suppose the disclaimer "sort does not work on all platforms" was
inserted for a purpose :)

<snip>

FYI, some test results proved interesting...

Sorting undefined elements to the top or the array using a compare function:

function sortUndefined(a,b)
{
var u,a=a===u,b=b===u;
return a==b?0:a?1:-1;
}

and a binary search to find and set the length of defined elements:

function definedLength(A)
{
var i,n,m,u,delta;

for(n=0,m=A.length;;)
{
delta=(m-n)>>1;
if(!delta)
break;
i=n+delta;
if(A[i]===u)
m=i;
else
n=i;
}
return (A.length = A[n]===u?n:m);
}

is not fast due to the sort, and speed did not improve by using the
in-built lexical compare function without calling sortUndefined above.
In comparison, the (hand) optimised code to remove undefined entries in
place:

function scanReduce(A)
{
var u,i=-1,j=-1,k=A.length;
while(++j<k)if(!(A[j]===u))A[++i]=A[j];
A.length=++i;
return A;
}

was roughly 10 to 100 times *faster* than sorting, depending on the
density and size of the sparse array.

Cheers,
Dom

Jul 20 '05 #3
In article <qG**************@merlyn.demon.co.uk>, Dr John Stockton
<sp**@merlyn.demon.co.uk> writes

Q1 :

Given an array such as might have been generated by

var A = [,,2,,,,4,,,6,,,8,,,]

is there a highly effective way of reducing it to [2,4,6,8] - i.e.
removing the undefineds and shifting the rest down?

<snip>

1 Is optimisation really needed?
Is the number of array elements large enough to make the processing
time actually visible to your end users? If not then Keep It Simple is
the golden rule.

2 Is the problem really needed?
Is the array machine-generated? If it is then can the generation be
altered to stop it being a problem array? (If human-generated then see
A).

John
--
John Harris
mailto:jo**@jgharris.demon.co.uk
Jul 20 '05 #4
JRS: In article <bf*********@drn.newsguy.com>, seen in
news:comp.lang.javascript, Lee <RE**************@cox.net> posted at Tue,
15 Jul 2003 13:07:02 :-
Dr Stockton said:

JRS: In article <7k**********@hotpop.com>, seen in
news:comp.lang.javascript, Lasse Reichstein Nielsen <lr*@hotpop.com>
posted at Mon, 14 Jul 2003 01:22:55 :-
Dr John Stockton <sp**@merlyn.demon.co.uk> writes:

Q1 :

Given an array such as might have been generated by

var A = [,,2,,,,4,,,6,,,8,,,]

is there a highly effective way of reducing it to [2,4,6,8] - i.e.
removing the undefineds and shifting the rest down?
... I feel that there should be a way of iterating through the
above array in four moves.

I doubt it's that easy.

Yes, should be != is
I suspect the line above allocates
all 16 elements of the Array, rather than using any sort of
"sparse array" data structure that remembers which elements
have been assigned.
The following executes "instantly"; too fast to create a thousand
million individual elements on a 64MB PII/300 :

B = []
B[1000000000] = 3
X = B[1000000000]

Elements numbered above and below #1000000000 appear identical.

Of course, there could be some difference between the results of that
and B = [,,,,,,,,,,,,,,,,,,,,,, ... ... ... ,,,,,,,,,,,,,,,,,,3]; but A
is in fact created by assigning a few sparse elements to an empty array.

I would look for a way to optimize whatever is creating the
Array in the first place.

That optimises the previous work!

--
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (SonOfRFC1036)
Do not Mail News to me. Before a reply, quote with ">" or "> " (SonOfRFC1036)
Jul 20 '05 #5
JRS: In article <zn**********@hotpop.com>, seen in
news:comp.lang.javascript, Lasse Reichstein Nielsen <lr*@hotpop.com>
posted at Wed, 16 Jul 2003 20:34:32 :-
Lasse Reichstein Nielsen <lr*@hotpop.com> writes:
You could then start by collecting the indices and then sort
them numerically:

...
for (var i in B) {
indices.push(+i);
}
indices.sort();

Silly me, this doesn't sort the indices numerically, but lexically.
To sort numerically, one must use a custom comparison function, e.g.,

indices.sort(function(a,b){return a-b;});

This costs, but in a very sparse array, its complexity is still
only O(n(log(n)) where n is the number of real elements in the sparse
array.

Still absorbing your full answer; but it gives the sort of solution that
I felt should exist.

It is implied above that .sort() is much faster than .sort(F). For an
integer sort, provided that there is a known upper bound to the integers
and that the bound is less than o(1e15), one can add a large constant to
the numbers, use .sort(), and remove the bias; the extra arithmetic
takes only o(n).

It seems that in MSIE4 the elements are enumerated in the order of
creation, independently of subsequent changes to their contents :-

B = []
B[7] = 7 ; B[5] = 5 ; // B[7] = 99
S = '' ; for (j in B) S +=j
S

gives 75, both with and without the // .

--
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> JS maths, dates, sources.
Jul 20 '05 #6
JRS: In article <u8***************@nnrp1.ozemail.com.au>, seen in
news:comp.lang.javascript, Dom Leonard <do*************@senet.andthis.co
m.au> posted at Tue, 15 Jul 2003 11:23:23 :-

DL posted the interesting function scanReduce herein :-

<script type="text/javascript"><!--

function scanReduce(A)
{
var u,i=-1,j=-1,k=A.length;
while(++j<k)if(!(A[j]===u))A[++i]=A[j];
A.length=++i;
return A;
}

//--></script>

That innocent function does show a noteworthy effect in Web page
validation with W3's TIDY - it contains the character pair '<k'.

That caused me no problem with W3's TIDY dated 2000. With 1 Feb 2003
TIDY, I get
line 20 column 3 - Warning: '<' + '/' + letter not allowed here
line 20 column 6 - Warning: discarding unexpected </script>
(line 20 starts //) - but only when the pair <!-- & //--> are used.

Now I know that <!-- & //--> are considered unnecessary with current
browsers; but they are still quite commonly present. It seems worth
mentioning as a warning, since it took me a while to spot the cause.

The effect does not occur if a space is added '< k', nor for '<2'.

--
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> JS maths, dates, sources.
Jul 20 '05 #7
Dr John Stockton wrote:

That caused me no problem with W3's TIDY dated 2000. With 1 Feb 2003
TIDY, I get
line 20 column 3 - Warning: '<' + '/' + letter not allowed
here line 20 column 6 - Warning: discarding unexpected
</script> (line 20 starts //) - but only when the pair <!-- & //-->
are used.

You try to validate with XHTML doctype I think. As mentioned in
XHTML 1.0 spec., the script tag has a PCDATA content where <
means an opening tag. The spec says to declare the content of the
script as CDATA i.e. enclosing them in <![CDATA[ ... ]]>. Unhappily,
the JS engine throws a syntax error... The workaround consists in
Combined with the ancient way for non JS browsers (is it useful?):
<script type="text/javascript">
<!--
//<!{CDATA{
....
....
//]]>
-->
</script>

and TIDY is happy!

Another workaround is to write "while(k>++j)" instead...

--
Y.D.
Jul 20 '05 #8
In article <EV**************@merlyn.demon.co.uk>, Dr John Stockton
<sp**@merlyn.demon.co.uk> writes
<snip>
Perhaps there should be a FAQ, with links from the clj & unwa FAQs, on
the validation of Web pages containing script, including how to placate
TIDY when it does not really understand

<snip>

If there is there should be a link to the SourceForge site at
<URL:http://tidy.sourceforge.net/>

The Tidy code is all up in the air at the moment. We won't know for some
time if bugs introduced in the last of the old version will reappear in
the new version.

John
--
John Harris
mailto:jo**@jgharris.demon.co.uk
Jul 20 '05 #9
In article <3f**********************@news.free.fr>, YD <no*****@web.com>
writes
<snip>
As mentioned in
XHTML 1.0 spec., the script tag has a PCDATA content where <
means an opening tag. The spec says to declare the content of the
script as CDATA i.e. enclosing them in <![CDATA[ ... ]]>.

<snip>

And beware of any code like this :-)

if (a[b[c]]>d)

John
--
John Harris
mailto:jo**@jgharris.demon.co.uk
Jul 20 '05 #10
Dr John Stockton wrote:
[...] (I've been putting
'action=.' into '<form...>' for that; grammar requires forms, but
purpose requires no action).
Once I got the same error in a validation. After a sight on the
HTML4.01 DTD, I found the action attribute is required in a form tag
(just like alt for images, even if those attributes are empty strings,
which is recommended for alt with meaningless images as you
probably know).
Another trap was, in script, // that won't ...
which had to become // that will not ...

I try to reproduce this error I never noticed, and was not able! No error
found by both TIDY (2003/02) and HTMLTidy 1.15.

--
Y.D.

Jul 20 '05 #11
JRS: In article <JK**************@jgharris.demon.co.uk>, seen in
news:comp.lang.javascript, John G Harris <jo**@nospam.demon.co.uk>
posted at Mon, 21 Jul 2003 20:42:10 :-
In article <EV**************@merlyn.demon.co.uk>, Dr John Stockton
<sp**@merlyn.demon.co.uk> writes
<snip>
Perhaps there should be a FAQ, with links from the clj & unwa FAQs, on
the validation of Web pages containing script, including how to placate
TIDY when it does not really understand

<snip>

If there is there should be a link to the SourceForge site at
<URL:http://tidy.sourceforge.net/>

The Tidy code is all up in the air at the moment. We won't know for some
time if bugs introduced in the last of the old version will reappear in
the new version.

Let us hope they fix the error in line 2 of TIDY -h (2003 Feb 1) ,
see http://tidy.sourgeforge.net/

No doubt the "don't" problem was dependent on earlier code. TIDY was
deeply upset by my ws-dates.htm, which contained commented VBscript -
the comment character is "'". I doubled them all <G>.

Let us also hope that the DOS-32 version is retained; I drive it from a
batch file, to test all *.htm in the current directory that are dated
since the date of ws_ftp.log.

--