473,322 Members | 1,493 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Re: random sorting a collection

On Wed, 28 May 2008 06:23:35 -0700, Adam Sandler <co****@excite.comwrote:
[...]
The problem is, the sorted array can now contain duplicates!
No, it can't. Not unless it had duplicates to start with.

Personally, I wouldn't bother with such a contrived loop. Something
simpler is fine:

for (int i = 0; i < a.Length; i++)
{
int j = r.Next(a.Length);
String temp = a[i];

a[i] = a[j];
a[j] = temp;
}

Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item that
you'd already shuffled. It's still random.

That whole thread you referenced is a bit sketchy, if you ask me.

Pete
Jun 27 '08 #1
2 2210
On May 28, 3:39 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:

<snip>
Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item that
you'd already shuffled. It's still random.
It ends up being less random though. It's less likely that something
will end up in its initial position than with a true shuffle, IIRC. It
takes a *little* more work to get rid of the off-by-one errors, but
it's not that bad.

I'd personally write it as:

// i=position we're going to replace (possibly with same element)
// Everything less than i is already shuffled
for (int i=0; i < a.Length; i++)
{
// target is in range [i, a.Length)
int target = i+rng.Next(a.Length-i);

String temp = a[i];
a[i] = a[target];
a[target] = temp;
}

Jon
Jun 27 '08 #2
On Wed, 28 May 2008 08:30:18 -0700, Jon Skeet [C# MVP] <sk***@pobox.com>
wrote:
On May 28, 3:39 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:

<snip>
>Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item
that
you'd already shuffled. It's still random.

It ends up being less random though. It's less likely that something
will end up in its initial position than with a true shuffle, IIRC.
By the way, I did finally get a chance to try to look up a reference that
would refute or verify this. I found in Wikipedia an argument that points
out that for a shuffle, there are N! permutations, but that selecting a
random swap from every possible index on each iteration produces N^N
possible outcomes for the random selections. It then points out that N^N
is never divisible by N! and uses that to claim that some permutations
must occur more often (basically, there are duplicates in the possible
random outcomes, but only for some permutations).

Anyway, that's a long way of saying I think you're right.

As long as I'm commenting, I'll point out though that if we're going to
fix the algorithm so that it produces correct uniform distribution, we
might as well take advantage of the chance and reduce our iteration count
by one. So instead of:
for (int i = 0; i < a.Length; i++)
we can have:
for (int i = 0; i < a.Length - 1; i++)
Not nearly as good a nitpick as the one you came up with, but it's all
I've got. :)

There's another optimization with respect to not swapping with something
winds up in the same place, but that involves a much more (relatively
speaking) complex change to the basic algorithm, and so more easily
qualifies as a premature optimization. The above I think makes sense due
to its simplicity, even as I would ignore the potentially more beneficial
optimization. I find this whole thought process somewhat
counter-intuitive and ironic. :)

Pete
Jun 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Jean-Francois.Doyon | last post by:
Hello, I'm trying to retrieve a limited number of random rows, and order them by a column, and am not having any luck with that last part: SELECT * FROM tablename ORDER BY random(), id LIMIT...
22
by: Nhmiller | last post by:
Is there a way to do this? Thanks. Neil Cat Paintings At Carol Wilson Gallery http://www.carolwilsongallery.com
18
by: Scott | last post by:
I have a collection where the items in the collection are dates. I want to iterate over the collection and build a value list string for the rowsource of a listbox. The dates in the collection are...
7
by: Pete Davis | last post by:
A different question this time. I have a DataGrid bound to a collection. Is there any way for me to allow sorting? The DataGrid.AllowSorting=true doesn't work, but that's probably because it can't...
1
by: Patrick | last post by:
Hi, This post is the 'sequel' ;) of the "Data Oriented vs Object Oriented Design" post, but it can be read and treated apart from that one. I will just quote the beginning of my previous message...
19
by: Boris Borcic | last post by:
does x.sort(cmp = lambda x,y : cmp(random.random(),0.5)) pick a random shuffle of x with uniform distribution ? Intuitively, assuming list.sort() does a minimal number of comparisons to ...
16
by: Claudio Grondi | last post by:
I have a 250 Gbyte file (occupies the whole hard drive space) and want to change only eight bytes in this file at a given offset of appr. 200 Gbyte (all other data in that file should remain...
10
by: Sjaakie | last post by:
Hi, I'm, what it turns out to be, fooling around with 3-tier design. At several websites people get really enthusiastic about using custom dataobjects instead of datasets/-tables. While trying to...
13
by: Bruza | last post by:
I need to implement a "random selection" algorithm which takes a list of as input. Each of the (obj, prob) represents how likely an object, "obj", should be selected based on its probability of...
0
by: =?ISO-8859-1?Q?Arne_Vajh=F8j?= | last post by:
Adam Sandler wrote: The following code does both array and collection. It is not the most efficient way of doing it, but the algorithm is easy to understand. public class Util<T> {
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.