473,383 Members | 1,737 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Re: random sorting a collection

On Wed, 28 May 2008 06:23:35 -0700, Adam Sandler <co****@excite.comwrote:
[...]
The problem is, the sorted array can now contain duplicates!
No, it can't. Not unless it had duplicates to start with.

Personally, I wouldn't bother with such a contrived loop. Something
simpler is fine:

for (int i = 0; i < a.Length; i++)
{
int j = r.Next(a.Length);
String temp = a[i];

a[i] = a[j];
a[j] = temp;
}

Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item that
you'd already shuffled. It's still random.

That whole thread you referenced is a bit sketchy, if you ask me.

Pete
Jun 27 '08 #1
2 2216
On May 28, 3:39 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:

<snip>
Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item that
you'd already shuffled. It's still random.
It ends up being less random though. It's less likely that something
will end up in its initial position than with a true shuffle, IIRC. It
takes a *little* more work to get rid of the off-by-one errors, but
it's not that bad.

I'd personally write it as:

// i=position we're going to replace (possibly with same element)
// Everything less than i is already shuffled
for (int i=0; i < a.Length; i++)
{
// target is in range [i, a.Length)
int target = i+rng.Next(a.Length-i);

String temp = a[i];
a[i] = a[target];
a[target] = temp;
}

Jon
Jun 27 '08 #2
On Wed, 28 May 2008 08:30:18 -0700, Jon Skeet [C# MVP] <sk***@pobox.com>
wrote:
On May 28, 3:39 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:

<snip>
>Then you avoid all the -1 and +1 and off-by-one. For the purposes of
shuffling the input, it doesn't matter that you might shuffle an item
that
you'd already shuffled. It's still random.

It ends up being less random though. It's less likely that something
will end up in its initial position than with a true shuffle, IIRC.
By the way, I did finally get a chance to try to look up a reference that
would refute or verify this. I found in Wikipedia an argument that points
out that for a shuffle, there are N! permutations, but that selecting a
random swap from every possible index on each iteration produces N^N
possible outcomes for the random selections. It then points out that N^N
is never divisible by N! and uses that to claim that some permutations
must occur more often (basically, there are duplicates in the possible
random outcomes, but only for some permutations).

Anyway, that's a long way of saying I think you're right.

As long as I'm commenting, I'll point out though that if we're going to
fix the algorithm so that it produces correct uniform distribution, we
might as well take advantage of the chance and reduce our iteration count
by one. So instead of:
for (int i = 0; i < a.Length; i++)
we can have:
for (int i = 0; i < a.Length - 1; i++)
Not nearly as good a nitpick as the one you came up with, but it's all
I've got. :)

There's another optimization with respect to not swapping with something
winds up in the same place, but that involves a much more (relatively
speaking) complex change to the basic algorithm, and so more easily
qualifies as a premature optimization. The above I think makes sense due
to its simplicity, even as I would ignore the potentially more beneficial
optimization. I find this whole thought process somewhat
counter-intuitive and ironic. :)

Pete
Jun 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Jean-Francois.Doyon | last post by:
Hello, I'm trying to retrieve a limited number of random rows, and order them by a column, and am not having any luck with that last part: SELECT * FROM tablename ORDER BY random(), id LIMIT...
22
by: Nhmiller | last post by:
Is there a way to do this? Thanks. Neil Cat Paintings At Carol Wilson Gallery http://www.carolwilsongallery.com
18
by: Scott | last post by:
I have a collection where the items in the collection are dates. I want to iterate over the collection and build a value list string for the rowsource of a listbox. The dates in the collection are...
7
by: Pete Davis | last post by:
A different question this time. I have a DataGrid bound to a collection. Is there any way for me to allow sorting? The DataGrid.AllowSorting=true doesn't work, but that's probably because it can't...
1
by: Patrick | last post by:
Hi, This post is the 'sequel' ;) of the "Data Oriented vs Object Oriented Design" post, but it can be read and treated apart from that one. I will just quote the beginning of my previous message...
19
by: Boris Borcic | last post by:
does x.sort(cmp = lambda x,y : cmp(random.random(),0.5)) pick a random shuffle of x with uniform distribution ? Intuitively, assuming list.sort() does a minimal number of comparisons to ...
16
by: Claudio Grondi | last post by:
I have a 250 Gbyte file (occupies the whole hard drive space) and want to change only eight bytes in this file at a given offset of appr. 200 Gbyte (all other data in that file should remain...
10
by: Sjaakie | last post by:
Hi, I'm, what it turns out to be, fooling around with 3-tier design. At several websites people get really enthusiastic about using custom dataobjects instead of datasets/-tables. While trying to...
13
by: Bruza | last post by:
I need to implement a "random selection" algorithm which takes a list of as input. Each of the (obj, prob) represents how likely an object, "obj", should be selected based on its probability of...
0
by: =?ISO-8859-1?Q?Arne_Vajh=F8j?= | last post by:
Adam Sandler wrote: The following code does both array and collection. It is not the most efficient way of doing it, but the algorithm is easy to understand. public class Util<T> {
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.