On 7 Jul 2005 06:49:29 -0700,

er*********@lot o-quebec.com wrote:

In a program randomly generating 10 000 000 alphanumeric codes of 16

characters in length (Ex.: "ZAZAZAZAZAZAZ1 56"), what would be an

efficient way to ensure that I do not generate duplicates?

STL set, map?

Could you give me a little code example?

Thank you.

An alternative:

1 Generate a random string.

2 Hash the string to a value in a range > 0 - 9 999 999.

3 If this hash value has been used before then goto 1.

4 If the hash value is a new one then the string is also new.

Keep track of which hash values have appeared by using a bit array,

one bit per value in the hash range.

If you pick the hash range to be 25% larger than the number of strings

then on average you will only have to generate 4 strings to find a new

one at the end of the run.

Another alternative:

1 Generate a file of N > 10 000 000 unique strings in advance.

2 Pick a random number R in the range 1 .. N, both inclusive..

3 If R is already picked then pick another.

4 If R is not already picked then output the Rth string.

Again a bit array will keep track of the numbers that you have already

picked.

By generating the strings in advance you can take as long as you want

to get them right. You can load the file from disk for the actual run.

Yet another alternative:

1 Generate a file of N >= 10 000 000 unique strings in advance.

2 For IX = N downto 1

3 Pick a random number R in the range 1 .. IX, both inclusive.

4 Output string R.

5 Copy string IX over string R

6 Endfor

rossum

The ultimate truth is that there is no ultimate truth