472,103 Members | 1,067 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,103 software developers and data experts.

random iteration of a sequence using generators

Hello everybody. Anybody has any ideas with the following problem?

We start with a list, and we wish to construct a generator that yields each element of the list exactly once, but in random order.

In other words, an iterator version of random.shuffle(list)

The motivation is the following: suppose you have a huge list and wish to extract one random element that satisfies a certain property. Typically, you expect many elements to have the property, so this approach:

-Select a random element, check if it has the property. If that is the case, yield the element, otherwise, loop

is more efficient that this other one:

-Filter the list and select one element.

However the first approach will have find trouble if none, or few elements have the property if your iteration of the list does not guarantee that every element is yielded at least once, and will not be efficient if the iteration does not guarantee that every element is yielded at most once.

Thanks everybody
Aug 18 '07 #1
2 1711
bartonc
6,596 Expert 4TB
Not sure if this satisfies all your requirements, but distribution looks good:
Expand|Select|Wrap|Line Numbers
  1. >>> from random import randint
  2. >>> def YieldRandomItem(aList):
  3. ...     beenYielded = []
  4. ...     size = len(aList) - 1
  5. ...     for item in aList:
  6. ...         i = randint(0, size)
  7. ...         while i in beenYielded:
  8. ...             i = randint(0, size)
  9. ...         beenYielded.append(i)
  10. ...         yield aList[i]
  11. ...         
  12. >>> myList = range(20)
  13. >>> for item in YieldRandomItem(myList):
  14. ...     print item
  15. ...     
  16. 9
  17. 10
  18. 19
  19. 11
  20. 12
  21. 1
  22. 14
  23. 18
  24. 16
  25. 7
  26. 5
  27. 2
  28. 3
  29. 6
  30. 8
  31. 17
  32. 13
  33. 4
  34. 15
  35.  
  36. >>> for item in YieldRandomItem(myList):
  37. ...     print item
  38. ...     
  39. 5
  40. 15
  41. 7
  42. 19
  43. 2
  44. 12
  45. 4
  46. 14
  47. 6
  48. 11
  49. 13
  50.  
  51. 10
  52. 1
  53. 17
  54. 9
  55. 3
  56. 8
  57. 18
  58. 16
  59. >>> 
Aug 18 '07 #2
Thanks for your answer, bartonc.
Well, this certainly works as a general method, though for the particular problem that motivated my question I would rather generate random numbers and check if they have the required property:

Expand|Select|Wrap|Line Numbers
  1. x=choice(aList)
  2. while not property(x):
  3.      x=choice(aList)
  4. return x
  5.  
However, I'd bet this can be done even better. I would like to do without having to check the property several times on the same item. Your proposal does this, but at the cost of having to make the check 'i in beenYielded' several times on the same item.

------

As a solution to the general problem, your solution won't do bad, specially if 'YieldRandomItem' is called for only a few items of the list (which was the motivation), but I believe it's possible to do even better, may be at the cost of having random numbers of poorer quality.

One comment about your solution: the number of iterations of the 'while' loop required to yield the i'th element follows a geometric distribution, with expected time: n/(n-i)
The total expected time if 'YieldRandomItem' has to run through the whole list is thus:
n/n+n/(n-1)+n/(n-2)+...+n
which is aproximately
n log(n)

So it takes generation of n log(n) random numbers and performing the check 'i in beenYielded' n log(n) times.

This last part takes 'n' operations for each check, which increases the total cost greatly, but the cost of a search operation 'x in aList' can be reduced to 'log(n)' if the list is replaced by a set.

-------

May be a method in the random module that pops a random element from a set would be helpful....
Aug 19 '07 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by Brandon Michael Moore | last post: by
59 posts views Thread by Raymond Hettinger | last post: by
3 posts views Thread by Joe | last post: by
23 posts views Thread by MConly | last post: by
5 posts views Thread by Peteroid | last post: by
104 posts views Thread by fieldfallow | last post: by
40 posts views Thread by RadiationX | last post: by
10 posts views Thread by rplobue | last post: by
16 posts views Thread by jason.cipriani | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.