473,403 Members | 2,293 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

Stratified random sampling with random_shuffle ?

Dear cpp-ians,

I want to apply a stratified sampling on an image. Following simplified
example will explain my problem.

The original image I with nrows and ncols is now a vector V of length
(nrow x ncol) and every element of the vector contians its pixel value.

vector float V [nrow * ncol] ;

This means that:
- the first ncol elements = first row of my image
- the elements between ncol and 2*ncol = the second row of my image
and so on ...

Now I want to select a stratified pixel of this vector and sample every
pixel till all pixels are sampled once. In the image it would
correspond to divide my image into small blocks and take a pixel of
each block iteratively till all pixels are done.

If I random_shuffle my vector and start at the beginning V till the end
I perform a random sampling, but I now want to add a certain order in
it so it corresponds to stratified random sampling. Is there anyone
with ideas on how I should do this?

Thanx in advance and kind regards,
Stef

Jul 23 '05 #1
1 4696
On 14 Apr 2005 02:55:06 -0700, "steflhermitte"
<st***************@agr.kuleuven.ac.be> wrote:
Dear cpp-ians,
[...]The original image I with nrows and ncols is now a vector V of length
(nrow x ncol) and every element of the vector contians its pixel value.

vector float V [nrow * ncol] ;
[...]Now I want to select a stratified pixel of this vector and sample every
pixel till all pixels are sampled once. In the image it would
correspond to divide my image into small blocks and take a pixel of
each block iteratively till all pixels are done.

If I random_shuffle my vector and start at the beginning V till the end
I perform a random sampling, but I now want to add a certain order in
it so it corresponds to stratified random sampling. Is there anyone
with ideas on how I should do this?

As random_shuffle will completely re-order the vector, the original
neighbor relation will be completely lost. I suggest creating a
"StratifiedVector", whose indexing operator returns a
StratifiedVector::Block, which could represent the sequence of
elements in a block from the original vector (like a gslice_array).

template <typename _Scalar,
typename _Container=std::vector<_Scalar> >
class StratifiedVector {
public:
StratifiedVector(_Container&,
size_t block_width, size_t block_height);
//define class Block, iterator &c.
...
};

You could then apply random_shuffle to each block. Alternatively,
define Block::sample() which returns a random element of a Block; if
you don't want repeats, Block::sample could internally use a random
sequence of indices. Depending on how you implemented
StratifiedVector, every block could use have their own sequence for
Block::sample or share a common one. Shuffling within each block will
produce a random sequence for each block but preserve the
stratification. A random shuffling of a StratifiedVector will produce
a random sequence of blocks and thus also preserve stratification.

You could then have two loops, the outer over block indexes and the
inner over the (random) sequence of blocks. Within the inner loop,
take the pixel within the current block. Note that with this
approach, having a different random sequence of blocks for each inner
loop is as expensive as shuffling V (which is what would be
happening). Example:

StratifiedVector<float> StratV(V, block_width, block_height);
//following shuffles could be done in StratifiedVector's
//constructor. If you provide a Block::sample, shuffling within
//blocks is unnecessary.
for (size_t i=0; i < StratV.size(); ++i) {
random_shuffle(StratV[i].begin(),
StratV[i].end());
}
random_shuffle(StratV.begin(), StratV.end());

size_t block_size=block_width*block_height;
for (size_t pixel_i=0; pixel_i < block_size; ++pixel_i) {
/* if you want a different sequence of blocks for each inner loop,
perform random_shuffle here rather than before the outer loop.
Expensive. */
//random_shuffle(StratV.begin(), StratV.end());
for (size_t block_i=0; block_i < StratV.size(); ++block_i)
{
/*Test is for blocks in the last row or column. If block_height
divides nrow and block_width divides ncol, the test can be
skipped.*/
if (pixel_i < StratV[block_i].size()) {
foo(StratV[block_i][pixel_i]);

/*self-sampling version; Block::sample() returns a random
element. If you use Block::sample, remove the above loop
which shuffles each block */
//foo(StratV[block_i].sample());
}
}
}

Alternatively, you could shuffle sequences of indices rather than
blocks or pixels. Producing different random sequences of blocks is
less expensive with this approach, but the extra index vectors are
more space expensive and indirection through them is more time
expensive. Also, each block must use the same random index sequence.
Example:

StratifiedVector<float> StratV(V, block_width, block_height);
vector<size_t> SV_idxes(StratV.size());
for (sizt_t i=0; i<SV_idxes.size(); ++i) {
SV_idxes[i]=i;
}
/* could also shuffle SV_idxes inside the outer 'for' loop */
random_shuffle(SV_idxes.begin(), SV_idxes.end());
vector<size_t> Pixel_idxes(block_width * block_height);
for (sizt_t i=0; i<Pixel_idxes.size(); ++i) {
Pixel_idxes[i]=i;
}
random_shuffle(Pixel_idxes.begin(), Pixel_idxes.end());
for (size_t pixel_i=0; pixel_i < block_size; ++pixel_i) {
/* if you want a different sequence of blocks for each inner loop,
perform random_shuffle here rather than after creating
SV_idxes. */
//random_shuffle(SV_idxes.begin(), SV_idxes.end());
for (size_t block_i=0; block_i < SV_idxes.size(); ++block_i)
{
/*Test is for blocks in the last row or column. If block_height
divides nrow and block_width divides ncol, the test can be
skipped.*/
if (Pixel_idxes[pixel_i] < StratV[SV_idxes[block_i]].size()) {
foo(StratV[SV_idxes[block_i]][Pixel_idxes[pixel_i]]);

/* Self-sampling version. If you use Block::sample(), remove
Pixel_idexes and loop which shuffles it. */
//foo(StratV[block_i].sample());
}
}
}
You could also combine the two approaches by shuffling within each
block and shuffling indices of StratifiedVector. This way, each block
will have a different random sequence and shuffling the sequence of
blocks isn't too expensive.

Note StratV.size() equals:
ceil(static_cast<float>(nrow)/block_height)
*ceil(static_cast<float>(ncol)/block_width);
If you know block_width and block_height will always be divisors of
nrow and ncol, you can use solely integer arithmetic in
StratifiedVector::size().

Kanenas
Jul 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Bart Nessux | last post by:
I am using method 'a' below to pick 25 names from a pool of 225. A co-worker is using method 'b' by running it 25 times and throwing out the winning name (names are associated with numbers) after...
21
by: Andreas Lobinger | last post by:
Aloha, i wanted to ask another problem, but as i started to build an example... How to generate (memory and time)-efficient a string containing random characters? I have never worked with...
5
by: Ross MacGregor | last post by:
I have a very simple yet complicated problem. I want to generate a random list of indices (int's) for a container. Let's say I have a container with 10 items and I want a list of 3 random...
5
by: Artyom | last post by:
Hi everybody! Here's the problem: I have an array of 4 integer elements and they must be initialized by random numbers from 0 to 9 that shouldn't repeat. I tried many things, but some elements...
16
by: glowfire | last post by:
Please, somebody help me with this program! We have a deck of 52 cards and we shuffle the deck by choosing a random card out of the deck placing it back in the deck at some random place. Repeat...
2
by: Nelis Franken | last post by:
Good day. Thanks for the previous help on binding member functions to use as predicates for STL functions (original example applied to sort()). The technique to use Boost's bind() works well,...
1
by: Generic Usenet Account | last post by:
I had a need to create my own RandomNumberGenerator class, for use with random_shuffle (I did not want the same sequence generated each time). I looked up an example provided by Nicolai M....
7
by: bipi | last post by:
Dear all, I found function rand(), it can create random number but this function can not define the range of number which I want to get it, such as, I want to get random number in the range from...
1
by: helezem | last post by:
pls cud any one help me with a program that generates samples randomly using simple random sampling
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.