473,385 Members | 1,341 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Median of values in a std::map

Hi,
I have a std::map<id, timestampcollection.

The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.

i.e.

typedef std::map<id, timestampcollection_t;
collection_t collection;
//...
std::vector<timestamp_tallTimes;
allTimes.reserve(collection.size());
for(collection_t::const_iterator iter = collection.begin();
iter != collection.end() ; ++iter)
{
allTimes.push_back(iter->second);
}
std::sort(allTimes.begin(), allTimes.end());
const timestamp_t timeToPrune = allTimes.at(allTimes.size()/2);
It works, it does what I want, it's easy to understand, performance
isn't needed. It does the job. I should leave it there. But,
but... it just seems that there must be a more elegant way to do it
:-(

Any suggestions?

Thanks

Yan
Aug 30 '06 #1
5 3550
Yannick Tremblay wrote:
Hi,
I have a std::map<id, timestampcollection.

The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.

i.e.

typedef std::map<id, timestampcollection_t;
collection_t collection;
//...
std::vector<timestamp_tallTimes;
allTimes.reserve(collection.size());
for(collection_t::const_iterator iter = collection.begin();
iter != collection.end() ; ++iter)
{
allTimes.push_back(iter->second);
}
std::sort(allTimes.begin(), allTimes.end());
const timestamp_t timeToPrune = allTimes.at(allTimes.size()/2);

It works, it does what I want, it's easy to understand, performance
isn't needed. It does the job. I should leave it there. But,
but... it just seems that there must be a more elegant way to do it
:-(

Any suggestions?
To convey intent more clearly, you could use nth_element() instead of sort.
That would be all, I'd change. [It could also be faster since nth_element
is linear on average.]
Best

Kai-Uwe Bux
Aug 30 '06 #2

Yannick Tremblay wrote:
typedef std::map<id, timestampcollection_t;
collection_t collection;
//...
std::vector<timestamp_tallTimes;
allTimes.reserve(collection.size());
for(collection_t::const_iterator iter = collection.begin();
iter != collection.end() ; ++iter)
{
allTimes.push_back(iter->second);
}
std::sort(allTimes.begin(), allTimes.end());
const timestamp_t timeToPrune = allTimes.at(allTimes.size()/2);
It works, it does what I want, it's easy to understand, performance
isn't needed. It does the job. I should leave it there. But,
but... it just seems that there must be a more elegant way to do it
:-(
Nothing C++ specific..but finding the median in a vector in linear time
(in average case) is possible using the "partitioning" technique. Its a
step in the QuickSort method whose documentation should be on the net.
For worst-case linear time, you can google for "median of medians"
technique. The idea is to deal with groups (of 5 elements each
typically), find their median and use it partition.
But frankly I dont think it will make your code more elegant!

Aug 30 '06 #3
Yannick Tremblay wrote:
>
I have a std::map<id, timestampcollection.

The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.
You could avoid the space and time overhead by using the mean
instead of the median. The mean will be quite close to the median,
unless there is something strange about your distribution.

Of course, the code to do this would probably be longer :)

Aug 31 '06 #4

Old Wolf wrote:
Yannick Tremblay wrote:

I have a std::map<id, timestampcollection.

The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.

You could avoid the space and time overhead by using the mean
instead of the median. The mean will be quite close to the median,
unless there is something strange about your distribution.

Of course, the code to do this would probably be longer :)
Also there might be some problem of defining the mean of a timestamp.
Come to think about, I do not believe that the timestamps I've created
in my professional career would allow this without some tweaking.

/Peter

Aug 31 '06 #5
In article <11***************@irys.nyx.net>,
Yannick Tremblay <yt******@nyx.netwrote:
>Hi,
[ cut cut ]

Thanks to all thoses that replied. Confirms my "sensible head"
feeling that the code works and is clear and I shouldn't worry about
it.

Yan

Sep 6 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Woodster | last post by:
I have declared the following std::map<std::string, std::string> myMap; to pass myMap to functions should I be declaring functions as: void function(std::map<std::string, std::string>); ...
2
by: Serengeti | last post by:
Hello, in my class I have a map that translates strings to pointers to some member functions. The code goes like this: class F { typedef void (Function::*MathFuncPtr)(); std::map<std::string,...
1
by: Saeed Amrollahi | last post by:
Dear All C++ Programmers Hello I am Saeed Amrollahi. I am a software engineer in Tehran Sewerage Company. I try to use std::map and map::find member function. I use Visual Studio .NET. my...
19
by: Erik Wikström | last post by:
First of all, forgive me if this is the wrong place to ask this question, if it's a stupid question (it's my second week with C++), or if this is answered some place else (I've searched but not...
3
by: Dan Trowbridge | last post by:
Hi everyone, In my attempt to port code from VS 6.0 to VS.NET I had some code break along the way, mostly due to not adhereing closely to the C++ standard. This may be another instance but I...
1
by: Avery Fong | last post by:
The following program will result in a compile error when building under Debug but will compile under Release. Why does is work under Release mode but not under Debug This program is developed...
13
by: kamaraj80 | last post by:
Hi I am using the std:: map as following. typedef struct _SeatRowCols { long nSeatRow; unsigned char ucSeatLetter; }SeatRowCols; typedef struct _NetData
2
by: digz | last post by:
Hi, I am trying to write a program which has two threads one of them write to a map , and the other one deletes entries based on a certain criterion.. first I cannot get the delete portion to...
2
by: aiooua | last post by:
hello, I had a doubt regarding the handling of enums in std::map. Consider the following piece of code: ------ #include<iostream> #include<map> using namespace std; typedef enum {
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.