473,320 Members | 2,164 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Better data structure?

I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data (Gaussian
distribution, some of you call it a "normal distribution"). There are
two values of relevance: the "z" value (telling how many standard
deviations away from the mean), and the "phi" value (cumulative
probability).

The data we are reading lists "z" values from -4.00 to +4.00, in 0.02
increments. As a result, the old data structure was an array declared
as such:

double probs[401][2];

where probs[i][0] is the "z" value and probs[i][1] is the "phi" value.

Later, we want to look up values in the table, and interpolate if the
exact values are not there, as follows:
double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];
else if (z >= 4.0)
phi = probs[400][1];
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];
else {
ratio = (z - probs[indexL][0]) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
}
}
Pretty ugly! So, I am trying to use better data structures, but it
doesn't seem to be a huge improvement:
std::vector< std::pair<double, double> > probs;

// z is calculated here

if (z <= -4.0) {
phi = probs.front().second;
}
else if (z >= 4.0) {
phi = probs.back().second;
}
else {
double tmp = (z + 4.0) / 0.02;
int indexL = static_cast<int>(std::floor(tmp));
int indexH = static_cast<int>(std::ceil(tmp));
if (indexL == indexH) {
phi = probs[indexL].second;
}
else {
double ratio = (z - probs[indexL].first) / 0.02;
phi = probs[indexL].second + ratio * (probs[indexH].second - probs[indexL].second);
}
}
Is there a better way to do this, maybe involving a std::map<double, double>?
However, my issue with the std::map<> is the floating-point inaccuracies
when comparing the keys to see if the exact key is in the table.

--
Marcus Kwok
Oct 10 '05 #1
6 1313
On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
double probs[401][2];
struct Prob
{
double z;
double phi;
};

Prob probs[401];

exact values are not there, as follows:
double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];
phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];
phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];
phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;
ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
}
}
}
}


- Jay

Oct 10 '05 #2
Jay Nabonne <ja*********@sonicnospam.com> wrote:
On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
double probs[401][2];


struct Prob
{
double z;
double phi;
};

Prob probs[401];

exact values are not there, as follows:
double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];


phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];


phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];


phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;


ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);


phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
}
}
}
}


Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

--
Marcus Kwok
Oct 10 '05 #3
Marcus Kwok wrote:

Jay Nabonne <ja*********@sonicnospam.com> wrote:
On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
double probs[401][2];


struct Prob
{
double z;
double phi;
};

Prob probs[401];

exact values are not there, as follows:
double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];


phi = probs[0].phi;
else if (z >= 4.0)
phi = probs[400][1];


phi = probs[400].phi;
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];


phi = probs[indexL].phi;
else {
ratio = (z - probs[indexL][0]) / 0.02;


ratio = (z - probs[indexL].z) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);


phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
}
}
}
}


Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.


I don't think there is much you can do. It already is short, is easy to understand.
I really don't see a need to clean it up.
The only thing I would do is: I would check if it ever happens that IndexL equals
indexH. If it happens, how often does it happen? Due to round-off errors I don't think
it is likely that the calculation of ( z + 4.0 ) / 0.02 equals a whole number when
using double arithmetic. So there is no point in having an 'if' that is almost never
taken. But I may be wrong, only a test could show that.

--
Karl Heinz Buchegger
kb******@gascad.at
Oct 11 '05 #4

Marcus Kwok wrote:
I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data [is] Pretty ugly!
So, I am trying to use better data structures...


In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".

Oct 11 '05 #5
Marcus Kwok wrote:
I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data [is] Pretty ugly!
So, I am trying to use better data structures...

ma************@gmail.com wrote: In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".


I am reimplementing the entire program (it's not too terribly big) so I
have full control over the entire code base, so it's not too big of a
deal. Converting it to C++ has allowed me to not worry as much about
resource management (since I use RAII a lot), leaving me more time to
spot inefficiencies in the algorithms we're using.

--
Marcus Kwok
Oct 11 '05 #6
>> Jay Nabonne <ja*********@sonicnospam.com> wrote:
> On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
>
>> double probs[401][2];
>
> struct Prob
> {
> double z;
> double phi;
> };
>
> Prob probs[401];
>
>>
>> exact values are not there, as follows:
>>
>>
>> double z, phi, tmp, ratio;
>> int indexL, indexH;
>>
>> /* z is calculated here */
>>
>> if (z <= -4.0)
>> phi = probs[0][1];
>
> phi = probs[0].phi;
>
>> else if (z >= 4.0)
>> phi = probs[400][1];
>
> phi = probs[400].phi;
>
>> else {
>> tmp = (z + 4.0) / 0.02;
>> indexL = floor(tmp);
>> indexH = ceil(tmp);
>> if (indexL == indexH)
>> phi = probs[indexL][1];
>
> phi = probs[indexL].phi;
>
>> else {
>> ratio = (z - probs[indexL][0]) / 0.02;
>
> ratio = (z - probs[indexL].z) / 0.02;
>
>> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
>
> phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
>
>> }
>> }
>> }
>> }

Marcus Kwok wrote: Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

Karl Heinz Buchegger <kb******@gascad.at> wrote: I don't think there is much you can do. It already is short, is easy
to understand. I really don't see a need to clean it up. The only
thing I would do is: I would check if it ever happens that IndexL
equals indexH. If it happens, how often does it happen? Due to
round-off errors I don't think it is likely that the calculation of (
z + 4.0 ) / 0.02 equals a whole number when using double arithmetic.
So there is no point in having an 'if' that is almost never taken. But
I may be wrong, only a test could show that.


Thanks for your response. I may go back and look at it later, but a
separate, more urgent issue (completely unrelated to this) has come up
in a different area of the code. Since it works now, I may leave it as
is.

--
Marcus Kwok
Oct 11 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Roberto A. F. De Almeida | last post by:
Hi, I'm interested in parsing a file containing this "structure": """dataset { int catalog_number; sequence { string experimenter; int32 time; structure {
6
by: Darren | last post by:
X-No-Archive Hi all, Can anyone help me with structuring the data in a small tool I have to build? I'm trying to work out if there's a design pattern or data structure that would remove some...
3
by: mrhicks | last post by:
Hello all, I have a question regarding efficeny and how to find the best approach when trying to find flag with in a structure of bit fields. I have several structures which look similar to ...
12
by: Jonathan Bartlett | last post by:
Just finished a new IBM DeveloperWorks article on linked lists, and thought you all might be interested. It's not an introduction -- it instead covers some of the more interesting aspects of...
39
by: windandwaves | last post by:
Hi Folk I have to store up to eight boolean bits of information about an item in my database. e.g. with restaurant drive-through facility yellow windows
15
by: Rob Meade | last post by:
Hi all, I have a databse which I'm pulling the data from for my ASP page. I have 4 tables, Course, Feature, Objective, and PreRequisite. The last three all contain a course product code and a...
19
by: Mary Pegg | last post by:
There's got to be a better way: if (isset($c)) echo $c."<br>"; if (isset($c)) echo $c."<br>"; if (isset($c)) echo $c."<br>"; if (isset($c)) echo $c."<br>"; if (isset($c)) echo $c."<br>"; but...
1
by: qbob | last post by:
I have data that is essentially tree-like, and to store it/iterate over it I have been using nested dictionaries, eg: Dictionary<int, Dictionary<string, Dictionary<double>>> data = new ... to...
36
by: istillshine | last post by:
I personally like C, and do not like any OO languages. The reference books for OO languages are too heavy for me. They just made things complicated. Someone laughed at my opinion, saying Google...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.