Connecting Tech Pros Worldwide Forums | Help | Site Map

Better data structure?

Marcus Kwok
Guest
 
Posts: n/a
#1: Oct 10 '05
I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data (Gaussian
distribution, some of you call it a "normal distribution"). There are
two values of relevance: the "z" value (telling how many standard
deviations away from the mean), and the "phi" value (cumulative
probability).

The data we are reading lists "z" values from -4.00 to +4.00, in 0.02
increments. As a result, the old data structure was an array declared
as such:

double probs[401][2];

where probs[i][0] is the "z" value and probs[i][1] is the "phi" value.

Later, we want to look up values in the table, and interpolate if the
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];
else if (z >= 4.0)
phi = probs[400][1];
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];
else {
ratio = (z - probs[indexL][0]) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
}
}


Pretty ugly! So, I am trying to use better data structures, but it
doesn't seem to be a huge improvement:


std::vector< std::pair<double, double> > probs;

// z is calculated here

if (z <= -4.0) {
phi = probs.front().second;
}
else if (z >= 4.0) {
phi = probs.back().second;
}
else {
double tmp = (z + 4.0) / 0.02;
int indexL = static_cast<int>(std::floor(tmp));
int indexH = static_cast<int>(std::ceil(tmp));
if (indexL == indexH) {
phi = probs[indexL].second;
}
else {
double ratio = (z - probs[indexL].first) / 0.02;
phi = probs[indexL].second + ratio * (probs[indexH].second - probs[indexL].second);
}
}


Is there a better way to do this, maybe involving a std::map<double, double>?
However, my issue with the std::map<> is the floating-point inaccuracies
when comparing the keys to see if the exact key is in the table.

--
Marcus Kwok

Jay Nabonne
Guest
 
Posts: n/a
#2: Oct 10 '05

re: Better data structure?


On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
[color=blue]
> double probs[401][2];[/color]

struct Prob
{
double z;
double phi;
};

Prob probs[401];
[color=blue]
>
> exact values are not there, as follows:
>
>
> double z, phi, tmp, ratio;
> int indexL, indexH;
>
> /* z is calculated here */
>
> if (z <= -4.0)
> phi = probs[0][1];[/color]

phi = probs[0].phi;
[color=blue]
> else if (z >= 4.0)
> phi = probs[400][1];[/color]

phi = probs[400].phi;
[color=blue]
> else {
> tmp = (z + 4.0) / 0.02;
> indexL = floor(tmp);
> indexH = ceil(tmp);
> if (indexL == indexH)
> phi = probs[indexL][1];[/color]

phi = probs[indexL].phi;
[color=blue]
> else {
> ratio = (z - probs[indexL][0]) / 0.02;[/color]

ratio = (z - probs[indexL].z) / 0.02;
[color=blue]
> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
[color=blue]
> }
> }
> }
> }[/color]

- Jay

Marcus Kwok
Guest
 
Posts: n/a
#3: Oct 10 '05

re: Better data structure?


Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=blue]
> On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
>[color=green]
>> double probs[401][2];[/color]
>
> struct Prob
> {
> double z;
> double phi;
> };
>
> Prob probs[401];
>[color=green]
>>
>> exact values are not there, as follows:
>>
>>
>> double z, phi, tmp, ratio;
>> int indexL, indexH;
>>
>> /* z is calculated here */
>>
>> if (z <= -4.0)
>> phi = probs[0][1];[/color]
>
> phi = probs[0].phi;
>[color=green]
>> else if (z >= 4.0)
>> phi = probs[400][1];[/color]
>
> phi = probs[400].phi;
>[color=green]
>> else {
>> tmp = (z + 4.0) / 0.02;
>> indexL = floor(tmp);
>> indexH = ceil(tmp);
>> if (indexL == indexH)
>> phi = probs[indexL][1];[/color]
>
> phi = probs[indexL].phi;
>[color=green]
>> else {
>> ratio = (z - probs[indexL][0]) / 0.02;[/color]
>
> ratio = (z - probs[indexL].z) / 0.02;
>[color=green]
>> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]
>
> phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
>[color=green]
>> }
>> }
>> }
>> }[/color][/color]

Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

--
Marcus Kwok
Karl Heinz Buchegger
Guest
 
Posts: n/a
#4: Oct 11 '05

re: Better data structure?


Marcus Kwok wrote:[color=blue]
>
> Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=green]
> > On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
> >[color=darkred]
> >> double probs[401][2];[/color]
> >
> > struct Prob
> > {
> > double z;
> > double phi;
> > };
> >
> > Prob probs[401];
> >[color=darkred]
> >>
> >> exact values are not there, as follows:
> >>
> >>
> >> double z, phi, tmp, ratio;
> >> int indexL, indexH;
> >>
> >> /* z is calculated here */
> >>
> >> if (z <= -4.0)
> >> phi = probs[0][1];[/color]
> >
> > phi = probs[0].phi;
> >[color=darkred]
> >> else if (z >= 4.0)
> >> phi = probs[400][1];[/color]
> >
> > phi = probs[400].phi;
> >[color=darkred]
> >> else {
> >> tmp = (z + 4.0) / 0.02;
> >> indexL = floor(tmp);
> >> indexH = ceil(tmp);
> >> if (indexL == indexH)
> >> phi = probs[indexL][1];[/color]
> >
> > phi = probs[indexL].phi;
> >[color=darkred]
> >> else {
> >> ratio = (z - probs[indexL][0]) / 0.02;[/color]
> >
> > ratio = (z - probs[indexL].z) / 0.02;
> >[color=darkred]
> >> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]
> >
> > phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
> >[color=darkred]
> >> }
> >> }
> >> }
> >> }[/color][/color]
>
> Thanks. However, I was trying to avoid basic arrays and instead use a
> std::vector<>, though maybe it will be clearer to create a struct (as
> you did) instead of using a std::pair<>.
>
> The main thing I am trying to do is clean up the calculation of the
> index, and seeing if the exact value is in the table.
>[/color]

I don't think there is much you can do. It already is short, is easy to understand.
I really don't see a need to clean it up.
The only thing I would do is: I would check if it ever happens that IndexL equals
indexH. If it happens, how often does it happen? Due to round-off errors I don't think
it is likely that the calculation of ( z + 4.0 ) / 0.02 equals a whole number when
using double arithmetic. So there is no point in having an 'if' that is almost never
taken. But I may be wrong, only a test could show that.

--
Karl Heinz Buchegger
kbuchegg@gascad.at
makc.the.great@gmail.com
Guest
 
Posts: n/a
#5: Oct 11 '05

re: Better data structure?



Marcus Kwok wrote:[color=blue]
> I am in the process of converting some legacy code (written in C) to
> C++.
>
> The original data structure is used to hold a table of data [is] Pretty ugly!
> So, I am trying to use better data structures...[/color]

In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".

Marcus Kwok
Guest
 
Posts: n/a
#6: Oct 11 '05

re: Better data structure?


Marcus Kwok wrote:[color=blue][color=green]
>> I am in the process of converting some legacy code (written in C) to
>> C++.
>>
>> The original data structure is used to hold a table of data [is] Pretty ugly!
>> So, I am trying to use better data structures...[/color][/color]

makc.the.great@gmail.com wrote:[color=blue]
> In a process of converting legacy code it is nod a good idea to change
> anything. There's russian saying, the "better" is an enemy of the
> "good".[/color]

I am reimplementing the entire program (it's not too terribly big) so I
have full control over the entire code base, so it's not too big of a
deal. Converting it to C++ has allowed me to not worry as much about
resource management (since I use RAII a lot), leaving me more time to
spot inefficiencies in the algorithms we're using.

--
Marcus Kwok
Marcus Kwok
Guest
 
Posts: n/a
#7: Oct 11 '05

re: Better data structure?


>> Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=blue][color=green][color=darkred]
>> > On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
>> >
>> >> double probs[401][2];
>> >
>> > struct Prob
>> > {
>> > double z;
>> > double phi;
>> > };
>> >
>> > Prob probs[401];
>> >
>> >>
>> >> exact values are not there, as follows:
>> >>
>> >>
>> >> double z, phi, tmp, ratio;
>> >> int indexL, indexH;
>> >>
>> >> /* z is calculated here */
>> >>
>> >> if (z <= -4.0)
>> >> phi = probs[0][1];
>> >
>> > phi = probs[0].phi;
>> >
>> >> else if (z >= 4.0)
>> >> phi = probs[400][1];
>> >
>> > phi = probs[400].phi;
>> >
>> >> else {
>> >> tmp = (z + 4.0) / 0.02;
>> >> indexL = floor(tmp);
>> >> indexH = ceil(tmp);
>> >> if (indexL == indexH)
>> >> phi = probs[indexL][1];
>> >
>> > phi = probs[indexL].phi;
>> >
>> >> else {
>> >> ratio = (z - probs[indexL][0]) / 0.02;
>> >
>> > ratio = (z - probs[indexL].z) / 0.02;
>> >
>> >> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
>> >
>> > phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
>> >
>> >> }
>> >> }
>> >> }
>> >> }[/color][/color][/color]

Marcus Kwok wrote:[color=blue][color=green]
>> Thanks. However, I was trying to avoid basic arrays and instead use a
>> std::vector<>, though maybe it will be clearer to create a struct (as
>> you did) instead of using a std::pair<>.
>>
>> The main thing I am trying to do is clean up the calculation of the
>> index, and seeing if the exact value is in the table.[/color][/color]

Karl Heinz Buchegger <kbuchegg@gascad.at> wrote:[color=blue]
> I don't think there is much you can do. It already is short, is easy
> to understand. I really don't see a need to clean it up. The only
> thing I would do is: I would check if it ever happens that IndexL
> equals indexH. If it happens, how often does it happen? Due to
> round-off errors I don't think it is likely that the calculation of (
> z + 4.0 ) / 0.02 equals a whole number when using double arithmetic.
> So there is no point in having an 'if' that is almost never taken. But
> I may be wrong, only a test could show that.[/color]

Thanks for your response. I may go back and look at it later, but a
separate, more urgent issue (completely unrelated to this) has come up
in a different area of the code. Since it works now, I may leave it as
is.

--
Marcus Kwok
Closed Thread