Connecting Tech Pros Worldwide Help | Site Map

Better data structure?

 
LinkBack Thread Tools Search this Thread
  #1  
Old October 10th, 2005, 08:15 PM
Marcus Kwok
Guest
 
Posts: n/a
Default Better data structure?

I am in the process of converting some legacy code (written in C) to
C++.

The original data structure is used to hold a table of data (Gaussian
distribution, some of you call it a "normal distribution"). There are
two values of relevance: the "z" value (telling how many standard
deviations away from the mean), and the "phi" value (cumulative
probability).

The data we are reading lists "z" values from -4.00 to +4.00, in 0.02
increments. As a result, the old data structure was an array declared
as such:

double probs[401][2];

where probs[i][0] is the "z" value and probs[i][1] is the "phi" value.

Later, we want to look up values in the table, and interpolate if the
exact values are not there, as follows:


double z, phi, tmp, ratio;
int indexL, indexH;

/* z is calculated here */

if (z <= -4.0)
phi = probs[0][1];
else if (z >= 4.0)
phi = probs[400][1];
else {
tmp = (z + 4.0) / 0.02;
indexL = floor(tmp);
indexH = ceil(tmp);
if (indexL == indexH)
phi = probs[indexL][1];
else {
ratio = (z - probs[indexL][0]) / 0.02;
phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
}
}


Pretty ugly! So, I am trying to use better data structures, but it
doesn't seem to be a huge improvement:


std::vector< std::pair<double, double> > probs;

// z is calculated here

if (z <= -4.0) {
phi = probs.front().second;
}
else if (z >= 4.0) {
phi = probs.back().second;
}
else {
double tmp = (z + 4.0) / 0.02;
int indexL = static_cast<int>(std::floor(tmp));
int indexH = static_cast<int>(std::ceil(tmp));
if (indexL == indexH) {
phi = probs[indexL].second;
}
else {
double ratio = (z - probs[indexL].first) / 0.02;
phi = probs[indexL].second + ratio * (probs[indexH].second - probs[indexL].second);
}
}


Is there a better way to do this, maybe involving a std::map<double, double>?
However, my issue with the std::map<> is the floating-point inaccuracies
when comparing the keys to see if the exact key is in the table.

--
Marcus Kwok

  #2  
Old October 10th, 2005, 08:25 PM
Jay Nabonne
Guest
 
Posts: n/a
Default Re: Better data structure?

On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
[color=blue]
> double probs[401][2];[/color]

struct Prob
{
double z;
double phi;
};

Prob probs[401];
[color=blue]
>
> exact values are not there, as follows:
>
>
> double z, phi, tmp, ratio;
> int indexL, indexH;
>
> /* z is calculated here */
>
> if (z <= -4.0)
> phi = probs[0][1];[/color]

phi = probs[0].phi;
[color=blue]
> else if (z >= 4.0)
> phi = probs[400][1];[/color]

phi = probs[400].phi;
[color=blue]
> else {
> tmp = (z + 4.0) / 0.02;
> indexL = floor(tmp);
> indexH = ceil(tmp);
> if (indexL == indexH)
> phi = probs[indexL][1];[/color]

phi = probs[indexL].phi;
[color=blue]
> else {
> ratio = (z - probs[indexL][0]) / 0.02;[/color]

ratio = (z - probs[indexL].z) / 0.02;
[color=blue]
> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]

phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
[color=blue]
> }
> }
> }
> }[/color]

- Jay

  #3  
Old October 10th, 2005, 08:35 PM
Marcus Kwok
Guest
 
Posts: n/a
Default Re: Better data structure?

Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=blue]
> On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
>[color=green]
>> double probs[401][2];[/color]
>
> struct Prob
> {
> double z;
> double phi;
> };
>
> Prob probs[401];
>[color=green]
>>
>> exact values are not there, as follows:
>>
>>
>> double z, phi, tmp, ratio;
>> int indexL, indexH;
>>
>> /* z is calculated here */
>>
>> if (z <= -4.0)
>> phi = probs[0][1];[/color]
>
> phi = probs[0].phi;
>[color=green]
>> else if (z >= 4.0)
>> phi = probs[400][1];[/color]
>
> phi = probs[400].phi;
>[color=green]
>> else {
>> tmp = (z + 4.0) / 0.02;
>> indexL = floor(tmp);
>> indexH = ceil(tmp);
>> if (indexL == indexH)
>> phi = probs[indexL][1];[/color]
>
> phi = probs[indexL].phi;
>[color=green]
>> else {
>> ratio = (z - probs[indexL][0]) / 0.02;[/color]
>
> ratio = (z - probs[indexL].z) / 0.02;
>[color=green]
>> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]
>
> phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
>[color=green]
>> }
>> }
>> }
>> }[/color][/color]

Thanks. However, I was trying to avoid basic arrays and instead use a
std::vector<>, though maybe it will be clearer to create a struct (as
you did) instead of using a std::pair<>.

The main thing I am trying to do is clean up the calculation of the
index, and seeing if the exact value is in the table.

--
Marcus Kwok
  #4  
Old October 11th, 2005, 07:55 AM
Karl Heinz Buchegger
Guest
 
Posts: n/a
Default Re: Better data structure?

Marcus Kwok wrote:[color=blue]
>
> Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=green]
> > On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
> >[color=darkred]
> >> double probs[401][2];[/color]
> >
> > struct Prob
> > {
> > double z;
> > double phi;
> > };
> >
> > Prob probs[401];
> >[color=darkred]
> >>
> >> exact values are not there, as follows:
> >>
> >>
> >> double z, phi, tmp, ratio;
> >> int indexL, indexH;
> >>
> >> /* z is calculated here */
> >>
> >> if (z <= -4.0)
> >> phi = probs[0][1];[/color]
> >
> > phi = probs[0].phi;
> >[color=darkred]
> >> else if (z >= 4.0)
> >> phi = probs[400][1];[/color]
> >
> > phi = probs[400].phi;
> >[color=darkred]
> >> else {
> >> tmp = (z + 4.0) / 0.02;
> >> indexL = floor(tmp);
> >> indexH = ceil(tmp);
> >> if (indexL == indexH)
> >> phi = probs[indexL][1];[/color]
> >
> > phi = probs[indexL].phi;
> >[color=darkred]
> >> else {
> >> ratio = (z - probs[indexL][0]) / 0.02;[/color]
> >
> > ratio = (z - probs[indexL].z) / 0.02;
> >[color=darkred]
> >> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);[/color]
> >
> > phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
> >[color=darkred]
> >> }
> >> }
> >> }
> >> }[/color][/color]
>
> Thanks. However, I was trying to avoid basic arrays and instead use a
> std::vector<>, though maybe it will be clearer to create a struct (as
> you did) instead of using a std::pair<>.
>
> The main thing I am trying to do is clean up the calculation of the
> index, and seeing if the exact value is in the table.
>[/color]

I don't think there is much you can do. It already is short, is easy to understand.
I really don't see a need to clean it up.
The only thing I would do is: I would check if it ever happens that IndexL equals
indexH. If it happens, how often does it happen? Due to round-off errors I don't think
it is likely that the calculation of ( z + 4.0 ) / 0.02 equals a whole number when
using double arithmetic. So there is no point in having an 'if' that is almost never
taken. But I may be wrong, only a test could show that.

--
Karl Heinz Buchegger
kbuchegg@gascad.at
  #5  
Old October 11th, 2005, 10:15 AM
makc.the.great@gmail.com
Guest
 
Posts: n/a
Default Re: Better data structure?


Marcus Kwok wrote:[color=blue]
> I am in the process of converting some legacy code (written in C) to
> C++.
>
> The original data structure is used to hold a table of data [is] Pretty ugly!
> So, I am trying to use better data structures...[/color]

In a process of converting legacy code it is nod a good idea to change
anything. There's russian saying, the "better" is an enemy of the
"good".

  #6  
Old October 11th, 2005, 12:55 PM
Marcus Kwok
Guest
 
Posts: n/a
Default Re: Better data structure?

Marcus Kwok wrote:[color=blue][color=green]
>> I am in the process of converting some legacy code (written in C) to
>> C++.
>>
>> The original data structure is used to hold a table of data [is] Pretty ugly!
>> So, I am trying to use better data structures...[/color][/color]

makc.the.great@gmail.com wrote:[color=blue]
> In a process of converting legacy code it is nod a good idea to change
> anything. There's russian saying, the "better" is an enemy of the
> "good".[/color]

I am reimplementing the entire program (it's not too terribly big) so I
have full control over the entire code base, so it's not too big of a
deal. Converting it to C++ has allowed me to not worry as much about
resource management (since I use RAII a lot), leaving me more time to
spot inefficiencies in the algorithms we're using.

--
Marcus Kwok
  #7  
Old October 11th, 2005, 09:55 PM
Marcus Kwok
Guest
 
Posts: n/a
Default Re: Better data structure?

>> Jay Nabonne <jay_nabonne@sonicnospam.com> wrote:[color=blue][color=green][color=darkred]
>> > On Mon, 10 Oct 2005 20:02:10 +0000, Marcus Kwok wrote:
>> >
>> >> double probs[401][2];
>> >
>> > struct Prob
>> > {
>> > double z;
>> > double phi;
>> > };
>> >
>> > Prob probs[401];
>> >
>> >>
>> >> exact values are not there, as follows:
>> >>
>> >>
>> >> double z, phi, tmp, ratio;
>> >> int indexL, indexH;
>> >>
>> >> /* z is calculated here */
>> >>
>> >> if (z <= -4.0)
>> >> phi = probs[0][1];
>> >
>> > phi = probs[0].phi;
>> >
>> >> else if (z >= 4.0)
>> >> phi = probs[400][1];
>> >
>> > phi = probs[400].phi;
>> >
>> >> else {
>> >> tmp = (z + 4.0) / 0.02;
>> >> indexL = floor(tmp);
>> >> indexH = ceil(tmp);
>> >> if (indexL == indexH)
>> >> phi = probs[indexL][1];
>> >
>> > phi = probs[indexL].phi;
>> >
>> >> else {
>> >> ratio = (z - probs[indexL][0]) / 0.02;
>> >
>> > ratio = (z - probs[indexL].z) / 0.02;
>> >
>> >> phi = probs[indexL][1] + ratio * (probs[indexH][1] - probs[indexL][1]);
>> >
>> > phi = probs[indexL].phi + ratio * (probs[indexH].phi - probs[indexL].phi);
>> >
>> >> }
>> >> }
>> >> }
>> >> }[/color][/color][/color]

Marcus Kwok wrote:[color=blue][color=green]
>> Thanks. However, I was trying to avoid basic arrays and instead use a
>> std::vector<>, though maybe it will be clearer to create a struct (as
>> you did) instead of using a std::pair<>.
>>
>> The main thing I am trying to do is clean up the calculation of the
>> index, and seeing if the exact value is in the table.[/color][/color]

Karl Heinz Buchegger <kbuchegg@gascad.at> wrote:[color=blue]
> I don't think there is much you can do. It already is short, is easy
> to understand. I really don't see a need to clean it up. The only
> thing I would do is: I would check if it ever happens that IndexL
> equals indexH. If it happens, how often does it happen? Due to
> round-off errors I don't think it is likely that the calculation of (
> z + 4.0 ) / 0.02 equals a whole number when using double arithmetic.
> So there is no point in having an 'if' that is almost never taken. But
> I may be wrong, only a test could show that.[/color]

Thanks for your response. I may go back and look at it later, but a
separate, more urgent issue (completely unrelated to this) has come up
in a different area of the code. Since it works now, I may leave it as
is.

--
Marcus Kwok
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,662 network members.