473,387 Members | 1,687 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Array class and pointer aliasing problems

(I'm having trouble to post this, sorry if this mail comes
several times)

I'm trying to optimize the use of an integer array class,
and I found that one major problem is pointer aliasing.

I consider the following example:

--------------------------------------------------------
class IntTab
{
public:
IntTab(int n, int m) :
data(new int[n*m]), dim0(n), dim1(m) {
}
~IntTab() {
delete data;
}
int ni() const {
return dim0;
}
int nj() const {
return dim1;
}
int & operator()(int i, int j) {
return data[i*dim1+j];
}
const int & operator()(int i, int j) const {
return data[i*dim1+j];
}
void myjob(IntTab & a) const;
private:
int * const data;
const int dim0;
const int dim1;
};

void myjob(const IntTab & a, IntTab & b)
{
int i, j, k;
int ni = b.ni();
int nj = b.nj();
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
b(k,0) = a(i,j);
}

void IntTab::myjob(IntTab & a) const
{
int a_j = a.dim1;
int ni = dim0;
int nj = dim1;
int i, j, k;
int * a_data = a.data;
int * my_data = data;
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
a_data[k*a_j] = my_data[i*nj+j];
}
---------------------------------------------------------

"myjob" is poorly optimized by the compiler (g++ 3.2)
because it assumes that in the process of writing b, I could
silently update the pointers (a.data and b.data) and the
index counters (a.dim1 and b.dim1) which are used in the
loop. Indeed, making these variables local as in
IntTab::myjob is much better but of course I prefer to write
my algorithms with the operator() which allows to easily put
some asserts for debugging.

Strangely enough, changing my array for an array of double
leads to the same problem, even with "-fstrict-aliasing".

Moreover, the "restrict" keyword might help here but it is
not supported by g++. Is there another way to tell the
compiler that caching the pointers is safe while preserving
readability ?

Is there something special with g++3.2 ?

Thanks,
Benoit MATHIEU

Jul 22 '05 #1
5 2350

"Mathieu Benoit" <be*************@free.fr> wrote in message
news:40***********************@news.free.fr...
(I'm having trouble to post this, sorry if this mail comes
several times)

I'm trying to optimize the use of an integer array class,
and I found that one major problem is pointer aliasing.

I consider the following example:

--------------------------------------------------------
class IntTab
{
public:
IntTab(int n, int m) :
data(new int[n*m]), dim0(n), dim1(m) {
}
~IntTab() {
delete data;
}
int ni() const {
return dim0;
}
int nj() const {
return dim1;
}
int & operator()(int i, int j) {
return data[i*dim1+j];
}
const int & operator()(int i, int j) const {
return data[i*dim1+j];
}
void myjob(IntTab & a) const;
private:
int * const data;
const int dim0;
const int dim1;
};

void myjob(const IntTab & a, IntTab & b)
{
int i, j, k;
int ni = b.ni();
int nj = b.nj();
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
b(k,0) = a(i,j);
}

void IntTab::myjob(IntTab & a) const
{
int a_j = a.dim1;
int ni = dim0;
int nj = dim1;
int i, j, k;
int * a_data = a.data;
int * my_data = data;
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
a_data[k*a_j] = my_data[i*nj+j];
}
---------------------------------------------------------

"myjob" is poorly optimized by the compiler (g++ 3.2)
because it assumes that in the process of writing b, I could
silently update the pointers (a.data and b.data) and the
index counters (a.dim1 and b.dim1) which are used in the
loop. Indeed, making these variables local as in
IntTab::myjob is much better but of course I prefer to write
my algorithms with the operator() which allows to easily put
some asserts for debugging.

Strangely enough, changing my array for an array of double
leads to the same problem, even with "-fstrict-aliasing".

Moreover, the "restrict" keyword might help here but it is
not supported by g++. Is there another way to tell the
compiler that caching the pointers is safe while preserving
readability ?

Is there something special with g++3.2 ?

Thanks,
Benoit MATHIEU

Why don't you pass the dimensions into the function and call it like this:
myjob(IntTab1, IntTab2, IntTab2.ni(), IntTab2.nj())
Jul 22 '05 #2
>>I'm trying to optimize an integer array class,
and I found that one major problem is pointer aliasing.
--------------------------------------------------------
class IntTab
{
public:
IntTab(int n, int m) :
data(new int[n*m]), dim0(n), dim1(m) {
}
~IntTab() {
delete data;
}
int ni() const {
return dim0;
}
int nj() const {
return dim1;
}
int & operator()(int i, int j) {
return data[i*dim1+j];
}
const int & operator()(int i, int j) const {
return data[i*dim1+j];
}
void myjob(IntTab & a) const;
private:
int * const data;
const int dim0;
const int dim1;
};

void myjob(const IntTab & a, IntTab & b)
{
int i, j, k;
int ni = b.ni();
int nj = b.nj();
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
b(k,0) = a(i,j); <snip>"myjob" is poorly optimized by the compiler (g++ 3.2)
because it assumes that in the process of writing b, I could
silently update the pointers (a.data and b.data) and the
index counters (a.dim1 and b.dim1) which are used in the
loop.
<snip>
Why don't you pass the dimensions into the function and call it like this:
myjob(IntTab1, IntTab2, IntTab2.ni(), IntTab2.nj())


Would this really help ? I already made a local copy of ni
and nj... I can see in the assembly output that ni and nj
and handled efficiently, only data and dim1 are reloaded
every time...

Maybe this issue is strongly related to the implementation
details in g++ (which might be particularly conservative
about aliasing) and I should ask in another group. But if
this is a g++ issue and the answer is g++ specific, then I
will probably not use it...

I would be more interested if somebody knows about a C++
construct which clearly tells the compiler that the actual
arrays will never alias these indexes and pointers... Does
the standard say something about that ?

Jul 22 '05 #3

"Benoit Mathieu" <be*************@free.fr> wrote in message
news:40**********************@news.free.fr...
I'm trying to optimize an integer array class,
and I found that one major problem is pointer aliasing.
--------------------------------------------------------
class IntTab
{
public:
IntTab(int n, int m) :
data(new int[n*m]), dim0(n), dim1(m) {
}
~IntTab() {
delete data;
}
int ni() const {
return dim0;
}
int nj() const {
return dim1;
}
int & operator()(int i, int j) {
return data[i*dim1+j];
}
const int & operator()(int i, int j) const {
return data[i*dim1+j];
}
void myjob(IntTab & a) const;
private:
int * const data;
const int dim0;
const int dim1;
};

void myjob(const IntTab & a, IntTab & b)
{
int i, j, k;
int ni = b.ni();
int nj = b.nj();
for (i=0, k=0; i<ni; i++)
for (j=0; j<nj; j++, k++)
b(k,0) = a(i,j); <snip>"myjob" is poorly optimized by the compiler (g++ 3.2)
because it assumes that in the process of writing b, I could
silently update the pointers (a.data and b.data) and the
index counters (a.dim1 and b.dim1) which are used in the
loop. <snip>
Why don't you pass the dimensions into the function and call it like

this: myjob(IntTab1, IntTab2, IntTab2.ni(), IntTab2.nj())


Would this really help ? I already made a local copy of ni
and nj... I can see in the assembly output that ni and nj
and handled efficiently, only data and dim1 are reloaded
every time...

Maybe this issue is strongly related to the implementation
details in g++ (which might be particularly conservative
about aliasing) and I should ask in another group. But if
this is a g++ issue and the answer is g++ specific, then I
will probably not use it...

I would be more interested if somebody knows about a C++
construct which clearly tells the compiler that the actual
arrays will never alias these indexes and pointers... Does
the standard say something about that ?

Sorry I don't know.
What about doing it an an asm block, or something like that?
Jul 22 '05 #4
Benoit Mathieu wrote:

Maybe this issue is strongly related to the implementation
details in g++ (which might be particularly conservative
about aliasing) and I should ask in another group. But if
this is a g++ issue and the answer is g++ specific, then I
will probably not use it...

I would be more interested if somebody knows about a C++
construct which clearly tells the compiler that the actual
arrays will never alias these indexes and pointers... Does
the standard say something about that ?


What do you want to achieve?
As I see it, you want the compiler to replace the index
calculation in

for (j=0; j<nj; j++, k++)
b(k,0) = a(i,j);

with some sort of pointer magic. The same thing it can
do when you write

for( i = 0; i < size; ++i )
m[i] = ...

where the compiler can replace the whole thing with

basePtr = m;
for( i = 0; i < size; ++i, basePtr++ )
*basePtr = ...

If your compiler is unable to do this transformation in
your current case, then clearly it is a quality of implementation
issue (meaning: g++ specific).

But what can you do?
One thing you can do in any case is to look how others solve this
(or a similar) problem. What immediatly comes to my mind is the
STL. How do they do it? The STL has introduced a concept called
iterators. There are functions for getting the first iterator
and incrementing such an iterator. Well, in the above, transformed
example, basePtr acts as an iterator! It is assigned a starting value
and after that the pointer (aehh: iterator) is incremented and evaluated.
Hmm.

So you could add functions to your class IntTab:

class intTab
{
int* GetFirstColumn( int Row ) { return &data[ Row*dim1 ]; }
int GetNextColumnValue( int*& Pos ) { return *(Pos++); }

...
};

And your looping construct modifies to:

for( i = 0, k = 0; i < n1; ++i ) {
int* PosA = a.GetFirstColumn( i );

for( j = 0; j < nj; j++, k++ )
b(k,0) = a.GetNextColumnValue( PosA );
}

Try it and see, if the compiler is able to optimize this in a better
way.

Of course one could polish up the above by introducing a seperate
iterator class etc., but for a quick test to see if your compiler can
do a better job the above should be sufficient.

Yes I am aware, that this could be seen as premature oprimization
and as we all know this is the root of all evil :-)

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #5
Karl Heinz Buchegger wrote:
<snip>
So you could add functions to your class IntTab:

class intTab
{
int* GetFirstColumn( int Row ) { return &data[ Row*dim1 ]; }
int GetNextColumnValue( int*& Pos ) { return *(Pos++); }

...
};

And your looping construct modifies to:

for( i = 0, k = 0; i < n1; ++i ) {
int* PosA = a.GetFirstColumn( i );

for( j = 0; j < nj; j++, k++ )
b(k,0) = a.GetNextColumnValue( PosA );
}

Try it and see, if the compiler is able to optimize this in a better
way.

Of course one could polish up the above by introducing a seperate
iterator class etc., but for a quick test to see if your compiler can
do a better job the above should be sufficient.

Yes I am aware, that this could be seen as premature oprimization
and as we all know this is the root of all evil :-)


(why is it that I so often look at the assembly output,
that's certainly evil... :) )

Using an iterator is cool in the example that I showed :
seen from the compiler, it "is" a pointer, and from my point
of view it is also quite interesting since I can hide some
sanity checking in the iterator to prevent the user from
going out of bounds. Things get worse when I need to access
the data randomly (and this is in fact the general case, and
here it is much more intersting to have some bound
checking). An iterator won't do the job in this case... I
also wonder if preserving logical constness is easy with
iterators (I'll have a look in the STL, the answer probably
sits there).

I feel that there is no easy solution to my
performance/clarity/safety tradeoff problem (I mean, other
than "write the code in fortran" (uh!)). I will take it this
way : code every thing as clearly as possible with all the
safe bound checking, run the code and eventually perhaps
concentrate on the few pieces of code where I really spend a
lot of time (creating some "friend" functions where I can
hack with pointers and other evil things...)

Anyway this is interesting: I should have a look at the STL
more often. Thank you ...

Benoît Mathieu
Jul 22 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

44
by: Carl | last post by:
"Nine Language Performance Round-up: Benchmarking Math & File I/O" http://www.osnews.com/story.php?news_id=5602 I think this is an unfair comparison! I wouldn't dream of developing a numerical...
204
by: Alexei A. Frounze | last post by:
Hi all, I have a question regarding the gcc behavior (gcc version 3.3.4). On the following test program it emits a warning: #include <stdio.h> int aInt2 = {0,1,2,4,9,16}; int aInt3 =...
3
by: Keith Rebello | last post by:
How do I pass an array of values as a property of a class? i.e. If Class1 has a property called XVals how do I assign an array of x-values (X()) to the XVals property of Class 1? I need an...
14
by: Peter Hallett | last post by:
I would like to set up a string array as a class member, or field, and then populate this array by reading in from a text file, but I cannot find the appropriate syntax. The getter and setter are...
23
by: sandy | last post by:
I need (okay, I want) to make a dynamic array of my class 'Directory', within my class Directory (Can you already smell disaster?) Each Directory can have subdirectories so I thought to put these...
3
by: David Mathog | last post by:
I have a program for which this line: if(! lstrtol(&atoken,length-2,(long *) &(lclparams->pad)) || (lclparams->pad< 0)){ generates the warning below, but ONLY if the gcc compiler is at -O2 or...
5
by: Immortal Nephi | last post by:
I would like to design an object using class. How can this class contain 10 member functions. Put 10 member functions into member function pointer array. One member function uses switch to call...
152
by: vippstar | last post by:
The subject might be misleading. Regardless, is this code valid: #include <stdio.h> void f(double *p, size_t size) { while(size--) printf("%f\n", *p++); } int main(void) { double array = { {...
18
by: Stephan Beal | last post by:
Hi, all! Before i ask my question, i want to clarify that my question is not about the code i will show, but about what the C Standard says should happen. A week or so ago it occurred to me...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.