valarray &lt;vallaray&lt;T&gt; &gt; efficiency

kwikius wrote:

>
My 2p's worth.. FWIW

Because a valarray is not a compile time fixed size it must? use heap
allocation, hence a matrix as view of list approach will likely only
require one allocation, whereas matrix as sequence of sequence will
require multiple allocations.

I believe even now that compilers find it difficult to do much
optimisation on heap pointers( even wary of stack pointers and
anything complicated in references), so it probably makes sense to use
as few pointers as possible, then the compiler only needs to deal with
a single offset.

Anyway its an interesting issue :-)

Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?
#include <iostream>
#include <valarray>

int main()
{
using namespace std;

valarray<intvi(1, 10);

int (*p)[5]= reinterpret_cast<int (*)[5](&vi[0]);

for (size_t i= 0; i< 2; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[i][j]<<" ";

cout<< endl;

}
Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.

Jan 3 '08 #3

Ioannis Vranos wrote:

kwikius wrote:
>>
My 2p's worth.. FWIW

Because a valarray is not a compile time fixed size it must? use heap
allocation, hence a matrix as view of list approach will likely only
require one allocation, whereas matrix as sequence of sequence will
require multiple allocations.

I believe even now that compilers find it difficult to do much
optimisation on heap pointers( even wary of stack pointers and
anything complicated in references), so it probably makes sense to
use as few pointers as possible, then the compiler only needs to
deal with a single offset.

Anyway its an interesting issue :-)

Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the
same efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

Given that 'reinterpret_cast' is not supposed to be used that way,
the following has undefined behaviour. Aside from that...

>

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

valarray<intvi(1, 10);

int (*p)[5]= reinterpret_cast<int (*)[5](&vi[0]);

for (size_t i= 0; i< 2; ++i)
for(size_t j=0; j<5; ++j)
cout<< p[i][j]<<" ";

cout<< endl;

}
Here p behaves a s a 2-dimensional matrix, that is a 2x5 matrix.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 3 '08 #4

Victor Bazarov wrote:

>
Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]( static_cast<void *(&vi[0]) );
why reinterpret_cast "is not supposed to be used that way"?

Jan 3 '08 #5

Ioannis Vranos wrote:

Victor Bazarov wrote:
>>
Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]( static_cast<void *(&vi[0]) );
why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.

As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*back to the original type. See
[expr.static.cast].

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 3 '08 #6

kwikius

On Jan 3, 4:49*pm, Ioannis Vranos <j...@no.spamwrote:

kwikius wrote:

My 2p's worth.. FWIW

Because a valarray is not a compile time fixed size it must? use heap
allocation, hence a matrix as view of list approach will likely only
require one allocation, whereas matrix as sequence of sequence will
require multiple allocations.

I believe even now that compilers find it difficult to do much
optimisation on heap pointers( even wary of stack pointers and
anything complicated in references), so it probably makes sense to use
as few pointers as possible, then the compiler only needs to deal with
a single offset.

Anyway its an interesting issue :-)

Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.

And N.B some might call an e.g array of 3dpoints a matrix where I
would see it as an array of matrices where 1 dimension is equal to 1
dependent on the local convention.

regards
Andy Little

Jan 3 '08 #7

On 2008-01-03 16:03, john wrote:

Hi, in TC++PL 3 on pages 674-675 it is mentioned:
"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<doublev;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".
However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.

--
Erik WikstrÃ¶m

Jan 3 '08 #8

Victor Bazarov wrote:

Ioannis Vranos wrote:
>Victor Bazarov wrote:
>>Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]( static_cast<void *(&vi[0]) );
why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.

There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the overall
restriction that a reinterpret_cast cannot cast away const-
ness".

>
As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*back to the original type. See
[expr.static.cast].

Yes, I think in the standard, conversions from "pointer to object" to
"pointer to an array of n objects" is not explicitly mentioned, but I
think there isn't any decent implementation where it does not work
whenever the pointed object is member of an array (sequence).

Actually I have the feeling the guarantee exists in the standard and is
probably "buried" in some clause. :-)
So we have

int main()
{
int array[5]= {0};

int *p= &array[0];

int (*q)[5]= reinterpret_cast<int (*)[5](p);
}

Shouldn't the last always work?

Jan 3 '08 #9

kwikius wrote:

>
>Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.

valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

Jan 3 '08 #10

Erik WikstrÃ¶m wrote:

On 2008-01-03 16:03, john wrote:
>Hi, in TC++PL 3 on pages 674-675 it is mentioned:
"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<doublev;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".
However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.

How about using "pointer to array" pointing to a member of a valarray as
I mentioned in another post in the thread instead of slices, to
simulate a matrix?

Jan 3 '08 #11

Ioannis Vranos wrote:

Victor Bazarov wrote:
>Ioannis Vranos wrote:
>>Victor Bazarov wrote:
Given that 'reinterpret_cast' is not supposed to be used that way,

Apart from the approach:

static_cast<int (*)[5]( static_cast<void *(&vi[0]) );
why reinterpret_cast "is not supposed to be used that way"?

If you reinterpret_cast one object pointer into another object
pointer, the Standard only specifies that the resulting pointer
can be converted back to the original type. Please see Standard,
[expr.reinterpret.cast]/7. Whatever you think you can do with
the pointer, is not in the langauge specification, literally.

There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness".

>>
As to the two consequitive static_cast operations, the result
of those is also unspecified, since you're not converting the
result of static_cast<void*back to the original type. See
[expr.static.cast].

Yes, I think in the standard, conversions from "pointer to object" to
"pointer to an array of n objects" is not explicitly mentioned, but I
think there isn't any decent implementation where it does not work
whenever the pointed object is member of an array (sequence).

Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :-)

Find it; everybody will be grateful.

[..]

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 3 '08 #12

On 2008-01-03 22:39, Ioannis Vranos wrote:

Erik WikstrÃ¶m wrote:
>On 2008-01-03 16:03, john wrote:
>>Hi, in TC++PL 3 on pages 674-675 it is mentioned:
"Maybe your first idea for a two-dimensional vector was something like this:

class Matrix {
valarray< valarray<doublev;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".
However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

For various reasons valarray never became what it was meant to be, and
few people use it. Since few use it the library writers have not
bothered much with it so I would not be surprised if the code is more or
less the same as it was back in 1998. And even if they had it would
still not be able to match the efficiency of slices since a valarray of
valarrays adds an additional layer of indirection.

How about using "pointer to array" pointing to a member of a valarray as
I mentioned in another post in the thread instead of slices, to
simulate a matrix?

If you want a matrix there are a number of libraries out there, or you
can write your own. In the end it usually ends up with a contiguous
piece of memory that is somehow divided into rown/columns and some
simple arithmetic is used to get the correct element. Besides, these
kinds of optimisations usually have a smaller impact than expected in
real world applications where operations are usually performed on the
whole matrix and where the real costs comes from creation of
temporaries. That is why expression templates are used by all libraries
aiming for high performance.

--
Erik WikstrÃ¶m

Jan 4 '08 #13

On 2008-01-03 22:36, Ioannis Vranos wrote:

kwikius wrote:
>>
>>Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.

valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

That was the intention, however, as I mentioned elsethread, that is not
the case. I do not know of any high-performance applications (and I
doubt that they exist) where valarrays are used. My understanding is
that valarray is designed to allow for optimisations (such as utilising
SIMD instructions), but no implementation have made those optimisations.

--
Erik WikstrÃ¶m

Jan 4 '08 #14

Erik WikstrÃ¶m wrote:

>
>valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

That was the intention, however, as I mentioned elsethread, that is not
the case. I do not know of any high-performance applications (and I
doubt that they exist) where valarrays are used. My understanding is
that valarray is designed to allow for optimisations (such as utilising
SIMD instructions), but no implementation have made those optimisations.

So, what kind of container is used for high performance C++ computing
instead? Those template math libraries on the web?

Jan 4 '08 #15

kwikius

On Jan 3, 9:36*pm, Ioannis Vranos <j...@no.spamwrote:

kwikius wrote:

Given that TC++PL3 was written before the change in the standard that
*requires* that vectors, valarrays, etc are implemented as continuous
sequences,wouldn't the following be more efficient or at least the same
efficient as the usage of valarrays and slices/slice_arrays to
"simulate" a 2-dimensional matrix?

The more interesting question to me is how necessary runtime
resizability of matrices is. Because if they are not resizeable then
there is no need to allocate from the heap and valarray is unnecessary
(you can simple use a wrapped c-style array), in which case the
compiler has much better opportunities for optimisation.

valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

In practise I reckon you will find that implementation of valarray
uses "new" to allocate, after all if this cool allocation mechanism
exists , you might as well use it for new too..

regards
Andy Little

Jan 4 '08 #16

On 2008-01-04 01:16, Ioannis Vranos wrote:

Erik WikstrÃ¶m wrote:
>>
>>valarray is not a typical container in terms of implementation. It is
intended to be heavily optimised, and it can use raw storage from
anywhere or something else AFAIK.

That was the intention, however, as I mentioned elsethread, that is not
the case. I do not know of any high-performance applications (and I
doubt that they exist) where valarrays are used. My understanding is
that valarray is designed to allow for optimisations (such as utilising
SIMD instructions), but no implementation have made those optimisations.

So, what kind of container is used for high performance C++ computing
instead? Those template math libraries on the web?

For a non-sparse NxM matrix of type double I would assume that they
simply allocate an array of N*M elements. But I have not looked under
the hood on any of them.

--
Erik WikstrÃ¶m

Jan 4 '08 #17

Victor Bazarov wrote:

>
Ioannis Vranos wrote:

>There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to an
object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness".
[...]
>Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :-)

Find it; everybody will be grateful.

I think the above quotation from the standard is sufficient.

Jan 4 '08 #18

Ioannis Vranos wrote:

Victor Bazarov wrote:
>>
Ioannis Vranos wrote:

>>There, it is mentioned:

"A pointer to an object can be explicitly converted to a pointer to
an object of different type. 65)".

"65) The types may have different cv-qualifiers, subject to the
overall restriction that a reinterpret_cast cannot cast away const-
ness".
[...]
>>Actually I have the feeling the guarantee exists in the standard and
is probably "buried" in some clause. :-)

Find it; everybody will be grateful.

I think the above quotation from the standard is sufficient.

No, it's not. You conveniently omitted the rest of the paragraph
7, following the "type. 65)". Read it. The result of such conversion
is unspecified. That means no guarantees are given. Where you get
your "feeling the guarantee exists", I cannot imagine. If you find it
elsewhere, do post about it. Until you do, the guarantee that you
speak of does NOT exist, and that is expressed in the second sentence
of the paragraph 7 of [expr.reinterpret.cast].

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 4 '08 #19

Victor Bazarov wrote:

>
No, it's not. You conveniently omitted the rest of the paragraph
7, following the "type. 65)". Read it. The result of such conversion
is unspecified. That means no guarantees are given. Where you get
your "feeling the guarantee exists", I cannot imagine. If you find it
elsewhere, do post about it. Until you do, the guarantee that you
speak of does NOT exist, and that is expressed in the second sentence
of the paragraph 7 of [expr.reinterpret.cast].

Doesn't the standard mention that all kinds of automatic storage/heap
arrays are stored in a sequence?

Jan 4 '08 #20

Ioannis Vranos wrote:

Victor Bazarov wrote:
>>
No, it's not. You conveniently omitted the rest of the paragraph
7, following the "type. 65)". Read it. The result of such
conversion is unspecified. That means no guarantees are given. Where you
get your "feeling the guarantee exists", I cannot imagine.
If you find it elsewhere, do post about it. Until you do, the
guarantee that you speak of does NOT exist, and that is expressed in
the second sentence of the paragraph 7 of [expr.reinterpret.cast].

Doesn't the standard mention that all kinds of automatic storage/heap
arrays are stored in a sequence?

What does this have to do with 'reinterpret_cast'?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 4 '08 #21

Victor Bazarov wrote:

>
>Doesn't the standard mention that all kinds of automatic storage/heap
arrays are stored in a sequence?

What does this have to do with 'reinterpret_cast'?

It has with converting an int * to int (*)[5] and vice versa. If C-style
cast works for this, I think reinterpret_cast should also work, along
with static_cast<int (*)[5](static_cast<void *(&i)); where i is an
int and a member of a sequence.

Jan 4 '08 #22

kwikius

On Jan 4, 7:08*pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:

Again, should also work for what? *Heating your house by lighting
up grandpa's farts works, but you have to feed him too much beans
and open flame is dangerous indoors.

Victor .

Hasnt your long suffering grandpa suffered enough now from your
misguided experiments to save on heating bills?
;_)
regards
Andy Little

Jan 4 '08 #23

jk********@gmx.net wrote:

>
>Let's take a step at a time. Is the following guaranteed to work (to
output 50 zeros) always?

#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p[i]<< " ";
}
int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;
}

I think, the code has undefined behavior. See the following thread for a
similar case:

groups.google.com/group/comp.lang.c++/browse_frm/thread/9c501bae821bd406

AFAIK it has not undefined behavior. In all kinds of array, its members
are in a sequence. Also all POD objects can be considered as sequences
of chars or unsigned chars (bytes), and we can read them as such.

Jan 5 '08 #24

Lionel B

On Thu, 03 Jan 2008 17:03:04 +0200, john wrote:

Hi, in TC++PL 3 on pages 674-675 it is mentioned:
"Maybe your first idea for a two-dimensional vector was something like
this:

class Matrix {
valarray< valarray<doublev;
public:
// ...
};

This would also work (22.9[10]). However, it is not easy to match
efficiency and compatibility required by high performance computations
without dropping to the lower and more conventional level represented by
valarray plus slices".
However since 1998 much time has passed, and I wonder if the current
compiler implementations allow valarray<valarray<T to be equally
efficient (or more) than using a valarray with slices/slice_arrays.

I have been following this thread with some interest, as I actually use
valarray quite extensively and have used it to design (2 dim) matrix
classes.

The logic behind my using valarray is roughly as follows: my matrix
classes frequently interface to BLAS and LAPACK libraries which, being
Fortran, expect contiguous (C array-style) storage - this rules out the
"array-of-arrays" approach. So I could still use, say std::valarray,
std::vector or simply allocate basic arrays via new or malloc. I have
used all of those without problems (and not, to be honest, with much by
way of discernible performance difference - but see below) but have I
generally plumped for std::valarray on the grounds that - as I understand
it - it is supposed to guarantee non-aliasing of its (internal) arrays
and that this potentially allows compilers to optimise more efficiently.

Whether any modern compilers actually *do* take advantage of this
potential is, as discussed in this thread, a moot point. My suspicion is
that compilers which can vectorise array operations (via some hardware
facility) are starting to do so; in particular, I *think* I can point to
noticeable performance improvements for std::valarray (over std::vector,
etc.) with the Intel compiler ICC on modern Intel architectures and, to a
lesser extent, with later versions of GCC on modern Intel and AMD
architectures (I don't work on Windows platforms, so I can't speak to
Microsoft compilers).

In any case, I don't see any *harm* in preferring std::valarray over
std::vector or simply array allocation (apart from its flaky syntax -
those reversed ctor arguments catch me every time :() so if there is some
chance that compiler writers may be starting to implement the potential
optimisations then it seems reasonable to go with it.

BTW, I take the point someone raised about the dominant performance
overhead of temporary copying in general matrix manipulation, so should
point out that this is not currently an issue for me as I tend to manage
temporary copies (tediously!) "by hand".

I'd be interested in comment,

Regards,

--
Lionel B

Jan 5 '08 #25

James Kanze

On Jan 5, 3:01 pm, Ioannis Vranos <j...@no.spamwrote:

jkherci...@gmx.net wrote:

Let's take a step at a time. Is the following guaranteed to
work (to output 50 zeros) always?

#include <iostream>

inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p[i]<< " ";
}

int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;
}

I think, the code has undefined behavior. See the following
thread for a similar case:

groups.google.com/group/comp.lang.c++/browse_frm/thread/9c501bae821bd406

AFAIK it has not undefined behavior.

According to both the C and the C++ standard, it is undefined
behavior.

In all kinds of array, its members are in a sequence. Also all
POD objects can be considered as sequences of chars or
unsigned chars (bytes), and we can read them as such.

Reading an object as a sequence of bytes is a special case, and
I'm not sure how it applies here. In the above code, however,
array[0] is an array of 5 ints. The conversion to int* results
in a pointer to the first element in an array of 5 ints. The C
standard was very carefully worded to allow an implementation to
check this, and there have been implementations (and maybe still
are) which actually checked this. (Checking is not widespread,
because it requires fat pointers---the pointer must include not
only the address, but the legal bounds---, which have a very
definite negative impact on performance.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jan 5 '08 #26

Lionel B wrote:

>
I have been following this thread with some interest, as I actually use
valarray quite extensively and have used it to design (2 dim) matrix
classes.

The logic behind my using valarray is roughly as follows: my matrix
classes frequently interface to BLAS and LAPACK libraries which, being
Fortran, expect contiguous (C array-style) storage - this rules out the
"array-of-arrays" approach. So I could still use, say std::valarray,
std::vector or simply allocate basic arrays via new or malloc. I have
used all of those without problems (and not, to be honest, with much by
way of discernible performance difference - but see below) but have I
generally plumped for std::valarray on the grounds that - as I understand
it - it is supposed to guarantee non-aliasing of its (internal) arrays
and that this potentially allows compilers to optimise more efficiently.

Whether any modern compilers actually *do* take advantage of this
potential is, as discussed in this thread, a moot point. My suspicion is
that compilers which can vectorise array operations (via some hardware
facility) are starting to do so; in particular, I *think* I can point to
noticeable performance improvements for std::valarray (over std::vector,
etc.) with the Intel compiler ICC on modern Intel architectures and, to a
lesser extent, with later versions of GCC on modern Intel and AMD
architectures (I don't work on Windows platforms, so I can't speak to
Microsoft compilers).

In any case, I don't see any *harm* in preferring std::valarray over
std::vector or simply array allocation (apart from its flaky syntax -
those reversed ctor arguments catch me every time :() so if there is some
chance that compiler writers may be starting to implement the potential
optimisations then it seems reasonable to go with it.

BTW, I take the point someone raised about the dominant performance
overhead of temporary copying in general matrix manipulation, so should
point out that this is not currently an issue for me as I tend to manage
temporary copies (tediously!) "by hand".

I'd be interested in comment,

I am currently reading Chapter 22 of TC++PL 3,v which includes
valarrays, and from what I have read so far, valarrays are intended to
be heavily optimised (even by using parallel operations on multi-cpu
systems) so there are fewer assumptions we can make in comparison to
other containers, for example "valarrays are assumed to be alias free,
and the introduction of auxiliary types and ==>the elimination of
temporaries is allowed<== as long as the basic semantics are maintained".

So it is the only container that I am not sure we can use pointers and
pointer arithmetic to access and manipulate its data, in addition to
accessing them via the subscript operator[].

Also "22.4.7 Temporaries, Copying and Loops" of TC++PL 3 may be useful
to you, since it describes a method of deferring the various
sub-calculations, until all data for a given calculation are available
to be used in a final_calculation_function, while inlining the
"sub-calculations".

Jan 5 '08 #27

jkherciueh

Ioannis Vranos wrote:

jk********@gmx.net wrote:
>>
>>Let's take a step at a time. Is the following guaranteed to work (to
output 50 zeros) always?

#include <iostream>
inline void some_func(int *p, const std::size_t SIZE)
{
using namespace std;

for(size_t i=0; i<SIZE; ++i)
cout<< p[i]<< " ";
}
int main()
{
int array[10][5]= {0};

some_func(array[0], sizeof(array)/sizeof(**array));

std::cout<< std::endl;
}

I think, the code has undefined behavior. See the following thread for a
similar case:

groups.google.com/group/comp.lang.c++/browse_frm/thread/9c501bae821bd406

AFAIK it has not undefined behavior. In all kinds of array, its members
are in a sequence.

True, but irrelevant. The problem is not with memory layout but with pointer
types and conversions.

C++ allows for bounds-checking pointers. The rules for pointer conversions
are crafted so that an implementation could decorate each pointer with the
bounds of the array from which is is obtained and propagate that
information through conversions. Using pointer arithmetic to access objects
outside the stored bounds could then trigger whatever the implementation
sees fit.

Also all POD objects can be considered as sequences
of chars or unsigned chars (bytes), and we can read them as such.

True, but (a) you are not converting pointer to byte and (b) you are not
converting from a pointer to an int[10][5].
Best

Kai-Uwe Bux

Jan 5 '08 #28

On 2008-01-05 20:06, Ioannis Vranos wrote:

Lionel B wrote:
>>
I have been following this thread with some interest, as I actually use
valarray quite extensively and have used it to design (2 dim) matrix
classes.

The logic behind my using valarray is roughly as follows: my matrix
classes frequently interface to BLAS and LAPACK libraries which, being
Fortran, expect contiguous (C array-style) storage - this rules out the
"array-of-arrays" approach. So I could still use, say std::valarray,
std::vector or simply allocate basic arrays via new or malloc. I have
used all of those without problems (and not, to be honest, with much by
way of discernible performance difference - but see below) but have I
generally plumped for std::valarray on the grounds that - as I understand
it - it is supposed to guarantee non-aliasing of its (internal) arrays
and that this potentially allows compilers to optimise more efficiently.

Whether any modern compilers actually *do* take advantage of this
potential is, as discussed in this thread, a moot point. My suspicion is
that compilers which can vectorise array operations (via some hardware
facility) are starting to do so; in particular, I *think* I can point to
noticeable performance improvements for std::valarray (over std::vector,
etc.) with the Intel compiler ICC on modern Intel architectures and, to a
lesser extent, with later versions of GCC on modern Intel and AMD
architectures (I don't work on Windows platforms, so I can't speak to
Microsoft compilers).

In any case, I don't see any *harm* in preferring std::valarray over
std::vector or simply array allocation (apart from its flaky syntax -
those reversed ctor arguments catch me every time :() so if there is some
chance that compiler writers may be starting to implement the potential
optimisations then it seems reasonable to go with it.

BTW, I take the point someone raised about the dominant performance
overhead of temporary copying in general matrix manipulation, so should
point out that this is not currently an issue for me as I tend to manage
temporary copies (tediously!) "by hand".

I'd be interested in comment,

I am currently reading Chapter 22 of TC++PL 3,v which includes
valarrays, and from what I have read so far, valarrays are intended to
be heavily optimised (even by using parallel operations on multi-cpu
systems) so there are fewer assumptions we can make in comparison to
other containers, for example "valarrays are assumed to be alias free,
and the introduction of auxiliary types and ==>the elimination of
temporaries is allowed<== as long as the basic semantics are maintained".

Might be of interest to the discussion:
http://www.oonumerics.org/oon/oonstd/archive/0018.html

--
Erik WikstrÃ¶m

Jan 5 '08 #29

Lionel B

On Sat, 05 Jan 2008 20:28:40 +0000, Erik WikstrÃ¶m wrote:

On 2008-01-05 20:06, Ioannis Vranos wrote:
>Lionel B wrote:
>>>
I have been following this thread with some interest, as I actually
use valarray quite extensively and have used it to design (2 dim)
matrix classes.

[...]

Might be of interest to the discussion:
http://www.oonumerics.org/oon/oonstd/archive/0018.html

Thanks... perhaps I hadn't realised how "dead" std::valarray really is.

There's also an interesting thread regarding aliasing and the
"restrict" (non-)keyword. Todd Veldhuizen writes:

"Having the NCEG "restrict" keyword would be more useful
than a built-in alias-free array class like valarray<>.
The restrict keyword has apparently been adopted into the C9x
standard, so hopefully it will become part of C++ in the future.
Already several C++ compilers support it (Cray,KAI C++,SGI)."

And this this is circa 1988...

GCC at least has a __restrict__ extension.

--
Lionel B

Jan 6 '08 #30

Lionel B

On Sat, 05 Jan 2008 21:06:00 +0200, Ioannis Vranos wrote:

Lionel B wrote:
>>
I have been following this thread with some interest, as I actually use
valarray quite extensively and have used it to design (2 dim) matrix
classes.

[...]

I am currently reading Chapter 22 of TC++PL 3,v which includes
valarrays, and from what I have read so far, valarrays are intended to
be heavily optimised (even by using parallel operations on multi-cpu
systems) so there are fewer assumptions we can make in comparison to
other containers, for example "valarrays are assumed to be alias free,
and the introduction of auxiliary types and ==>the elimination of
temporaries is allowed<== as long as the basic semantics are
maintained".

So it is the only container that I am not sure we can use pointers and
pointer arithmetic to access and manipulate its data, in addition to
accessing them via the subscript operator[].

I'd always understood that &v[0] is guaranteed to point to a contiguous
array (I've always blithely passed it as a Fortran array parameter
without any problems)... is not the case?

Also "22.4.7 Temporaries, Copying and Loops" of TC++PL 3 may be useful
to you, since it describes a method of deferring the various
sub-calculations, until all data for a given calculation are available
to be used in a final_calculation_function, while inlining the
"sub-calculations".

Thanks, I'll check it out.

--
Lionel B

Jan 6 '08 #31

Jerry Coffin

In article <qO*****************@newsfe6-win.ntli.net>, me@privacy.net
says...

[ ... ]

So it is the only container that I am not sure we can use pointers and
pointer arithmetic to access and manipulate its data, in addition to
accessing them via the subscript operator[].

I'd always understood that &v[0] is guaranteed to point to a contiguous
array (I've always blithely passed it as a Fortran array parameter
without any problems)... is not the case?

Theoretically, no; practically, yes. The C++ 98 standard didn't require
std::vector to use contiguous memory. All of the publicly available
implementations have done so however, and in C++ 0x, it will be
required.

FWIW, the same is true of std::string.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jan 6 '08 #32

Jerry Coffin wrote:

In article <qO*****************@newsfe6-win.ntli.net>, me@privacy.net
says...

[ ... ]

>>So it is the only container that I am not sure we can use pointers
and pointer arithmetic to access and manipulate its data, in
addition to accessing them via the subscript operator[].

I'd always understood that &v[0] is guaranteed to point to a
contiguous array (I've always blithely passed it as a Fortran array
parameter without any problems)... is not the case?

Theoretically, no; practically, yes. The C++ 98 standard didn't
require std::vector to use contiguous memory. All of the publicly
available implementations have done so however, and in C++ 0x, it
will be required.

It is already required in current [2003] Standard. See [lib.vector],
paragraph 1.

FWIW, the same is true of std::string.

Not sure what "the same" you mean here.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 6 '08 #33

Jerry Coffin

In article <XO******************************@comcast.com>,
v.********@comAcast.net says...

Jerry Coffin wrote:
In article <qO*****************@newsfe6-win.ntli.net>, me@privacy.net
says...

[ ... ]

>So it is the only container that I am not sure we can use pointers
and pointer arithmetic to access and manipulate its data, in
addition to accessing them via the subscript operator[].

I'd always understood that &v[0] is guaranteed to point to a
contiguous array (I've always blithely passed it as a Fortran array
parameter without any problems)... is not the case?
Theoretically, no; practically, yes. The C++ 98 standard didn't
require std::vector to use contiguous memory. All of the publicly
available implementations have done so however, and in C++ 0x, it
will be required.

It is already required in current [2003] Standard. See [lib.vector],
paragraph 1.

Ah, I hadn't noticed that. Thanks for the heads-up.

FWIW, the same is true of std::string.

Not sure what "the same" you mean here.

That C++ 98 didn't require it to use contiguous storage, but C++ 0x
will.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jan 6 '08 #34

Hi in TC++PL3 it is mentioned that an slice_array can not be copied,
however the following code compiled in my system even with "g++ -ansi
-pedantic-errors":

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

slice_array<intresult2= result;

cout<< endl;
}

Broken implementation?

Jan 9 '08 #35

Ioannis Vranos wrote:

Hi in TC++PL3 it is mentioned that an slice_array can not be copied,
however the following code compiled in my system even with "g++ -ansi
-pedantic-errors":

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

slice_array<intresult2= result;

cout<< endl;
}

Broken implementation?

I am not familiar enough with slice_array. Does the Standard say
that it "shall not be copied" or does it simply leave the copying
undefined (up to the implementation to provide it if it wants to)?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 9 '08 #36

Victor Bazarov wrote:

Ioannis Vranos wrote:
>Hi in TC++PL3 it is mentioned that an slice_array can not be copied,
however the following code compiled in my system even with "g++ -ansi
-pedantic-errors":

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

slice_array<intresult2= result;

cout<< endl;
}

Broken implementation?

I am not familiar enough with slice_array. Does the Standard say
that it "shall not be copied" or does it simply leave the copying
undefined (up to the implementation to provide it if it wants to)?

The standard says:

"26.3.5.1 slice_array constructors
[lib.cons.slice.arr]
slice_array();
slice_array(const slice_array&);
The slice_array template has no public constructors. These constructors
are declared to be private.
These constructors need not be defined.
26.3.5.2 slice_array assignment
[lib.slice.arr.assign]
void operator=(const valarray<T>&) const;
slice_array& operator=(const slice_array&);
The second of these two assignment operators is declared private and
need not be defined. The first has reference semantics, assigning the
values of the argument array elements to selected elements of the
valarray<Tobject to which the slice_array object refers.

26.3.5.3 slice_array computed assignment
[lib.slice.arr.comp.assign]
void operator*= (const valarray<T>&) const;
void operator/= (const valarray<T>&) const;
void operator%= (const valarray<T>&) const;
void operator+= (const valarray<T>&) const;
void operator-= (const valarray<T>&) const;
void operatorˆ= (const valarray<T>&) const;
void operator&= (const valarray<T>&) const;
void operator|= (const valarray<T>&) const;
void operator<<=(const valarray<T>&) const;
void operator>>=(const valarray<T>&) const;
These computed assignments have reference semantics, applying the
indicated operation to the elements of the argument array and selected
elements of the valarray<Tobject to which the slice_array object refers".

Jan 9 '08 #37

Ioannis Vranos wrote:

Victor Bazarov wrote:
>Ioannis Vranos wrote:
>>Hi in TC++PL3 it is mentioned that an slice_array can not be copied,
however the following code compiled in my system even with "g++
-ansi -pedantic-errors":

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

slice_array<intresult2= result;

cout<< endl;
}

Broken implementation?

I am not familiar enough with slice_array. Does the Standard say
that it "shall not be copied" or does it simply leave the copying
undefined (up to the implementation to provide it if it wants to)?

The standard says:

"26.3.5.1 slice_array constructors
[lib.cons.slice.arr]
slice_array();
slice_array(const slice_array&);
The slice_array template has no public constructors. These
constructors are declared to be private.
These constructors need not be defined.
26.3.5.2 slice_array assignment
[lib.slice.arr.assign]
void operator=(const valarray<T>&) const;
slice_array& operator=(const slice_array&);

Neither of those is a constructor.

[..assignment stuff, irrelevant here..]

I looked over the definition of 'slice_array'. Apparently, the
class is auxiliary, not supposed to be instantiated.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 9 '08 #38

Victor Bazarov wrote:

Ioannis Vranos wrote:
>Victor Bazarov wrote:
>>Ioannis Vranos wrote:
Hi in TC++PL3 it is mentioned that an slice_array can not be copied,
however the following code compiled in my system even with "g++
-ansi -pedantic-errors":

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

slice_array<intresult2= result;

cout<< endl;
}

Broken implementation?

I looked over the definition of 'slice_array'. Apparently, the
class is auxiliary, not supposed to be instantiated.

So, broken implementation?

Jan 9 '08 #39

Ioannis Vranos wrote:

Victor Bazarov wrote:
>Ioannis Vranos wrote:
>>Victor Bazarov wrote:
Ioannis Vranos wrote:
Hi in TC++PL3 it is mentioned that an slice_array can not be
copied, however the following code compiled in my system even
with "g++ -ansi -pedantic-errors":
>
>
>
#include <iostream>
#include <valarray>
>
int main()
{
using namespace std;
>
int array[]={1,2,3,4,5};
>
valarray<intv(array, sizeof(array)/sizeof(*array));
>
slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2,
2)]; slice_array<intresult2= result;
>
cout<< endl;
}
>
>
>
Broken implementation?

I looked over the definition of 'slice_array'. Apparently, the
class is auxiliary, not supposed to be instantiated.

So, broken implementation?

Well, "broken" to me implies lack of intent. FAIK, it may have been
intentional because the draft of C++0x lists only the default c-tor
as private, which makes the compiler-generated copy c-tor accessible.
You can call it an "advanced feature".

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 9 '08 #40

Victor Bazarov wrote:

>
>So, broken implementation?

Well, "broken" to me implies lack of intent. FAIK, it may have been
intentional because the draft of C++0x lists only the default c-tor
as private, which makes the compiler-generated copy c-tor accessible.
You can call it an "advanced feature".

The standard states:

"The slice_array template has no public constructors. These constructors
are declared to be private. These constructors need not be defined".

Here,

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

== slice_array<intresult2(result);
}
my compiler uses the copy constructor.

[john@localhost src]$ g++ -ansi -pedantic-errors main.cc -o foobar-cpp
[john@localhost src]$

Jan 9 '08 #41

Ioannis Vranos wrote:

Victor Bazarov wrote:
>>
>>So, broken implementation?

Well, "broken" to me implies lack of intent. FAIK, it may have been
intentional because the draft of C++0x lists only the default c-tor
as private, which makes the compiler-generated copy c-tor accessible.
You can call it an "advanced feature".

The standard states:

The *current* Standard. If you read what I wrote, I was talking
about the draft of the new Standard. GNU folks are fo the
experimenting kind...

Get yourself a copy of the most recent draft if you'd like to stay
abreast with the language/library design.

>
"The slice_array template has no public constructors. These
constructors are declared to be private. These constructors need not
be defined".
Here,

#include <iostream>
#include <valarray>

int main()
{
using namespace std;

int array[]={1,2,3,4,5};

valarray<intv(array, sizeof(array)/sizeof(*array));

slice_array<intresult= v[slice(0, v.size()/2 + v.size()%2, 2)];

== slice_array<intresult2(result);
}
my compiler uses the copy constructor.

[john@localhost src]$ g++ -ansi -pedantic-errors main.cc -o foobar-cpp
[john@localhost src]$

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jan 9 '08 #42

Bo Persson wrote:

>
But the rule was a mistake. Someone has already noticed that valarray
constructors taking a const reference parameter cannot be called,
because the helper classes' copy constructors are all private.

http://www.open-std.org/jtc1/sc22/wg...n2483.html#253
Obviously, your compiler has implemented the fix!

Strangely enough, the text in the link is referencing 14882:1998, if
this was a real issue, why it wasn't fixed in 14882:2003?

Jan 10 '08 #43