473,396 Members | 1,972 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

No Macros, No for loops, Pure C++

It blows up on a 9 dimensional space, but I'm pretty sure that's due to the
size of size_t.

--- begin: DataGeneration_Test ---
Tensor<T 3, 3>
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ,21,22,23,24,25,26,27
TensorIndices<3, 3>
::ORDER : 3
::RANK : 3
::SIZE : 27
::_gslices:
GSlices<3,3>::_slices:
start: 0 sizes: 1,3,3 strides: 9,3,1
start: 9 sizes: 1,3,3 strides: 9,3,1
start: 18 sizes: 1,3,3 strides: 9,3,1
GSlices::_subslices:
GSlices<3,2>::_slices:
start: 0 sizes: 1,3 strides: 3,1
start: 3 sizes: 1,3 strides: 3,1
start: 6 sizes: 1,3 strides: 3,1
GSlices::_subslices:
GSlices<3,1>::_slices:
start: 0 sizes: 1,1,1 strides: 1,1,1
start: 1 sizes: 1,1,1 strides: 1,1,1
start: 2 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 3 sizes: 1,1,1 strides: 1,1,1
start: 4 sizes: 1,1,1 strides: 1,1,1
start: 5 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 6 sizes: 1,1,1 strides: 1,1,1
start: 7 sizes: 1,1,1 strides: 1,1,1
start: 8 sizes: 1,1,1 strides: 1,1,1
GSlices<3,2>::_slices:
start: 9 sizes: 1,3 strides: 3,1
start: 12 sizes: 1,3 strides: 3,1
start: 15 sizes: 1,3 strides: 3,1
GSlices::_subslices:
GSlices<3,1>::_slices:
start: 9 sizes: 1,1,1 strides: 1,1,1
start: 10 sizes: 1,1,1 strides: 1,1,1
start: 11 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 12 sizes: 1,1,1 strides: 1,1,1
start: 13 sizes: 1,1,1 strides: 1,1,1
start: 14 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 15 sizes: 1,1,1 strides: 1,1,1
start: 16 sizes: 1,1,1 strides: 1,1,1
start: 17 sizes: 1,1,1 strides: 1,1,1
GSlices<3,2>::_slices:
start: 18 sizes: 1,3 strides: 3,1
start: 21 sizes: 1,3 strides: 3,1
start: 24 sizes: 1,3 strides: 3,1
GSlices::_subslices:
GSlices<3,1>::_slices:
start: 18 sizes: 1,1,1 strides: 1,1,1
start: 19 sizes: 1,1,1 strides: 1,1,1
start: 20 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 21 sizes: 1,1,1 strides: 1,1,1
start: 22 sizes: 1,1,1 strides: 1,1,1
start: 23 sizes: 1,1,1 strides: 1,1,1

GSlices<3,1>::_slices:
start: 24 sizes: 1,1,1 strides: 1,1,1
start: 25 sizes: 1,1,1 strides: 1,1,1
start: 26 sizes: 1,1,1 strides: 1,1,1


--- end: DataGeneration_Test ---

--
"If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true." - Bertrand
Russell

Jul 22 '05 #1
15 1316

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote in message
news:cc********************@speakeasy.net...
It blows up on a 9 dimensional space, but I'm pretty sure that's due to
the
size of size_t.

--- begin: DataGeneration_Test ---
Tensor<T 3, 3>
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ,21,22,23,24,25,26,27
TensorIndices<3, 3>


<snip>
Care to enlighten us as to exactly what you're talking about here? :-)

-Howard
Jul 22 '05 #2
Howard wrote:
Tensor<T 3, 3>
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ,21,22,23,24,25,26,27
TensorIndices<3, 3>
Care to enlighten us as to exactly what you're talking about here? :-)


Loop unrolling with expression metatemplates, without for loops.

The compiler will output essentially a huge brick of solid opcodes.
Electrons go in one end, and the result comes out another, with no moving
parts inside.

--
Phlip
http://industrialxp.org/community/bi...UserInterfaces
Jul 22 '05 #3
"Phlip" <ph*******@yahoo.com> wrote:
Howard wrote:
Tensor<T 3, 3>
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ,21,22,23,24,25,26,27
TensorIndices<3, 3>

Care to enlighten us as to exactly what you're talking about here? :-)


Loop unrolling with expression metatemplates, without for loops.

The compiler will output essentially a huge brick of solid opcodes.


It doesn't output much for me:
i.cc:2: error: expected constructor, destructor, or type conversion
before '<' token
i.cc:2: error: expected `,' or `;' before '<' token

Is there meant to be some compilable code somewhere?
Jul 22 '05 #4
Phlip wrote:
Howard wrote:
> Tensor<T 3, 3>
> 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ,21,22,23,24,25,26,27
> TensorIndices<3, 3>

Care to enlighten us as to exactly what you're talking about here? :-)


Loop unrolling with expression metatemplates, without for loops.

The compiler will output essentially a huge brick of solid opcodes.
Electrons go in one end, and the result comes out another, with no moving
parts inside.


In some cases compilers can unroll loops without the templates. For me, one
of the advantages of using templates and compile-time construction of the
data structures is that I can use recursion without paying for it at
run-time. Run-time recursion usually means a lot of dereferencing which can
be expensive (in comparison to loops). Oh, I should add that I have no
pointers either. There are very few conditionals as well.

I have to admit, I don't fully understand the beast I've created. Nor do I
have solid ideas of how best to use it. I started out by creating my own
vector and matrix classes treating them as separate types. That didn't
settle well with me because I know they should ideally be derived from a
common tensor type. One library I looked at has this backwards. In that
library tensor derives from vector.

I believe there is a correct foundation that can be laid, upon which all
subsequent functionality should be easily and naturally derivable. There
should be two fundamental algorithms: inner product, and outer product.
From these I should be able to produce all the operations such as M*V, V*M,
M*M, M^n, V*V, VxV, etc. I'm inclined to believe the correct design can be
expressed in fewer than 500 loc.
--
"If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true." - Bertrand
Russell

Jul 22 '05 #5

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote in message
news:16********************@speakeasy.net...
Phlip wrote:
[SNIP]
In some cases compilers can unroll loops without the templates. For me, one of the advantages of using templates and compile-time construction of the
data structures is that I can use recursion without paying for it at
run-time. Run-time recursion usually means a lot of dereferencing which can be expensive (in comparison to loops). Oh, I should add that I have no
pointers either. There are very few conditionals as well.

As always with optimization, care is advisable. In some situations this
"manual" loop unrolling might pay off and in others it won´t. This depends
very much on what you´re doing in the loop, if there might be aliasing
effects and so on. In general it´s good practice to profile and decide
whether the compiler´s built in optimization is better or the template
unrolling.
I have to admit, I don't fully understand the beast I've created. Nor do I
have solid ideas of how best to use it. I started out by creating my own
vector and matrix classes treating them as separate types. That didn't
settle well with me because I know they should ideally be derived from a
common tensor type. One library I looked at has this backwards. In that
library tensor derives from vector.


Well, then the design of that library is wrong (if we´re strictly speaking).
In terms of mathematics a vector is a tensor of first order and all
operations can be expressed in tensor notation. Consequently, it makes sense
that a vector is a tensor. However, from a progarmmer´s view the library
creators might have had their reasons for their design.

Cheers
Chris

[SNIP]
Jul 22 '05 #6
Steven T. Hatton wrote:
I have to admit, I don't fully understand the beast I've created. Nor do I
have solid ideas of how best to use it. I started out by creating my own
vector and matrix classes treating them as separate types. That didn't
settle well with me because I know they should ideally be derived from a
common tensor type. One library I looked at has this backwards. In that
library tensor derives from vector.


If nothing can compell you to show Old Wolf your code, you could at least
show us your tests.

--
Phlip
http://industrialxp.org/community/bi...UserInterfaces
Jul 22 '05 #7
Chris Theis wrote:

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote in message
news:16********************@speakeasy.net...
Phlip wrote:

[SNIP]
In some cases compilers can unroll loops without the templates. For me,

one
of the advantages of using templates and compile-time construction of the
data structures is that I can use recursion without paying for it at
run-time. Run-time recursion usually means a lot of dereferencing which

can
be expensive (in comparison to loops). Oh, I should add that I have no
pointers either. There are very few conditionals as well.


As always with optimization, care is advisable. In some situations this
"manual" loop unrolling might pay off and in others it won´t. This depends
very much on what you´re doing in the loop, if there might be aliasing
effects and so on. In general it´s good practice to profile and decide
whether the compiler´s built in optimization is better or the template
unrolling.


Can you provide an example of where using template recursion was not as
efficient as the compilers own optimization? Also, can you explain what you
mean by "aliasing effects", and how it might impact object code generated
from a recursive expression template?
I have to admit, I don't fully understand the beast I've created. Nor do
I have solid ideas of how best to use it. I started out by creating my
own vector and matrix classes treating them as separate types. That
didn't settle well with me because I know they should ideally be derived
from a common tensor type. One library I looked at has this backwards. In
that library tensor derives from vector.


Well, then the design of that library is wrong (if we´re strictly
speaking). In terms of mathematics a vector is a tensor of first order and
all operations can be expressed in tensor notation. Consequently, it makes
sense that a vector is a tensor. However, from a progarmmer´s view the
library creators might have had their reasons for their design.


It may be possible to define tensors in terms nested vectors of vectors and
show them to mathematically equivalent to tensors defined in the
conventional manner. I haven't thought about it for more than the time it
took to write this paragraph, so I can't say for sure.
--
"If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true." - Bertrand
Russell

Jul 22 '05 #8
Steven T. Hatton wrote:
Can you provide an example of where using template recursion was not as
efficient as the compilers own optimization?
Easy. Imagine function A calls B in seven different ways. Now they both fit
in the CPU cache, so this just discusses A and B to itself, without
involving the system bus.

Now force B to inline, via the keyword, or compiling options, or a macro.
Now A is seven times longer. It no longer fits in the CPU. When A evaluates,
it must constantly flush itself out of the CPU cache to load its other half.
Performance plumets.

Always remember: Inline does not automatically mean better performance.
Also, can you explain what you
mean by "aliasing effects",
int q = 5;
int * p = &q;
*p = 4;
if (q == 5)...

Between those lines, the compiler cannot optimize q into a register. It must
fetch q from memory, because the compiler may or may not be able to track
whether aliases, meaning alternate names, for q have affected its value.
and how it might impact object code generated
from a recursive expression template?


I don't know; the general point to learn is you can never predict this or
that code's effect on performance.

--
Phlip
http://industrialxp.org/community/bi...UserInterfaces
Jul 22 '05 #9
Phlip wrote:
Steven T. Hatton wrote:
Can you provide an example of where using template recursion was not as
efficient as the compilers own optimization?


Easy. Imagine function A calls B in seven different ways. Now they both
fit in the CPU cache, so this just discusses A and B to itself, without
involving the system bus.

Now force B to inline, via the keyword, or compiling options, or a macro.
Now A is seven times longer. It no longer fits in the CPU. When A
evaluates, it must constantly flush itself out of the CPU cache to load
its other half. Performance plumets.

Always remember: Inline does not automatically mean better performance.


I'm not sure how that addresses my question. Can you provide an actual
compilable example of template code and analogos for() loop code that
demonstrates that compiler-produced op-code from the for() loop version is
more efficient than the template version? I don't care if you have to use
a particular compiler to demonstrate.

--
"If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true." - Bertrand
Russell

Jul 22 '05 #10

"Steven T. Hatton" <su******@setidava.kushan.aa> schrieb im Newsbeitrag
news:Qv********************@speakeasy.net...
Chris Theis wrote:

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote in message
news:16********************@speakeasy.net...
Phlip wrote:
[SNIP]
In some cases compilers can unroll loops without the templates. For me,

one
of the advantages of using templates and compile-time construction of the data structures is that I can use recursion without paying for it at
run-time. Run-time recursion usually means a lot of dereferencing which

can
be expensive (in comparison to loops). Oh, I should add that I have no
pointers either. There are very few conditionals as well.


As always with optimization, care is advisable. In some situations this
"manual" loop unrolling might pay off and in others it won´t. This depends very much on what you´re doing in the loop, if there might be aliasing
effects and so on. In general it´s good practice to profile and decide
whether the compiler´s built in optimization is better or the template
unrolling.


Can you provide an example of where using template recursion was not as
efficient as the compilers own optimization?


Giving general examples regarding optimization is always tricky because it
depends on so many things. What I want to stress is that template loop
unrolling is certainly worth trying but it might result in code-bloat and
you might take away the compiler's possibility to optimize the code with its
own loop unrolling procedure. This "built-in" feature heavily relies on the
internal CPU registers and might or might not be more efficient. This
depends very much on the code.
Also, can you explain what you
mean by "aliasing effects", and how it might impact object code generated
from a recursive expression template?

Let's take a look at the following function which can be used to set the
values of an array:

void Set( float y[], const float* pArray, int n )
{
for( int i = 0; i < n; i++ )
y[i] = 1.0f - *pArray;
}

Under the assumption that pArray is invariant the optimizer could move the
subtraction out of the loop and everybody is happy. But the user could pass
an address of an element of pArray as the second parameter (e.g., y[10] =
1.0; Set( y, &y[10], 20 ); ).
This means that pArray is not invariant anymore and the compiler cannot
safely move the subtraction to the front. However, you might force the
compiler or at least hint to the compiler that there is certainly no
aliasing-effect and then the built-in loop-unrolling can kick in. Naturally
this should be done with care only. If you use the template approach the
compiler will have a hard time to unroll the loop as it's not there anymore.
Don't get me wrong - I'm not against the template solution and I personally
favor it very much, but one has to decide on a case to case basis after
profiling which way to go. There simply is no general rule.

[SNIP]
It may be possible to define tensors in terms nested vectors of vectors and show them to mathematically equivalent to tensors defined in the
conventional manner. I haven't thought about it for more than the time it
took to write this paragraph, so I can't say for sure.


You can implement tensors in terms of vectors but mathematically speaking
its wrong. A vector is by definition a tensor of n=1 order with 3^n elements
and a generic tensor of order n is expressed by 3^n elements. So every
vector is a tensor but not every tensor is a vector and consequently such an
implementation would violate the "is-a inheritance" rule.

Cheers
Chris

Jul 22 '05 #11
Chris Theis wrote:
Don't get me wrong - I'm not against the template solution
and I personally favor it very much, but one has to decide on a case to
case basis after profiling which way to go. There simply is no general
rule.
I don't believe profiling at this stage is really very useful. In the case
of what I'm doing, I really have to take the design approach as an
hypothesis and build a fairly extensive framework before I can evaluate the
value of the approach. Since the design approach using template
metaprogramming is new to me, and fairly new to the field of computer
science (notice I did not say 'industry'), I am hard pressed to evaluate
one design approach in comparison with the other until I have more
experience. That only comes from doing.
[SNIP]

It may be possible to define tensors in terms nested vectors of vectors and
show them to mathematically equivalent to tensors defined in the
conventional manner. I haven't thought about it for more than the time
it took to write this paragraph, so I can't say for sure.


You can implement tensors in terms of vectors but mathematically speaking
its wrong. A vector is by definition a tensor of n=1 order with 3^n
elements and a generic tensor of order n is expressed by 3^n elements.


Actually, a vector in n-space is a rank 1 tensor of order n, having n
components. A general tensor in n-space is a rank m order n object with
n^m components. Hence the need for the previously discussed PowerOf
template to calculate the number of components.
So
every vector is a tensor but not every tensor is a vector and consequently
such an implementation would violate the "is-a inheritance" rule.


As I said, it may be possible to define tensors in terms of nested vectors
of vectors and demonstrate that a tensor thus defined is mathematically
equivalent to the conventional definition of a tensor. This would, I
believe, follow rather naturally from a generalization of the concept of
diadic matrices.
--
"If our hypothesis is about anything and not about some one or more
particular things, then our deductions constitute mathematics. Thus
mathematics may be defined as the subject in which we never know what we
are talking about, nor whether what we are saying is true." - Bertrand
Russell

Jul 22 '05 #12

"Steven T. Hatton" <su******@setidava.kushan.aa> wrote in message
news:Kr********************@speakeasy.net...
Chris Theis wrote:
Don't get me wrong - I'm not against the template solution
and I personally favor it very much, but one has to decide on a case to
case basis after profiling which way to go. There simply is no general
rule.
I don't believe profiling at this stage is really very useful. In the case
of what I'm doing, I really have to take the design approach as an
hypothesis and build a fairly extensive framework before I can evaluate

the value of the approach. Since the design approach using template
metaprogramming is new to me, and fairly new to the field of computer
science (notice I did not say 'industry'), I am hard pressed to evaluate
one design approach in comparison with the other until I have more
experience. That only comes from doing.
You should certainly go ahead and evaluate the designs first and the
optimization will come in the end. BTW metaprogramming is not really "new"
to the field of computer science either. The first time I saw it in an
academic context was in the book "Generative Programming" by Ulrich
Eisenecker published in 2000. There might be some scientific context even
before that time which I´m not aware of.
[SNIP]
[SNIP]
As I said, it may be possible to define tensors in terms of nested vectors
of vectors and demonstrate that a tensor thus defined is mathematically
equivalent to the conventional definition of a tensor. This would, I
believe, follow rather naturally from a generalization of the concept of
diadic matrices.


You might be right on that, although dyadic matrices requires some
conditions (which makes them dyadic) that is not required by the generic
definition of tensors. Anyway, whether the generalization from the special
case is possible is something left for mathematicians and OT here, so I
won´t continue nitpicking ;-)

Cheers
Chris
Jul 22 '05 #13
Steven T. Hatton wrote:
I'm not sure how that addresses my question. Can you provide an actual
compilable example of template code and analogos for() loop code that
demonstrates that compiler-produced op-code from the for() loop version is
more efficient than the template version? I don't care if you have to use
a particular compiler to demonstrate.


I would think a good example would be any sufficently large loop? as the
for loop would only unfold a few iteratons, whereas the template version
would unroll the whole loop.

Chris
Jul 22 '05 #14
Steven T. Hatton wrote:
I'm not sure how that addresses my question. Can you provide an actual
compilable example of template code and analogos for() loop code that
demonstrates that compiler-produced op-code from the for() loop version is
more efficient than the template version? I don't care if you have to use
a particular compiler to demonstrate.


I would think a good example would be any sufficently large loop? as the
for loop would only unfold a few iteratons, whereas the template version
would unroll the whole loop.

Chris
Jul 22 '05 #15
Steven T. Hatton wrote:
I'm not sure how that addresses my question. Can you provide an actual
compilable example of template code and analogos for() loop code that
demonstrates that compiler-produced op-code from the for() loop version is
more efficient than the template version? I don't care if you have to use
a particular compiler to demonstrate.


I would think a good example would be any sufficently large loop? as the
for loop would only unfold a few iteratons, whereas the template version
would unroll the whole loop.

Chris
Jul 22 '05 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: David Powell | last post by:
Because my work area won't have an Access programmer next year, I've been asked to rebuild their coded application as a set of modular tools. The idea is that a non-programmer will be able to...
1
by: Pavel A. | last post by:
Since new C compilers recognize SAL keywords (__in, __out, __in_opt, etc), should we continue using macros IN, OUT, OPTIONAL? Can OPTIONAL appear by itself, or only combined with IN or OUT? Then,...
77
by: Peter Olcott | last post by:
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html The above link shows that C# is 450% slower on something as simple as a nested loop....
47
by: Emil | last post by:
Is there any hope that new versions of PHP will support macros similar to C or C++? I've searched manual and didn't find anything except define directive, but it can be used to define constant...
12
by: Sheldon | last post by:
Hi, I have two arrays that are of the same dimension but having 3 different values: 255, 1 or 2. I would like to set all the positions in both arrays having 255 to be equal, i.e., where one...
6
by: P-Rage | last post by:
Hello everyone, I was wondering what could possibly cause Internet Explorer 6 to loop a page usging a header(Location:) statement. For example, a page called 'page1.php': if((isset($_POST))...
7
by: aaragon | last post by:
Hi everyone, I have a simple question. I'm trying to make a macro in one file so I can use it in main.cpp. The idea is that I the user of my code has simple to type the macro definition to replace...
95
by: hstagni | last post by:
Where can I find a library to created text-based windows applications? Im looking for a library that can make windows and buttons inside console.. Many old apps were make like this, i guess ...
0
by: Malcolm McLean | last post by:
"Malcolm McLean" <regniztar@btinternet.comwrote in message news:... In C, use a macro for a trivial function that can take more than one type. A good example is lerp() - linearly interpolate...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.