473,803 Members | 3,422 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Memory access optimization

Hi,

please take a look to this sample code:

class MyClass {
private:
static int length ;
public:
static void setLength(int newLength) ;
void do() ;
}

int MyClass::length ;

void MyClass::setLen gth(int newLength) {
length=newLengt h ;
}

void MyClass::do() {
for(int i=0;i<length;i+ +) {
// some simple floating point additions and multiplications ...
}
}

I expected that the compiler (Microsoft EVC 4 for ARM, full
optimization)
would optimize the for-loop of the method "do" to
register int t=length ;
for(int i=0;i<t;i++) {
...
}
But this does not happen. Instead, the generated machine code
refetches "length" in each iteration of the for-loop. Why?
Does the compiler "fear" that another thread could call "setLength"
while the for-loop runs? "length" is *not* declared volatile.

Is there a formal specification how long a compiler is allowed to cache
non-local variables in registers? For example, can code like
a = length ;
b = length ;
always be expected to be optimized to
register int t=length ;
a=t ;
b=t ;
?

rs

Jan 11 '06
12 2404

"Bo Persson" <bo*@gmb.dk> skrev i meddelandet
news:42******** *****@individua l.net...

<ro*****@gmx.de > skrev i meddelandet
news:11******** *************@g 44g2000cwa.goog legroups.com...
Yes. The length is globally accessible through the public static
function setLength. To be able to cache the length, the compiler
must
first prove that no one is ever calling setLenght while the loop
runs.
As already discussed above, the specs don't force the compiler
to prove anything, do they?


Yes, they do. Using volatile prevents the compiler from keeping the
value in a register. Not using volatile doesn't automatically allow
caching. The compiler must first make sure that the value is not
updated any other way. If the variable is global, or accessible
through a public function, that is hard.

Consider this code:

int i;

void f(int& j)
{
for (i = 0; i < 10; i++)
j = i;
}


Ok, this was a bad example. Change the line to j = 1; to show the
problem with f(i).

int main()
{
int k;

f(k);
f(i);

}

What happens for f(k) ?
What happens for f(i) ?

Does that affect the optimizations possible for the function f()? It
sure does!
Bo Persson

Jan 12 '06 #11
Bo Persson wrote:
<ro*****@gmx.de > skrev i meddelandet
news:11******** *************@g 44g2000cwa.goog legroups.com...
Yes. The length is globally accessible through the public static
function setLength. To be able to cache the length, the compiler
must
first prove that no one is ever calling setLenght while the loop
runs.


As already discussed above, the specs don't force the compiler
to prove anything, do they?


Yes, they do. Using volatile prevents the compiler from keeping the
value in a register.


[snip]

Actually, the meaning of volatile is compiler-specific, but in general
it means that the compiler should not do any fancy optimizations to the
data at hand. That could mean not caching a variable in a register even
though the code doesn't appear to change it. By using volatile, the
programmer is indicating that the value could change due to something
outside the program (e.g. another thread or process or a hardware
device). See TC++PL A.7.1.

Cheers! --M

Jan 12 '06 #12
> Ok, this was a bad example. Change the line to j = 1; to show the
problem with f(i).


Thank you! I have eventually understood why the optimization did
not happen.
The for-loop contains code like

for(int i=0;i<length;i+ +) {
data[i]= ...some floating point math...
}

As in your example, it is possible (at least from the compiler's point
of view) that the assignment to "data[i]" overwrites "length". One
would need a very potent data flow analysis to prove the opposite.
/*
Please insert here some C/C++ bashing.
Topic: Why does C/C++ constantly forces me to do things that
are done by the compiler in other language?
*/

Ramin

Jan 13 '06 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
5423
by: Sean C. | last post by:
Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this platform always seemed very predictable and consistent. Since moving to a WinNT/UDB 7.2 environment, the choices the optimizer makes often seem flaky. But this last example really floored me. I was hoping someone could explain why I get worse response time when the optimizer uses two indexes, than when it uses one. Some context:
12
2311
by: ira2402 | last post by:
Hi All, We are developing sw for a small embedded OS and we have limited memory. We are looking for algorithms, links, and articles about this. The goal is efficient utilization of small amount of memory - means - allocation for fixed length blocks / variable length blocks. Thanks.
3
2431
by: Peter Olcott | last post by:
What can anyone: (1) Tell Me about (2) Provide Links to (3) Recommend Books Pertaining to Memory Performance Optimization? The main thing that I am looking for is rules-of-thumb that maximize the execution speed of time critical applications.
0
1767
by: monika.saxena | last post by:
Hi all, In one of my projects which is a web based application in asp.net, a third party tool - "Frontline Solver DLL" (It is an unmanaged DLL and ..NET is calling it using the PInvoke) is used. This DLL is used for solving and optimizing some non linear problems. Problem : The problem I am facing is as the solver runs and does the
3
2012
by: thomas.porschberg | last post by:
Hi, I want to read records from a database and export it in an arbitrary format. My idea was to feed a class with a String array fetched from the database and let this class fire SAX events as processor input. The basic class hierarchy is:
74
4708
by: ballpointpenthief | last post by:
If I have malloc()'ed a pointer and want to read from it as if it were an array, I need to know that I won't be reading past the last index. If this is a pointer to a pointer, a common technique seems to be setting a NULL pointer to the end of the list, and here we know that the allocated memory has been exhausted. All good. When this is a pointer to another type, say int, I could have a variable that records how much memory is being...
6
6831
by: zl2k | last post by:
hi, there I am using a big, sparse binary array (size of 256^3). The size may be changed in run time. I first thought about using the bitset but found its size is unchangeable. If I use the vector<bool>, does each element takes 4 bytes instead of 1 bit? I am using gcc3.4.4. There is a bit_vector which is kind of old so I wont use that. Any other choices? Thanks ahead. zl2k
20
2167
by: Udo A. Steinberg | last post by:
Hi all, In a ternary statement such as: x = (cond ? a : b); it is obviously guaranteed that "x" will be equal to "a" only if the condition "cond" holds. Assuming that "a" is a memory location, is it also guaranteed that "a" will not be accessed in memory if the condition does not hold? Or, in other words, is a compiler allowed to speculatively fetch a and b from memory and then assign one of them to x based on the condition?
1
2254
by: Lambda | last post by:
I'm trying to develop several interesting components of a simple search engine follow some text books. A book introduces some ways to compress dictionary, allow it to stay on main memory. In the dictionary, is a list of millions of words with related data of several bytes. Word / Term Freq(int) / Pointer to disk file Hello 100
0
9703
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10550
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10317
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10295
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9125
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5501
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5633
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4275
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3799
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.