473,789 Members | 2,732 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Memory access optimization

Hi,

please take a look to this sample code:

class MyClass {
private:
static int length ;
public:
static void setLength(int newLength) ;
void do() ;
}

int MyClass::length ;

void MyClass::setLen gth(int newLength) {
length=newLengt h ;
}

void MyClass::do() {
for(int i=0;i<length;i+ +) {
// some simple floating point additions and multiplications ...
}
}

I expected that the compiler (Microsoft EVC 4 for ARM, full
optimization)
would optimize the for-loop of the method "do" to
register int t=length ;
for(int i=0;i<t;i++) {
...
}
But this does not happen. Instead, the generated machine code
refetches "length" in each iteration of the for-loop. Why?
Does the compiler "fear" that another thread could call "setLength"
while the for-loop runs? "length" is *not* declared volatile.

Is there a formal specification how long a compiler is allowed to cache
non-local variables in registers? For example, can code like
a = length ;
b = length ;
always be expected to be optimized to
register int t=length ;
a=t ;
b=t ;
?

rs

Jan 11 '06 #1
12 2403
ro*****@gmx.de wrote:
Hi,

please take a look to this sample code:

class MyClass {
private:
static int length ;
public:
static void setLength(int newLength) ;
void do() ;
}

int MyClass::length ;

void MyClass::setLen gth(int newLength) {
length=newLengt h ;
}

void MyClass::do() {
for(int i=0;i<length;i+ +) {
// some simple floating point additions and multiplications ...
}
}

I expected that the compiler (Microsoft EVC 4 for ARM, full
optimization)
would optimize the for-loop of the method "do" to
register int t=length ;
for(int i=0;i<t;i++) {
...
}
But this does not happen. Instead, the generated machine code
refetches "length" in each iteration of the for-loop. Why?
Does the compiler "fear" that another thread could call "setLength"
while the for-loop runs? "length" is *not* declared volatile.

Is there a formal specification how long a compiler is allowed to cache
non-local variables in registers? For example, can code like
a = length ;
b = length ;
always be expected to be optimized to
register int t=length ;
a=t ;
b=t ;
?

rs


rule of thumb is to cache any global variable in a local variable to
ease optimization for the compiler. If a register is available, it will
be allocated for the local variable. If none is available, the local
varialbe will most likely be allocated on stack and therefore will most
likely have a better cache behaviour than its global counterpart. The
compiler might even be capable of eliminating the local variable.

I cannot say what the reason for missing optimization is your case,
because you omitted the contents of the for loop. But I recomend to
allocate a local variable that is the copy of the object attribute.

HTH,
Tom
Jan 11 '06 #2
ro*****@gmx.de wrote:
Hi,

please take a look to this sample code:

class MyClass {
private:
static int length ;
public:
static void setLength(int newLength) ;
void do() ;
}

int MyClass::length ;

void MyClass::setLen gth(int newLength) {
length=newLengt h ;
}

void MyClass::do() {
for(int i=0;i<length;i+ +) {
// some simple floating point additions and multiplications ...
}
}

I expected that the compiler (Microsoft EVC 4 for ARM, full
optimization)
would optimize the for-loop of the method "do" to
register int t=length ;
for(int i=0;i<t;i++) {
...
}
But this does not happen. Instead, the generated machine code
refetches "length" in each iteration of the for-loop. Why?
Does the compiler "fear" that another thread could call "setLength"
while the for-loop runs? "length" is *not* declared volatile.

Is there a formal specification how long a compiler is allowed to cache
non-local variables in registers? For example, can code like
a = length ;
b = length ;
always be expected to be optimized to
register int t=length ;
a=t ;
b=t ;
?

rs


There is no formal specification for such optimizations in the
Standard. You might be better off posting in Microsoft-specific group
where more of their experts lurk. Several such groups are listed in
this FAQ:

http://www.parashift.com/c++-faq-lit...t.html#faq-5.9

Cheers! --M

Jan 11 '06 #3
> There is no formal specification for such optimizations in the
Standard. You might be better off posting in Microsoft-specific group
where more of their experts lurk. Several such groups are listed in
[...]


I know that it may be a MS-specific thing but I wanted to know what
c.l.c++ thinks about it. Do you agree with me that nothing in the specs
prevents the compiler from doing the optimization unless the variable
is declared volatile?

rs

Jan 11 '06 #4
On 11 Jan 2006 06:00:59 -0800 in comp.lang.c++, ro*****@gmx.de
wrote,
I know that it may be a MS-specific thing but I wanted to know what
c.l.c++ thinks about it. Do you agree with me that nothing in the specs
prevents the compiler from doing the optimization unless the variable
is declared volatile?


I think you are right.
Jan 11 '06 #5

<ro*****@gmx.de > wrote in message
news:11******** *************@g 14g2000cwa.goog legroups.com...
There is no formal specification for such optimizations in the
Standard. You might be better off posting in Microsoft-specific group
where more of their experts lurk. Several such groups are listed in
[...]


I know that it may be a MS-specific thing but I wanted to know what
c.l.c++ thinks about it. Do you agree with me that nothing in the specs
prevents the compiler from doing the optimization unless the variable
is declared volatile?

rs


Why don't you make the length variable a local and see what MSVC does ?
Although "length" is not declared as voltaile, in typical MS manner, they
might
have tried to anticipate that you might be modifying "length" in another
thread
and so get the value each iteration of the loop.
Jan 12 '06 #6

"Dave Townsend" <da********@com cast.net> skrev i meddelandet
news:_K******** *************** *******@comcast .com...

<ro*****@gmx.de > wrote in message
news:11******** *************@g 14g2000cwa.goog legroups.com...
> There is no formal specification for such optimizations in the
> Standard. You might be better off posting in Microsoft-specific
> group
> where more of their experts lurk. Several such groups are listed
> in
> [...]


I know that it may be a MS-specific thing but I wanted to know what
c.l.c++ thinks about it. Do you agree with me that nothing in the
specs
prevents the compiler from doing the optimization unless the
variable
is declared volatile?

rs


Why don't you make the length variable a local and see what MSVC
does ?
Although "length" is not declared as voltaile, in typical MS manner,
they
might
have tried to anticipate that you might be modifying "length" in
another
thread
and so get the value each iteration of the loop.


Yes. The length is globally accessible through the public static
function setLength. To be able to cache the length, the compiler must
first prove that no one is ever calling setLenght while the loop runs.

That's hard to do in the general case, so they have obviously not
bothered.
Bo Persson
Jan 12 '06 #7
> Yes. The length is globally accessible through the public static
function setLength. To be able to cache the length, the compiler must
first prove that no one is ever calling setLenght while the loop runs.


As already discussed above, the specs don't force the compiler
to prove anything, do they?

Ramin

Jan 12 '06 #8
> Why don't you make the length variable a local and see what MSVC does ?

First thing I tried. As expected, the local variable is kept in a
register
(plenty of free registers in the ARM processor).
Although "length" is not declared as voltaile, in typical MS manner, they
might have tried to anticipate that you might be modifying "length" in
another thread and so get the value each iteration of the loop.


As an experiment, I removed the setter-method and initialized the
variable
via
static MyClass::length =10 ;
so that the variable is never write-accessed in the code. No difference
:-(

I suppose that optimization on global data is generally very difficult
in languages like C/C++ where the compiler has to assume that you can
modify any variable indirectly with some pointer arithmetic+type casts.

Ramin

Jan 12 '06 #9

<ro*****@gmx.de > skrev i meddelandet
news:11******** *************@g 44g2000cwa.goog legroups.com...
Yes. The length is globally accessible through the public static
function setLength. To be able to cache the length, the compiler
must
first prove that no one is ever calling setLenght while the loop
runs.


As already discussed above, the specs don't force the compiler
to prove anything, do they?


Yes, they do. Using volatile prevents the compiler from keeping the
value in a register. Not using volatile doesn't automatically allow
caching. The compiler must first make sure that the value is not
updated any other way. If the variable is global, or accessible
through a public function, that is hard.

Consider this code:

int i;

void f(int& j)
{
for (i = 0; i < 10; i++)
j = i;
}

int main()
{
int k;

f(k);
f(i);

}

What happens for f(k) ?
What happens for f(i) ?

Does that affect the optimizations possible for the function f()? It
sure does!
Bo Persson


Jan 12 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
5423
by: Sean C. | last post by:
Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this platform always seemed very predictable and consistent. Since moving to a WinNT/UDB 7.2 environment, the choices the optimizer makes often seem flaky. But this last example really floored me. I was hoping someone could explain why I get worse response time when the optimizer uses two indexes, than when it uses one. Some context:
12
2311
by: ira2402 | last post by:
Hi All, We are developing sw for a small embedded OS and we have limited memory. We are looking for algorithms, links, and articles about this. The goal is efficient utilization of small amount of memory - means - allocation for fixed length blocks / variable length blocks. Thanks.
3
2431
by: Peter Olcott | last post by:
What can anyone: (1) Tell Me about (2) Provide Links to (3) Recommend Books Pertaining to Memory Performance Optimization? The main thing that I am looking for is rules-of-thumb that maximize the execution speed of time critical applications.
0
1767
by: monika.saxena | last post by:
Hi all, In one of my projects which is a web based application in asp.net, a third party tool - "Frontline Solver DLL" (It is an unmanaged DLL and ..NET is calling it using the PInvoke) is used. This DLL is used for solving and optimizing some non linear problems. Problem : The problem I am facing is as the solver runs and does the
3
2011
by: thomas.porschberg | last post by:
Hi, I want to read records from a database and export it in an arbitrary format. My idea was to feed a class with a String array fetched from the database and let this class fire SAX events as processor input. The basic class hierarchy is:
74
4704
by: ballpointpenthief | last post by:
If I have malloc()'ed a pointer and want to read from it as if it were an array, I need to know that I won't be reading past the last index. If this is a pointer to a pointer, a common technique seems to be setting a NULL pointer to the end of the list, and here we know that the allocated memory has been exhausted. All good. When this is a pointer to another type, say int, I could have a variable that records how much memory is being...
6
6830
by: zl2k | last post by:
hi, there I am using a big, sparse binary array (size of 256^3). The size may be changed in run time. I first thought about using the bitset but found its size is unchangeable. If I use the vector<bool>, does each element takes 4 bytes instead of 1 bit? I am using gcc3.4.4. There is a bit_vector which is kind of old so I wont use that. Any other choices? Thanks ahead. zl2k
20
2163
by: Udo A. Steinberg | last post by:
Hi all, In a ternary statement such as: x = (cond ? a : b); it is obviously guaranteed that "x" will be equal to "a" only if the condition "cond" holds. Assuming that "a" is a memory location, is it also guaranteed that "a" will not be accessed in memory if the condition does not hold? Or, in other words, is a compiler allowed to speculatively fetch a and b from memory and then assign one of them to x based on the condition?
1
2253
by: Lambda | last post by:
I'm trying to develop several interesting components of a simple search engine follow some text books. A book introduces some ways to compress dictionary, allow it to stay on main memory. In the dictionary, is a list of millions of words with related data of several bytes. Word / Term Freq(int) / Pointer to disk file Hello 100
0
9666
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9511
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10410
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10200
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10139
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9984
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9020
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5418
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2909
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.