473,672 Members | 2,615 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Optimiser question

Hi All,
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are), and would the optimiser prefer the second one:
First snippet, as it is now:

while(count-->0)
{
*pDest++=nPatte rn;
}

how I propose to 'improve' the code:

while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}

There are numerous bits a bit like this, so this would probably be a
big help overall if I am correct.

cheers

Dave
Dec 13 '07 #1
31 1773
Dave S schrieb:
The question I have is are these 2 snippets functionally the same( I
think they are), and would the optimiser prefer the second one:
Seriously, why don't you just try it out? Compile both pieces of code,
do an objdump -d on them as take a look at the assembly output. Nobody
but you will be able to give you a precise answer considering that
you're using unusual-platform-3.4.1-i-think-gcc as a compiler.

Greetings,
Johannes

--
"Viele der Theorien der Mathematiker sind falsch und klar
Gotteslästerli ch. Ich vermute, dass diese falschen Theorien genau
deshalb so geliebt werden." -- Prophet und Visionär Hans Joss aka
HJP in de.sci.mathemat ik <47************ **********@news .sunrise.ch>
Dec 13 '07 #2
Dave S wrote:
Hi All,
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are), and would the optimiser prefer the second one:
First snippet, as it is now:

while(count-->0)
{
*pDest++=nPatte rn;
}

how I propose to 'improve' the code:

while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}
The two ways you could investigate this are
a) measuring
or
b) examining the generated code

I'm no expert, but I doubt that you will find that such simple changes
would reap significant benefits.

In general, I would suspect you'd be better off looking at
a) using a higher optimisation level on the compiler
b) seeing if a recent compiler build has better optimisation
c) looking at your algorithms (rather than your code)
Dec 13 '07 #3
Dave S wrote:
Hi All,
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are), and would the optimiser prefer the second one:
First snippet, as it is now:

while(count-->0)
{
*pDest++=nPatte rn;
}

how I propose to 'improve' the code:

while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}
The first snippet is more idiomatically C and so no less likely to
be optimised than the second.

If the program isn't running fast enough, and you wish to make it
go faster, the first thing you /must/ do is /find out where the
time goes/. For the gods sake don't just trawl through the program
looking for bits you think you can help with: /find out/ which bits
are eating the time using whatever profiling tools the platform
has available. Finding out which are the slow bits and doing an
/algorithmic/ improvement will get much better results. Even just
finding the slow bits and improving the code will help.

You don't say what the program does, so it's hard to guess what
might be cycle-sinks, but if it does any string-hacking, remember
that `strlen` and `strcat` (as examples) are not constant-time
operations ...

--
Chris "who knows where the time goes?" Dollin

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Dec 13 '07 #4
Dave S <da************ *@bem.fki-et.comwrites:
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are),
No. See below.
and would the optimiser prefer the second one:
I can't see why, but there are real compiler experts here how may know
better.
First snippet, as it is now:

while(count-->0)
{
*pDest++=nPatte rn;
}

how I propose to 'improve' the code:

while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}
The original decrements 'count' even when the condition is false so they
terminate with different values in 'count'.

If 'sizeof *pDest' == 1 (i.e. if 'pDest' it is some form of 'char *')
consider using memset.

--
Ben.
Dec 13 '07 #5
On Dec 13, 3:01 pm, Chris Dollin <chris.dol...@h p.comwrote:
Dave S wrote:
Hi All,
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are), and would the optimiser prefer the second one:
First snippet, as it is now:
while(count-->0)
{
*pDest++=nPatte rn;
}
how I propose to 'improve' the code:
while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}

The first snippet is more idiomatically C and so no less likely to
be optimised than the second.

If the program isn't running fast enough, and you wish to make it
go faster, the first thing you /must/ do is /find out where the
time goes/. For the gods sake don't just trawl through the program
looking for bits you think you can help with: /find out/ which bits
are eating the time using whatever profiling tools the platform
has available. Finding out which are the slow bits and doing an
/algorithmic/ improvement will get much better results. Even just
finding the slow bits and improving the code will help.

You don't say what the program does, so it's hard to guess what
might be cycle-sinks, but if it does any string-hacking, remember
that `strlen` and `strcat` (as examples) are not constant-time
operations ...

--
Chris "who knows where the time goes?" Dollin

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England- Hide quoted text -

- Show quoted text -
Thanks all,
The code is used for embedded real time control. It runs fast enough,
but only with the compiler optimisations turned up to maximum and no
debug info (although I dont think debug info and optimisations mix
anyway?). The algorithms have already been tweaked,and the
hardware(FPGA) is also similarly optimised. Im at a slight loose end
whilst I wait for the finalised target hardware (actual circuit
board), so I thought I would go and 'tidy up' and see if I could help
the compiler out any. The tidy up was provoked by an article in the
IET Electronics magazine where ++i is esier to optimise than i++ due
to the need to store i and then increment it. (think I got that
correct). As I dont know really how optimsers work I thought Id ask
the question.

cheers
Dave
Dec 13 '07 #6
On Dec 13, 3:11 pm, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
Dave S <david.sander.. .@bem.fki-et.comwrites:
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed. With this in mind I have been
loking through it to see if I can help it out a bit. I have limitied
knowledge of how compilers works, but I understand that some
constructs optimise easier / better. We are using a variant of gcc
(3.4.1 I think) on Nios II platform, if that makes a difference.
The question I have is are these 2 snippets functionally the same( I
think they are),

No. See below.
and would the optimiser prefer the second one:

I can't see why, but there are real compiler experts here how may know
better.
First snippet, as it is now:
while(count-->0)
{
*pDest++=nPatte rn;
}
how I propose to 'improve' the code:
while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}

The original decrements 'count' even when the condition is false so they
terminate with different values in 'count'.
Yes, in this instance it makes no odds, but in another it could.
If 'sizeof *pDest' == 1 (i.e. if 'pDest' it is some form of 'char *')
consider using memset.
I might normally use a bzero or similar, but this is a set of
structures and pointers to structures.

Dave

Dec 13 '07 #7
Dave S wrote:
Thanks all,
The code is used for embedded real time control. It runs fast enough,
but only with the compiler optimisations turned up to maximum and no
debug info (although I dont think debug info and optimisations mix
anyway?). The algorithms have already been tweaked,and the
hardware(FPGA) is also similarly optimised.
You need measurements (and, perhaps, well-grounded estimates) to find
out what's taking the time. Really. Otherwise you're just whistling
in the dark. If performance is important to you, some investment in
measurement seems worthwhile.

If you can't instrument the embedded application, is it /possible/ to
compile on say an x86 desktop and use that platform's profiling tools
to get an idea of where to look? Clearly you'd have to fake out the
FPGA stuff, and that might make a nonsense of the results; I don't do
embedded, so I don't have a feel for it, but some here do and might
be able to advise.
The tidy up was provoked by an article in the
IET Electronics magazine where ++i is esier to optimise than i++ due
to the need to store i and then increment it.
It may be "easier" to optimise, but that doesn't matter; C compilers
that can't generate the "best" output just because you've used `i++`
rather than `++i` have learned nothing from the past twenty years.

[In C++, where `i` might be some big class instance, I understand that
it can make rather more difference. But in C? Unlikely. Not impossible;
just unlikely, given that you're using a gcc variant.]

--
Chris "i += 1" Dollin

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

Dec 13 '07 #8
Dave S wrote:
The tidy up was provoked by an article in the
IET Electronics magazine where ++i is esier to optimise than i++ due
to the need to store i and then increment it. (think I got that
correct).
This is often said, but mainly by people who are talking about C++, and
even then only when i is a class type rather than a POD (plain old
data). If i is an int, then i++; and ++i; are equivalent statements, and
any compiler worth its salt knows this.

Your compiler may not know this optimization, but unless you can prove
that it doesn't, I will assume that it does. Similarly for any other
common peephole optimization trick.
Dec 13 '07 #9
Dave S <da************ *@bem.fki-et.comwrites:
On Dec 13, 3:11 pm, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
>Dave S <david.sander.. .@bem.fki-et.comwrites:
I have been given some code to wok on. It relies heavily on the
optimiser to run at the correct speed.
<snip>
First snippet, as it is now:
while(count-->0)
{
*pDest++=nPatte rn;
}
how I propose to 'improve' the code:
while(count>0)
{
*pDest=nPattern ;
++pDest;
--count;
}
<snip>
I might normally use a bzero or similar, but this is a set of
structures and pointers to structures.
Then you might consider Duff's device:

http://www.lysator.liu.se/c/duffs-device.html

(view on an empty stomach if it is new to you) but, as always, measure
before investing any effort in tricksy code changes.

--
Ben.
Dec 13 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2348
by: Andrew Mayo | last post by:
There is something very strange going on here. Tested with ADO 2.7 and MSDE/2000. At first, things look quite sensible. You have a simple SQL query, let's say select * from mytab where col1 = 1234 Now, let's write a simple VB program to do this query back to an MSDE/2000 database on our local machine. Effectively, we'll
3
5031
by: Stevey | last post by:
I have the following XML file... <?xml version="1.0"?> <animals> <animal> <name>Tiger</name> <questions> <question index="0">true</question> <question index="1">true</question> </questions>
7
2654
by: nospam | last post by:
Ok, 3rd or is it the 4th time I have asked this question on Partial Types, so, since it seems to me that Partial Types is still in the design or development stages at Microsoft, I am going to ask it differently. FOUR QUESTIONS: The background: I got three (3) files
5
1533
by: Craig O'Shannessy | last post by:
Hi everyone, My performance on a big mission critical system has recently collapsed, and I've finally traced it down to the postgresql optimiser I think. I'm running postgresql-7.2.1-2PGDG The explains below make it clear I think. If I just change the table declaration order, I get MASSIVELY better performance. I thought the postgres optimiser was meant to make these desicions for me?
2
2416
by: Paul Reddin | last post by:
Hi, I'm sure I read somewhere that the SELECTIVITY Clause cannot be used with static SQL, can anybody confirm/deny this? Also, at the risk of a philosophical war, when will the optimizer provide for some real form of hints. It is so very,very painful and very,very time consuming trying to work around bad optimizer plans. I know IBM think it is
3
3078
by: Ekqvist Marko | last post by:
Hi, I have one Access database table including questions and answers. Now I need to give answer id automatically to questionID column. But I don't know how it is best (fastest) to do? table before rowID answID qryrow questionID datafield 1591 12 06e 06e 06e question 1593 12 06f 06f 06f question 1594 12 answer to the question 06f
10
3418
by: glenn | last post by:
I am use to programming in php and the way session and post vars are past from fields on one page through to the post page automatically where I can get to their values easily to write to a database or continue to process on to the next page. I am now trying to learn ASP to see if we can replace some of our applications that were written in php with an ASP alternative. However, after doing many searches on google and reading a couple...
10
3711
by: Rider | last post by:
Hi, simple(?) question about asp.net configuration.. I've installed ASP.NET 2.0 QuickStart Sample successfully. But, When I'm first start application the follow message shown. ========= Server Error in '/QuickStartv20' Application. -------------------------------------------------------------------------------- Configuration Error Description: An error occurred during the processing of a configuration file
8
1525
by: =?ISO-8859-1?Q?Tom=E1s_=D3_h=C9ilidhe?= | last post by:
I'm working with a microcontroller at the moment that has a single instruction for clearing a bit in a byte. I started off with the following line of code: x &= ~0x8u; /* Clear the 4th bit */ But then I changed it to the following because I thought I might get more efficient assembler out of it:
0
8486
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8931
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8828
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8608
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8680
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
4227
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2819
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2063
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1816
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.