473,387 Members | 1,742 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Optimisation techniques

Hi All,

I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

In my project, I have to use basic operations like add and sub many
times. So I initially inlined them and found a lot of optimisation.
However I was wondering if more optimisation could be achieved if I
use macros instead of inlining in some of the functions if not all.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.

Thanks and Regards,
-akd
----------------------------------
Anoop Deoras
Signal Processing Group
Sasken Communication Technologies,
India
----------------------------------
Nov 14 '05 #1
17 2338
EC-AKD <ak*****@yahoo.com> spoke thus:
I was wondering if
C level inlining of code is any way comparable to macros.
No.
I believe
that inlining is equivalent to writing macros.


No. Macros are a purely textual substitution performed by the
preprocessor; the only optimization inherent in them is for the reader
and the maintainer. Inlined functions eliminate the overhead of the
function call, so they can improve the efficiency of the code relative
to their non-inlined versions.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 14 '05 #2
In article <2f*************************@posting.google.com> ,
EC-AKD <ak*****@yahoo.com> wrote:
I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros.
They are certainly related techniques. But the facilities provided
*by C* for them are not of equal power.

C's macro mechanism doesn't provide any way to reliably introduce
variables without name conflicts (though suitably prefixed names are
often adequate). And the lack of full generality in C's block
structuring makes it hard to use constructs like switches and loops in
a macro.

On the other hand, C89 does not provide inline functions, and there's
no guarantee that declaring a function inline will do anything even if
it's accepted.
However I had this strange feeling if I should go for macros
wherever necessary instead of inlining the codes.


If you want your code to be portable, you will at least have to
conditionalize any inline declarations. So if a macro will do the job
without unreasonable contortions, you might as well use one.

-- Richard
Nov 14 '05 #3

"EC-AKD" <ak*****@yahoo.com> wrote in message

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.

C99 provides "inline", but isn't in common use. However it is a C++ keyword,
so as long as your code is written so that it is C++able you can use
"inline" for any platform for which a C++ compiler is available.
Nov 14 '05 #4
On Tue, 3 Aug 2004 12:32:51 +0000 (UTC), Christopher Benson-Manica
<at***@nospam.cyberspace.org> wrote in comp.lang.c:
EC-AKD <ak*****@yahoo.com> spoke thus:
I was wondering if
C level inlining of code is any way comparable to macros.


No.
I believe
that inlining is equivalent to writing macros.


No. Macros are a purely textual substitution performed by the
preprocessor; the only optimization inherent in them is for the reader
and the maintainer. Inlined functions eliminate the overhead of the
function call, so they can improve the efficiency of the code relative
to their non-inlined versions.


And where exactly in the C standard does it state that functions
defined with the ''inline'' keyword are guaranteed to actually be
inlined?

Macros are always expanded inline. Functions with the ''inline''
keyword might or might not be.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #5
Jack Klein <ja*******@spamcop.net> spoke thus:
And where exactly in the C standard does it state that functions
defined with the ''inline'' keyword are guaranteed to actually be
inlined? Macros are always expanded inline. Functions with the ''inline''
keyword might or might not be.


I was wondering when someone was going to correct what I later
realized was, at best, a decidedly mediocre post. Thanks.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Nov 14 '05 #6
ak*****@yahoo.com (EC-AKD) wrote in message news:<2f*************************@posting.google.c om>...
Hi All,

I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

In my project, I have to use basic operations like add and sub many
times. So I initially inlined them and found a lot of optimisation.
However I was wondering if more optimisation could be achieved if I
use macros instead of inlining in some of the functions if not all.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.

Thanks and Regards,
-akd


You should read the documentation of your compiler. The compiler is
not bound to obey the "inline" keyword (as someone already said).
Some compilers will always obey it, some will not. In C++ different
rules apply.

Whether an inline is as fast as a macro depends on how the compiler
expands the inline function.
Nov 14 '05 #7
ak*****@yahoo.com (EC-AKD) writes:
I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.


The last three issue of C Vu Magazine (16.1-16.3, http://accu.org/)
have a series of introductory articles on optimisation in C.

--
Brian Gough

Network Theory Ltd,
Publishing "An Introduction to GCC" --- http://www.network-theory.co.uk/
Nov 14 '05 #8
Brian Gough <bj*@network-theory.co.uk> wrote in message news:<87************@network-theory.co.uk>...
ak*****@yahoo.com (EC-AKD) writes:
I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.


The last three issue of C Vu Magazine (16.1-16.3, http://accu.org/)
have a series of introductory articles on optimisation in C.


Thanks to everyone who replied to my mail. Well! I have to port my
code on arm processor and hence I was wondering if converting the
basic operations (like 16 bit additions, sub, mult , shifting,
rounding etc )to assembly and calling them would in any way optimise
my code than inlining the same or writing macros of the same.
Arm9 is very near to DSP and has many good one cycle instructions. But
if I convert the operations into assembly then there would be an
overhead of calling them many a times, which gets eliminated when we
write macros OR inline the codes.
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...

Thanks in advance,
Regards,
-akd

-------------------------------------
Anoop Deoras
Signal Processing group,
Sasken Communication Technologies,
India
Email: akdeoras[at]sasken[dot]com
-------------------------------------
Nov 14 '05 #9
You are asking THE optimization question :-)

In other words, the answers to these questions depend enormously on the kind
of hardware and CPU you use. Also, the compiler has something to do with it.

The general rules are :
If you wish to eliminate the (a bit time consuming) procedure calls, start
using macros.
However, in more recent CPU's there's a good opcode prediction system in it,
and also the instruction pipelines are developed so that function calls can
be done rapidly and can be predicted.

So I think the real optimization is done by using inline asm for your inner
loops. This way you can decide what registers go on stack and easily
calculate the nr of clock cycles you need.
PLEASE NOTE: you'll have to write better/faster asm than the compiler if you
want to see any improvements in performance. This might be difficult with
the latest level compilers. (But I do not have any experience with Arm
processors)
Do as many in one loop as possible and try to prevent unnessecary loops.
Small example:
If you do this:
{
for int y = 0; y < 200; y++)
for (int x = 0; x < 320; x++)
myPointerToMem[x + y *320] = 234;
}

then this is awful. Two loops are _bad_ :-)

Better do:
{
for (int i = 0; i < 64000; i++)
myPointerToMem[i] = 234;
}

Ok, bad example, there are a lot faster fillchar routines available, but for
the sake of exampling it does the trick.

Good luck.
"EC-AKD" <ak*****@yahoo.com> wrote in message
news:2f**************************@posting.google.c om...
Brian Gough <bj*@network-theory.co.uk> wrote in message

news:<87************@network-theory.co.uk>...
ak*****@yahoo.com (EC-AKD) writes:
I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.


The last three issue of C Vu Magazine (16.1-16.3, http://accu.org/)
have a series of introductory articles on optimisation in C.


Thanks to everyone who replied to my mail. Well! I have to port my
code on arm processor and hence I was wondering if converting the
basic operations (like 16 bit additions, sub, mult , shifting,
rounding etc )to assembly and calling them would in any way optimise
my code than inlining the same or writing macros of the same.
Arm9 is very near to DSP and has many good one cycle instructions. But
if I convert the operations into assembly then there would be an
overhead of calling them many a times, which gets eliminated when we
write macros OR inline the codes.
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...

Thanks in advance,
Regards,
-akd

-------------------------------------
Anoop Deoras
Signal Processing group,
Sasken Communication Technologies,
India
Email: akdeoras[at]sasken[dot]com
-------------------------------------

Nov 14 '05 #10
On 16 Aug 2004 03:49:57 -0700
ak*****@yahoo.com (EC-AKD) wrote:

<snip>
Thanks to everyone who replied to my mail. Well! I have to port my
code on arm processor and hence I was wondering if converting the
basic operations (like 16 bit additions, sub, mult , shifting,
rounding etc )to assembly and calling them would in any way optimise
my code than inlining the same or writing macros of the same.
If your compiler has a 16 bit type (it probably does if the hardware
supports 16 bit operations) then for the operations that C supports
using that type is likely to be the most efficient method IMO.
Arm9 is very near to DSP and has many good one cycle instructions. But
if I convert the operations into assembly then there would be an
overhead of calling them many a times, which gets eliminated when we
write macros OR inline the codes.
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...


Your best bet is probably to start by reading the documentation for the
compiler. After that you should go to a group dedicated either to your
compiler or to the processor where you will find lots of experts who
know about it.
--
Flash Gordon
Sometimes I think shooting would be far too good for some people.
Although my email address says spam, it is real and I read it.
Nov 14 '05 #11
>I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...


Little math:
ARM - mode : Subroutine-call + return : 2 insns = 8 bytes
Thumb : 2 insns: 6 bytes

So subroutines should be larger than the call/return pair + some extra
for the branch overhead.

Optimize inner loops to have no subroutine-calls and to fit into the
cache.

Try to align inner-loops on a cache-boundary.

If you are new to ARM: Don't try to be smarter than the compiler :-)

Try if Thumb might fit better for certain parts of your code.

--
42Bastian
Do not email to ba*******@yahoo.com, it's a spam-only account :-)
Use <same-name>@epost.de instead !
Nov 14 '05 #12
On 16 Aug 2004 03:49:57 -0700, EC-AKD <ak*****@yahoo.com> wrote:
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...


Get yourself a copy of ARM Appnote #34, "Writing Efficient C for ARM":
http://www.arm.com/pdfs/DAI0034A_efficient_c.pdf

HTH,

Vadim
Nov 14 '05 #13
EC-AKD wrote:
Brian Gough <bj*@network-theory.co.uk> wrote in message news:<87************@network-theory.co.uk>...
ak*****@yahoo.com (EC-AKD) writes:

I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.

Could any one help me with various optimisation techniques in C. Any
pointers would be appreciated.


The last three issue of C Vu Magazine (16.1-16.3, http://accu.org/)
have a series of introductory articles on optimisation in C.

Thanks to everyone who replied to my mail. Well! I have to port my
code on arm processor and hence I was wondering if converting the
basic operations (like 16 bit additions, sub, mult , shifting,
rounding etc )to assembly and calling them would in any way optimise
my code than inlining the same or writing macros of the same.
Arm9 is very near to DSP and has many good one cycle instructions. But
if I convert the operations into assembly then there would be an
overhead of calling them many a times, which gets eliminated when we
write macros OR inline the codes.
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...

Thanks in advance,
Regards,
-akd

-------------------------------------
Anoop Deoras
Signal Processing group,
Sasken Communication Technologies,
India
Email: akdeoras[at]sasken[dot]com
-------------------------------------


The simple 16-bit operations should already be optimized for you, so you
don't want to try replacing them.

If you _can_ speed things up by using assembly it'll be by writing code
that's tighter than the compiler can do, and that will happen at the
level of 5-10 line chunks of code or larger, if it happens at all.

I would approach this by compiling to assembly (read your compiler's
manual for the command line option to use) and examining the result,
then playing with my C code to see if I can make it tighter. This also
tells you if the compiler is doing something consistently stupid that
you can fix by going to assembly.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Nov 14 '05 #14

"EC-AKD" <ak*****@yahoo.com> wrote in message
news:2f**************************@posting.google.c om...
Brian Gough <bj*@network-theory.co.uk> wrote in message news:<87************@network-theory.co.uk>...
ak*****@yahoo.com (EC-AKD) writes:
I am new to the concept of optimising codes in C. I was wondering if
C level inlining of code is any way comparable to macros. I believe
that inlining is equivalent to writing macros. However I had this
strange feeling if I should go for macros wherever necessary instead
of inlining the codes.


In a good compiler inlining is better than macro substitution. It is
almost always better to leave the inlining decision to the compiler -
it can take the size of the function, the number of calls, the type
of arguments passed (eg. constants) and even whether it is called
from an inner loop into account. The #1 C++ optimization is to
*not* inline functions that are marked inline either implicitly or
explicitly when it would result in a huge code bloat.
Thanks to everyone who replied to my mail. Well! I have to port my
code on arm processor and hence I was wondering if converting the
basic operations (like 16 bit additions, sub, mult , shifting,
rounding etc )to assembly and calling them would in any way optimise
my code than inlining the same or writing macros of the same.
I'd suggest to leave built-in operations as they are, and use inline
functions only for more complicated operations. You can use inline
assembler to create basic inlineable intrisic functions, such as
saturated addition. There is no need to use inline assembler for any
operations that a compiler can easily recognise - eg SMULBB for
a 16x16->32 multiply.
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?


Generally speaking it's best to stay with C, inline functions and some
intrinsics/inline assembler. Once you've got your code working you
can identify where the time is spent (not always where you think it is)
and optimise the code starting from high level optimizations down to
low-level until you reach the desired performance.

Resist the temptation of using a lot of inline assembler, you'll have
to fight the optimizer! Only consider converting a critical loop or
function to assembler after you've done everything you can in C,
you know how to write optimal assembly code and you found the
inefficiencies in the assembler output.
Wilco

Nov 14 '05 #15

"Rob Vermeulen" <rvermeulen@arbor-audio-antispam-.com> wrote in message
news:10*************@corp.supernews.com...

{
for int y = 0; y < 200; y++)
for (int x = 0; x < 320; x++)
myPointerToMem[x + y *320] = 234;
}

then this is awful. Two loops are _bad_ :-)
It's awful, not because there are 2 loops, but because it writes a *byte* at
a time. Unroll and write 2-4 words per iteration to get anywhere near
decent performance.
Better do:
{
for (int i = 0; i < 64000; i++)
myPointerToMem[i] = 234;
}
This will give the same terrible performance as the first example.
Loop fusion only helps if the inner loop has a low trip count. Note
the multiply is loop invariant so it's for free anyway.
Ok, bad example, there are a lot faster fillchar routines available, but for the sake of exampling it does the trick.


Indeed - it's not uncommon for people to write their own memset
or memcpy and then find out the built-in version is an order of
magnitude faster...

Wilco

Nov 14 '05 #16
Wilco,

Good to see a dutchman in this discussion forum :-) (comp.dsp that is)
It's awful, not because there are 2 loops, but because it writes a *byte* at a time. Unroll and write 2-4 words per iteration to get anywhere near
decent performance.

That's a point of discussion. the pointer can also be pointing to a 32 bits
memory location so I'm actually writing *DWORDS*.
But you are absolutely right about the awfullness. I should have come up
with a more decent example, perhaps showing how to eliminate iterations by
setting multiple dwords at a time. I just wanted to set an example.
Better do:
{
for (int i = 0; i < 64000; i++)
myPointerToMem[i] = 234;
}
This will give the same terrible performance as the first example.
Loop fusion only helps if the inner loop has a low trip count. Note
the multiply is loop invariant so it's for free anyway.


That's where you are right. Naughty me should have come up with a better
example.
Indeed - it's not uncommon for people to write their own memset
or memcpy and then find out the built-in version is an order of
magnitude faster...


Hmm, I've also seen the opposite.
The usual implementation of memset in a MS Visual C++ environment isn't that
rapid (understating), it just works.

Best regards,

Rob
Nov 14 '05 #17
Vadim Borshchev <va*************@127.0.0.1> wrote in message news:<opsctuljq7y1ubid@news>...
On 16 Aug 2004 03:49:57 -0700, EC-AKD <ak*****@yahoo.com> wrote:
I hence need your help. How do I decide what is good. Is inlining good
or is converting my code to assembly and calling the functions good?
Any pointers please...


Get yourself a copy of ARM Appnote #34, "Writing Efficient C for ARM":
http://www.arm.com/pdfs/DAI0034A_efficient_c.pdf

HTH,

Vadim


You still @ picsel?
Nov 14 '05 #18

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: OPQ | last post by:
Hi all, I'd happy to have you share some thougts about ultimate optimisations on those 2 topics: (1)- adding one caractere at the end of a string (may be long) (2)- in a dict mapping a key...
2
by: Simon Elliott | last post by:
What optimisation do compilers typically provide when passing STL containers around? For example, if I do something like this: struct Tbloggs { std::string s1; }; typedef...
16
by: simonwittber | last post by:
Hello People. I've have a very tight inner loop (in a game app, so every millisecond counts) which I have optimised below: def loop(self): self_pool = self.pool self_call_exit_funcs =...
55
by: Ennixo | last post by:
hi, do you know where i can find some ebooks or websites talking about C# optimisation ? for exemple, i just learned that ++i is faster than i++. i would like to know more about the things...
8
by: Jon Maz | last post by:
Hi, I'm facing a code-optimisation issue on an asp.net/vb.net/SQL Server 2000 project. A web page containing not much more than 3 DropDownLists is taking nigh on 6 seconds to load, because each...
6
by: Lee Harr | last post by:
I have a database where I remove the schema public. When I try to use the createlang script, it fails like this ... >createdb foo CREATE DATABASE >psql foo -c "select version()" version...
1
by: David Welch | last post by:
Hi, I have a bit of code where I am relying on empty base member optimisation. The bit of code is below: template<typename Enum> struct EncodePrefix { template<Enum e> struct Apply
1
by: grid | last post by:
Hi, I was exploring the affect of cache on program performance/optimisation.Is it the compilers responsibility only to consider this kind of optimisation or the programmer can do his bit in this...
2
by: special_dragonfly | last post by:
Hello, I know this might be a little cheeky, and if it is, please say, but I need a little hand optimising some code. For the simple reason that this is 'company' code and I have no idea what I'm...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.