473,746 Members | 2,311 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Program performance/optimisation

Hi,
I was exploring the affect of cache on program
performance/optimisation.Is it the compilers responsibility only to
consider this kind of optimisation or the programmer can do his bit in
this case ?
Reading through the "Expert C Programming" text,it mentions how the
below program can be efficient taking the cache details into accont.

The below program can be executed using the two versions of copy
alternatively and running the time command on the executable on Unix,to
see the difference.As obvious,the slowdown happens in DUMBCOPY.

#include<stdio. h>
#include<string .h>

#define DUMBCOPY for (i = 0; i < 65536; i++) \
destination[i] = source[i]

#define SMARTCOPY memcpy(destinat ion, source, 65536)

int main()
{
char source[65536], destination[65536];
int i, j;
for (j = 0; j < 100; j++)
SMARTCOPY;
/* DUMBCOPY; */
return 0;
}

Below are the reasonings :
The slowdown happens because the source and destination are an exact
multiple of the cache size apart.The particular algorithm used happens
to fill the same line for main memory addresses that are exact multiples
of the cache size apart.

In this particular case both the source and destination use the same
cache line, causing every memory reference to miss the cache and stall
the processor while it waited for regular memory to deliver. The library
memcpy() routine is especially tuned for high performance.
It unrolls the loop to read for one cache line and then write, which
avoids the problem.Using the smart copy, we were able to get a huge
performance improvement. This also shows the folly of drawing
conclusions from simple-minded benchmark programs.

I dont fully understand the above 2 paragraphs,so if someone could give
a better explanation.Wou ld also appreciate any helpful pointers.

This might not be something directly related to C,but I thought I would
get better answers in this newsgroup and hence the posting.

-TIA
Sep 12 '06 #1
1 2062
On Tue, 12 Sep 2006 21:19:34 +0530, grid <pr******@gmail .comwrote in
comp.lang.c:
Hi,
I was exploring the affect of cache on program
performance/optimisation.Is it the compilers responsibility only to
consider this kind of optimisation or the programmer can do his bit in
this case ?
Any special effort a particular compiler makes to use cache, or any
other hardware feature of the platform, is completely a QOI (Quality
Of Implementation) issue, not a language one. The C language and its
standard define the operation of a correctly written program. They
make no mention of, nor do they place any requirements, on the speed
of efficiency, or any program.
Reading through the "Expert C Programming" text,it mentions how the
below program can be efficient taking the cache details into accont.

The below program can be executed using the two versions of copy
alternatively and running the time command on the executable on Unix,to
see the difference.As obvious,the slowdown happens in DUMBCOPY.

#include<stdio. h>
#include<string .h>

#define DUMBCOPY for (i = 0; i < 65536; i++) \
destination[i] = source[i]

#define SMARTCOPY memcpy(destinat ion, source, 65536)

int main()
{
char source[65536], destination[65536];
int i, j;
for (j = 0; j < 100; j++)
SMARTCOPY;
/* DUMBCOPY; */
return 0;
}

Below are the reasonings :
The slowdown happens because the source and destination are an exact
multiple of the cache size apart.The particular algorithm used happens
to fill the same line for main memory addresses that are exact multiples
of the cache size apart.

In this particular case both the source and destination use the same
cache line, causing every memory reference to miss the cache and stall
the processor while it waited for regular memory to deliver. The library
memcpy() routine is especially tuned for high performance.
It unrolls the loop to read for one cache line and then write, which
avoids the problem.Using the smart copy, we were able to get a huge
performance improvement. This also shows the folly of drawing
conclusions from simple-minded benchmark programs.

I dont fully understand the above 2 paragraphs,so if someone could give
a better explanation.Wou ld also appreciate any helpful pointers.

This might not be something directly related to C,but I thought I would
get better answers in this newsgroup and hence the posting.
Actually, your question is completely off-topic here. As far as C is
concerned, there is no such thing as a cache, cache line, or processor
stall. This is all quite hardware and architecture dependent.

You need to ask questions about this in some sort of platform specific
newsgroup. The moderated group news:comp.lang. asm.x86 is a good place
to discuss the behavior of such things as cache on x86 processors.
You'll have to look to find an appropriate group for other processor
architectures.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Sep 12 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
4837
by: RAYYILDIZ | last post by:
I Know C is the fastest progrmming language. However, by using some bitwise operation you can get faster the your program. For instance, we talk about swap function. For a integer swapping we use this generally. void swap(int* a,int* b){ int c; c=*a; *b=*a; *b=c;
9
5770
by: Java script Dude | last post by:
In many languages, it is necessary to string together multiple strings into one string for use over multiple lines of code. Which one is the most efficient from the interpreters perspective: Case 1: str += '<?xml version="1.0" encoding="' + charset + '"?>\n'; str += '<view-source-with version="1.1">\n'; str += ' <default-item-index>' + this.defaultItem + '</default-item-index>\n';
14
7238
by: Nigel | last post by:
I read that C#'s JIT compiler produces very efficient machine code. However, I've found when performing extensive numerical calculations that C# is less than a fourth the speed of C++. I give code examples below. Both C# and C++ were compiled as release builds with default optimisation on (C++ has /Ob1 set in addition to default optimizations). I suspected the relatively poor C# performance was due to using managed memory, but adding...
8
5865
by: rendle | last post by:
I have a MSIL/performance question: Is there any difference between declaring a variable once and assigning to it multiple times, and declaring and assigning multiple times? For example: // Begin sample for(int i = 0; i < 100; i++) {
6
1438
by: Tony | last post by:
Is there any value to pursuing program designs that mimimize the mainline call stack? For example, within main() I could code up something like: while(GetMsg(m)) DispatchMsg(m); instead of doing Program.Run() from main() or even worse calling Run() from the Program constructor.
334
11526
by: Antoninus Twink | last post by:
The function below is from Richard HeathField's fgetline program. For some reason, it makes three passes through the string (a strlen(), a strcpy() then another pass to change dots) when two would clearly be sufficient. This could lead to unnecessarily bad performance on very long strings. It is also written in a hard-to-read and clunky style. char *dot_to_underscore(const char *s) { char *t = malloc(strlen(s) + 1); if(t != NULL)
12
4803
by: lali.b97 | last post by:
Somewhere in a tutorial i read that if statement has performance overheads as code within the if statement cannot take benefit of pipeling of microprocessor and also that the compiler cannot agressively optimize that code. my functions as much as possible and also try to have minimum code withing if block.
3
4285
by: traceable1 | last post by:
I installed the SQL Server 2005 SP2 update 2 rollup on my 64-bit server and the performance has tanked! I installed rollup 3 on some of them, but that did not seem to help. I thought it was just a linked server performance issue, but my optimization started running today on one of the "update 2" instances and so far it's been running about 10 hours longer than it normally
41
2691
by: c | last post by:
Hi every one, Me and my Cousin were talking about C and C#, I love C and he loves C#..and were talking C is ...blah blah...C# is Blah Blah ...etc and then we decided to write a program that will calculate the factorial of 10, 10 millions time and print the reusult in a file with the name log.txt.. I wrote something like this
0
8801
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9516
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9286
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9219
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8229
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6062
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4587
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4840
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3294
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.