473,566 Members | 2,776 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Performance: VC++ 33% slower then Builder 5 on LineTo() API call??

Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanc eCounter() API).
The processor resolution (QueryPerforman ceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dlle xport)
#else
#define DRAWDLL_API __declspec(dlli mport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_c all,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMet hod(HWND handle)
{

HDC hDC = ::GetDC(handle) ;

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;re p++)
{
QueryPerformanc eCounter (&m_StartCounte r);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanc eCounter (&m_EndCounter) ;

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter. QuadPart -
m_2StartCounter .QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime) ;
OutputDebugStri ng(buff2);
}
}

ReleaseDC(handl e, hDC);

}

CDrawDll::CDraw Dll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo() and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro
Jul 21 '05 #1
12 2437
Could you give me an example of when I would want to call that function 8000
times in a tight loop?

Jonathan

"Gustavo L. Fabro" <gu************ ************@ho tmail.com> wrote in message
news:34******** *****@individua l.net...
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanc eCounter() API).
The processor resolution (QueryPerforman ceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dlle xport)
#else
#define DRAWDLL_API __declspec(dlli mport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_c all,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMet hod(HWND handle)
{

HDC hDC = ::GetDC(handle) ;

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;re p++)
{
QueryPerformanc eCounter (&m_StartCounte r);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanc eCounter (&m_EndCounter) ;

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter. QuadPart -
m_2StartCounter .QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime) ;
OutputDebugStri ng(buff2);
}
}

ReleaseDC(handl e, hDC);

}

CDrawDll::CDraw Dll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS
compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo()
and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API
call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro

Jul 21 '05 #2
Gustavo L. Fabro wrote:
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanc eCounter() API).
The processor resolution (QueryPerforman ceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975


Did you look at the assembly produced by both compilers?

But artificial tests like this rarely mean anything in real applications...

--
Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

Jul 21 '05 #3
Could you give me an example of when I would want to call that function
8000
times in a tight loop?
In a CAD application, for instance. The function is only called one time,
what it does
is draw 8000 lines.

On a regular CAD drawing much more then 8.000 lines are needed for the
complete
drawing to take place.

Jonathan

"Gustavo L. Fabro" <gu************ ************@ho tmail.com> wrote in
message news:34******** *****@individua l.net...
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanc eCounter() API).
The processor resolution (QueryPerforman ceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dlle xport)
#else
#define DRAWDLL_API __declspec(dlli mport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_c all,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMet hod(HWND handle)
{

HDC hDC = ::GetDC(handle) ;

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;re p++)
{
QueryPerformanc eCounter (&m_StartCounte r);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanc eCounter (&m_EndCounter) ;

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter. QuadPart -
m_2StartCounter .QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime) ;
OutputDebugStri ng(buff2);
}
}

ReleaseDC(handl e, hDC);

}

CDrawDll::CDraw Dll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS
compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo()
and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API
call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro


Jul 21 '05 #4
Did you look at the assembly produced by both compilers?
By this time I unfortunately don't have the necessary knowledge in assembly
language to be able to tell something concrete out of 2 given codes. If that
helps I can disassemble both DLLs and post the code here!

But artificial tests like this rarely mean anything in real
applications...


I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the same
file
in the screen with the VS compiled version then with our Builder compiled
one.

As the application itself has lots of classes, DLLs, and we used managed and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try to
focus on the managed/unmanaged approach, DLL interaction and other factors.

But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?

Fabro
Jul 21 '05 #5
Gustavo L. Fabro wrote:
Could you give me an example of when I would want to call that function
8000
times in a tight loop?

In a CAD application, for instance. The function is only called one time,
what it does
is draw 8000 lines.

On a regular CAD drawing much more then 8.000 lines are needed for the
complete
drawing to take place.


In a CAD program you wouldn't be calling QueryPerformanc eCounter or
OutputDebugStri ng for each line:

for(int rep=0;rep<10;re p++)
{
QueryPerformanc eCounter (&m_StartCounte r);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanc eCounter (&m_EndCounter) ;

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter. QuadPart -
m_2StartCounter .QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime) ;
OutputDebugStri ng(buff2);
}

QPC and ODS both have high overhead: each involve a transition to kernel
mode and back; QPC samples the hardware timer; and when a debugger is
attached, ODS effectively triggers an exception, which causes a full
context switch to the debugger and back.

Move the benchmarking code to the outside of the outer loop -- time the
whole operation -- and then compare results.

--
Tim Robinson (MVP, Windows SDK)
http://mobius.sourceforge.net/
Jul 21 '05 #6
Gustavo L. Fabro wrote:
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.


What command-line options are you using for the VC++ build? If you're
compiling it as managed code (/clr) I wouldn't be surprised to see a 33%
speed reduction since you'd be transitioning in and out of managed code
several times per iteration of your timing loop.

-cd
Jul 21 '05 #7
Gustavo L. Fabro wrote:

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is
MoveTo() and LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?


Well, first off, as you state yourself, the portion of the code that
your compilers generated is only a fraction of the full overhead.
The work of the API calls is done by the same (OS) code in both cases,
so the results are not comparing VC with Builder 5.

Having said that, if your original application shows the same behaviour
then it is quite reasonable for you to ask for an explanation!

Try comparing the interval "m_ElapsedT ime" with a millisecond or so.
(357954 in your case). If the APIs take more than that, then you've
suffered a context switch and you should ignore that time interval.
If this is the case, the question ceases to be "why is VC slower"
but becomes "why is VC provoking context switches" and the answer
probably lies in the run-time library rather than the compiler's
code generation.

A similar test is to use an array of "m_ElapsedT ime[10]" and collect
ten iterations of the inner loop between tracing. Yet another test
might be to insert Sleep(0) at the start of the inner loop. If either
of these affects the results, your problem is context switching.

Another variation is to use the RDTSC instruction...

__declspec(nake d) __int64 Rdtsc()
{
__asm rdtsc;
__asm ret;
}

This is a higher resolution timer with much lower calling overheads.

Oh, and lastly, %d isn't the correct format for an __int64 variable.
Jul 21 '05 #8
Greetings!

Thanks everybody for the comments. I've ran the
tests again, and indeed it was my mistake.

As Tim suggested, With the profiling code on the
outer loop (eliminating a great overhead) and putting
the call in a better place (I was using menus, but that
XP's "fading effect" time was interfering in the timing)
the results I got matched what I expected in the first place:

Visual Studio:
269996

Borland:
270206

I can now go through the code and try to find what is really
affecting the speed (I had stopped when I saw this).

Answering to Carl, I wasn't compiling using managed code. Will
do so later on in my quest to see what is happening in our program.

And commenting Ken's reply, I appreciate the tips for reducing
context switching time's interference in the profiles for a better timing.
Will use that next time if I find myself in a similar situation!

Fabro
Jul 21 '05 #9
Just out of curiosity. How come you are not using hardware to render your
lines (i.e. DirectX). If performance is an issue, using DirectX to draw lines
would give you a seemingly infinite boost in performance compared to
rendering your lines in software (even anti-aliased lines).

Just curious.

cheers,
Luis Miguel Huapaya

"Gustavo L. Fabro" wrote:
Did you look at the assembly produced by both compilers?


By this time I unfortunately don't have the necessary knowledge in assembly
language to be able to tell something concrete out of 2 given codes. If that
helps I can disassemble both DLLs and post the code here!

But artificial tests like this rarely mean anything in real
applications...


I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the same
file
in the screen with the VS compiled version then with our Builder compiled
one.

As the application itself has lots of classes, DLLs, and we used managed and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try to
focus on the managed/unmanaged approach, DLL interaction and other factors.

But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?

Fabro

Jul 21 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
3274
by: PDQBach | last post by:
Hello, im a visual c++ und borland c++builder newbie. i have witten a simple mandelbrot algorithm and compiled it with both vc++ (mfc) and cbuilder (vcl) (same code besides the drawing part). the vc++ version is twice! as fast in release mode. in debug mode its as fast as cbuilder. it seems i cant get cbuilder to compile a real release...
12
395
by: Gustavo L. Fabro | last post by:
Greetings! Getting straight to the point, here are the results of my experiment. I've included my comments and questions after them. The timing: (The total time means the sum of each line's drawing time. Time is measured in clock ticks (from QueryPerformanceCounter() API). The processor resolution (QueryPerformanceFrequency()) for my
0
7666
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7888
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7644
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6260
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5484
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5213
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3643
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3626
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2083
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.