By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,853 Members | 1,570 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,853 IT Pros & Developers. It's quick & easy.

Performance: VC++ 33% slower then Builder 5 on LineTo() API call??

P: n/a
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanceCounter() API).
The processor resolution (QueryPerformanceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dllexport)
#else
#define DRAWDLL_API __declspec(dllimport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMethod(HWND handle)
{

HDC hDC = ::GetDC(handle);

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanceCounter (&m_EndCounter);

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}
}

ReleaseDC(handle, hDC);

}

CDrawDll::CDrawDll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo() and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro
Jul 21 '05 #1
Share this Question
Share on Google+
12 Replies


P: n/a
Could you give me an example of when I would want to call that function 8000
times in a tight loop?

Jonathan

"Gustavo L. Fabro" <gu************************@hotmail.com> wrote in message
news:34*************@individual.net...
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanceCounter() API).
The processor resolution (QueryPerformanceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dllexport)
#else
#define DRAWDLL_API __declspec(dllimport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMethod(HWND handle)
{

HDC hDC = ::GetDC(handle);

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanceCounter (&m_EndCounter);

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}
}

ReleaseDC(handle, hDC);

}

CDrawDll::CDrawDll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS
compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo()
and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API
call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro

Jul 21 '05 #2

P: n/a
Gustavo L. Fabro wrote:
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanceCounter() API).
The processor resolution (QueryPerformanceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975


Did you look at the assembly produced by both compilers?

But artificial tests like this rarely mean anything in real applications...

--
Phil Frisbie, Jr.
Hawk Software
http://www.hawksoft.com

Jul 21 '05 #3

P: n/a
Could you give me an example of when I would want to call that function
8000
times in a tight loop?
In a CAD application, for instance. The function is only called one time,
what it does
is draw 8000 lines.

On a regular CAD drawing much more then 8.000 lines are needed for the
complete
drawing to take place.

Jonathan

"Gustavo L. Fabro" <gu************************@hotmail.com> wrote in
message news:34*************@individual.net...
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.

The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanceCounter() API).
The processor resolution (QueryPerformanceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975

The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dllexport)
#else
#define DRAWDLL_API __declspec(dllimport)
#endif

class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};

DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>

BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
return TRUE;
}

void CDrawDll::MyMethod(HWND handle)
{

HDC hDC = ::GetDC(handle);

LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];

//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanceCounter (&m_EndCounter);

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}
}

ReleaseDC(handle, hDC);

}

CDrawDll::CDrawDll()
{
return;
}

The explanation
---------------------------------------------------------

In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS
compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.

I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.

I made 2 executables that run the function from the DLL (compiled with
both compilers too).

The results were astonishing, for me, and I'd like an explanation for
what is happening.

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo()
and
LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?

I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:

- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API
call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?

We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!

Waiting for a light,

Gustavo L. Fabro


Jul 21 '05 #4

P: n/a
Did you look at the assembly produced by both compilers?
By this time I unfortunately don't have the necessary knowledge in assembly
language to be able to tell something concrete out of 2 given codes. If that
helps I can disassemble both DLLs and post the code here!

But artificial tests like this rarely mean anything in real
applications...


I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the same
file
in the screen with the VS compiled version then with our Builder compiled
one.

As the application itself has lots of classes, DLLs, and we used managed and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try to
focus on the managed/unmanaged approach, DLL interaction and other factors.

But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?

Fabro
Jul 21 '05 #5

P: n/a
Gustavo L. Fabro wrote:
Could you give me an example of when I would want to call that function
8000
times in a tight loop?

In a CAD application, for instance. The function is only called one time,
what it does
is draw 8000 lines.

On a regular CAD drawing much more then 8.000 lines are needed for the
complete
drawing to take place.


In a CAD program you wouldn't be calling QueryPerformanceCounter or
OutputDebugString for each line:

for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanceCounter (&m_EndCounter);

//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}

QPC and ODS both have high overhead: each involve a transition to kernel
mode and back; QPC samples the hardware timer; and when a debugger is
attached, ODS effectively triggers an exception, which causes a full
context switch to the debugger and back.

Move the benchmarking code to the outside of the outer loop -- time the
whole operation -- and then compare results.

--
Tim Robinson (MVP, Windows SDK)
http://mobius.sourceforge.net/
Jul 21 '05 #6

P: n/a
Gustavo L. Fabro wrote:
Greetings!

Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.


What command-line options are you using for the VC++ build? If you're
compiling it as managed code (/clr) I wouldn't be surprised to see a 33%
speed reduction since you'd be transitioning in and out of managed code
several times per iteration of your timing loop.

-cd
Jul 21 '05 #7

P: n/a
Gustavo L. Fabro wrote:

I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is
MoveTo() and LineTo() API calls?

It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.

Is Visual C++ really THAT MUCH slower?


Well, first off, as you state yourself, the portion of the code that
your compilers generated is only a fraction of the full overhead.
The work of the API calls is done by the same (OS) code in both cases,
so the results are not comparing VC with Builder 5.

Having said that, if your original application shows the same behaviour
then it is quite reasonable for you to ask for an explanation!

Try comparing the interval "m_ElapsedTime" with a millisecond or so.
(357954 in your case). If the APIs take more than that, then you've
suffered a context switch and you should ignore that time interval.
If this is the case, the question ceases to be "why is VC slower"
but becomes "why is VC provoking context switches" and the answer
probably lies in the run-time library rather than the compiler's
code generation.

A similar test is to use an array of "m_ElapsedTime[10]" and collect
ten iterations of the inner loop between tracing. Yet another test
might be to insert Sleep(0) at the start of the inner loop. If either
of these affects the results, your problem is context switching.

Another variation is to use the RDTSC instruction...

__declspec(naked) __int64 Rdtsc()
{
__asm rdtsc;
__asm ret;
}

This is a higher resolution timer with much lower calling overheads.

Oh, and lastly, %d isn't the correct format for an __int64 variable.
Jul 21 '05 #8

P: n/a
Greetings!

Thanks everybody for the comments. I've ran the
tests again, and indeed it was my mistake.

As Tim suggested, With the profiling code on the
outer loop (eliminating a great overhead) and putting
the call in a better place (I was using menus, but that
XP's "fading effect" time was interfering in the timing)
the results I got matched what I expected in the first place:

Visual Studio:
269996

Borland:
270206

I can now go through the code and try to find what is really
affecting the speed (I had stopped when I saw this).

Answering to Carl, I wasn't compiling using managed code. Will
do so later on in my quest to see what is happening in our program.

And commenting Ken's reply, I appreciate the tips for reducing
context switching time's interference in the profiles for a better timing.
Will use that next time if I find myself in a similar situation!

Fabro
Jul 21 '05 #9

P: n/a
Just out of curiosity. How come you are not using hardware to render your
lines (i.e. DirectX). If performance is an issue, using DirectX to draw lines
would give you a seemingly infinite boost in performance compared to
rendering your lines in software (even anti-aliased lines).

Just curious.

cheers,
Luis Miguel Huapaya

"Gustavo L. Fabro" wrote:
Did you look at the assembly produced by both compilers?


By this time I unfortunately don't have the necessary knowledge in assembly
language to be able to tell something concrete out of 2 given codes. If that
helps I can disassemble both DLLs and post the code here!

But artificial tests like this rarely mean anything in real
applications...


I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the same
file
in the screen with the VS compiled version then with our Builder compiled
one.

As the application itself has lots of classes, DLLs, and we used managed and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try to
focus on the managed/unmanaged approach, DLL interaction and other factors.

But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?

Fabro

Jul 21 '05 #10

P: n/a
> Just out of curiosity. How come you are not using hardware to render your
lines (i.e. DirectX). If performance is an issue, using DirectX to draw
lines
would give you a seemingly infinite boost in performance compared to
rendering your lines in software (even anti-aliased lines).
Hmmm... As far as I know (or knew), GDI calls are accelerated by hardware
when
available (and when the "Hardware Acceleration" slider in Control Panel,
Video, Configuration, Advanced, Problem Solving is not all to the left).

The profiling for the problem of this post, for instance, was made using
a computer with the "Hardware Acceleration" slider a couple of notches
to the left. It took an average of 270206 ticks to draw 8.000 lines. With
hardware acceleration fully enabled, the time droped to 62123.

Am I wrong? If DirectX could give an infinite boost in performance
I would definitely be interested!

Fabro
Just curious.

cheers,
Luis Miguel Huapaya

"Gustavo L. Fabro" wrote:
> Did you look at the assembly produced by both compilers?


By this time I unfortunately don't have the necessary knowledge in
assembly
language to be able to tell something concrete out of 2 given codes. If
that
helps I can disassemble both DLLs and post the code here!
>
> But artificial tests like this rarely mean anything in real
> applications...


I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the
same
file
in the screen with the VS compiled version then with our Builder compiled
one.

As the application itself has lots of classes, DLLs, and we used managed
and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try
to
focus on the managed/unmanaged approach, DLL interaction and other
factors.

But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?

Fabro

Jul 21 '05 #11

P: n/a
Gustavo L. Fabro wrote:
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);

::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);

QueryPerformanceCounter (&m_EndCounter);

/* snip */
}
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625

Borland Builder 5:
Total Time: 482151
Average: 61.0975


I can't explain the speed difference in your experiment, but I can say
that if you are writing or porting an application for which drawing
primitives are a critical bottleneck, such as the CAD applications you
cite in a later post, you should seriously consider using a
performance-oriented graphics library such as DirectX or OpenGL, which
takes advantage of modern hardware. The GDI is, quite frankly, rarely up
to the task of serious graphics work, just simple business graphics such
as bar charts and buttons.
--
Derrick Coetzee, Microsoft Speech Server developer
This posting is provided "AS IS" with no warranties, and confers no
rights. Use of included code samples are subject to the terms
specified at http://www.microsoft.com/info/cpyright.htm
Jul 21 '05 #12

P: n/a
Gustavo L. Fabro wrote:
Just out of curiosity. How come you are not using hardware to render your
lines (i.e. DirectX). If performance is an issue, using DirectX to draw
lines
would give you a seemingly infinite boost in performance compared to
rendering your lines in software (even anti-aliased lines).

Hmmm... As far as I know (or knew), GDI calls are accelerated by hardware
when
available (and when the "Hardware Acceleration" slider in Control Panel,
Video, Configuration, Advanced, Problem Solving is not all to the left).

The profiling for the problem of this post, for instance, was made using
a computer with the "Hardware Acceleration" slider a couple of notches
to the left. It took an average of 270206 ticks to draw 8.000 lines. With
hardware acceleration fully enabled, the time droped to 62123.

Am I wrong? If DirectX could give an infinite boost in performance
I would definitely be interested!


Yes, such 2D calls generally are accelerated by hardware. However, for
the ultimate in speed, you should perhaps render using 3D hardware,
although this requires a lot of extra programming work. This would be
appropriate for a CAD application though, perhaps.

Tom
Jul 21 '05 #13

This discussion thread is closed

Replies have been disabled for this discussion.