Greetings!
Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.
The timing:
(The total time means the sum of each line's drawing time.
Time is measured in clock ticks (from QueryPerformanceCounter() API).
The processor resolution (QueryPerformanceFrequency()) for my
machine is 3579545).
------------------------------------------
Visual Studio .NET 2003
Total time: 717230
Average: 89.8165625
Borland Builder 5:
Total Time: 482151
Average: 61.0975
The code (for the DLL):
------------------------------------------
DrawDll.h
#ifdef DRAWDLL_EXPORTS
#define DRAWDLL_API __declspec(dllexport)
#else
#define DRAWDLL_API __declspec(dllimport)
#endif
class DRAWDLL_API CDrawDll {
public:
CDrawDll(void);
void MyMethod(HWND handle);
};
DrawDll.cpp
#include "stdafx.h"
#include "DrawDll.h"
#include <stdio.h>
BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
return TRUE;
}
void CDrawDll::MyMethod(HWND handle)
{
HDC hDC = ::GetDC(handle);
LARGE_INTEGER m_StartCounter; // start time
LARGE_INTEGER m_EndCounter; // finish time
__int64 m_ElapsedTime;
char buff2[255];
//For 800 different positions
for(int x=0;x<800;x++)
{
//10 times on each position
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);
::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);
QueryPerformanceCounter (&m_EndCounter);
//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}
}
ReleaseDC(handle, hDC);
}
CDrawDll::CDrawDll()
{
return;
}
The explanation
---------------------------------------------------------
In the translation process from a big project to Visual Studio, I started
facing some performance problems. Things were much slower on the VS compiled
executables. I went to study what exactly was happening and got to some
staring (to my point of view) conclusions.
I made a DLL and compiled it on Builder and Visual C++ .NET, with all
optimizations enabled for both compilers. The DLL has a class with only
one function, that gets a handle for a DC and draws 8.000 lines on it.
I made 2 executables that run the function from the DLL (compiled with
both compilers too).
The results were astonishing, for me, and I'd like an explanation for
what is happening.
I've run the test several times and the results are always of the
same magnitude. How can that be, if the only thing I'm doing is MoveTo() and
LineTo() API calls?
It's something simple! I'm not playing with the disk, loading large
chunks of memory, using managed extensions (I created a 'pure' Win32
project under VS), anything that could relate with performance.
Only 2 simple API calls.
Is Visual C++ really THAT MUCH slower?
I have the complete code and compiled executables here and will be glad
to send to anyone who wants to replicate the test. As for this posting
is concerned:
- Is VS compiled DLLs and/or executables inherently slower then, for
instance, Builder 5?
- Why does a simple API call takes that longer? Isn't it the same API call?
Shouldn't the call be fast and the API function itself take longer?
- Is there anything I can do/try to make the code run faster?
We would like to migrate other big projects for Visual C++, but now we're
having
second thoughts!
Waiting for a light,
Gustavo L. Fabro 12 2426
Could you give me an example of when I would want to call that function 8000
times in a tight loop?
Jonathan
"Gustavo L. Fabro" <gu************************@hotmail.com> wrote in message
news:34*************@individual.net... Greetings!
Getting straight to the point, here are the results of my experiment. I've included my comments and questions after them.
The timing: (The total time means the sum of each line's drawing time. Time is measured in clock ticks (from QueryPerformanceCounter() API). The processor resolution (QueryPerformanceFrequency()) for my machine is 3579545). ------------------------------------------ Visual Studio .NET 2003 Total time: 717230 Average: 89.8165625
Borland Builder 5: Total Time: 482151 Average: 61.0975
The code (for the DLL): ------------------------------------------ DrawDll.h #ifdef DRAWDLL_EXPORTS #define DRAWDLL_API __declspec(dllexport) #else #define DRAWDLL_API __declspec(dllimport) #endif
class DRAWDLL_API CDrawDll { public: CDrawDll(void); void MyMethod(HWND handle); };
DrawDll.cpp #include "stdafx.h" #include "DrawDll.h" #include <stdio.h>
BOOL APIENTRY DllMain( HANDLE hModule, DWORD ul_reason_for_call, LPVOID lpReserved ) { return TRUE; }
void CDrawDll::MyMethod(HWND handle) {
HDC hDC = ::GetDC(handle);
LARGE_INTEGER m_StartCounter; // start time LARGE_INTEGER m_EndCounter; // finish time __int64 m_ElapsedTime; char buff2[255];
//For 800 different positions for(int x=0;x<800;x++) { //10 times on each position for(int rep=0;rep<10;rep++) { QueryPerformanceCounter (&m_StartCounter);
::MoveToEx(hDC, x,0, NULL); ::LineTo(hDC, 50+x,50);
QueryPerformanceCounter (&m_EndCounter);
//get and store finishing time and calc elapsed time(ticks) m_2ElapsedTime = (m_2EndCounter.QuadPart - m_2StartCounter.QuadPart ); sprintf(buff2, "%d\n", m_2ElapsedTime); OutputDebugString(buff2); } }
ReleaseDC(handle, hDC);
}
CDrawDll::CDrawDll() { return; }
The explanation ---------------------------------------------------------
In the translation process from a big project to Visual Studio, I started facing some performance problems. Things were much slower on the VS compiled executables. I went to study what exactly was happening and got to some staring (to my point of view) conclusions.
I made a DLL and compiled it on Builder and Visual C++ .NET, with all optimizations enabled for both compilers. The DLL has a class with only one function, that gets a handle for a DC and draws 8.000 lines on it.
I made 2 executables that run the function from the DLL (compiled with both compilers too).
The results were astonishing, for me, and I'd like an explanation for what is happening.
I've run the test several times and the results are always of the same magnitude. How can that be, if the only thing I'm doing is MoveTo() and LineTo() API calls?
It's something simple! I'm not playing with the disk, loading large chunks of memory, using managed extensions (I created a 'pure' Win32 project under VS), anything that could relate with performance. Only 2 simple API calls.
Is Visual C++ really THAT MUCH slower?
I have the complete code and compiled executables here and will be glad to send to anyone who wants to replicate the test. As for this posting is concerned:
- Is VS compiled DLLs and/or executables inherently slower then, for instance, Builder 5? - Why does a simple API call takes that longer? Isn't it the same API call? Shouldn't the call be fast and the API function itself take longer? - Is there anything I can do/try to make the code run faster?
We would like to migrate other big projects for Visual C++, but now we're having second thoughts!
Waiting for a light,
Gustavo L. Fabro
Gustavo L. Fabro wrote: Greetings!
Getting straight to the point, here are the results of my experiment. I've included my comments and questions after them.
The timing: (The total time means the sum of each line's drawing time. Time is measured in clock ticks (from QueryPerformanceCounter() API). The processor resolution (QueryPerformanceFrequency()) for my machine is 3579545). ------------------------------------------ Visual Studio .NET 2003 Total time: 717230 Average: 89.8165625
Borland Builder 5: Total Time: 482151 Average: 61.0975
Did you look at the assembly produced by both compilers?
But artificial tests like this rarely mean anything in real applications...
--
Phil Frisbie, Jr.
Hawk Software http://www.hawksoft.com Could you give me an example of when I would want to call that function 8000 times in a tight loop?
In a CAD application, for instance. The function is only called one time,
what it does
is draw 8000 lines.
On a regular CAD drawing much more then 8.000 lines are needed for the
complete
drawing to take place. Jonathan
"Gustavo L. Fabro" <gu************************@hotmail.com> wrote in message news:34*************@individual.net... Greetings!
Getting straight to the point, here are the results of my experiment. I've included my comments and questions after them.
The timing: (The total time means the sum of each line's drawing time. Time is measured in clock ticks (from QueryPerformanceCounter() API). The processor resolution (QueryPerformanceFrequency()) for my machine is 3579545). ------------------------------------------ Visual Studio .NET 2003 Total time: 717230 Average: 89.8165625
Borland Builder 5: Total Time: 482151 Average: 61.0975
The code (for the DLL): ------------------------------------------ DrawDll.h #ifdef DRAWDLL_EXPORTS #define DRAWDLL_API __declspec(dllexport) #else #define DRAWDLL_API __declspec(dllimport) #endif
class DRAWDLL_API CDrawDll { public: CDrawDll(void); void MyMethod(HWND handle); };
DrawDll.cpp #include "stdafx.h" #include "DrawDll.h" #include <stdio.h>
BOOL APIENTRY DllMain( HANDLE hModule, DWORD ul_reason_for_call, LPVOID lpReserved ) { return TRUE; }
void CDrawDll::MyMethod(HWND handle) {
HDC hDC = ::GetDC(handle);
LARGE_INTEGER m_StartCounter; // start time LARGE_INTEGER m_EndCounter; // finish time __int64 m_ElapsedTime; char buff2[255];
//For 800 different positions for(int x=0;x<800;x++) { //10 times on each position for(int rep=0;rep<10;rep++) { QueryPerformanceCounter (&m_StartCounter);
::MoveToEx(hDC, x,0, NULL); ::LineTo(hDC, 50+x,50);
QueryPerformanceCounter (&m_EndCounter);
//get and store finishing time and calc elapsed time(ticks) m_2ElapsedTime = (m_2EndCounter.QuadPart - m_2StartCounter.QuadPart ); sprintf(buff2, "%d\n", m_2ElapsedTime); OutputDebugString(buff2); } }
ReleaseDC(handle, hDC);
}
CDrawDll::CDrawDll() { return; }
The explanation ---------------------------------------------------------
In the translation process from a big project to Visual Studio, I started facing some performance problems. Things were much slower on the VS compiled executables. I went to study what exactly was happening and got to some staring (to my point of view) conclusions.
I made a DLL and compiled it on Builder and Visual C++ .NET, with all optimizations enabled for both compilers. The DLL has a class with only one function, that gets a handle for a DC and draws 8.000 lines on it.
I made 2 executables that run the function from the DLL (compiled with both compilers too).
The results were astonishing, for me, and I'd like an explanation for what is happening.
I've run the test several times and the results are always of the same magnitude. How can that be, if the only thing I'm doing is MoveTo() and LineTo() API calls?
It's something simple! I'm not playing with the disk, loading large chunks of memory, using managed extensions (I created a 'pure' Win32 project under VS), anything that could relate with performance. Only 2 simple API calls.
Is Visual C++ really THAT MUCH slower?
I have the complete code and compiled executables here and will be glad to send to anyone who wants to replicate the test. As for this posting is concerned:
- Is VS compiled DLLs and/or executables inherently slower then, for instance, Builder 5? - Why does a simple API call takes that longer? Isn't it the same API call? Shouldn't the call be fast and the API function itself take longer? - Is there anything I can do/try to make the code run faster?
We would like to migrate other big projects for Visual C++, but now we're having second thoughts!
Waiting for a light,
Gustavo L. Fabro
Did you look at the assembly produced by both compilers?
By this time I unfortunately don't have the necessary knowledge in assembly
language to be able to tell something concrete out of 2 given codes. If that
helps I can disassemble both DLLs and post the code here! But artificial tests like this rarely mean anything in real applications...
I'm afraid this is not the case here. This test is just a replication of
something I have
seen in practice. Our CAD application took 5 times longer to draw the same
file
in the screen with the VS compiled version then with our Builder compiled
one.
As the application itself has lots of classes, DLLs, and we used managed and
unmanaged C++ in the middle, I tried to first check out if the API calls
themselves, after all the processing (of elements, points positions, etc)
were running at the same speed. In case that was true, I would then try to
focus on the managed/unmanaged approach, DLL interaction and other factors.
But when I saw that even the API drawing calls themselves were taking
longer, I got intrigued... And decided to do this test! Hence the results
here demonstrated and the question: Is it *really* like this?
Fabro
Gustavo L. Fabro wrote: Could you give me an example of when I would want to call that function 8000 times in a tight loop?
In a CAD application, for instance. The function is only called one time, what it does is draw 8000 lines.
On a regular CAD drawing much more then 8.000 lines are needed for the complete drawing to take place.
In a CAD program you wouldn't be calling QueryPerformanceCounter or
OutputDebugString for each line:
for(int rep=0;rep<10;rep++)
{
QueryPerformanceCounter (&m_StartCounter);
::MoveToEx(hDC, x,0, NULL);
::LineTo(hDC, 50+x,50);
QueryPerformanceCounter (&m_EndCounter);
//get and store finishing time and calc elapsed time(ticks)
m_2ElapsedTime = (m_2EndCounter.QuadPart -
m_2StartCounter.QuadPart );
sprintf(buff2, "%d\n", m_2ElapsedTime);
OutputDebugString(buff2);
}
QPC and ODS both have high overhead: each involve a transition to kernel
mode and back; QPC samples the hardware timer; and when a debugger is
attached, ODS effectively triggers an exception, which causes a full
context switch to the debugger and back.
Move the benchmarking code to the outside of the outer loop -- time the
whole operation -- and then compare results.
--
Tim Robinson (MVP, Windows SDK) http://mobius.sourceforge.net/
Gustavo L. Fabro wrote: Greetings!
Getting straight to the point, here are the results of my experiment. I've included my comments and questions after them.
What command-line options are you using for the VC++ build? If you're
compiling it as managed code (/clr) I wouldn't be surprised to see a 33%
speed reduction since you'd be transitioning in and out of managed code
several times per iteration of your timing loop.
-cd
Gustavo L. Fabro wrote: I've run the test several times and the results are always of the same magnitude. How can that be, if the only thing I'm doing is MoveTo() and LineTo() API calls?
It's something simple! I'm not playing with the disk, loading large chunks of memory, using managed extensions (I created a 'pure' Win32 project under VS), anything that could relate with performance. Only 2 simple API calls.
Is Visual C++ really THAT MUCH slower?
Well, first off, as you state yourself, the portion of the code that
your compilers generated is only a fraction of the full overhead.
The work of the API calls is done by the same (OS) code in both cases,
so the results are not comparing VC with Builder 5.
Having said that, if your original application shows the same behaviour
then it is quite reasonable for you to ask for an explanation!
Try comparing the interval "m_ElapsedTime" with a millisecond or so.
(357954 in your case). If the APIs take more than that, then you've
suffered a context switch and you should ignore that time interval.
If this is the case, the question ceases to be "why is VC slower"
but becomes "why is VC provoking context switches" and the answer
probably lies in the run-time library rather than the compiler's
code generation.
A similar test is to use an array of "m_ElapsedTime[10]" and collect
ten iterations of the inner loop between tracing. Yet another test
might be to insert Sleep(0) at the start of the inner loop. If either
of these affects the results, your problem is context switching.
Another variation is to use the RDTSC instruction...
__declspec(naked) __int64 Rdtsc()
{
__asm rdtsc;
__asm ret;
}
This is a higher resolution timer with much lower calling overheads.
Oh, and lastly, %d isn't the correct format for an __int64 variable.
Greetings!
Thanks everybody for the comments. I've ran the
tests again, and indeed it was my mistake.
As Tim suggested, With the profiling code on the
outer loop (eliminating a great overhead) and putting
the call in a better place (I was using menus, but that
XP's "fading effect" time was interfering in the timing)
the results I got matched what I expected in the first place:
Visual Studio:
269996
Borland:
270206
I can now go through the code and try to find what is really
affecting the speed (I had stopped when I saw this).
Answering to Carl, I wasn't compiling using managed code. Will
do so later on in my quest to see what is happening in our program.
And commenting Ken's reply, I appreciate the tips for reducing
context switching time's interference in the profiles for a better timing.
Will use that next time if I find myself in a similar situation!
Fabro
Just out of curiosity. How come you are not using hardware to render your
lines (i.e. DirectX). If performance is an issue, using DirectX to draw lines
would give you a seemingly infinite boost in performance compared to
rendering your lines in software (even anti-aliased lines).
Just curious.
cheers,
Luis Miguel Huapaya
"Gustavo L. Fabro" wrote: Did you look at the assembly produced by both compilers?
By this time I unfortunately don't have the necessary knowledge in assembly language to be able to tell something concrete out of 2 given codes. If that helps I can disassemble both DLLs and post the code here!
But artificial tests like this rarely mean anything in real applications...
I'm afraid this is not the case here. This test is just a replication of something I have seen in practice. Our CAD application took 5 times longer to draw the same file in the screen with the VS compiled version then with our Builder compiled one.
As the application itself has lots of classes, DLLs, and we used managed and unmanaged C++ in the middle, I tried to first check out if the API calls themselves, after all the processing (of elements, points positions, etc) were running at the same speed. In case that was true, I would then try to focus on the managed/unmanaged approach, DLL interaction and other factors.
But when I saw that even the API drawing calls themselves were taking longer, I got intrigued... And decided to do this test! Hence the results here demonstrated and the question: Is it *really* like this?
Fabro
> Just out of curiosity. How come you are not using hardware to render your lines (i.e. DirectX). If performance is an issue, using DirectX to draw lines would give you a seemingly infinite boost in performance compared to rendering your lines in software (even anti-aliased lines).
Hmmm... As far as I know (or knew), GDI calls are accelerated by hardware
when
available (and when the "Hardware Acceleration" slider in Control Panel,
Video, Configuration, Advanced, Problem Solving is not all to the left).
The profiling for the problem of this post, for instance, was made using
a computer with the "Hardware Acceleration" slider a couple of notches
to the left. It took an average of 270206 ticks to draw 8.000 lines. With
hardware acceleration fully enabled, the time droped to 62123.
Am I wrong? If DirectX could give an infinite boost in performance
I would definitely be interested!
Fabro Just curious.
cheers, Luis Miguel Huapaya
"Gustavo L. Fabro" wrote:
> Did you look at the assembly produced by both compilers?
By this time I unfortunately don't have the necessary knowledge in assembly language to be able to tell something concrete out of 2 given codes. If that helps I can disassemble both DLLs and post the code here!
> > But artificial tests like this rarely mean anything in real > applications...
I'm afraid this is not the case here. This test is just a replication of something I have seen in practice. Our CAD application took 5 times longer to draw the same file in the screen with the VS compiled version then with our Builder compiled one.
As the application itself has lots of classes, DLLs, and we used managed and unmanaged C++ in the middle, I tried to first check out if the API calls themselves, after all the processing (of elements, points positions, etc) were running at the same speed. In case that was true, I would then try to focus on the managed/unmanaged approach, DLL interaction and other factors.
But when I saw that even the API drawing calls themselves were taking longer, I got intrigued... And decided to do this test! Hence the results here demonstrated and the question: Is it *really* like this?
Fabro
Gustavo L. Fabro wrote: for(int rep=0;rep<10;rep++) { QueryPerformanceCounter (&m_StartCounter);
::MoveToEx(hDC, x,0, NULL); ::LineTo(hDC, 50+x,50);
QueryPerformanceCounter (&m_EndCounter);
/* snip */ } ------------------------------------------ Visual Studio .NET 2003 Total time: 717230 Average: 89.8165625
Borland Builder 5: Total Time: 482151 Average: 61.0975
I can't explain the speed difference in your experiment, but I can say
that if you are writing or porting an application for which drawing
primitives are a critical bottleneck, such as the CAD applications you
cite in a later post, you should seriously consider using a
performance-oriented graphics library such as DirectX or OpenGL, which
takes advantage of modern hardware. The GDI is, quite frankly, rarely up
to the task of serious graphics work, just simple business graphics such
as bar charts and buttons.
--
Derrick Coetzee, Microsoft Speech Server developer
This posting is provided "AS IS" with no warranties, and confers no
rights. Use of included code samples are subject to the terms
specified at http://www.microsoft.com/info/cpyright.htm
Gustavo L. Fabro wrote: Just out of curiosity. How come you are not using hardware to render your lines (i.e. DirectX). If performance is an issue, using DirectX to draw lines would give you a seemingly infinite boost in performance compared to rendering your lines in software (even anti-aliased lines).
Hmmm... As far as I know (or knew), GDI calls are accelerated by hardware when available (and when the "Hardware Acceleration" slider in Control Panel, Video, Configuration, Advanced, Problem Solving is not all to the left).
The profiling for the problem of this post, for instance, was made using a computer with the "Hardware Acceleration" slider a couple of notches to the left. It took an average of 270206 ticks to draw 8.000 lines. With hardware acceleration fully enabled, the time droped to 62123.
Am I wrong? If DirectX could give an infinite boost in performance I would definitely be interested!
Yes, such 2D calls generally are accelerated by hardware. However, for
the ultimate in speed, you should perhaps render using 3D hardware,
although this requires a lot of extra programming work. This would be
appropriate for a CAD application though, perhaps.
Tom This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: PDQBach |
last post by:
Hello,
im a visual c++ und borland c++builder newbie.
i have witten a simple mandelbrot algorithm and compiled it with both
vc++ (mfc) and cbuilder (vcl) (same code besides the drawing part)....
|
by: Gustavo L. Fabro |
last post by:
Greetings!
Getting straight to the point, here are the results
of my experiment. I've included my comments and questions
after them.
The timing:
(The total time means the sum of each line's...
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
| |