Hi,
We've found some pretty serious performance hits that we didn't expect in a
mixed mode C++ application. The number crunching bits of our algorithms are
compiled with #pragma unmanaged. They call a number of inline functions
elsewhere, and from the documentation my understanding was that inlined
function calls are supposed to be effectively replaced by their
implementation at compile time. This would mean that inline functions used
in a #pragma unmanaged function would also be compiled unmanaged.
However, having written a short test, this appears not to be the case -
putting #pragmas in the header affects how the inlined functions are
compiled (and affects running time enormously).
So given the code at the bottom of this post, my understanding would be the
following:
- Calling from a managed function, the first function should be as quick as
NumberCrunching_InlineFunction_InNoGCClass_Managed , which is compiled as
managed.
- Calling from an unmanaged function, the first should be as fast as
NumberCrunching_InlineFunction_InNoGCClass_Unmanag ed, which is specified to
be unmanaged.
Given that it is apparently possible to influence how inlined functions are
compiled in the header, and that our inlined functions may be called from
both managed and unmanaged functions, what are we supposed to do? And
equally, how does all this apply to templated functions (which should be
compiled as they're used)? We're pretty sure that instead of the inlining
giving us good performance, it's causing masses of transitions between
managed / unmanaged processes. Any help would really be appreciated, as
would any links to info.
Thanks!
Steve
### CODE ###
__nogc class UnmanagedClass
{
public:
// This function isn't specified how to compile
__forceinline void
NumberCrunching_InlineFunction_InNoGCClass_LeftToI tsOwnDevices()
{
for ( int i = 0 ; i < 1000000 ; i++ )
{
double d = sqrt( 69765.43556 ) * log( 9032425.543535 ) / sqrt( log
( exp( 3.65464 ) ) );
}
}
#pragma unmanaged
__forceinline void NumberCrunching_InlineFunction_InNoGCClass_Unmanag ed()
{
for ( int i = 0 ; i < 1000000 ; i++ )
{
double d = sqrt( 69765.43556 ) * log( 9032425.543535 ) / sqrt( log
( exp( 3.65464 ) ) );
}
}
#pragma managed
__forceinline void NumberCrunching_InlineFunction_InNoGCClass_Managed ()
{
for ( int i = 0 ; i < 1000000 ; i++ )
{
double d = sqrt( 69765.43556 ) * log( 9032425.543535 ) / sqrt( log
( exp( 3.65464 ) ) );
}
}
}; 2 1271
Steve McLellan wrote: Hi,
We've found some pretty serious performance hits that we didn't expect in a mixed mode C++ application. The number crunching bits of our algorithms are compiled with #pragma unmanaged. They call a number of inline functions elsewhere, and from the documentation my understanding was that inlined function calls are supposed to be effectively replaced by their implementation at compile time. This would mean that inline functions used in a #pragma unmanaged function would also be compiled unmanaged.
However, having written a short test, this appears not to be the case - putting #pragmas in the header affects how the inlined functions are compiled (and affects running time enormously).
So given the code at the bottom of this post, my understanding would be the following: - Calling from a managed function, the first function should be as quick as NumberCrunching_InlineFunction_InNoGCClass_Managed , which is compiled as managed. - Calling from an unmanaged function, the first should be as fast as NumberCrunching_InlineFunction_InNoGCClass_Unmanag ed, which is specified to be unmanaged.
Given that it is apparently possible to influence how inlined functions are compiled in the header, and that our inlined functions may be called from both managed and unmanaged functions, what are we supposed to do? And equally, how does all this apply to templated functions (which should be compiled as they're used)? We're pretty sure that instead of the inlining giving us good performance, it's causing masses of transitions between managed / unmanaged processes. Any help would really be appreciated, as would any links to info.
Separate your code into pure-native and pure-managed compilation units and
never use #pragma {un}managed. That should result in inlined IL versions of
your functions in the managed module(s) and inlined native versions in the
native modules. Templates should work out the same.
-cd
> Steve McLellan wrote: Hi,
We've found some pretty serious performance hits that we didn't expect in a mixed mode C++ application. The number crunching bits of our algorithms are compiled with #pragma unmanaged. They call a number of inline functions elsewhere, and from the documentation my understanding was that inlined function calls are supposed to be effectively replaced by their implementation at compile time. This would mean that inline functions used in a #pragma unmanaged function would also be compiled unmanaged.
However, having written a short test, this appears not to be the case - putting #pragmas in the header affects how the inlined functions are compiled (and affects running time enormously).
So given the code at the bottom of this post, my understanding would be the following: - Calling from a managed function, the first function should be as quick as NumberCrunching_InlineFunction_InNoGCClass_Managed , which is compiled as managed. - Calling from an unmanaged function, the first should be as fast as NumberCrunching_InlineFunction_InNoGCClass_Unmanag ed, which is specified to be unmanaged.
Given that it is apparently possible to influence how inlined functions are compiled in the header, and that our inlined functions may be called from both managed and unmanaged functions, what are we supposed to do? And equally, how does all this apply to templated functions (which should be compiled as they're used)? We're pretty sure that instead of the inlining giving us good performance, it's causing masses of transitions between managed / unmanaged processes. Any help would really be appreciated, as would any links to info. Separate your code into pure-native and pure-managed compilation units and never use #pragma {un}managed. That should result in inlined IL versions
of your functions in the managed module(s) and inlined native versions in the native modules. Templates should work out the same.
Hi Daniel,
Thanks - that's what we figured. Just what you need to realise a few weeks
before the project's meant to be completed :-)
Is this behaviour what you'd expect - i.e. is it what would happen with any
C++ compiler with regard to inlined functions? Or did the compiler team make
the decision to aim for easy portability at the expense of performance in
this case? IJW - but not fast :-)
Presumably it's still possible to link against a static LIB when building an
..NET assembly if I stick the unmanaged stuff in one?
Thanks again,
Steve This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Steven T. Hatton |
last post by:
I know this is, to some extent, implementation dependent, but since the
Standard specifies an inline specifier, there must be some "reasonable
assumptions" I can make about what should happen when...
|
by: Bern McCarty |
last post by:
Is it at all possible to leverage mixed-mode assemblies from AppDomains
other than the default AppDomain? Is there any means at all of doing this?
Mixed-mode is incredibly convenient, but if I...
|
by: Jacobo Rodriguez Villar |
last post by:
Hello, I'm writting an application using mixed mode with C++ and
C++.NET, and I have a problem, if I put a breakpoint or try to enter
(with step by step) into a virtual and unmanaged method, the...
|
by: Bob Rock |
last post by:
Hello,
in the last few days I've made my first few attempts at creating mixed
C++ managed-unmanaged assemblies and looking afterwards with ILDASM
at what is visible in those assemblies from a...
|
by: Nadav |
last post by:
Hi,
I am writing a performence critical application, this require me to stick to unmanaged C++ as performance is much better using unmanaged C++ ( about 33% better ), Still, I am trying to avoid...
|
by: Serg |
last post by:
Every time i am trying to step into virtual funtion debugger tells me "there
is no source code available" and disassemble window appears. If function is
non virtual debugger succefully steps into...
|
by: Edward Diener |
last post by:
By reuse, I mean a function in an assembly which is called in another
assembly.
By a mixed-mode function I mean a function whose signature has one or
more CLR types and one or more non-CLR...
|
by: Amit Dedhia |
last post by:
Hi All
I have a VC++ 2005 MFC application with all classes defined as
unmanaged classes. I want to write my application data in xml format.
Since ADO.NET has buit in functions available for...
|
by: newbie |
last post by:
I happened to read boost library code and realized that most (the part
I read) functions are inlined like:
template <class Key>
inline void Foo(const Key& k) {
...
...
}
Is there a strong...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |