473,554 Members | 3,068 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

/CLR floating point performance, inter-assembly function call performance

I have run an experiment to try to learn some things about floating point
performance in managed C++. I am using Visual Studio
2003. I was hoping to get a feel for whether or not it would make sense to
punch out from managed code to native code (I was using
IJW) in order to do some amount of floating point work and, if so, what that
certain amount of floating point work was
approximately.

To attempt to do this I made a program that applys a 3x3 matrix to an array
of 3D points (all doubles here folks). The program
contains a function that applies 10 different matrices to the same test data
set of 5,000,000 3D points. It does this by invoking
another workhorse function that does the actual floating point operations.
That function takes an input array of 3D points, an
output array of 3D points, a point count, and the matrix to use. There are
no __gc types in this program. It's just pointers and
structs and native arrays. The outer test function looks like this:

void test_applyMatri xToDPoints(Test Data *tdP, int ptsPerMultiply)
{
int jIterations = tdP->pointCnt / ptsPerMultiply;
for (int i = 0 ; i < tdP->matrixCnt ; ++i)
{
for (int j = 0 ; j < jIterations; ++j)
{
// managed-to-native transitions happen here in V2
DMatrix3d_multi plyDPoint3dArra y(tdP->matrices + i,
&tdP->outPts[j*ptsPerMultipl y],
&tdP->inPts[j*ptsPerMultipl y],
ptsPerMultiply) ;
}
}
}

The program calls the above routine 8 times and records the time elapsed
during each call. On the first call the above function
calls the workhorse function only once for each of the 10 matrices. In
other words, it applies a matrix to all of the 5,000,000
points in the test data set with a single call to the other workhorse
function. In the next call to the above function it passes
only 50,000 points per-call to the other routine, then 5,000, then 500, et
cetera, until we get all of the way down to 5, and then
finally 1 where there is a function call to
DMatrix3d_multi plyDPoint3dArra y() for each and every of the 5,000,000 3D
points in the
test data set.

I was hoping someone could help interpret the results. At first I made 3
versions of this program. In all 3 of these versions
the DMatrix3d_multi plyDPoint3dArra y function was in a geometry.dll and the
rest of the code was in my test.exe. The 3 versions
were merely different combinations of native versus IL for the two
executables:

test.exe geometry.dll (contains workhorse function)
-------- ----------------
v1) native native
v2) managed native
v3) managed managed

Here are the results. All numbers are elapsed time in seconds for calls to
the outer function described.

Native->Native:
0.953
0.968
0.968
0.953
0.968
0.952
1.093
1.39
Final run is 146% of first run.
Final run is 127% of previous run

Managed->Native
0.968
0.968
0.968
0.969
0.968
0.968
1.124
1.952
Final run is 202% of first run.
Final run is 174% of previous run

Managed->Managed
0.984
1.016
0.985
1
1
1.032
1.516
4.469
Final run is 454% of first run.
Final run is 295% of previous run

This surprised me in two ways. First, I thought that for version 2 the
penalty imposed by managed->native transitions would be
worse. It's there, you can see performance drop off more as the call
granularity becomes very fine toward the end, but it isn't
as much as I might have guessed it would be. More surprising was that the
managed->managed version, which didn't have any
manged->native transitions slowing it down at all, dropped off far worse!
The early calls to the test function compare very
closely between versions 2 and 3, suggesting that the raw floating point
performance of the managed versus native workhorse
function is quite similar. So this seemed to point the finger at function
call overhead. For some reason function call overhead
is just higher for managed code than for native? On a hunch I decided to
make a 4rth version of the program that was also
managed->managed but which eliminated the inter-assembly call. Instead I
just linked everything from geometry.dll right into
test.exe. It made a big difference. The results are below. Is there some
security/stack-walking stuff going on in the inter-DLL
case maybe? Or does it really make sense that managed, inter-assembly calls
are that much slower than the equivalent
intra-assembly call? Explanations welcomed. The inter-assembly version
takes 217% of the time that the intra-assembly version
takes on the final call when the call granularity is fine. That seems
awfully harsh.

Managed->Managed (one big test.exe)
1
0.999
0.984
1.015
0.984
1.015
1.093
2.061
Final run is 206% of first run.
Final run is 189% of previous run.

Even with the improvement yielded by eliminating the inter-assembly calls,
the relative performance between the version that has
to make managed->native transitions and the all managed version is difficult
for me to comprehend. What is it with
managed->managed function call overhead that seems worse even than
managed->native function call overhead?

I tried to make sure that page faults weren't affecting my test runs and the
results I got were very consistent from run to run.

Bern McCarty
Bentley Sytems, Inc.

P.S. For the curious, here is what DMatrix3d_multi plyDPoint3dArra y looks
like. There are no function calls made and it is all compiled into IL.

Public void DMatrix3d_multi plyDPoint3dArra y
(
const DMatrix3d *pMatrix,
DPoint3d *pResult,
const DPoint3d *pPoint,
int numPoint
)
{
int i;
double x,y,z;
DPoint3d *pResultPoint;

for (i = 0, pResultPoint = pResult;
i < numPoint;
i++, pResultPoint++
)
{
x = pPoint[i].x;
y = pPoint[i].y;
z = pPoint[i].z;

pResultPoint->x = pMatrix->column[0].x * x
+ pMatrix->column[1].x * y
+ pMatrix->column[2].x * z;

pResultPoint->y = pMatrix->column[0].y * x
+ pMatrix->column[1].y * y
+ pMatrix->column[2].y * z;

pResultPoint->z = pMatrix->column[0].z * x
+ pMatrix->column[1].z * y
+ pMatrix->column[2].z * z;

}

Nov 17 '05 #1
13 4105
Hello Bern,

Generally speaking, the v1 JIT does not currently perform all the
FP-specific optimizations that the VC++ backend does, making floating point
operations more expensive for now. That may be why managed->managed is more
expensive than managed->unmanaged in your test.

So for areas which make heavy use of floating point arithmetic, please use
profilers to pick the fragments where the overhead is costing you most, and
Keep the whole fragment in unmanaged space.

Also, work to minimize the number of transitions you make. If you have some
unmanaged code or an interop call sitting in a loop, make the entire loop
unmanaged. That way you'll only pay the transition cost twice, rather than
for each iteration of the loop.

By looking into ILCode, we can see that when InterOping, there are some
extra IL instructions. So minimizing the number of transitions can save
many IL instructions and improve performance.

For some more information, you can refer to this chapter online:
"Chapter 7 〞 Improving Interop Performance"
http://msdn.microsoft.com/library/en...pt07.asp?frame
=true#scalenetc hapt07 _topic12

Hope that helps.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #2
From reading various things I had already recognized the things that you
state as the current conventional wisdom. I went to the trouble to post my
results in the hopes of getting some feedback on why it might be that my
results run very much against that conventional wisdom. Please consider:

1) Floating point performance of managed code. At least in this little
test scenario floating point performance of managed code doesn't seem to be
a problem at all. In the first call out of the 8 in a test run the
DMatrix3d_multi plyDPoint3dArra y function is asked to apply the matrix to a
whopping 5,000,000 3D points per call. So it is just sitting there doing
floating point operations in a 5,000,000 iteration loop and there are no
function calls in that loop at all. The managed version took only 3% longer
in that case than the all native version. It seems logical then to rule out
floating point performance as the culprit when things quickly change for the
worse in the later calls where the call granularity to
DMatrix3d_multi plyDPoint3dArra y becomes very fine. It makes more sense to
assign the slowdown observed in the fine-grained call cases on function call
overhead, not on floating point performance.

2) The expense of transitions. What am I doing wrong? The version of my
test program that involves a transition in the call from
test_applyMatri xToDPoints->DMatrix3d_mult iplyDPoint3dArr ay is actually
FASTER than the all managed version (true for both the intra-assembly and
inter-assembly call cases). Furthermore, the more finely-grained the calls
are the more the native->managed version outperforms the managed-managed
versions. Since we already established that raw floating point performance
of the loop inside of the DMatrix3d_multi plyDPoint3dArra y function is very
equivalent between the managed and native versions, and the conventional
wisdom is that native->managed transitions are expensive and bad, then what
is to blame for the poor relative performance of the managed->managed
versions? The managed->managed version is flat-out beaten by the version
that does a transition for each and every call. It would seem that there is
some serious penalty associated with making regular managed->managed
function calls - not managed->native calls. What might be responsible for
it and is it something I have any control over?

3) The surprising difference in cost between inter-assembly and
intra-assembly managed->managed calls. Can someone explain this difference
and is there anything that can be done about it besides making my program
one enormous executable?

4) How can I step through JIT compiled code in assembly language in a
debugger for a release executable so that I can see what is going on? I
want the JIT to produce "non debug" x86 instructions and yet I want to step
through them to see what they do. Tips appreciated. Can I do this with the
VS.NET debugger? Windbg? How?

"Yan-Hong Huang[MSFT]" <yh*****@online .microsoft.com> wrote in message
news:kG******** ******@cpmsftng xa10.phx.gbl...
Hello Bern,

Generally speaking, the v1 JIT does not currently perform all the
FP-specific optimizations that the VC++ backend does, making floating point operations more expensive for now. That may be why managed->managed is more expensive than managed->unmanaged in your test.

So for areas which make heavy use of floating point arithmetic, please use
profilers to pick the fragments where the overhead is costing you most, and Keep the whole fragment in unmanaged space.

Also, work to minimize the number of transitions you make. If you have some unmanaged code or an interop call sitting in a loop, make the entire loop
unmanaged. That way you'll only pay the transition cost twice, rather than
for each iteration of the loop.

By looking into ILCode, we can see that when InterOping, there are some
extra IL instructions. So minimizing the number of transitions can save
many IL instructions and improve performance.

For some more information, you can refer to this chapter online:
"Chapter 7 〞 Improving Interop Performance"
http://msdn.microsoft.com/library/en...pt07.asp?frame =true#scalenetc hapt07 _topic12

Hope that helps.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #3
Hi Bern,

By using ildasm.exe, you can look into the IL code of the assembly to see
the difference between inter-assembly and intra-assembly managed->managed
calls.

At the same time, I have forwarded your questions to our product team for
their opinions on it. I will return here as soon as possilble.

Thanks.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #4
Bern, you're seeing what looks like a manifestation of the "double thunk"
(aka "double p/invoke") problem. The problem is that when your managed
code calls the managed code in the DLL, it first goes through a native stub
(when using the Win32 DLL mechanism), so you ended up transitioning from
managed to native and then back to managed.

Try #using the DLL which you have compiled managed, rather than the
standard Win32 DLL mechanism. This should help. Let us know if that
helps, or if this makes no sense.
Thanks,

Kang Su Gatlin
Visual C++ Program Manager

--------------------
| From: "Bern McCarty" <be**********@b entley.com>
| References: <eS************ **@TK2MSFTNGP09 .phx.gbl>
<kG************ **@cpmsftngxa10 .phx.gbl>
| Subject: Re: /CLR floating point performance, inter-assembly function
call performance
| Date: Thu, 6 May 2004 08:59:11 -0400

|
| From reading various things I had already recognized the things that you
| state as the current conventional wisdom. I went to the trouble to post
my
| results in the hopes of getting some feedback on why it might be that my
| results run very much against that conventional wisdom. Please consider:
|
| 1) Floating point performance of managed code. At least in this little
| test scenario floating point performance of managed code doesn't seem to
be
| a problem at all. In the first call out of the 8 in a test run the
| DMatrix3d_multi plyDPoint3dArra y function is asked to apply the matrix to a
| whopping 5,000,000 3D points per call. So it is just sitting there doing
| floating point operations in a 5,000,000 iteration loop and there are no
| function calls in that loop at all. The managed version took only 3%
longer
| in that case than the all native version. It seems logical then to rule
out
| floating point performance as the culprit when things quickly change for
the
| worse in the later calls where the call granularity to
| DMatrix3d_multi plyDPoint3dArra y becomes very fine. It makes more sense to
| assign the slowdown observed in the fine-grained call cases on function
call
| overhead, not on floating point performance.
|
| 2) The expense of transitions. What am I doing wrong? The version of my
| test program that involves a transition in the call from
| test_applyMatri xToDPoints->DMatrix3d_mult iplyDPoint3dArr ay is actually
| FASTER than the all managed version (true for both the intra-assembly and
| inter-assembly call cases). Furthermore, the more finely-grained the
calls
| are the more the native->managed version outperforms the managed-managed
| versions. Since we already established that raw floating point performance
| of the loop inside of the DMatrix3d_multi plyDPoint3dArra y function is very
| equivalent between the managed and native versions, and the conventional
| wisdom is that native->managed transitions are expensive and bad, then
what
| is to blame for the poor relative performance of the managed->managed
| versions? The managed->managed version is flat-out beaten by the version
| that does a transition for each and every call. It would seem that there
is
| some serious penalty associated with making regular managed->managed
| function calls - not managed->native calls. What might be responsible for
| it and is it something I have any control over?
|
| 3) The surprising difference in cost between inter-assembly and
| intra-assembly managed->managed calls. Can someone explain this
difference
| and is there anything that can be done about it besides making my program
| one enormous executable?
|
| 4) How can I step through JIT compiled code in assembly language in a
| debugger for a release executable so that I can see what is going on? I
| want the JIT to produce "non debug" x86 instructions and yet I want to
step
| through them to see what they do. Tips appreciated. Can I do this with
the
| VS.NET debugger? Windbg? How?
|
| "Yan-Hong Huang[MSFT]" <yh*****@online .microsoft.com> wrote in message
| news:kG******** ******@cpmsftng xa10.phx.gbl...
| > Hello Bern,
| >
| > Generally speaking, the v1 JIT does not currently perform all the
| > FP-specific optimizations that the VC++ backend does, making floating
| point
| > operations more expensive for now. That may be why managed->managed is
| more
| > expensive than managed->unmanaged in your test.
| >
| > So for areas which make heavy use of floating point arithmetic, please
use
| > profilers to pick the fragments where the overhead is costing you most,
| and
| > Keep the whole fragment in unmanaged space.
| >
| > Also, work to minimize the number of transitions you make. If you have
| some
| > unmanaged code or an interop call sitting in a loop, make the entire
loop
| > unmanaged. That way you'll only pay the transition cost twice, rather
than
| > for each iteration of the loop.
| >
| > By looking into ILCode, we can see that when InterOping, there are some
| > extra IL instructions. So minimizing the number of transitions can save
| > many IL instructions and improve performance.
| >
| > For some more information, you can refer to this chapter online:
| > "Chapter 7 〞 Improving Interop Performance"
| >
|
http://msdn.microsoft.com/library/en...pt07.asp?frame
| > =true#scalenetc hapt07 _topic12
| >
| > Hope that helps.
| >
| > Best regards,
| > Yanhong Huang
| > Microsoft Community Support
| >
| > Get Secure! 每 www.microsoft.com/security
| > This posting is provided "AS IS" with no warranties, and confers no
| rights.
| >
|
|
|

Nov 17 '05 #5
Hello Bern,

Are you still monitoring this thread? We just hold a discusstion between
PSS, SDE and PM.

The listed matrix of tested combination is this:

test.exe geometry.dll (contains workhorse function)
-------- ----------------
v1) native native
v2) managed native
v3) managed managed

The key is that we think that the third variation is using exported
functions and an import library to call the function in geometry.dll, as is
certainly the case with the first two. If this is the case, then it is
mistaken that there are no transitions in this scenario. In fact, there are
twice as many transitions in variation 3 as in variation 2. The reason for
this is the import libraries. Import libraries are a native construct. Any
time a function call is made from managed code to a DLL through a stub in
the import lib, a managed-native transition must happen. And then, since
the actual implementation of the function in the DLL is managed, there must
be another transition back to managed. This is very costly, as you found
out.

The good news is that there is a way around these transitions for the
managed/managed case. Here is a small example:

Code for DLL:
public __value class Utils { // Must have a public
managed type (__value or __gc)
public:
static int func(int i, int j) { // Must be static
unless you don't mind creating instances
return i + j;
}
};

Code for EXE:
#using <testdll.dll> // Pull in the types
defined in assembly testdll.dll

int main() {
return Utils::func(0, 0); // Call the function
}

This will eliminate all transitions from the call from the exe into the DLL.

I will email our SDE and let him look into this post also. If you have any
more concerns, please feel free to post here. Or you can contact us by
removing online from my email address here. Thanks very much.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #6
Yes I'm here. Thanks for the answer. That makes a certain amount of sense.
I'll see if I can verify it.

I gather than in Whidbey the performance of my inter-assembly,
managed->managed version would be much better without my changing anything.
Yes?

-Bern

"Yan-Hong Huang[MSFT]" <yh*****@online .microsoft.com> wrote in message
news:JX******** ******@cpmsftng xa10.phx.gbl...
Hello Bern,

Are you still monitoring this thread? We just hold a discusstion between
PSS, SDE and PM.

The listed matrix of tested combination is this:

test.exe geometry.dll (contains workhorse function)
-------- ----------------
v1) native native
v2) managed native
v3) managed managed

The key is that we think that the third variation is using exported
functions and an import library to call the function in geometry.dll, as is certainly the case with the first two. If this is the case, then it is
mistaken that there are no transitions in this scenario. In fact, there are twice as many transitions in variation 3 as in variation 2. The reason for
this is the import libraries. Import libraries are a native construct. Any
time a function call is made from managed code to a DLL through a stub in
the import lib, a managed-native transition must happen. And then, since
the actual implementation of the function in the DLL is managed, there must be another transition back to managed. This is very costly, as you found
out.

The good news is that there is a way around these transitions for the
managed/managed case. Here is a small example:

Code for DLL:
public __value class Utils { // Must have a public
managed type (__value or __gc)
public:
static int func(int i, int j) { // Must be static
unless you don't mind creating instances
return i + j;
}
};

Code for EXE:
#using <testdll.dll> // Pull in the types
defined in assembly testdll.dll

int main() {
return Utils::func(0, 0); // Call the function
}

This will eliminate all transitions from the call from the exe into the DLL.
I will email our SDE and let him look into this post also. If you have any
more concerns, please feel free to post here. Or you can contact us by
removing online from my email address here. Thanks very much.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #7

I tried to take the suggestion of doing a #using <geometry.dll > instead of
including the corresponding header files, but when I did that the result
would not compile:

C:\mycode\geomT est\\test.cpp(7 4) : error C3861:
'DMatrix3d_mult iplyDPoint3dArr ay': identifier not found, even with
argument-dependent lookup

Then I thought, well, maybe I should include the header files AND do a
#using <geometry.dll > but make sure to NOT link with the geometry.lib. Then
the problem just moves to link time:

test.obj : error LNK2001: unresolved external symbol "void __cdecl
DMatrix3d_multi plyDPoint3dArra y(struct _dMatrix3d const *,struct _dPoint3d
*,struct _dPoint3d const *,int)" (?
bsiDMatrix3d_mu ltiplyDPoint3dA rray@@$$J0YAXPB U_dMatrix3d@@PA U_dPoint3d@@PBU 2
@H@Z)

Here is what I can find on the function in the disassembled geometry.dll (I
omitted the body):

..method /*0600003F*/ public static void modopt([mscorlib/* 23000001
*/]System.Runtime. CompilerService s.CallConvCdecl/* 01000001 */)
DMatrix3d_multi plyDPoint3d(val uetype _dMatrix3d/* 02000005 */
modopt([Microsoft.Visua lC/* 23000002 */]Microsoft.Visua lC.IsConstModif ier/*
01000002 */)* pMatrix,
valuetype _dPoint3d/* 02000006 */*
pPoint) cil managed
// SIG: 00 02 20 05 01 0F 20 09 11 14 0F 11 18

Perhaps I am doing something wrong, but it appears to me that you cannot
supply the compiler/linker with the information that it needs to call global
functions that were compiled into IL via /CLR by simply referencing the
assembly at compile time. Does that mean that to avoid the inter-assembly
double-P/Invoke that I have no choice but to wrap all of the functionality
in my geometry library in GC classes? That would be a shame since I am able
to call it as is just fine - it is just that it is too slow.

Will the double P/Invoke that I am seeing in this case go away as of
Whidbey?

-Bern

"Bern McCarty" <be**********@b entley.com> wrote in message
news:eu******** ******@tk2msftn gp13.phx.gbl...
Yes I'm here. Thanks for the answer. That makes a certain amount of sense. I'll see if I can verify it.

I gather than in Whidbey the performance of my inter-assembly,
managed->managed version would be much better without my changing anything. Yes?

-Bern

"Yan-Hong Huang[MSFT]" <yh*****@online .microsoft.com> wrote in message
news:JX******** ******@cpmsftng xa10.phx.gbl...
Hello Bern,

Are you still monitoring this thread? We just hold a discusstion between
PSS, SDE and PM.

The listed matrix of tested combination is this:

test.exe geometry.dll (contains workhorse function)
-------- ----------------
v1) native native
v2) managed native
v3) managed managed

The key is that we think that the third variation is using exported
functions and an import library to call the function in geometry.dll, as

is
certainly the case with the first two. If this is the case, then it is
mistaken that there are no transitions in this scenario. In fact, there

are
twice as many transitions in variation 3 as in variation 2. The reason for this is the import libraries. Import libraries are a native construct. Any time a function call is made from managed code to a DLL through a stub in the import lib, a managed-native transition must happen. And then, since
the actual implementation of the function in the DLL is managed, there

must
be another transition back to managed. This is very costly, as you found
out.

The good news is that there is a way around these transitions for the
managed/managed case. Here is a small example:

Code for DLL:
public __value class Utils { // Must have a public
managed type (__value or __gc)
public:
static int func(int i, int j) { // Must be static
unless you don't mind creating instances
return i + j;
}
};

Code for EXE:
#using <testdll.dll> // Pull in the types defined in assembly testdll.dll

int main() {
return Utils::func(0, 0); // Call the function }

This will eliminate all transitions from the call from the exe into the

DLL.

I will email our SDE and let him look into this post also. If you have any more concerns, please feel free to post here. Or you can contact us by
removing online from my email address here. Thanks very much.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no

rights.


Nov 17 '05 #8
Hi Bern,

Based on my experience, the best way is to verify it by testing on Whidbey.
You can install one in MSDN subscriber download.

For the second issue, I think you need to use __GC wrapper class to export
it. Please refer to MSDN for the info of it. I think this is the
"MCppWrappe r Sample: Demonstrates Wrapping a C++ DLL with Managed
Extensions"
http://msdn.microsoft.com/library/de...us/vcsample/ht
ml/vcsammcppwrappe rsampledemonstr ateswrappingcdl lwithmanagedext ensions.asp

Do you have any more concerns on the performance issue yet? If yes, please
feel free to post here. I am glad to work with you on it. Thanks very much.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #9
Yes I am concerned about performance. I had hoped that IJW could be used to
compile nearly all of our existing C++ application into IL and that that
would eliminate the need for many managed->native transitions and would also
free us to begin using GC types throughout our application over time. Our
application consists of quite a number of .dlls and there are tons of
inter-dll calls. But now what I've learned is that, though the code
compiles, links and runs, every inter-dll call is suffering the double
P/Invoke problem so indeed my code is littered with managed->unmanaged
transitions.

Sure I could wrap every single function/method in my entire application in a
GC class, but then IJW isn't at all suitable for what I thought it was. Like
I said, it compiles, links and runs which is impressive. It's just too slow
and that's too bad. I would still like an answer to know if the double
P/Invoke problem will really be fixed in the final Whidbey release. I saw
where someone from Microsoft hedged on that saying that it might not be. I
hope that is not the case.

As for the Visual Studio 2005 Tech Preview on MSDN, I 've already looked at
it. I had so much trouble with it I gave up on it. I found myself editing
delivered headers just to try to get stuff to compile. Then the result
would crash. I haven't seen anyone else posting C++ issues in here that
related to this Whidbey build and I kind of reached the conclusion that the
VC++ team didn't really circle the wagons for this particular build. I can
only assume that they have other better quality builds that people in other
programs have access to. I also found that "search" did not work for the
MSDN library that came with the build and I find that terribly crippling.

Bern McCarty
Bentley Systems, Inc.
"Yan-Hong Huang[MSFT]" <yh*****@online .microsoft.com> wrote in message
news:Kd******** ******@cpmsftng xa10.phx.gbl...
Hi Bern,

Based on my experience, the best way is to verify it by testing on Whidbey. You can install one in MSDN subscriber download.

For the second issue, I think you need to use __GC wrapper class to export
it. Please refer to MSDN for the info of it. I think this is the
"MCppWrappe r Sample: Demonstrates Wrapping a C++ DLL with Managed
Extensions"
http://msdn.microsoft.com/library/de...us/vcsample/ht ml/vcsammcppwrappe rsampledemonstr ateswrappingcdl lwithmanagedext ensions.asp

Do you have any more concerns on the performance issue yet? If yes, please
feel free to post here. I am glad to work with you on it. Thanks very much.
Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! 每 www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 17 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

31
3631
by: JS | last post by:
We have the same floating point intensive C++ program that runs on Windows on Intel chip and on Sun Solaris on SPARC chips. The program reads the exactly the same input files on the two platforms. However, they generate slightly different results for floating point numbers. Are they really supposed to generate exactly the same results? I...
3
2119
by: Mantorok Redgormor | last post by:
What have some of you guys read to have a solid understanding of how floating-point numbers are represented or handled by the processor and what the difference between single and double precision is? I found this: http://docs.sun.com/source/806-3568/ncg_goldberg.html Not sure if this is what I should be reading? Maybe a more authoritive...
687
23008
by: cody | last post by:
no this is no trollposting and please don't get it wrong but iam very curious why people still use C instead of other languages especially C++. i heard people say C++ is slower than C but i can't believe that. in pieces of the application where speed really matters you can still use "normal" functions or even static methods which is...
24
2214
by: j0mbolar | last post by:
C supports single precision floating point and double precision floating point but does it support fixed floating point? i've read that fixed floating point is more accurate than single precision floating point when dealing with dollars and cents.
21
4190
by: Allin Cottrell | last post by:
OK, I realize that what I am asking here is not likely to have a answer within the C standard. Nonetheless, it is not specific to any particular platform, so I'll hazard the question anyway. A C double can be aliased by an array of unsigned char, to which any desired byte pattern may be written. Is there any somewhat portable way of...
7
3380
by: Vinoth | last post by:
I'm working in an ARM (ARM9) system which does not have Floating point co-processor or Floating point libraries. But it does support long long int (64 bits). Can you provide some link that would discuss about ways to emulate floating point calculations with just long int or long long int. For eg., if i've a formula X=(1-b)*Y + b*Z in floating...
15
3905
by: michael.mcgarry | last post by:
Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? Is this the EPSILON? I know in float.h a FLT_EPSILON is defined to be 10^-5. Does this mean that the computer cannot distinguish between 2 numbers that...
4
2825
by: jacob navia | last post by:
Hi people I continue to work in the tutorial for lcc-win32, and started to try to explain the floating point flags. Here is the relevant part of the tutorial. Since it is a difficult part, I would like your expert advise before I publish any serious nonsense. Any comments are welcome, style, organization, hard errors, etc.
32
4063
by: ma740988 | last post by:
template <class T> inline bool isEqual( const T& a, const T& b, const T epsilon = std::numeric_limits<T>::epsilon() ) { const T diff = a - b; return ( diff <= epsilon ) && ( diff >= -epsilon ); } int main() { std::deque<double> pt ;
39
3527
by: rembremading | last post by:
Hi all! The following piece of code has (for me) completely unexpected behaviour. (I compile it with gcc-Version 4.0.3) Something goes wrong with the integer to float conversion. Maybe somebody out there understands what happens. Essentially, when I subtract the (double) function value GRID_POINT(2) from a variable which has been assigned...
0
7519
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7798
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8039
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7560
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6140
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5152
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3545
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1130
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
838
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.