473,382 Members | 1,651 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

While we are discussing the new operator, different question about it...

I am currently working on an EDA app and heavily working on squeezing
the last bits of performance out of it. Going as far as sending
batches of geometry to the video card while still processing geometry
to get some parallization going. Though at least on my hardware, this
does not buy me much. I luv my two 7800 GTs in SLI =) But for users
with a lower spec video card, this may actually be of help.

I am also going ahead and running two render threads each processing
half the geometry to make use of hyperthreading or dual core if
available. Bought me a few ms rendering speedup!

The next thing I did instead of using a generic List<Vertexto store
my created triangles in, I use a static Vertex[] list instead. Once it
fills up, data is committed to the video hardware and the next batch
is processed. I was halfway expecting a speedup here already since the
overhead of calling .Add for the List is now removed but it actually
did not make any significant measurable difference.

So to finally get to my question of the new operator, the next thing I
am looking at is how I am assigning data to my vertex list.

Currently this looks as follows:

polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);

and so on...there are quite a few cases where I have multiple lines of
assignments like that, generally always in sets of 3. Triangles are
just wierd that way =)

Now out of all my drawing function, the one that gets called the most
number of times is my function to render a triangulated line with
round caps. So I took this function apart and did the following:

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

Repeat as necessary for all the assignments. I was expecting that
eliminating the new operator and subsequent copy of the vertex
structure would give me a speed up if I assign the parameters
directly.

I was rather surprised, both pleasantly and not to see it made no
difference. It's nice because the new Vertex() way is more readable
code-wise. But...I would have really liked to have seen a performance
improvement.

So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away and generate code to
assign the values manually like I tried to avoid creating a struct and
copying it? Technically it is a possibility since it is assigning
identical value types to each other...so it does know what's
ultimately going to happen.

Just curious...

And damnit..I now need to find something else to do to get more
speed!! =)

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara
Jul 6 '06 #1
7 1186
Very nice. Now, kindly post me a Mandelbrot vertex shader so I can make a
nice Fractal screensaver with DirectX 9c.
Peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com


"Stephan Rose" wrote:
I am currently working on an EDA app and heavily working on squeezing
the last bits of performance out of it. Going as far as sending
batches of geometry to the video card while still processing geometry
to get some parallization going. Though at least on my hardware, this
does not buy me much. I luv my two 7800 GTs in SLI =) But for users
with a lower spec video card, this may actually be of help.

I am also going ahead and running two render threads each processing
half the geometry to make use of hyperthreading or dual core if
available. Bought me a few ms rendering speedup!

The next thing I did instead of using a generic List<Vertexto store
my created triangles in, I use a static Vertex[] list instead. Once it
fills up, data is committed to the video hardware and the next batch
is processed. I was halfway expecting a speedup here already since the
overhead of calling .Add for the List is now removed but it actually
did not make any significant measurable difference.

So to finally get to my question of the new operator, the next thing I
am looking at is how I am assigning data to my vertex list.

Currently this looks as follows:

polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);

and so on...there are quite a few cases where I have multiple lines of
assignments like that, generally always in sets of 3. Triangles are
just wierd that way =)

Now out of all my drawing function, the one that gets called the most
number of times is my function to render a triangulated line with
round caps. So I took this function apart and did the following:

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

Repeat as necessary for all the assignments. I was expecting that
eliminating the new operator and subsequent copy of the vertex
structure would give me a speed up if I assign the parameters
directly.

I was rather surprised, both pleasantly and not to see it made no
difference. It's nice because the new Vertex() way is more readable
code-wise. But...I would have really liked to have seen a performance
improvement.

So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away and generate code to
assign the values manually like I tried to avoid creating a struct and
copying it? Technically it is a possibility since it is assigning
identical value types to each other...so it does know what's
ultimately going to happen.

Just curious...

And damnit..I now need to find something else to do to get more
speed!! =)

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara
Jul 7 '06 #2
Stephan Rose wrote:
So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away
Reflector is your friend - <http://www.aisto.com/roeder/dotnet>. The
smart thing to do is to look at the generated code and see for
yourself, instead of asking questions and waiting for replies.

In this case, yes, calling new on a struct constructs in place. That
is, it does NOT create a temporary struct, initialize it, and then
copy it to the struct you're assigning. Rather, code like

Point P = new Point(1, 2);

allocates space for P, and passes a pointer to P to the Point
constructor, as the this reference. The constructor then sets this.X
and and this.Y in the normal way.

--
Be the first to review my new book!

..NET 2.0 for Delphi Programmers www.midnightbeach.com/.net
Delphi skills make .NET easy to learn In print, in stores.
Jul 7 '06 #3
How are you measuring that it made "no difference"

Cheers,

Greg
"Stephan Rose" <ke****@somrek.diespammer.netwrote in message
news:it********************************@4ax.com...
>I am currently working on an EDA app and heavily working on squeezing
the last bits of performance out of it. Going as far as sending
batches of geometry to the video card while still processing geometry
to get some parallization going. Though at least on my hardware, this
does not buy me much. I luv my two 7800 GTs in SLI =) But for users
with a lower spec video card, this may actually be of help.

I am also going ahead and running two render threads each processing
half the geometry to make use of hyperthreading or dual core if
available. Bought me a few ms rendering speedup!

The next thing I did instead of using a generic List<Vertexto store
my created triangles in, I use a static Vertex[] list instead. Once it
fills up, data is committed to the video hardware and the next batch
is processed. I was halfway expecting a speedup here already since the
overhead of calling .Add for the List is now removed but it actually
did not make any significant measurable difference.

So to finally get to my question of the new operator, the next thing I
am looking at is how I am assigning data to my vertex list.

Currently this looks as follows:

polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);

and so on...there are quite a few cases where I have multiple lines of
assignments like that, generally always in sets of 3. Triangles are
just wierd that way =)

Now out of all my drawing function, the one that gets called the most
number of times is my function to render a triangulated line with
round caps. So I took this function apart and did the following:

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;

Repeat as necessary for all the assignments. I was expecting that
eliminating the new operator and subsequent copy of the vertex
structure would give me a speed up if I assign the parameters
directly.

I was rather surprised, both pleasantly and not to see it made no
difference. It's nice because the new Vertex() way is more readable
code-wise. But...I would have really liked to have seen a performance
improvement.

So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away and generate code to
assign the values manually like I tried to avoid creating a struct and
copying it? Technically it is a possibility since it is assigning
identical value types to each other...so it does know what's
ultimately going to happen.

Just curious...

And damnit..I now need to find something else to do to get more
speed!! =)

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara

Jul 7 '06 #4
On Fri, 7 Jul 2006 00:44:49 -0400, "Greg Young"
<dr*******************@hotmail.comwrote:
>How are you measuring that it made "no difference"
The average measured amount of time (using GetPerformanceCounter) it
took to render the identical data was no different than before. =)

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara
Jul 7 '06 #5
average measured for 1 run, 5000, 1000?

Have you looked at the optimized JIT output yet?

Cheers,

Greg
"Stephan Rose" <ke****@somrek.diespammer.netwrote in message
news:3n********************************@4ax.com...
On Fri, 7 Jul 2006 00:44:49 -0400, "Greg Young"
<dr*******************@hotmail.comwrote:
>>How are you measuring that it made "no difference"

The average measured amount of time (using GetPerformanceCounter) it
took to render the identical data was no different than before. =)

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara

Jul 7 '06 #6
On Fri, 7 Jul 2006 04:30:19 -0400, "Greg Young"
<dr*******************@hotmail.comwrote:
>average measured for 1 run, 5000, 1000?
Just a couple runs. I am not really concerned at this moment about
speed increases in the sub 10% range where I need lots of timing
accuracy to note any improvements.

I was more concerned with improvements like I managed to do today such
as reducing a rough 120ms average down to a rougly 70ms average.
Managed to find a spot to optimize in my polygon triangulator. At that
point in time, I know I made a significant improvement with a gap in
time that large.
>
Have you looked at the optimized JIT output yet?
No I haven't. I actually had been meaning to ask that one of these
days, where can I look at the JIT output?

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara
Jul 7 '06 #7
the new operator is one of those places. It will not be a huge performance
gain but it should offer a slight performance gain.

Cheers,

Greg
"Stephan Rose" <ke****@somrek.diespammer.netwrote in message
news:rl********************************@4ax.com...
On Fri, 7 Jul 2006 04:30:19 -0400, "Greg Young"
<dr*******************@hotmail.comwrote:
>>average measured for 1 run, 5000, 1000?

Just a couple runs. I am not really concerned at this moment about
speed increases in the sub 10% range where I need lots of timing
accuracy to note any improvements.

I was more concerned with improvements like I managed to do today such
as reducing a rough 120ms average down to a rougly 70ms average.
Managed to find a spot to optimize in my polygon triangulator. At that
point in time, I know I made a significant improvement with a gap in
time that large.
>>
Have you looked at the optimized JIT output yet?

No I haven't. I actually had been meaning to ask that one of these
days, where can I look at the JIT output?

--
Stephan
2003 Yamaha R6

kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara

Jul 7 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

34
by: Adam Hartshorne | last post by:
Hi All, I have the following problem, and I would be extremely grateful if somebody would be kind enough to suggest an efficient solution to it. I create an instance of a Class A, and...
4
by: Mark Stijnman | last post by:
A while ago I posted a question about how to get operator behave differently for reading and writing. I basically wanted to make a vector that can be queried about whether it is modified recently...
335
by: extrudedaluminiu | last post by:
Hi, Is there any group in the manner of the C++ Boost group that works on the evolution of the C language? Or is there any group that performs an equivalent function? Thanks, -vs
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.