473,404 Members | 2,170 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Lacking for-loop optimization in C#

I stumbled over an optimization (or lack of one, to be specific) when viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I missing
something here?
Nov 17 '05 #1
10 4542
If the compiler can't guarantee that the value of
_buffer.Width*_buffer.Height changes during any loop iteration, then it'll
have to be calculated each time. That is the nature of a C/C++/C# for-loop.
It's quite common to have for-loops where you want this to occur. (note that
VB never recalculates the ending condition)

David Anton
www.tangiblesoftwaresolutions.com
Home of the Instant C# VB.NET to C# converter
and the Instant VB C# to VB.NET converter

"MariusI" wrote:
I stumbled over an optimization (or lack of one, to be specific) when viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I missing
something here?

Nov 17 '05 #2
Hi MariusI,
for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,


I think that it is working as it should be.
How do you say about that code:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
if( _buffer.Height>1 ) {
_buffer.Height=_buffer.Height-1;
}
}

Of course there is no sense to shrink a "_buffer" but
the loop expression should work in this case too.
Why the compiler have to guess what you have in mind?

Regards
Marcin
Nov 17 '05 #3
Thanks for the reply. When you say "can't guarantee" do you mean that a
multithreaded system could alter the _buffer size and width? (There is no
code inside the for-loop which refers to the buffer).

"David Anton" wrote:
If the compiler can't guarantee that the value of
_buffer.Width*_buffer.Height changes during any loop iteration, then it'll
have to be calculated each time. That is the nature of a C/C++/C# for-loop.
It's quite common to have for-loops where you want this to occur. (note that
VB never recalculates the ending condition)

David Anton
www.tangiblesoftwaresolutions.com
Home of the Instant C# VB.NET to C# converter
and the Instant VB C# to VB.NET converter

"MariusI" wrote:
I stumbled over an optimization (or lack of one, to be specific) when viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I missing
something here?

Nov 17 '05 #4
Not knowing the contents of your loop, I assumed that the compiler may have
determined that _buffer may possibly be changed.

If there is no way that _buffer could have been altered during loop
execution, then this is indeed probably a shortcoming of the compiler.

"MariusI" wrote:
Thanks for the reply. When you say "can't guarantee" do you mean that a
multithreaded system could alter the _buffer size and width? (There is no
code inside the for-loop which refers to the buffer).

"David Anton" wrote:
If the compiler can't guarantee that the value of
_buffer.Width*_buffer.Height changes during any loop iteration, then it'll
have to be calculated each time. That is the nature of a C/C++/C# for-loop.
It's quite common to have for-loops where you want this to occur. (note that
VB never recalculates the ending condition)

David Anton
www.tangiblesoftwaresolutions.com
Home of the Instant C# VB.NET to C# converter
and the Instant VB C# to VB.NET converter

"MariusI" wrote:
I stumbled over an optimization (or lack of one, to be specific) when viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I missing
something here?

Nov 17 '05 #5
Did it actually get any faster when you hoisted the code out of the loop?

I ask, because in general, these kinds of optimizations are done by the JIT
compiler, not the C# compiler. One reason for that is this enables all
languages to take advantage of these kinds of optimization, without every
single compiler having to repeat the same work all the other compilers do.

Most .NET compilers do very little optimization for this reason.

So I'm wondering if this really is slow for the reason you think it is. The
actual compiled code that runs at run time doesn't necessarily have the same
structure as the IL shown by ILDASM. It often doesn't. Indeed there are
certain cases where it definitely will hoist invariants outside of loops at
JIT compile time.

But if the change really did improve things, then it's probably, as others
have suggested, either because the compiler was unable to be absolutely
certain that the values wouldn't change (e.g. because you are calling
virtual accessors), or because the JIT compiler failed to optimize in this
case.

--
Ian Griffiths - http://www.interact-sw.co.uk/iangblog/

"MariusI" <Ma*****@discussions.microsoft.com> wrote in message
news:7E**********************************@microsof t.com...
I stumbled over an optimization (or lack of one, to be specific) when
viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing
fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate
over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each
iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I
missing
something here?

Nov 17 '05 #6

"MariusI" <Ma*****@discussions.microsoft.com> wrote in message
news:7E**********************************@microsof t.com...
I stumbled over an optimization (or lack of one, to be specific) when
viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing
fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate
over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each
iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I
missing
something here?


It's not the C# compiler that performs hoisting, it's the JIT compilers that
does a limited form of hoisting but this depends on what's been done in the
loop. Mind to share the code in the loop?

Willy.
Nov 17 '05 #7
Yes, that is a good point, but as long as there is no code within the
for-loop which refers to the buffer, the compiler should optimize away the
buffer.Width*_buffer.Height calculation (move it outside) at compile time. In
my case there was no code inside the for-loop refering to the buffer which
mean that the compiler should have moved the calculation outside.

"Marcin Grzębski" wrote:
Hi MariusI,
for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each iteration,


I think that it is working as it should be.
How do you say about that code:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
if( _buffer.Height>1 ) {
_buffer.Height=_buffer.Height-1;
}
}

Of course there is no sense to shrink a "_buffer" but
the loop expression should work in this case too.
Why the compiler have to guess what you have in mind?

Regards
Marcin

Nov 17 '05 #8
To be honest there isn't any, but i found an old graphical effect which i had
coded which suffered from the same problems. I hoisted all for-loop variables
and the performance increase was truly impressive. I'll give you the code
whole code for the class, just cut'n paste :-)

// --- Field.cs

using System;
using System.Drawing;
using System.Drawing.Imaging;
using System.Collections;
using System.ComponentModel;
using System.Windows.Forms;
using System.Threading;

namespace Field
{
public unsafe sealed class Field : System.Windows.Forms.Form
{
private const int MAXAMP=20, MINAMP=5;

private System.ComponentModel.Container components = null;
private Graphics _hDC;
private Bitmap _buffer;
private int _numPixelsInbuffer;
private int _bufferWidth, _bufferHeight;
private uint [] _palette;
private int [] _hBaseMap, _hRunnerMap;
private Thread _runner;
private bool _done;

public Field()
{
InitializeComponent();
this.SetStyle(ControlStyles.AllPaintingInWmPaint | ControlStyles.UserPaint
| ControlStyles.Opaque, true);
}

/// <summary>
/// Clean up any resources being used.
/// </summary>
protected override void Dispose( bool disposing )
{
if( disposing )
{
if(components != null)
{
components.Dispose();
}
}
base.Dispose( disposing );
}

/// <summary>
/// Generates a heightmap
/// </summary>
/// <param name="map">Map to generate</param>
private void generateHeightMap(ref int [] map, int minamp, int maxamp)
{
Random r = new Random();
int [] xBaseSine = new int[_bufferWidth];

double freqX = r.Next((_bufferWidth-10)/5)+10;
int amplitudeX = r.Next(maxamp-minamp)+minamp;
double freqY = r.Next((_bufferHeight-10)/5)+10;
int amplitudeY = r.Next(maxamp-minamp)+minamp;

fixed (int* pMap = map, pBase = xBaseSine)
{
int* pM = pMap;
int* pB = pBase;

for (int x = 0; x < _bufferWidth; x++ )
{
*pB = (int)(Math.Sin(x/freqX)*amplitudeX);
pB++;
}

for ( int y = 0; y < _bufferHeight; y++ )
{
pB = pBase;
int yBaseSine = (int)(Math.Sin(y/freqY)*amplitudeY);
for ( int x = 0; x < _bufferWidth; x++ )
{
*pM = yBaseSine+*pB;
pM++; pB++;
}
}
}

}

/// <summary>
/// Initializes our palette with shades of blue
/// </summary>
private uint[] initializePalette(int size)
{
uint[] palette = new uint[size];
double blueDecrPrStep = 255f/size;
for (int i = 0; i < size; i++)
palette[i] = 0xFF000000 | (byte)(255f-i*blueDecrPrStep);

return palette;
}

private void run()
{
const int FADE=1, ZEROPOINT = 2*MAXAMP;

bool regenerate = true;
int[] tmpArrayPointer;

_palette = initializePalette(4*MAXAMP);
_hRunnerMap = new int[_numPixelsInbuffer];
_hBaseMap = new int[_numPixelsInbuffer];

generateHeightMap(ref _hRunnerMap, MINAMP, MAXAMP);

while(!_done)
{
if (regenerate)
{
tmpArrayPointer = _hBaseMap;
_hBaseMap = _hRunnerMap;
_hRunnerMap = tmpArrayPointer;
generateHeightMap(ref _hRunnerMap, MINAMP, MAXAMP);
}
regenerate = true;

fixed (int* pBaseMap = _hBaseMap, pRMap = _hRunnerMap)
{
int* pB = pBaseMap; // phMap is read only
int* pR = pRMap; // prHMap is read only

BitmapData bData = _buffer.LockBits(new Rectangle(0, 0, _buffer.Width,
_buffer.Height),
ImageLockMode.WriteOnly, _buffer.PixelFormat);

uint* pPixel = (uint*)bData.Scan0.ToPointer();

for ( int i = 0; i < _numPixelsInbuffer; i++ )
{
*pPixel = _palette[*pB+ZEROPOINT];
++pPixel;

if (*pR > *pB)
{
*pB += FADE;
regenerate = false;
}
else if (*pR < *pB)
{
*pB -= FADE;
regenerate = false;
}
pB++;
pR++;
}

_buffer.UnlockBits(bData);
}

_hDC.DrawImage(_buffer,0,0);
}
}

#region Windows Form Designer generated code
/// <summary>
/// Required method for Designer support - do not modify
/// the contents of this method with the code editor.
/// </summary>
private void InitializeComponent()
{
//
// Field
//
this.AutoScaleBaseSize = new System.Drawing.Size(5, 13);
this.ClientSize = new System.Drawing.Size(492, 371);
this.Name = "Field";
this.Text = "Field";
this.Closing += new
System.ComponentModel.CancelEventHandler(this.Fiel d_Closing);
this.Load += new System.EventHandler(this.Field_Load);

}
#endregion

private void Field_Load(object sender, System.EventArgs e)
{
_hDC = this.CreateGraphics();
_buffer = new Bitmap(this.ClientSize.Width, this.ClientSize.Height, _hDC);
_numPixelsInbuffer = _buffer.Width*_buffer.Height;
_bufferWidth = _buffer.Width;
_bufferHeight = _buffer.Height;
_runner = new Thread(new ThreadStart(this.run));
_runner.Start();
}

private void Field_Closing(object sender,
System.ComponentModel.CancelEventArgs e)
{
_done = true;

// wait for thread to end naturally
while (_runner.IsAlive) {}
}
}
}
"Willy Denoyette [MVP]" wrote:

"MariusI" <Ma*****@discussions.microsoft.com> wrote in message
news:7E**********************************@microsof t.com...
I stumbled over an optimization (or lack of one, to be specific) when
viewing
IL opcodes generated by the compiler using ms .net 2003. I was testing
fast
pixel manipulation using Bitmap.LockBits and unsafe pointers to iterate
over
an image's pixels. The inner-loop looked like this:

for (int x = 0; x < _buffer.Width*_buffer.Height; x++)
{
// do manipulations here..
}

I noticed how the for-loop performed extremely slow, and ildasm revealed
that the "x < _buffer.Width*_buffer.Height" was computed at each
iteration,
resulting in two virtual calls and one mul. The use of /optimize did not
improve the matter, so the solution was to calculate
_buffer.Width*_buffer.Height outside of the for-loop. I was under the
impression that the compiler performed hoisting automatically, am I
missing
something here?


It's not the C# compiler that performs hoisting, it's the JIT compilers that
does a limited form of hoisting but this depends on what's been done in the
loop. Mind to share the code in the loop?

Willy.

Nov 17 '05 #9
MariusI <Ma*****@discussions.microsoft.com> wrote:
To be honest there isn't any, but i found an old graphical effect which i had
coded which suffered from the same problems. I hoisted all for-loop variables
and the performance increase was truly impressive. I'll give you the code
whole code for the class, just cut'n paste :-)


Ah, I see these are member variables. I suspect that the compiler is
indeed concerned (not unreasonably) about the possibility of them being
modified in a
multi-threaded environment.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #10
Or, more likely, the compiler has no way of knowing what might cause
the .Width and .Height properties to change, so it did the conservative
thing and left the property fetches in the loop. If you were to change
it to this:

int wid = _buffer.Width;
int high = _buffer.Height;
for (int i = 0; i < wid * high; i++)
{
}

I would be very surprised if the compiler did not optimize and do the
multiply only once. The compiler should know that the _buffer reference
isn't changing, but what it can't know is whether something inside the
loop might not affect the values that _buffer chooses to return for
Width and Height.

Nov 17 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Xerxes | last post by:
Hi, is there any script to authenticate an email address entered in a form field? I used the php mail() function, using the following (where my email field on the form is called "email"): ...
4
by: Craig Bailey | last post by:
Anyone recommend a good script editor for Mac OS X? Just finished a 4-day PHP class in front of a Windows machine, and liked the editor we used. Don't recall the name, but it gave line numbers as...
0
by: Verizon | last post by:
Has anybody ever heard of support for the book: "Secure PHP Development" by: Mohammed J. Kabir I'm trying to run one of his PHP solutions called "Web Forms Manager" I haven't been able to...
1
by: Anne Wangnick | last post by:
Dear all, I don't get why the index() method is only defined for mutable sequence types. This is not what I expected. Shouldn't this be added in Python? Is there such a PEP already? Regards,...
2
by: rakesh | last post by:
I want to make my windows form .exe file run on a machine lacking dotnet framework.How can I make a setup project out of it to run on all machines.
2
by: Steve | last post by:
I just realized that I need elements in my List<> to be doubly linked. So I searched for "C# Linked List" and found the LinkedList<> class. I was happy until I looked at the docs for this class...
7
by: cess | last post by:
Hi!!! i would like to know if what is lacking in the codes below to have a greatest common denominator of two given(by the user) numbers?? I'm confused and i need your help! import java.io.*;...
3
by: cess | last post by:
If the user input three numbers, say (1,3,2), how it will become 1,2,3 or in ascending order?? there is something wrong/lacking with my code, help plzzzz!! import java.io.*; public class sign {...
2
by: Cyntalan | last post by:
You know, I have this bad feeling I'm wasting space by asking this, but I'm having a peculiar problem. Most of my VBA experience has been Excel, and giving Access for a go for the first time has...
33
by: castironpi | last post by:
I am concerned by the lack of follow-through on some responses to recent ideas I have described. Do I merely have a wrong understanding of group policy? Is it a good policy (defined with respect...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.