By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,993 Members | 1,883 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,993 IT Pros & Developers. It's quick & easy.

byte[] to byte*... What is the fastest way?

P: n/a
Hi,
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

Thanks

ThunderMusic
Aug 23 '07 #1
Share this Question
Share on Google+
24 Replies


P: n/a
ThunderMusic,

Use unsafe code:

byte[] bytes = ...;

unsafe
{
fixed (byte* p = bytes)
{
// Work with pointer here.
}
}

As a matter of fact, that's the only way to do it, as you need to pin
down the location of the array to prevent the reference from moving around.

Is there a reason you need the pointer, or are you just looking for a
faster way to iterate through the array?
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...
Hi,
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

Thanks

ThunderMusic

Aug 23 '07 #2

P: n/a
>The subject says it all... I want to use a byte[] and use it as byte* so I
>can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?
The fixed statement?
Mattias

--
Mattias Sjögren [C# MVP] mattias @ mvps.org
http://www.msjogren.net/dotnet/ | http://www.dotnetinterop.com
Please reply only to the newsgroup.
Aug 23 '07 #3

P: n/a
Hi,

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...
Hi,
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?
Out of curiosity, why you want to do that?

What is wrong with using a indexer like
for( int i=0; i< buffer.Lengh; i++
buffer[i] .....

Aug 23 '07 #4

P: n/a
"ThunderMusic" <No*************************@NoSpAm.comwrote:
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.
I really hate to be pedantic, but I'm willing to bet that the difference in
how you iterate through your array makes little to no difference in the
overall performance of your code.

People frequently are guilting of over-optimizing things that are already
"Fast Enough". Unless you've verified this section is slow via a Profiler,
you're better off not getting fancy with optimizations.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins
Aug 23 '07 #5

P: n/a
hi,
thanks for the really quick answer... Your solution works, but I can't
do p++ because it's fixed. So I must use another pointer like byte* p2 = p;
so now I can do p2++;...

Actually, I'm looking for a faster way to compute a checksum on a byte
array... For now, I'm using the Adler-32 algorithm, but I'm open to advises
on a performant checksum algorithm. It will be for an error checking
mecanism for tcp and udp communication on a closed network environment. So
it doesn't need to be human-modification resistant, it's just to prevent
modification due to the noise on the line (if it can happen)...

Thanks

ThunderMusic

"Nicholas Paldino [.NET/C# MVP]" <mv*@spam.guard.caspershouse.comwrote in
message news:ue**************@TK2MSFTNGP03.phx.gbl...
ThunderMusic,

Use unsafe code:

byte[] bytes = ...;

unsafe
{
fixed (byte* p = bytes)
{
// Work with pointer here.
}
}

As a matter of fact, that's the only way to do it, as you need to pin
down the location of the array to prevent the reference from moving
around.

Is there a reason you need the pointer, or are you just looking for a
faster way to iterate through the array?
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...
>Hi,
The subject says it all... I want to use a byte[] and use it as byte* so
I can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

Thanks

ThunderMusic


Aug 23 '07 #6

P: n/a
On Thu, 23 Aug 2007 15:36:18 -0400, "ThunderMusic"
<No*************************@NoSpAm.comwrote:
>Hi,
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?
I have recently started with C#, converting C++ applications that must
communicate with external equipment using protocols with rigidly fixed
byte sequences. Initially I was terribly frustrated abandoning my
precious pointer manipulation and access to byte arrays. But I
developed a bunch of utilities to create managed code byte[] arrays
from pieces and extract appropriate values from them and can now avoid
all unmanaged code. Look into Encoding.Convert for converting between
ASCII byte sequences and Unicode, MemoryStream.Write and ToArray along
with BinaryWriter.Write for creating byte[] data and the BitConverter
set of methods to extract values from byte[].

I keep telling myself that staying entirely within managed code is
worth all that effort, but I am still early in the process. Some old
timers here can probably fill you in on much more detail, including
the virtues of doing so.


Aug 23 '07 #7

P: n/a
ThunderMusic,

If you are looking to tune this algoritm, I would say that the iteration
through the bytes is NOT the way to do it. There are probably a bunch of
other areas that can be improved upon.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:uk**************@TK2MSFTNGP06.phx.gbl...
hi,
thanks for the really quick answer... Your solution works, but I can't
do p++ because it's fixed. So I must use another pointer like byte* p2 =
p; so now I can do p2++;...

Actually, I'm looking for a faster way to compute a checksum on a byte
array... For now, I'm using the Adler-32 algorithm, but I'm open to
advises on a performant checksum algorithm. It will be for an error
checking mecanism for tcp and udp communication on a closed network
environment. So it doesn't need to be human-modification resistant, it's
just to prevent modification due to the noise on the line (if it can
happen)...

Thanks

ThunderMusic

"Nicholas Paldino [.NET/C# MVP]" <mv*@spam.guard.caspershouse.comwrote
in message news:ue**************@TK2MSFTNGP03.phx.gbl...
>ThunderMusic,

Use unsafe code:

byte[] bytes = ...;

unsafe
{
fixed (byte* p = bytes)
{
// Work with pointer here.
}
}

As a matter of fact, that's the only way to do it, as you need to pin
down the location of the array to prevent the reference from moving
around.

Is there a reason you need the pointer, or are you just looking for a
faster way to iterate through the array?
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...
>>Hi,
The subject says it all... I want to use a byte[] and use it as byte* so
I can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

Thanks

ThunderMusic



Aug 23 '07 #8

P: n/a

On Aug 23, 10:05 pm, "ThunderMusic"
<NoSpAmdanlatathotmaildot...@NoSpAm.comwrote:
on a performant checksum algorithm. It will be for an error checking
mecanism for tcp and udp communication on a closed network environment. So
it doesn't need to be human-modification resistant, it's just to prevent
modification due to the noise on the line (if it can happen)...
It can't happen.

Now, how's that for optimization?

Aug 23 '07 #9

P: n/a
hi,
I know it may be overoptimizing, but this part of the code will be called
many many many times per seconds, so I'm must make sure it's as optimized as
it can be. It's for a checksum mecanism, so each time a message must be
sent, it is called to compute the checksum to append to the message and will
be computed again when the client receives it so it can verify the validity
of the message.

Thanks for the comment tought...

ThunderMusic

"Chris Mullins [MVP]" <cm******@yahoo.comwrote in message
news:uy*************@TK2MSFTNGP06.phx.gbl...
"ThunderMusic" <No*************************@NoSpAm.comwrote:
>The subject says it all... I want to use a byte[] and use it as byte* so
I can increment the pointer to iterate through it.

I really hate to be pedantic, but I'm willing to bet that the difference
in how you iterate through your array makes little to no difference in the
overall performance of your code.

People frequently are guilting of over-optimizing things that are already
"Fast Enough". Unless you've verified this section is slow via a Profiler,
you're better off not getting fancy with optimizations.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins

Aug 23 '07 #10

P: n/a
hi,
So, here is the code to get my checksum. If you see something that can be
optimized, let me know.

thanks

ThunderMusic

Expand|Select|Wrap|Line Numbers
  1.  
  2. private const UInt16 MOD_ADLER = 65521;
  3. private unsafe uint GetChecksum(byte[] databytes)
  4. {
  5. UInt32 a = 1, b = 0;
  6. fixed (byte* tmpdata = databytes)
  7. {
  8. byte* data = tmpdata;
  9. int len = databytes.Length; /* Length in bytes */
  10. while (len 0)
  11. {
  12. int tlen = len 5550 ? 5550 : len;
  13. len -= tlen;
  14. do
  15. {
  16. a += *(data++);
  17. b += a;
  18. } while ((--tlen) 0);
  19. a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
  20. b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  21. }
  22. /* It can be shown that a <= 0x1013a here, so a single subtract will
  23. do. */
  24. if (a >= MOD_ADLER)
  25. a -= MOD_ADLER;
  26. /* It can be shown that b can reach 0xffef1 here. */
  27. b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  28. if (b >= MOD_ADLER)
  29. b -= MOD_ADLER;
  30. }
  31. return ((b << 16) | a);
  32. }
  33.  
  34.  


"Nicholas Paldino [.NET/C# MVP]" <mv*@spam.guard.caspershouse.comwrote in
message news:ew**************@TK2MSFTNGP04.phx.gbl...
ThunderMusic,

If you are looking to tune this algoritm, I would say that the
iteration through the bytes is NOT the way to do it. There are probably a
bunch of other areas that can be improved upon.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:uk**************@TK2MSFTNGP06.phx.gbl...
>hi,
thanks for the really quick answer... Your solution works, but I can't
do p++ because it's fixed. So I must use another pointer like byte* p2 =
p; so now I can do p2++;...

Actually, I'm looking for a faster way to compute a checksum on a byte
array... For now, I'm using the Adler-32 algorithm, but I'm open to
advises on a performant checksum algorithm. It will be for an error
checking mecanism for tcp and udp communication on a closed network
environment. So it doesn't need to be human-modification resistant, it's
just to prevent modification due to the noise on the line (if it can
happen)...

Thanks

ThunderMusic

"Nicholas Paldino [.NET/C# MVP]" <mv*@spam.guard.caspershouse.comwrote
in message news:ue**************@TK2MSFTNGP03.phx.gbl...
>>ThunderMusic,

Use unsafe code:

byte[] bytes = ...;

unsafe
{
fixed (byte* p = bytes)
{
// Work with pointer here.
}
}

As a matter of fact, that's the only way to do it, as you need to pin
down the location of the array to prevent the reference from moving
around.

Is there a reason you need the pointer, or are you just looking for a
faster way to iterate through the array?
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl.. .
Hi,
The subject says it all... I want to use a byte[] and use it as byte*
so I can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

Thanks

ThunderMusic



Aug 23 '07 #11

P: n/a
Hi,

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:eZ**************@TK2MSFTNGP06.phx.gbl...
hi,
I know it may be overoptimizing, but this part of the code will be called
many many many times per seconds, so I'm must make sure it's as optimized
as it can be. It's for a checksum mecanism, so each time a message must be
sent, it is called to compute the checksum to append to the message and
will be computed again when the client receives it so it can verify the
validity of the message.

Thanks for the comment tought...
Why don't you do something, try using the "vanila" for statement versus the
pointer arithmetic you want to use, I will be surprised if you find any
noticeable difference.
Aug 23 '07 #12

P: n/a
I understand well the "This code it called alot, therefore it needs to be
optimized." frame of mind. I build big, high-performance, network socket
servers for a living. I know this space really well, and have done lots in
both TCP & UDP land at scales that are pretty signfigant.

Put a profiler on this, and you'll likley find that this chunk of code won't
even show up. Certainly the difference between the two (pointer vs iterator)
is unlikley to show up at all.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:eZ**************@TK2MSFTNGP06.phx.gbl...
hi,
I know it may be overoptimizing, but this part of the code will be called
many many many times per seconds, so I'm must make sure it's as optimized
as it can be. It's for a checksum mecanism, so each time a message must be
sent, it is called to compute the checksum to append to the message and
will be computed again when the client receives it so it can verify the
validity of the message.

Thanks for the comment tought...

ThunderMusic

"Chris Mullins [MVP]" <cm******@yahoo.comwrote in message
news:uy*************@TK2MSFTNGP06.phx.gbl...
>"ThunderMusic" <No*************************@NoSpAm.comwrote:
>>The subject says it all... I want to use a byte[] and use it as byte* so
I can increment the pointer to iterate through it.

I really hate to be pedantic, but I'm willing to bet that the difference
in how you iterate through your array makes little to no difference in
the overall performance of your code.

People frequently are guilting of over-optimizing things that are already
"Fast Enough". Unless you've verified this section is slow via a
Profiler, you're better off not getting fancy with optimizations.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, Microsoft C# MVP
http://www.coversant.com/blogs/cmullins


Aug 23 '07 #13

P: n/a
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:Oh**************@TK2MSFTNGP02.phx.gbl...
hi,
So, here is the code to get my checksum. If you see something that can be
optimized, let me know.

thanks

ThunderMusic

Expand|Select|Wrap|Line Numbers
  1. private const UInt16 MOD_ADLER = 65521;
  2. private unsafe uint GetChecksum(byte[] databytes)
  3. {
  4.    UInt32 a = 1, b = 0;
  5.    fixed (byte* tmpdata = databytes)
  6.    {
  7.        byte* data = tmpdata;
  8.        int len = databytes.Length; /* Length in bytes */
  9.        while (len 0)
  10.        {
  11.            int tlen = len 5550 ? 5550 : len;
  12.            len -= tlen;
  13.            do
  14.            {
  15.                a += *(data++);
  16.                b += a;
  17.            } while ((--tlen) 0);
  18.            a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
  19.            b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  20.        }
  21.        /* It can be shown that a <= 0x1013a here, so a single subtract
  22. will do. */
  23.        if (a >= MOD_ADLER)
  24.            a -= MOD_ADLER;
  25.        /* It can be shown that b can reach 0xffef1 here. */
  26.        b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  27.        if (b >= MOD_ADLER)
  28.            b -= MOD_ADLER;
  29.    }
  30.    return ((b << 16) | a);
  31. }
  32.  
You can try to do some partial loop unrolling, this is what a C++ compiler
does and what the C# compiler/JIT unfortunately fails to do.

UInt32 a = 1, b = 0;
fixed (byte* tmpdata = databytes)
{
byte* data = tmpdata;
int len = databytes.Length; /* Length in bytes */
while(len 0)
{
int tlen = len 5550 ? 5550 : len;
len -= tlen;
for(int c = 0; c < tlen / 4; c++)
{
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
for(int c = 0; c < tlen % 4; c++)
{
a += *(data++);
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
}
if (a >= MOD_ADLER)
a -= MOD_ADLER;
/* It can be shown that b can reach 0xffef1 here. */
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
if (b >= MOD_ADLER)
b -= MOD_ADLER;
}
return ((b << 16) | a);
Willy.

Aug 23 '07 #14

P: n/a
wow!! that's pretty impressive... this little modification made the loop 2
to 3 times faster!!

thanks a lot

ThunderMusic

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:uu****************@TK2MSFTNGP03.phx.gbl...
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:Oh**************@TK2MSFTNGP02.phx.gbl...
>hi,
So, here is the code to get my checksum. If you see something that can be
optimized, let me know.

thanks

ThunderMusic

Expand|Select|Wrap|Line Numbers
  1. private const UInt16 MOD_ADLER = 65521;
  2. private unsafe uint GetChecksum(byte[] databytes)
  3. {
  4.    UInt32 a = 1, b = 0;
  5.    fixed (byte* tmpdata = databytes)
  6.    {
  7.        byte* data = tmpdata;
  8.        int len = databytes.Length; /* Length in bytes */
  9.        while (len 0)
  10.        {
  11.            int tlen = len 5550 ? 5550 : len;
  12.            len -= tlen;
  13.            do
  14.            {
  15.                a += *(data++);
  16.                b += a;
  17.            } while ((--tlen) 0);
  18.            a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
  19.            b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  20.        }
  21.        /* It can be shown that a <= 0x1013a here, so a single subtract
  22. will do. */
  23.        if (a >= MOD_ADLER)
  24.            a -= MOD_ADLER;
  25.        /* It can be shown that b can reach 0xffef1 here. */
  26.        b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  27.        if (b >= MOD_ADLER)
  28.            b -= MOD_ADLER;
  29.    }
  30.    return ((b << 16) | a);
  31. }

You can try to do some partial loop unrolling, this is what a C++ compiler
does and what the C# compiler/JIT unfortunately fails to do.

UInt32 a = 1, b = 0;
fixed (byte* tmpdata = databytes)
{
byte* data = tmpdata;
int len = databytes.Length; /* Length in bytes */
while(len 0)
{
int tlen = len 5550 ? 5550 : len;
len -= tlen;
for(int c = 0; c < tlen / 4; c++)
{
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
for(int c = 0; c < tlen % 4; c++)
{
a += *(data++);
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
}
if (a >= MOD_ADLER)
a -= MOD_ADLER;
/* It can be shown that b can reach 0xffef1 here. */
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
if (b >= MOD_ADLER)
b -= MOD_ADLER;
}
return ((b << 16) | a);
Willy.

Aug 24 '07 #15

P: n/a
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:u$**************@TK2MSFTNGP03.phx.gbl...
wow!! that's pretty impressive... this little modification made the loop
2 to 3 times faster!!
2 to 3 times faster? It's hard to believe honestly, I would have expected
something like a 20-30% improvement. Are you sure about your time
measurement techniques?

Willy.

Aug 24 '07 #16

P: n/a

"r norman" <r_s_norman@_comcast.netwrote in message
news:dq********************************@4ax.com...
On Thu, 23 Aug 2007 15:36:18 -0400, "ThunderMusic"
<No*************************@NoSpAm.comwrote:
>>Hi,
The subject says it all... I want to use a byte[] and use it as byte* so I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

I have recently started with C#, converting C++ applications that must
communicate with external equipment using protocols with rigidly fixed
byte sequences. Initially I was terribly frustrated abandoning my
precious pointer manipulation and access to byte arrays. But I
developed a bunch of utilities to create managed code byte[] arrays
from pieces and extract appropriate values from them and can now avoid
all unmanaged code. Look into Encoding.Convert for converting between
ASCII byte sequences and Unicode, MemoryStream.Write and ToArray along
with BinaryWriter.Write for creating byte[] data and the BitConverter
set of methods to extract values from byte[].

I keep telling myself that staying entirely within managed code is
worth all that effort, but I am still early in the process. Some old
timers here can probably fill you in on much more detail, including
the virtues of doing so.
Staying in managed code may be worth the effort, converting to C# is not.
The C++ compiler still makes better code than the JIT can.

Aug 25 '07 #17

P: n/a

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:uu****************@TK2MSFTNGP03.phx.gbl...
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:Oh**************@TK2MSFTNGP02.phx.gbl...
>hi,
So, here is the code to get my checksum. If you see something that can be
optimized, let me know.

thanks

ThunderMusic

Expand|Select|Wrap|Line Numbers
  1. private const UInt16 MOD_ADLER = 65521;
  2. private unsafe uint GetChecksum(byte[] databytes)
  3. {
  4.    UInt32 a = 1, b = 0;
  5.    fixed (byte* tmpdata = databytes)
  6.    {
  7.        byte* data = tmpdata;
  8.        int len = databytes.Length; /* Length in bytes */
  9.        while (len 0)
  10.        {
  11.            int tlen = len 5550 ? 5550 : len;
  12.            len -= tlen;
  13.            do
  14.            {
  15.                a += *(data++);
  16.                b += a;
  17.            } while ((--tlen) 0);
  18.            a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
  19.            b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  20.        }
  21.        /* It can be shown that a <= 0x1013a here, so a single subtract
  22. will do. */
  23.        if (a >= MOD_ADLER)
  24.            a -= MOD_ADLER;
  25.        /* It can be shown that b can reach 0xffef1 here. */
  26.        b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
  27.        if (b >= MOD_ADLER)
  28.            b -= MOD_ADLER;
  29.    }
  30.    return ((b << 16) | a);
  31. }

You can try to do some partial loop unrolling, this is what a C++ compiler
does and what the C# compiler/JIT unfortunately fails to do.
So, why not use .NET's C++ compiler (C++/CLI)? You have no
managed/unmanaged overhead, you get pointers without pinning (interior_ptr),
and you get the best optimizing compiler available for .NET

In this particular case, Duff's Device is ideal for loop unrolling without
overhead, and only C++ can do it.

This particular function is the poster child for C++/CLI if I've ever seen
one.

public ref class Checksum
{
literal unsigned MOD_ADLER = 65521;

public:
static unsigned Adler( array<System::Byte>^ databytes )
{
unsigned a = 1, b = 0;
interior_ptr<System::Bytedata = &databytes[0];
unsigned len = databytes->Length; /* Length in bytes */
while (len 5550)
{
int iters = 555;
do {
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
a += *(data++);
b += a;
} while (--iters);
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
len -= 5550;
}

int iters = (len + 7) >3;
switch (len & 7)
{
do {
case 0:
a += *(data++);
b += a;
case 7:
a += *(data++);
b += a;
case 6:
a += *(data++);
b += a;
case 5:
a += *(data++);
b += a;
case 4:
a += *(data++);
b += a;
case 3:
a += *(data++);
b += a;
case 2:
a += *(data++);
b += a;
case 1:
a += *(data++);
b += a;
} while (--iters);
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
/* It can be shown that a <= 0x1013a here, so a single subtract will
do. */
if (a >= MOD_ADLER)
a -= MOD_ADLER;
/* It can be shown that b can reach 0xffef1 here. */
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
if (b >= MOD_ADLER)
b -= MOD_ADLER;
return ((b << 16) | a);
}
};

Aug 25 '07 #18

P: n/a
I use a StopWatch before (start) and after (stop) a loop of 100000 calls to
the function... It's not the most effective way, but it sure gives an idea
of the performance we can get...
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:OG**************@TK2MSFTNGP04.phx.gbl...
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:u$**************@TK2MSFTNGP03.phx.gbl...
>wow!! that's pretty impressive... this little modification made the
loop 2 to 3 times faster!!

2 to 3 times faster? It's hard to believe honestly, I would have expected
something like a 20-30% improvement. Are you sure about your time
measurement techniques?

Willy.

Aug 27 '07 #19

P: n/a

"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:04**********************************@microsof t.com...
>
"r norman" <r_s_norman@_comcast.netwrote in message
news:dq********************************@4ax.com...
>On Thu, 23 Aug 2007 15:36:18 -0400, "ThunderMusic"
<No*************************@NoSpAm.comwrote:
>>>Hi,
The subject says it all... I want to use a byte[] and use it as byte* so
I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?

I have recently started with C#, converting C++ applications that must
communicate with external equipment using protocols with rigidly fixed
byte sequences. Initially I was terribly frustrated abandoning my
precious pointer manipulation and access to byte arrays. But I
developed a bunch of utilities to create managed code byte[] arrays
from pieces and extract appropriate values from them and can now avoid
all unmanaged code. Look into Encoding.Convert for converting between
ASCII byte sequences and Unicode, MemoryStream.Write and ToArray along
with BinaryWriter.Write for creating byte[] data and the BitConverter
set of methods to extract values from byte[].

I keep telling myself that staying entirely within managed code is
worth all that effort, but I am still early in the process. Some old
timers here can probably fill you in on much more detail, including
the virtues of doing so.

Staying in managed code may be worth the effort, converting to C# is not.
The C++ compiler still makes better code than the JIT can.
hi,
That's an idea... I'll look into it... probably that some methods should
be more suited to C++ because I'm seeking a bit of performance for key
features in my app.. I never think about managed C++, but it's sure to be a
good idea... I don't want to make a C++ library for only one method
tought...

Thanks a lot

ThunderMusic
Aug 27 '07 #20

P: n/a
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:Ow**************@TK2MSFTNGP02.phx.gbl...
>
"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:04**********************************@microsof t.com...
>>
"r norman" <r_s_norman@_comcast.netwrote in message
news:dq********************************@4ax.com.. .
>>On Thu, 23 Aug 2007 15:36:18 -0400, "ThunderMusic"
<No*************************@NoSpAm.comwrote:

Hi,
The subject says it all... I want to use a byte[] and use it as byte* so
I
can increment the pointer to iterate through it.

What is the fastest way of doing so in C#?
I have recently started with C#, converting C++ applications that must
communicate with external equipment using protocols with rigidly fixed
byte sequences. Initially I was terribly frustrated abandoning my
precious pointer manipulation and access to byte arrays. But I
developed a bunch of utilities to create managed code byte[] arrays
from pieces and extract appropriate values from them and can now avoid
all unmanaged code. Look into Encoding.Convert for converting between
ASCII byte sequences and Unicode, MemoryStream.Write and ToArray along
with BinaryWriter.Write for creating byte[] data and the BitConverter
set of methods to extract values from byte[].

I keep telling myself that staying entirely within managed code is
worth all that effort, but I am still early in the process. Some old
timers here can probably fill you in on much more detail, including
the virtues of doing so.

Staying in managed code may be worth the effort, converting to C# is not.
The C++ compiler still makes better code than the JIT can.

hi,
That's an idea... I'll look into it... probably that some methods should
be more suited to C++ because I'm seeking a bit of performance for key
features in my app.. I never think about managed C++, but it's sure to be
a good idea... I don't want to make a C++ library for only one method
tought...

Thanks a lot

ThunderMusic

Managed C# is just as fast as C++/CLI in this case, the C++/CLI compiler
(C++/CLI to IL) doesn't "unroll" the inner loop (as illustrated by Ben), at
present there is no single managed compiler that performs "loop unrolling".
The JIT compiler (common for all managed languages) doesn't care to unroll
either.

Choosing the right algorithm is key for better performance not the (managed)
language choice.
Just to give you an idea, I optimized the algorithm in the following
function using C# with pointers (fixed/unsafe = non verifiable code).

private const int BASE = 65521; /* largest prime smaller than 65536 */
private const int NMAX = 5552;
//using an unroll factor of 16,
// compile with /o+
private static uint Adler(byte[] databytes)
{
uint adler = 1;
uint s1 = adler & 0xffff;
uint s2 = (adler >16) & 0xffff;
uint k;

if (databytes == null) return 1;
uint len = (uint)databytes.Length;
unsafe
{
fixed (byte* fixeddataPtr = &databytes[0])
{
byte* ptr = fixeddataPtr;
while (len 0)
{
k = len < NMAX ? len : NMAX;
len -= k;
while (k >= 16)
{
s1 += ptr[0];
s2 += s1;
s1 += ptr[1];
s2 += s1;
s1 += ptr[2];
s2 += s1;
s1 += ptr[3];
s2 += s1;
s1 += ptr[4];
s2 += s1;
s1 += ptr[5];
s2 += s1;
s1 += ptr[6];
s2 += s1;
s1 += ptr[7];
s2 += s1;
s1 += ptr[8];
s2 += s1;
s1 += ptr[9];
s2 += s1;
s1 += ptr[10];
s2 += s1;
s1 += ptr[11];
s2 += s1;
s1 += ptr[12];
s2 += s1;
s1 += ptr[13];
s2 += s1;
s1 += ptr[14];
s2 += s1;
s1 += ptr[15];
s2 += s1;

ptr += 16;
k -= 16;
}

if (k != 0)
{
int i = -1;
do
{
s1 += ptr[++i];
s2 += s1;
} while (--k 0);
}
s1 %= BASE;
s2 %= BASE;
}
}
}
return (s2 << 16) | s1;
}

Calling this function 4000 times on my box, using a byte array of 27500
random values, takes ~ 63 msecs.
The C++/CLI code (optimized build /O2) posted by Ben takes 95 milliseconds
to run the same test.

The same algorithm used in C# using array indexing (no unsafe constructs)
takes 105 msecs, that is only 10 % slower than the C++/CLI code.

private static uint AdlerSafe(byte[] ba)
{
uint adler = 1;
uint s1 = adler & 0xffff;
uint s2 = (adler >16) & 0xffff;
uint k;

if (ba == null) return 1;
uint len = (uint)ba.Length;
int offset = 0;
ArraySegment<byteptrSegm = new ArraySegment<byte>();
while (len 0)
{
k = len < NMAX ? len : NMAX;
len -= k;

while (k >= 16)
{
ptrSegm = new ArraySegment<byte>( ba, offset, 16);
int i = ptrSegm.Offset - 1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;

offset += 16;
k -= 16;
}
if (k != 0)
{
ptrSegm = new ArraySegment<byte>( ba, offset, (int)k);
do {

s1 += ptrSegm.Array[offset++];
s2 += s1;
} while (--k 0);
}
s1 %= BASE;
s2 %= BASE;
}

return (s2 << 16) | s1;
}
See, unsafe code is the fastest, but verifiable C# may be fast enough.
Either way, I don't like this kind of hand optimizing ( nor do I like
"unsafe" C# code), it's up to the compilers to take care about this,
unfortunately at present they don't.

Willy.
Aug 27 '07 #21

P: n/a
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP06.phx.gbl...
"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:Ow**************@TK2MSFTNGP02.phx.gbl...
>>
"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:04**********************************@microso ft.com...
>>>
"r norman" <r_s_norman@_comcast.netwrote in message
news:dq********************************@4ax.com. ..
On Thu, 23 Aug 2007 15:36:18 -0400, "ThunderMusic"
<No*************************@NoSpAm.comwrote:

>Hi,
>The subject says it all... I want to use a byte[] and use it as byte*
>so I
>can increment the pointer to iterate through it.
>
>What is the fastest way of doing so in C#?
>

I have recently started with C#, converting C++ applications that must
communicate with external equipment using protocols with rigidly fixed
byte sequences. Initially I was terribly frustrated abandoning my
precious pointer manipulation and access to byte arrays. But I
developed a bunch of utilities to create managed code byte[] arrays
from pieces and extract appropriate values from them and can now avoid
all unmanaged code. Look into Encoding.Convert for converting between
ASCII byte sequences and Unicode, MemoryStream.Write and ToArray along
with BinaryWriter.Write for creating byte[] data and the BitConverter
set of methods to extract values from byte[].

I keep telling myself that staying entirely within managed code is
worth all that effort, but I am still early in the process. Some old
timers here can probably fill you in on much more detail, including
the virtues of doing so.

Staying in managed code may be worth the effort, converting to C# is
not. The C++ compiler still makes better code than the JIT can.

hi,
That's an idea... I'll look into it... probably that some methods
should be more suited to C++ because I'm seeking a bit of performance for
key features in my app.. I never think about managed C++, but it's sure
to be a good idea... I don't want to make a C++ library for only one
method tought...

Thanks a lot

ThunderMusic


Managed C# is just as fast as C++/CLI in this case, the C++/CLI compiler
(C++/CLI to IL) doesn't "unroll" the inner loop (as illustrated by Ben),
at present there is no single managed compiler that performs "loop
unrolling". The JIT compiler (common for all managed languages) doesn't
care to unroll either.

Choosing the right algorithm is key for better performance not the
(managed) language choice.
Just to give you an idea, I optimized the algorithm in the following
function using C# with pointers (fixed/unsafe = non verifiable code).

private const int BASE = 65521; /* largest prime smaller than 65536 */
private const int NMAX = 5552;
//using an unroll factor of 16,
// compile with /o+
private static uint Adler(byte[] databytes)
{
uint adler = 1;
uint s1 = adler & 0xffff;
uint s2 = (adler >16) & 0xffff;
uint k;

if (databytes == null) return 1;
uint len = (uint)databytes.Length;
unsafe
{
fixed (byte* fixeddataPtr = &databytes[0])
{
byte* ptr = fixeddataPtr;
while (len 0)
{
k = len < NMAX ? len : NMAX;
len -= k;
while (k >= 16)
{
s1 += ptr[0];
s2 += s1;
s1 += ptr[1];
s2 += s1;
s1 += ptr[2];
s2 += s1;
s1 += ptr[3];
s2 += s1;
s1 += ptr[4];
s2 += s1;
s1 += ptr[5];
s2 += s1;
s1 += ptr[6];
s2 += s1;
s1 += ptr[7];
s2 += s1;
s1 += ptr[8];
s2 += s1;
s1 += ptr[9];
s2 += s1;
s1 += ptr[10];
s2 += s1;
s1 += ptr[11];
s2 += s1;
s1 += ptr[12];
s2 += s1;
s1 += ptr[13];
s2 += s1;
s1 += ptr[14];
s2 += s1;
s1 += ptr[15];
s2 += s1;

ptr += 16;
k -= 16;
}

if (k != 0)
{
int i = -1;
do
{
s1 += ptr[++i];
s2 += s1;
} while (--k 0);
}
s1 %= BASE;
s2 %= BASE;
}
}
}
return (s2 << 16) | s1;
}

Calling this function 4000 times on my box, using a byte array of 27500
random values, takes ~ 63 msecs.
The C++/CLI code (optimized build /O2) posted by Ben takes 95 milliseconds
to run the same test.

The same algorithm used in C# using array indexing (no unsafe constructs)
takes 105 msecs, that is only 10 % slower than the C++/CLI code.

private static uint AdlerSafe(byte[] ba)
{
uint adler = 1;
uint s1 = adler & 0xffff;
uint s2 = (adler >16) & 0xffff;
uint k;

if (ba == null) return 1;
uint len = (uint)ba.Length;
int offset = 0;
ArraySegment<byteptrSegm = new ArraySegment<byte>();
while (len 0)
{
k = len < NMAX ? len : NMAX;
len -= k;

while (k >= 16)
{
ptrSegm = new ArraySegment<byte>( ba, offset, 16);
int i = ptrSegm.Offset - 1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;
s1 += ptrSegm.Array[++i];
s2 += s1;

offset += 16;
k -= 16;
}
if (k != 0)
{
ptrSegm = new ArraySegment<byte>( ba, offset, (int)k);
do {

s1 += ptrSegm.Array[offset++];
s2 += s1;
} while (--k 0);
}
s1 %= BASE;
s2 %= BASE;
}

return (s2 << 16) | s1;
}
See, unsafe code is the fastest, but verifiable C# may be fast enough.
Either way, I don't like this kind of hand optimizing ( nor do I like
"unsafe" C# code), it's up to the compilers to take care about this,
unfortunately at present they don't.

Willy.


Sorry, above does not contain the correct verifiable C# function, here she
is........

private const int MOD_ADLER = 65521;
private static uint AdlerSafe(byte[] databytes)
{
const int unrollFactor = 16;
uint a = 1, b = 0;
int n = -1;
int len = databytes.Length;
while(len 0)
{
int tlen = len 5552 ? 5552 : len;
len -= tlen;

for(int c = 0; c < tlen / unrollFactor; c++)
{
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);

for(int c = 0; c < tlen % unrollFactor; c++)
{
a += databytes[++n];
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
}
/* It can be shown that a <= 0x1013a here, so a single subtract will
do. */
if (a >= MOD_ADLER)
a -= MOD_ADLER;
/* It can be shown that b can reach 0xffef1 here. */
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
if (b >= MOD_ADLER)
b -= MOD_ADLER;
return ((b << 16) | a);
}
}

Aug 27 '07 #22

P: n/a
Sorry, above does not contain the correct verifiable C# function, here she
is........

private const int MOD_ADLER = 65521;
private static uint AdlerSafe(byte[] databytes)
{
const int unrollFactor = 16;
uint a = 1, b = 0;
int n = -1;
int len = databytes.Length;
while(len 0)
{
int tlen = len 5552 ? 5552 : len;
Do you get the same result after changing this number? I would guess
probably so, in which case that could make a significant difference as well.
len -= tlen;

for(int c = 0; c < tlen / unrollFactor; c++)
{
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
a += databytes[++n];
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);

for(int c = 0; c < tlen % unrollFactor; c++)
{
a += databytes[++n];
b += a;
}
a = (a & 0xffff) + (a >16) * (65536 - MOD_ADLER);
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
}
/* It can be shown that a <= 0x1013a here, so a single subtract will
do. */
if (a >= MOD_ADLER)
a -= MOD_ADLER;
/* It can be shown that b can reach 0xffef1 here. */
b = (b & 0xffff) + (b >16) * (65536 - MOD_ADLER);
if (b >= MOD_ADLER)
b -= MOD_ADLER;
return ((b << 16) | a);
}
}

Aug 28 '07 #23

P: n/a
"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:eR**************@TK2MSFTNGP05.phx.gbl...
>
>Sorry, above does not contain the correct verifiable C# function, here
she
is........

private const int MOD_ADLER = 65521;
private static uint AdlerSafe(byte[] databytes)
{
const int unrollFactor = 16;
uint a = 1, b = 0;
int n = -1;
int len = databytes.Length;
while(len 0)
{
int tlen = len 5552 ? 5552 : len;

Do you get the same result after changing this number? I would guess
probably so, in which case that could make a significant difference as
well.
Changing into what?
If you mean into 5550, it makes no real (measurable) difference in
performance.
Take care, this is not the true for the "optimized C# (unsafe)" sample
(based on marc adler's adler32 algorithm), here the NMAX length must be a
multiple of the "unroll factor" while NMAX is the largest n such that
255n(n+1)/2 + (n+1)(BASE-1) <= (2^32)-1 (as per the adler32 algorithm)

5552 is the largest possible value for n, while this value is perfectly
suitable for an "unroll factor" of 16.
Changing NMAX into 5550 (which is a valid number), requires you to change
the "unroll factor" into 15, else the result of the checksum will be
incorrect!

Willy.

Aug 28 '07 #24

P: n/a

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:eu**************@TK2MSFTNGP02.phx.gbl...
"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:eR**************@TK2MSFTNGP05.phx.gbl...
>>
>>Sorry, above does not contain the correct verifiable C# function, here
she
is........

private const int MOD_ADLER = 65521;
private static uint AdlerSafe(byte[] databytes)
{
const int unrollFactor = 16;
uint a = 1, b = 0;
int n = -1;
int len = databytes.Length;
while(len 0)
{
int tlen = len 5552 ? 5552 : len;

Do you get the same result after changing this number? I would guess
probably so, in which case that could make a significant difference as
well.

Changing into what?
If you mean into 5550, it makes no real (measurable) difference in
performance.
Take care, this is not the true for the "optimized C# (unsafe)" sample
(based on marc adler's adler32 algorithm), here the NMAX length must be a
multiple of the "unroll factor" while NMAX is the largest n such that
255n(n+1)/2 + (n+1)(BASE-1) <= (2^32)-1 (as per the adler32 algorithm)

5552 is the largest possible value for n, while this value is perfectly
suitable for an "unroll factor" of 16.
Changing NMAX into 5550 (which is a valid number), requires you to change
the "unroll factor" into 15, else the result of the checksum will be
incorrect!
No fair, you're actually familiar with the algorithm in question!
>
Willy.

Aug 29 '07 #25

This discussion thread is closed

Replies have been disabled for this discussion.