Access individual bytes of a 4 byte long (optimization)

anon.asdf

Hi!

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

Is version A, version B, or version C better? Are there other
alternatives?

/**** Version A ******/
{
long mylong = -1;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
(unsigned char) mylong , \
(unsigned char) (mylong >8), \
(unsigned char) (mylong >>16), \
(unsigned char) (mylong >>24));
}

/**** Version B ******/
{
long mylong = -1;
unsigned char f_b[4];

*((long *)&f_b) = mylong;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", f_b[0], f_b[1], f_b[2],
f_b[3]);
}

/**** Version C ******/
{
union align_array_and_long {
unsigned char four_b[4];
long dummy;
};

long mylong = -1;
union align_array_and_long four;

four = (union align_array_and_long) mylong;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
four.four_b[0], \
four.four_b[1], \
four.four_b[2], \
four.four_b[3]);
}
My feeling is the Version C is best.

What can be said about the alignment of array f_b and mylong in
Version B?
(I think in Version B, the alignment of array f_b and mylong might be
skew, in which case it is slower than C. If in Version B, four_b and
mylong are aligned, then Version B is identical to Version C.?)

..
..
..

Now what if one needs to access the individual bytes the *whole time*?
Is A2, B2, C2 or D2 faster?

/**** Version A2 ******/
{
long mylong = -1;
unsigned char b0, b1, b2, b3;

b0 = (unsigned char) mylong;
b1 = (unsigned char) (mylong >8);
b2 = (unsigned char) (mylong >>16);
b3 = (unsigned char) (mylong >>24);

// access: b0, b1, b2, b3
}

/**** Version B2 ******/
{
long mylong = -1;
unsigned char f_b[4];

*((long *)&f_b) = mylong;

// access: f_b[0], f_b[1], f_b[2], f_b[3]
}

/**** Version C2 ******/
{
union align_array_and_long {
unsigned char four_b[4];
long dummy;
};

long mylong = -1;
union align_array_and_long four;

four = (union align_array_and_long) mylong;

// access: four.four_b[0], four.four_b[1], four.four_b[2],
four.four_b[3]
}

/**** Version D2 ******/
{
struct four_struct {
unsigned char byte0;
unsigned char byte1;
unsigned char byte2;
unsigned char byte3;
};

union align_array_and_long {
struct four_struct four_s;
long dummy;
};

long mylong = -1;
union align_array_and_long four;

four = (union align_array_and_long) mylong;

// access: four.four_s.byte0, four.four_s.byte1,
four.four_s.byte2, four.four_s.byte3

}

My feeling is the Version D2 is best: mylong is loaded into four in
one shot (no shifts etc. as in A2).

And in D2 the compiler always knows that we specify exactly which byte
we want:
four.four_s.byte0
This is different in C2: four.four_b[which_byte]
Or is it really different? :
are these 2 equivalent: four.four_s.byte0 <--four.four_b[0] ???

..
..
..

Version A and A2 are portable in terms of endianness, but the question
is not about portability - it's about optimization for a given
platform.

Thanks.

anon.asdf

Aug 10 '07 #1

Subscribe Post Reply

2186

Chris Dollin

an*******@gmail.com wrote:

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

Is version A, version B, or version C better?

Measure them and find out.

Now what if one needs to access the individual bytes the *whole time*?
Is A2, B2, C2 or D2 faster?

Measure them and find out.

--
Chris "performance is nothing without measurement" Dollin

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

Aug 10 '07 #2

Richard Bos

an*******@gmail.com wrote:

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

Is version A, version B, or version C better?

Mu.

Rule one of micro-optimisation:
Don't Do It.
Rule two of micro-optimisation (for experts only!):
Don't Do It Yet.
Rule three of micro-optimisation (only under duress):
Measure, Measure, Measure.

Unless you _know_ that it matters, assume that it doesn't, and write the
clearest code. If you think you do know that it matters, first gather
evidence. Only by measuring which is the fastest will you know which is
the fastest - on your machine, using your implementation, in your
project, under your optimisation settings. And don't be surprised to
find out that you were wrong, and the difference is no more than 0.5%,
with an error of 1%.

Richard

Aug 10 '07 #3

Ben Bacarisse

an*******@gmail.com writes:

Hi!

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

Is version A, version B, or version C better? Are there other
alternatives?

/**** Version A ******/
{
long mylong = -1;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
(unsigned char) mylong , \
(unsigned char) (mylong >8), \
(unsigned char) (mylong >>16), \
(unsigned char) (mylong >>24));
}

/**** Version B ******/
{
long mylong = -1;
unsigned char f_b[4];

*((long *)&f_b) = mylong;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", f_b[0], f_b[1], f_b[2],
f_b[3]);
}

/**** Version C ******/
{
union align_array_and_long {
unsigned char four_b[4];
long dummy;
};

long mylong = -1;
union align_array_and_long four;

four = (union align_array_and_long) mylong;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
four.four_b[0], \
four.four_b[1], \
four.four_b[2], \
four.four_b[3]);
}
My feeling is the Version C is best.

For the fastest, try:

printf("0x%08lx\n", mylong); /* :-) */

Versions B and C, invoke undefined behaviour. The defined way to do
version B is:

void *vp = &mylong;
unsigned char *cp = vp;
/* now do what you want with cp[0] to cp[sizeof long] */

There is no need to lie about having an array. Version C is very
likely to work, but the standard does not guarantee accesses to any
union member other than the last one assigned to (barring the special
exception for "common initial members").

Similar comments apply to the your other code fragments.

--
Ben.

Aug 10 '07 #4

Eric Sosman

an*******@gmail.com wrote:

Hi!

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

1) Your question is about your environment -- machine(s),
compiler(s), O/S(es), etc. -- and not about C. Seek a forum
where the experts on your environment hang out.

2) If "as fast as possible" is really your goal, you
should not be using C, nor even assembly. Custom-built
hardware is the way to go. Seek a forum where chip designers
hang out.

3) This is the second time in recent days that you've
given "I want" as the only reason for doing something. You
may not understand it yet, but the context of the "I want"
can often have a huge influence on the speed of whatever code
you wind up with. For example: Is this long just sitting
around in memory, or is it the result of a recent computation
and perhaps still available in a register? Seek a forum where
compiler experts hang out.

--
Eric Sosman
es*****@ieee-dot-org.invalid

Aug 10 '07 #5

anon.asdf

For the fastest, try:

printf("0x%08lx\n", mylong); /* :-) */

Versions B and C, invoke undefined behaviour. The defined way to do
version B is:

void *vp = &mylong;
unsigned char *cp = vp;
/* now do what you want with cp[0] to cp[sizeof long] */

Very good comment, about using a pointer that way!!
Thanks!
anon.asdf

Aug 10 '07 #6

Army1987

On Fri, 10 Aug 2007 03:35:38 -0700, anon.asdf wrote:

Hi!

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

Is version A, version B, or version C better? Are there other
alternatives?

/**** Version A ******/
{
long mylong = -1;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
(unsigned char) mylong , \
(unsigned char) (mylong >8), \
(unsigned char) (mylong >>16), \
(unsigned char) (mylong >>24));

Bitwise shifts on negative integers are implementation-defined,
and that needn't have anything to do with endianness.

}

/**** Version B ******/
{
long mylong = -1;
unsigned char f_b[4];

*((long *)&f_b) = mylong;

#include <string.h>
memcpy(f_b, &mylong, 4);
This does the same thing you were trying to do, without the risk
of disasters if f_b doesn't happen to be correctly aligned for a
long.

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", f_b[0], f_b[1], f_b[2],
f_b[3]);
}

/**** Version C ******/
{
union align_array_and_long {
unsigned char four_b[4];
long dummy;
};

long mylong = -1;
union align_array_and_long four;

four = (union align_array_and_long) mylong;

You meant four.dummy = -1? You can only cast into a scalar type,
which a union is not.
Also, accessing one member of an union other than the last one
written in is UB, so I think the compiler is allowed to optimize
away an assignment to four.dummy if its value is not used.

printf("0x%02x 0x%02x 0x%02x 0x%02x\n", \
four.four_b[0], \
four.four_b[1], \
four.four_b[2], \
four.four_b[3]);
}

[snip]

Version A and A2 are portable in terms of endianness, but the

question

is not about portability - it's about optimization for a given platform.

Whoever implemented memcpy on your platform is likely to know what
is more efficient on that specific platform better than you do.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 10 '07 #7

Walter Roberson

In article <87************@bsb.me.uk>,
Ben Bacarisse <be********@bsb.me.ukwrote:

>The defined way to do
version B is:

void *vp = &mylong;
unsigned char *cp = vp;
/* now do what you want with cp[0] to cp[sizeof long] */

Make that cp[sizeof long - 1]
--
Programming is what happens while you're busy making other plans.

Aug 10 '07 #8

Ben Bacarisse

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) writes:

In article <87************@bsb.me.uk>,
Ben Bacarisse <be********@bsb.me.ukwrote:
>>The defined way to do
version B is:

void *vp = &mylong;
unsigned char *cp = vp;
/* now do what you want with cp[0] to cp[sizeof long] */

Make that cp[sizeof long - 1]

Of course, thanks.

--
Ben.

Aug 10 '07 #9

Chris Torek

In article <11*********************@j4g2000prf.googlegroups.c om>
<an*******@gmail.comwrote:

>On a machine of *given architecture* ...

OK, I give you "MIPS" as the architecture (using the MIPS compilers).

>... I want to access the individual bytes of a long (*once-off*)
as fast as possible.

Oops, now you have to decide whether this is a 32-bit MIPS (ILP32
model) or a 64-bit MIPS (I32LP64 model -- i.e., long is eight 8-bit
bytes long).

>Is version A, version B, or version C better?

[where A is shift-and-mask, and B and C go through RAM]

On most compilers, version A will be *far* faster than almost
anything else. In fact, since your original code fragment had the
variable set to a constant, if you compile with optimization, the
four or eight extracted sub-parts will also be constants.

Interesting side note: if the architecture is changed to the original
DEC (now Compaq) Alpha, "byte" accesses to RAM are handled in the
compiler by doing full 8-byte machine-word accesses and then using
shift-and-mask instructions, because that is how the machine *has*
to do it. (There are special instructions like "zap" for working
with the eight 8-bit "byte fields" of a register, but loads and
stores are always full 64-bit operations.)

(The MIPS architecture is a lot more common though, as it is found
in various home gaming systems.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Aug 10 '07 #10

pete

an*******@gmail.com wrote:

>
Hi!

On a machine of *given architecture* (in terms of endianness etc.), I
want to access the individual bytes of a long (*once-off*) as fast as
possible.

/* BEGIN new.c */

#include <stdio.h>

int main (void)
{
long mylong = 0x12345678;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

/* END new.c */
--
pete

Aug 10 '07 #11

Walter Roberson

In article <46**********@mindspring.com>, pete <pf*****@mindspring.comwrote:

>an*******@gmail.com wrote:

>#include <stdio.h>

>int main (void)
{
long mylong = 0x12345678;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

What if sizeof(long) 4 ?
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton

Aug 10 '07 #12

pete

Walter Roberson wrote:

>
In article <46**********@mindspring.com>, pete <pf*****@mindspring.comwrote:
an*******@gmail.com wrote:

#include <stdio.h>

int main (void)
{
long mylong = 0x12345678;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

What if sizeof(long) 4 ?

/* BEGIN new.c */

#include <stdio.h>
#include <assert.h>

int main (void)
{
long mylong = 0x12345678;

assert(sizeof(long) == 4);
printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

/* END new.c */

--
pete

Aug 10 '07 #13

¬a\\/b

On Fri, 10 Aug 2007 22:43:12 +0000 (UTC), (Walter Roberson) wrote:

>In article, pete wrote:
>>#include <stdio.h>

>>int main (void)
{
long mylong = 0x12345678;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

What if sizeof(long) 4 ?

what if this?

#include <stdio.h>
#include <limits.h>

int main (void)
{long mylong=0x123456789abcdef, prova=0xFF12;
unsigned char *a; int i;

if(CHAR_BIT!=8) return 0;

a= (char*) &mylong;
printf("Valore 0X%x\n",
(unsigned) ((unsigned char*)&prova)[sizeof(long)-1]);

if( ((unsigned char*)&prova)[sizeof(long)-1] == 0x12)
{for(i=0; i<sizeof(long); ++i)
printf("0x%02x ", (unsigned) a[i]);
}
else {for(i=sizeof(long)-1; i>=0; --i)
printf("0x%02x ", (unsigned) a[i]);
}

printf("\n");
return 0;
}

or this? How many UB do you find?
i find one in first example none in the below

#include <stdio.h>
#include <limits.h>

int main (void)
{long mylong=0x123456789abcdef;
unsigned long prova, r;
unsigned char *a;
int i;

if(CHAR_BIT!=8) return 0;
prova=0xFF;
for(i=sizeof(long)-1, prova<<=i*8; i>=0 ; prova>>=8, --i)
{r=((unsigned long)mylong & prova)>>(i*8);
printf("0x%02x ", r);
}

printf("\n");
return 0;
}

Aug 11 '07 #14

¬a\\/b

On Sat, 11 Aug 2007 09:36:17 +0200, "¬a\\/b" <al@f.gwrote:

>or this? How many UB do you find?
i find one in first example none in the below

UB in the sense the implementation give the correct result or nothing(
char_bit!=8)

>#include <stdio.h>
#include <limits.h>

int main (void)
{long mylong=0x123456789abcdef;
unsigned long prova, r;
unsigned char *a;
int i;

if(CHAR_BIT!=8) return 0;
prova=0xFF;
for(i=sizeof(long)-1, prova<<=i*8; i>=0 ; prova>>=8, --i)
{r=((unsigned long)mylong & prova)>>(i*8);
printf("0x%02x ", r);

okok printf("0x%02x ", (unsigned) r);

}

printf("\n");
return 0;
}

not take all to siriusly it is the summer time i have to say something
:)

Aug 11 '07 #15

Army1987

On Fri, 10 Aug 2007 22:43:12 +0000, Walter Roberson wrote:

In article <46**********@mindspring.com>, pete <pf*****@mindspring.comwrote:
>>an*******@gmail.com wrote:

>>#include <stdio.h>

>>int main (void)
{
long mylong = 0x12345678;

printf("0x%02x 0x%02x 0x%02x 0x%02x\n",
((unsigned char *)&mylong)[0],
((unsigned char *)&mylong)[1],
((unsigned char *)&mylong)[2],
((unsigned char *)&mylong)[3]);
return 0;
}

What if sizeof(long) 4 ?

#include <stdio.h>
int main(void)
{
long mylong = 0x12345678;
unsigned char *ptr;
for (ptr = &mylong; ptr < &mylong + 1; ptr++)
printf("0x%02x ", *ptr);
putchar('\n');
return 0;
}
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 11 '07 #16

pete

¬a\/b wrote:

>
On Sat, 11 Aug 2007 09:36:17 +0200, "¬a\\/b" <al@f.gwrote:

or this? How many UB do you find?
i find one in first example none in the below

UB in the sense the implementation give the correct result or nothing(
char_bit!=8)

That's implementation defined behavior.

>
#include <stdio.h>
#include <limits.h>

int main (void)
{long mylong=0x123456789abcdef;

The result of assigning a 45 bit integer value to a long,
is also implementation defined.

--
pete

Aug 11 '07 #17

by: Skwerl | last post by:

Hi guys. I've written code to embed an ICC profile in a TIFF image, and I think my IO operations are slowing things down. It is taking about a second to embed each tag in 7-meg TIFF files. ...

.NET Framework

Convert VB to C# for saving/reading an image from Access DB

by: dale zhang | last post by:

Hi, I am trying to save and read an image from MS Access DB based on the following article: http://www.vbdotnetheaven.com/Code/Sept2003/2175.asp Right now, I saved images without any...

C# / C Sharp

Get last access date of desktop shortcut

by: yxq | last post by:

Hello, The XP Desktop clean wizard can get the last access time of desktop shortcut, i found that the info come from ...

Visual Basic .NET

Outdated help (feat. Access '97 and VB4)

by: ZillionDollarSadist | last post by:

Hello, I'm working at a simple Access '97 + VB4 application, and I ran into a terrible problem: something I never modified now gives me a totally unwanted "Invalid use of null" error. It happens...

Microsoft Access / VBA

Byte ordering and array access

by: Benjamin M. Stocks | last post by:

Hello all, I've heard differing opinions on this and would like a definitive answer on this once and for all. If I have an array of 4 1-byte values where index 0 is the least signficant byte of a...

C / C++

How to access the individual bytes of LONGLONG value

by: shaji | last post by:

Hi, Could anybody tell me how to access the individual bytes that consists of a Windows LONGLONG value?

C / C++

Upload file and save it in database (Access 2003)

by: kev | last post by:

Hi folks, I have created a database to store information on equipments. During the first level of registration, there is a form that i need the user to fill up details on the equipment testing....

Microsoft Access / VBA

C# Access level string to byte

by: arggg | last post by:

I have a varchar in the database 52669 ( or something like that) that I store in a namecollection with other variables for user info and would like to change it to bytes so I can use bitwise to see...

.NET Framework

To access the bytes of floating point number

by: avais | last post by:

I am transmitting a numer over a uart and reading the number in Matlab by using fread function. To transmit a number that occupies more than byte. I use and operation to access the individual bytes...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Access individual bytes of a 4 byte long (optimization)

Similar topics