471,049 Members | 1,656 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,049 software developers and data experts.

Use of memcpy() to transfer from memory to a variable

For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the 4-byte
target variable must be zero. Is the following use of memcpy() a
well-defined way of so doing? The code is written knowing that
sizeof(unsigned long) == 4 in this instance. The code is somewhat contrived
in order to provide a self-contained program that will compile and show the
use of memcpy() I am asking about.

The following code clean compiles using

gcc -Wall -ansi -pedantic

where

gcc -dumpversion displays

4.10.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{

unsigned long value;
unsigned char *pAddress;
unsigned char rem;

pAddress = malloc(3);
if ( pAddress == NULL )
{
puts("malloc failed");
exit(EXIT_FAILURE);
}

/* load in some arbitrary values */
pAddress[0] = 1;
pAddress[1] = 2;
pAddress[2] = 4;
rem = 3; /* hard-wired for demo - normally calculated */

if ( rem )
{
value = 0;
memcpy( &value, pAddress, rem );
}

/* for demo - shows values in pAddress have been transferred */
printf("value = 0x%0lX\n", value);

return 0;
}

--
Martin

May 18 '07 #1
29 3675
[In a model post, much of which I've snipped for brevity but please do
see the original rather than deducing fault on his part from omissions
on mine]...

Martin said:
For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the
4-byte target variable must be zero. Is the following use of memcpy()
a well-defined way of so doing?
No, but it's not exactly undefined either.
unsigned long value;
unsigned char *pAddress;
unsigned char rem;

pAddress = malloc(3);
if ( pAddress == NULL )
{
So far so good (and <snip>)
/* load in some arbitrary values */
pAddress[0] = 1;
pAddress[1] = 2;
pAddress[2] = 4;
rem = 3; /* hard-wired for demo - normally calculated */

if ( rem )
{
value = 0;
memcpy( &value, pAddress, rem );
Okay, this is certainly legal, given that sizeof(unsigned long) is 4, as
stated in your article.

But what do you actually get? Answer: it depends on the byte ordering
that pertains to your implementation. If you have little endian
integers, you'll get one result, and if you have big-endian, you'll get
another. And of course there are various flavours of middle-endian.

So if you're not too fussy about what value 'value' contains, or if
you're happy that it's correct on your implementation and you're not
worried about porting, you're fine.

HTH, HAND.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 18 '07 #2

"Martin" <martin.o_brien@[no-spam]which.netwrote in message
news:Zv*****************@fe168.usenetserver.com...
For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the 4-byte
target variable must be zero. Is the following use of memcpy() a
well-defined way of so doing?
memcpy() copies bytes so it will always produce the same bit pattern in the
result. As others have said, depending on the "endianess" this may yeild
different integer values. I just wonder if using a struct with a union may
produce more unnderstandable code?

The code is written knowing that
sizeof(unsigned long) == 4 in this instance. The code is somewhat
contrived
in order to provide a self-contained program that will compile and show
the
use of memcpy() I am asking about.

The following code clean compiles using

gcc -Wall -ansi -pedantic

where

gcc -dumpversion displays

4.10.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{

unsigned long value;
unsigned char *pAddress;
unsigned char rem;

pAddress = malloc(3);
if ( pAddress == NULL )
{
puts("malloc failed");
exit(EXIT_FAILURE);
}

/* load in some arbitrary values */
pAddress[0] = 1;
pAddress[1] = 2;
pAddress[2] = 4;
rem = 3; /* hard-wired for demo - normally calculated */

if ( rem )
{
value = 0;
memcpy( &value, pAddress, rem );
}

/* for demo - shows values in pAddress have been transferred */
printf("value = 0x%0lX\n", value);

return 0;
}

--
Martin

May 19 '07 #3
Martin wrote:
For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the 4-byte
target variable must be zero. Is the following use of memcpy() a
well-defined way of so doing? The code is written knowing that
sizeof(unsigned long) == 4 in this instance. The code is somewhat
contrived in order to provide a self-contained program that will compile
and show the use of memcpy() I am asking about.
So what's wrong with something like

unsigned long value =
(unsigned long) byteA
| ((unsigned long) byteB << 8)
| ((unsigned long) byteC << 16)
;

where bytes A, B, and C are the bytes you want to transfer to
the low, low-middle, and high-middle parts of `value`?

[This assumes 8-bit bytes.]

Advantages (a) bypasses endianness issues (b) no need to muck
around with `memcpy`.

--
Signed Hedgehog
"It took a very long time, much longer than the most generous estimates."
- James White, /Sector General/

May 19 '07 #4

"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:ja*********************@bt.com...
Okay, this is certainly legal, given that sizeof(unsigned long) is 4, as
stated in your article.

But what do you actually get? Answer: it depends on the byte ordering
that pertains to your implementation. If you have little endian
integers, you'll get one result, and if you have big-endian, you'll get
another. And of course there are various flavours of middle-endian.

So if you're not too fussy about what value 'value' contains, or if
you're happy that it's correct on your implementation and you're not
worried about porting, you're fine.
Indeed, I'm not fussy about the actual value; it is correct on my
implementation and portability is not an issue. I really want to transfer a
sequence of bytes from memory into a "container" which holds four bytes. So
you have confirmed for me my use of memcpy() does not invoke undefined
behaviour.

Many thanks.

--
Martin

May 21 '07 #5
Subject: Re: Use of memcpy() to transfer from memory to a variable
From: Peter 'Shaggy' Haywood <ph******@alphalink.com.au>
Newsgroups: comp.lang.c
Date: Mon, 21 May 2007 13:32:23 +1000

Groovy hepcat Chris Dollin <eh@electrichedgehog.netwas jivin' on Sat, 19
May 2007 19:29:24 +0000. It's a cool scene! Dig it.
Martin wrote:
>For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the
4-byte target variable must be zero. Is the following use of memcpy() a
well-defined way of so doing? The code is written knowing that
sizeof(unsigned long) == 4 in this instance. The code is somewhat
contrived in order to provide a self-contained program that will
compile and show the use of memcpy() I am asking about.

So what's wrong with something like

unsigned long value =
(unsigned long) byteA
| ((unsigned long) byteB << 8)
| ((unsigned long) byteC << 16)
;

where bytes A, B, and C are the bytes you want to transfer to the low,
low-middle, and high-middle parts of `value`?

[This assumes 8-bit bytes.]

Advantages (a) bypasses endianness issues (b) no need to muck around
with `memcpy`.
Disadvantage: relies on 8 bit bytes.
A better solution would use CHAR_BIT from limits.h instead of hard
coding magic numbers.

#include <limits.h>
....
unsigned long value = (unsigned long)byteA |
(unsigned long)byteB << CHAR_BIT |
(unsigned long)byteC << 2 * CHAR_BIT;

--
Dig the sig!

----------- Peter 'Shaggy' Haywood ------------
Ain't I'm a dawg!!

May 25 '07 #6
Richard Heathfield writes:
Martin said:
>For reasons I won't go into, I need to transfer from 1 to 3 bytes to a
variable that I know is 4 bytes long. Bytes not written to in the
4-byte target variable must be zero. Is the following use of memcpy()
a well-defined way of so doing?

No, but it's not exactly undefined either.
It is if it produces a trap representation in the 'long' variable. Due
to either padding bits in the long type, or on a sign/magnitude host
where it can produce negative zero, which can be a trap representation.

Not exactly common, but it's possible. And if you use unsigned long
instead, it's possible to trap it at compile time by checking that
ULONG_MAX uses all the bits in a long.
> if ( rem )
{
value = 0;
memcpy( &value, pAddress, rem );
}

/* for demo - shows values in pAddress have been transferred */
printf("value = 0x%0lX\n", value);
Note that you need to look out for C's aliasing rules in code like that.
I _think_ it's safe when pAddress holds a malloced char array, but
otherwise the compiler would be allowed to "know" that pAddress does not
hold a long and 'value' thus is not set to a long after 'value = 0;'.
Then it could optimize away the memcpy, since it can tell the
destination value is not used (validly).

Compilers get smarter all the time, I remember a recent comment from the
gcc guys about some other hack: "We are working on a feature which will
break your hack."

Sigh. I guess we'll have to throw away hash functions which read
aligned data in larger chunks than byte by byte soon... Or maybe it
would help to declare the input values volatile.

--
Regards,
Hallvard
May 25 '07 #7
Hallvard B Furuseth said:
Richard Heathfield writes:
>Martin said:
>>For reasons I won't go into, I need to transfer from 1 to 3 bytes to
a variable that I know is 4 bytes long. Bytes not written to in the
4-byte target variable must be zero. Is the following use of
memcpy() a well-defined way of so doing?

No, but it's not exactly undefined either.

It is if it produces a trap representation in the 'long' variable.
That isn't possible in C89, of course, since there's no such thing as a
trap representation in C89.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 25 '07 #8
"Chris Dollin" <eh@electrichedgehog.netwrote in message
news:oe******************@fe3.news.blueyonder.co.u k...
So what's wrong with something like

unsigned long value =
(unsigned long) byteA
| ((unsigned long) byteB << 8)
| ((unsigned long) byteC << 16)
;

where bytes A, B, and C are the bytes you want to transfer to
the low, low-middle, and high-middle parts of `value`?

[This assumes 8-bit bytes.]

Advantages (a) bypasses endianness issues (b) no need to muck
around with `memcpy`.

Thanks for that Chris. I can see the byte order issue solved by it, but it
doesn't seem any easier (or worse) than using memcpy() - plus memcpy()'s
third argument specifies how many bytes I want to transfer whereas I'd have
to do some extra coding to accomodate that with your method.

--
Martin

May 25 '07 #9
Hallvard wrote:
>It is if it produces a trap representation in the 'long' variable.
Richard Heathfield replied:
That isn't possible in C89, of course, since there's no such thing as a
trap representation in C89.

I am using a C89 compiler so it seems well-defined to do this then,
excellent.

--
Martin

May 25 '07 #10
"Martin" <martin.o_brien@[no-spam]which.netwrites:
Hallvard wrote:
>>It is if it produces a trap representation in the 'long' variable.

Richard Heathfield replied:
>That isn't possible in C89, of course, since there's no such thing as a
trap representation in C89.

I am using a C89 compiler so it seems well-defined to do this then,
excellent.
I don't think it's well-defined in C89, though it may happen to be
safe under some particular implementation.

C89/C90 doesn't use the concept of "trap representation", but it does
have "indeterminately valued objects". Here's (part of) the C99
definition of "undefined behavior":

3.16 undefined behavior: Behavior, upon use of a nonponable or
erroneous program construct, of erroneous data, or of indeterminately
valued objects, for which this International Standard imposes no
requirements.
[...]

It seems to me that a conforming C90 implementation can have the
equivalent of "trap representatations", even though the C90 standard
doesn't use that term.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 26 '07 #11
Richard Heathfield wrote:
Hallvard B Furuseth said:
>Richard Heathfield writes:
>>Martin said:

For reasons I won't go into, I need to transfer from 1 to 3
bytes to a variable that I know is 4 bytes long. Bytes not
written to in the 4-byte target variable must be zero. Is the
following use of memcpy() a well-defined way of so doing?

No, but it's not exactly undefined either.

It is if it produces a trap representation in the 'long' variable.

That isn't possible in C89, of course, since there's no such thing
as a trap representation in C89.
How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed. They normally won't be
created without such things as overflow, division by zero,
arbitrary copies from byte values etc. Something is functioning
quite differently in our two minds.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
<http://kadaitcha.cx/vista/dogsbreakfast/index.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

May 26 '07 #12
Martin said:
Hallvard wrote:
>>It is if it produces a trap representation in the 'long' variable.

Richard Heathfield replied:
>That isn't possible in C89, of course, since there's no such thing as
a trap representation in C89.


I am using a C89 compiler so it seems well-defined to do this then,
excellent.
Well, I refer you to my earlier reply, in which I said that it is /not/
well-defined. See <ja*********************@bt.comfor more details.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 26 '07 #13
CBFalconer said:
Richard Heathfield wrote:
>Hallvard B Furuseth said:
<snip>
>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such thing
as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.
Chapter and verse, please.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 26 '07 #14
Richard Heathfield <rj*@see.sig.invalidwrites:
CBFalconer said:
>Richard Heathfield wrote:
>>Hallvard B Furuseth said:
<snip>
>>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such thing
as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.

Chapter and verse, please.
There is no direct C&V in C89/C90, since that standard doesn't define
the term "trap value". (Then again, neither does C99, but C99 does
define "trap representation", which is what we're really talking
about.)

But the concept is there implicitly, I think.

C90 6.5.7:

If an object that has automatic storage duration is not
initialized explicitly, its value is indeterminate.

C90 3.16:

Undefined behavior: behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.
[...]

In C99, a trap representation is one that causes undefined behavior if
the program attempts to access it. C90 had the same concept, but not
the same term. C99 didn't really change the semantics, it just made
it more explicit and nailed down the terminology.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 26 '07 #15
Keith Thompson wrote:
Richard Heathfield <rj*@see.sig.invalidwrites:
>CBFalconer said:
>>Richard Heathfield wrote:
Hallvard B Furuseth said:
<snip>
>>>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such thing
as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.

Chapter and verse, please.

There is no direct C&V in C89/C90, since that standard doesn't define
the term "trap value". (Then again, neither does C99, but C99 does
define "trap representation", which is what we're really talking
about.)

But the concept is there implicitly, I think.

C90 6.5.7:

If an object that has automatic storage duration is not
initialized explicitly, its value is indeterminate.

C90 3.16:

Undefined behavior: behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements.
[...]

In C99, a trap representation is one that causes undefined behavior if
the program attempts to access it. C90 had the same concept, but not
the same term. C99 didn't really change the semantics, it just made
it more explicit and nailed down the terminology.
If you set the bytes that make up an object to specific values, the object
is initialised, so in C90, it is then allowed to be read, regardless of
what values you used for the representation, right?
May 26 '07 #16
Harald van Dijk <tr*****@gmail.comwrites:
Keith Thompson wrote:
[...]
>In C99, a trap representation is one that causes undefined behavior if
the program attempts to access it. C90 had the same concept, but not
the same term. C99 didn't really change the semantics, it just made
it more explicit and nailed down the terminology.

If you set the bytes that make up an object to specific values, the object
is initialised, so in C90, it is then allowed to be read, regardless of
what values you used for the representation, right?
As far as the wording of the C90 standard is concerned, I'm not sure.
As far as actual implementations are concerned, a floating-point
object could contain a signalling NaN representation, and accessing it
could cause Bad Things to happen. Similar things could happen for
some pointer representations on some systems, or even integers.

I'd suggest that this needs to be clarified, but it already was when
the C99 standard came out, and they're not doing DRs for C90.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 26 '07 #17
Keith Thompson said:
Richard Heathfield <rj*@see.sig.invalidwrites:
>CBFalconer said:
>>Richard Heathfield wrote:
Hallvard B Furuseth said:
<snip>
>>>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such thing
as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.

Chapter and verse, please.

There is no direct C&V in C89/C90, since that standard doesn't define
the term "trap value".
Quite so. In fact, it doesn't even mention the word "trap".
(Then again, neither does C99, but C99 does
define "trap representation", which is what we're really talking
about.)
Which is why I referred specifically to C89 rather than C99.
But the concept is there implicitly, I think.
The concept of "indeterminate value" exists.
>
C90 6.5.7:

If an object that has automatic storage duration is not
initialized explicitly, its value is indeterminate.
Look at the OP's code again. None of the objects concerned had its value
read without that value first being set. Therefore, it's hard to argue
that any of the values in the OP's code are indeterminate.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 26 '07 #18
Harald van D?k wrote:
Keith Thompson wrote:
.... snip ...
>
>In C99, a trap representation is one that causes undefined
behavior if the program attempts to access it. C90 had the same
concept, but not the same term. C99 didn't really change the
semantics, it just made it more explicit and nailed down the
terminology.

If you set the bytes that make up an object to specific values,
the object is initialised, so in C90, it is then allowed to be
read, regardless of what values you used for the representation,
right?
No. The only type that is immune from trap representation is the
unsigned char.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
<http://kadaitcha.cx/vista/dogsbreakfast/index.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

May 26 '07 #19
Richard Heathfield wrote:
CBFalconer said:
>Richard Heathfield wrote:
>>Hallvard B Furuseth said:

<snip>
>>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such
thing as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.

Chapter and verse, please.
I don't have a C90 std, but the following (para. 5) is from N869:

6.2.6.1 General

[#1] The representations of all types are unspecified except
as stated in this subclause.

[#2] Except for bit-fields, objects are composed of
contiguous sequences of one or more bytes, the number,
order, and encoding of which are either explicitly specified
or implementation-defined.

[#3] Values stored in objects of type unsigned char shall be
represented using a pure binary notation.36)

[#4] Values stored in objects of any other object type
consist of n+CHAR_BIT bits, where n is the size of an object
of that type, in bytes. The value may be copied into an
object of type unsigned char [n] (e.g., by memcpy); the
resulting set of bytes is called the object representation
of the value. Two values (other than NaNs) with the same
object representation compare equal, but values that compare
equal may have different object representations.

[#5] Certain object representations need not represent a
value of the object type. If the stored value of an object
has such a representation and is accessed by an lvalue
expression that does not have character type, the behavior
is undefined. If such a representation is produced by a
side effect that modifies all or any part of the object by
an lvalue expression that does not have character type, the
behavior is undefined.37) Such a representation is called a
trap representation.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
<http://kadaitcha.cx/vista/dogsbreakfast/index.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

May 26 '07 #20
CBFalconer wrote:
Harald van D?k wrote:
>Keith Thompson wrote:
... snip ...
>>
>>In C99, a trap representation is one that causes undefined
behavior if the program attempts to access it. C90 had the same
concept, but not the same term. C99 didn't really change the
semantics, it just made it more explicit and nailed down the
terminology.

If you set the bytes that make up an object to specific values,
the object is initialised, so in C90, it is then allowed to be
read, regardless of what values you used for the representation,
right?

No. The only type that is immune from trap representation is the
unsigned char.
Please keep in mind that I was asking about C90, not C99. As Keith Thompson
pointed out, real-world implementations aiming to conform to C90 do have
trap representations, but where does C90 allow them to?
May 26 '07 #21
Keith Thompson writes:
There is no direct C&V in C89/C90, since that standard doesn't define
the term "trap value". (Then again, neither does C99, but C99 does
define "trap representation", which is what we're really talking
about.)

But the concept is there implicitly, I think.
(...)
A real-world example I vaguely remember from earlier discussions, I
think before C99, which the OP's example can produce: Sign bit 1, all
other bits 0, when LONG_MIN == -LONG_MAX, on a two's complement machine.

--
Regards,
Hallvard
May 26 '07 #22
CBFalconer said:
Richard Heathfield wrote:
>CBFalconer said:
>>Richard Heathfield wrote:
Hallvard B Furuseth said:

<snip>
>>>>>
It is [undefined] if it produces a trap representation in the
'long' variable.

That isn't possible in C89, of course, since there's no such
thing as a trap representation in C89.

How can you say that? A C89/C90 int can have trap values,
independant of the actual code employed.

Chapter and verse, please.

I don't have a C90 std, but the following (para. 5) is from N869:
....and it is irrelevant to C89, which *pre-dates* N869 by a substantial
number of years.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 26 '07 #23

"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:Tr*********************@bt.com...
Well, I refer you to my earlier reply, in which I said that it is /not/
well-defined. See <ja*********************@bt.comfor more details.
I can't see where you said that. In reply to my original question, viz. "Is
the following use of memcpy()
a well-defined way of so doing?" you replied "No, but it's not exactly
undefined either." I took that to mean it is not undefined.

--
Martin

May 26 '07 #24
Martin said:
>
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:Tr*********************@bt.com...
>Well, I refer you to my earlier reply, in which I said that it is
/not/ well-defined. See <ja*********************@bt.comfor more
details.

I can't see where you said that. In reply to my original question,
viz. "Is the following use of memcpy()
a well-defined way of so doing?" you replied "No, but it's not exactly
undefined either." I took that to mean it is not undefined.
There are several different kinds of behaviour described in the
Standard. They are:

* Unspecified behavior --- behavior, for a correct program construct
and correct data, for which the Standard imposes no requirements.

* Undefined behavior --- behavior, upon use of a nonportable or
erroneous program construct, of erroneous data, or of
indeterminately-valued objects, for which the Standard imposes no
requirements. [...]

* Implementation-defined behavior --- behavior, for a correct program
construct and correct data, that depends on the characteristics of
the implementation and that each implementation shall document.

* Locale-specific behavior --- behavior that depends on local
conventions of nationality, culture, and language that each
implementation shall document.

(C99 may add more - I haven't checked.)

If the Standard *fully* describes the behaviour of the construct, it is
considered to be "well-defined". Otherwise, the behaviour is one of the
others described above. It isn't locale-specific, for reasons that I
hope are obvious. It isn't implementation-defined, because the
implementation is not required to document which value your object will
take. The remaining possibilities are: undefined, unspecified, and
well-defined. It isn't well-defined because the value your object will
have after your memcpy cannot be determined solely by reference to the
Standard. So it's either unspecified or undefined.

To judge it 'undefined' would, I think, be harsh. After all, you
specified that sizeof(long) was 4, and I think it's fair to assume that
you would be happy to specify that CHAR_BIT is 8, so we're talking
about an object exactly 32 bits wide, and long ints must be at least
that wide anyway. You are giving precise values to three of the four
bytes in the long int, and the fourth is known to contain 0-bits. In no
case are you setting any byte in such a way that the sign bit will be
set, so that prospective complication is ruled out. So you're going to
end up with some legal value or other in your long int. The exact value
you get will depend, basically, on endianness (byte ordering), for
which the Standard doesn't impose any requirements on the
implementation.

So in /my/ judgement, the behaviour is neither well-defined,
implementation-defined, undefined, or locale-specific. That leaves
'unspecified' as the only remaining option.

Others here may disagree, of course!

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 26 '07 #25
Richard Heathfield writes:
To judge it 'undefined' would, I think, be harsh. After all, you
specified that sizeof(long) was 4, and I think it's fair to assume that
you would be happy to specify that CHAR_BIT is 8, so we're talking
about an object exactly 32 bits wide, and long ints must be at least
that wide anyway.
Right, I hadn't noticed that.
You are giving precise values to three of the four
bytes in the long int, and the fourth is known to contain 0-bits. In no
case are you setting any byte in such a way that the sign bit will be
set, so that prospective complication is ruled out.
Not in the example values, but those were just an example.
Another way to avoid sign bit complications is to use unsigned long.
So you're going to
end up with some legal value or other in your long int.
Well, data which represents a legal value if the compiler deigns to
notice it.

You missed one point: He stored that in a way - via a char type - which is
a valid way to access that location, which tells the compiler that there
is a new value in the variable's location. I think. Compare with:
long foo()
{
long v = 0;
*((short *)&v) = 123;
return v;
}
which may return 0 - and it does, with gcc-4 -O2.
Replace short with char and it returns nonzero.
Personally I've begun to run scared and stuff "volatile" into code like
that, even though that doesn't help definedness. I've lost track of
things like whether e.g. memcpy vs. just storing directly into the long
makes a difference. Or if the compiler can get anal and decide that a
long can't have been stored because only 3 bytes were stored. I'm
fairly sure it's bad to store partial values, but maybe I've got that
from another context. Unions, maybe.
BTW, I wouldn't trust a modern "C89" compiler to still be using C89
semantics with stuff like this, if there really is a difference other
than different terminology than C99. And assuming the semantics is
different, I suspect more aggressive optimizations (valid in C89) are a
more noticeable difference anyway, with regards to bit/byte-fiddling.

--
Regards,
Hallvard
May 26 '07 #26
Hallvard B Furuseth wrote:
A real-world example I vaguely remember from earlier discussions, I
think before C99, which the OP's example can produce: Sign bit 1, all
other bits 0, when LONG_MIN == -LONG_MAX, on a two's complement machine.
All the variable defined in the program I posted here are unsigned, so can I
assume there is no danger of a trap representation?

--
Martin

May 26 '07 #27
"Martin" <martin.o_brien@[no-spam]which.netwrites:
Hallvard B Furuseth wrote:
>A real-world example I vaguely remember from earlier discussions, I
think before C99, which the OP's example can produce: Sign bit 1, all
other bits 0, when LONG_MIN == -LONG_MAX, on a two's complement machine.

All the variable defined in the program I posted here are unsigned, so can I
assume there is no danger of a trap representation?
No, unsigned types bigger than unsigned char can have padding bits and
trap representations (though I don't know of any implementations where
they actually do).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 27 '07 #28
Martin writes:
Hallvard B Furuseth wrote:
>A real-world example I vaguely remember from earlier discussions, I
think before C99, which the OP's example can produce: Sign bit 1, all
other bits 0, when LONG_MIN == -LONG_MAX, on a two's complement machine.

All the variable defined in the program I posted here are unsigned, so can I
assume there is no danger of a trap representation?
Not when it also has the minimum size allowed for an unsigned long. I
seem to have been far too tired when I originally read this thread, I
knew your variable was long and not unsigned long. Maybe someone said
long elsewhere.

--
Regards,
Hallvard
May 27 '07 #29
Martin wrote:
Hallvard B Furuseth wrote:
>A real-world example I vaguely remember from earlier discussions,
I think before C99, which the OP's example can produce: Sign bit
1, all other bits 0, when LONG_MIN == -LONG_MAX, on a two's
complement machine.

All the variable defined in the program I posted here are unsigned,
so can I assume there is no danger of a trap representation?
Your description specifically isolates 0x80...0 and thus has room
for an (ex) non-initialized value trap.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
<http://kadaitcha.cx/vista/dogsbreakfast/index.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

May 27 '07 #30

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

10 posts views Thread by spoc | last post: by
6 posts views Thread by Samee Zahur | last post: by
36 posts views Thread by Olga Sayenko | last post: by
6 posts views Thread by myhotline | last post: by
70 posts views Thread by Rajan | last post: by
39 posts views Thread by Martin Jørgensen | last post: by
18 posts views Thread by Mark | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.