473,403 Members | 2,270 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

C style casting

Hi,

Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values.

Assume that variables are initilized.

char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;

Thanks,
Ramesh
Nov 14 '05 #1
19 1722
Ramesh Tharma <ra********@yahoo.com> wrote:
Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values. Assume that variables are initilized.
How?
char* c;
long* lg; c = (char*) lg;
lg = (long*) c;


There's nothing generally wrong, except that in the last line
if `c' is not aligned for type `long', then UB is invoked and
anything may happen. Everything depends on where `c' points to,
and on the implementation's alignment requirements.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #2


S.Tobias wrote:
Ramesh Tharma <ra********@yahoo.com> wrote:
Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values.

Assume that variables are initilized.


How?
char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;


There's nothing generally wrong, except that in the last line
if `c' is not aligned for type `long', then UB is invoked and
anything may happen. Everything depends on where `c' points to,
and on the implementation's alignment requirements.


Also, dereferencing "c" may give different results on implementations
that have different endianness.

Nov 14 '05 #3
Ramesh Tharma wrote:
Is any one knows what's wrong with the following code, I was told
that it will compile and run but it will crash for some values.

Assume that variables are initilized.

char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;


I remember there are systems which allow char pointers to point to any
kind of address but restrict long pointers to only even addresses.
(Wasn't good old Amiga one of these?) So if you force a long pointer to
an odd address it will lead to an exception.

Best regards
Steffen

Nov 14 '05 #4
Steffen Buehler <st*************@mailinator.com> wrote:
Ramesh Tharma wrote:
Is any one knows what's wrong with the following code, I was told
that it will compile and run but it will crash for some values.

Assume that variables are initilized.

char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;
I remember there are systems which allow char pointers to point to any
kind of address


They must; char * has the same representation and alignment as void *.
but restrict long pointers to only even addresses.
(Wasn't good old Amiga one of these?)
Such things are quite common, I gather. But if you stick with what's
defined by C, you'll never have to worry about it.
So if you force a long pointer to an odd address it will lead to an exception.


Mind you, the code above is portable. The conversion from long * to char
* is allowed; and the conversion back, given that the first one is
correct, must result in the same long * that was originally converted to
char *.

Richard
Nov 14 '05 #5
The first conversion, from long to char is just fine. However there are
two problems with the second conversion.

1) If the char is not word aligned, i.e. if you have a 32-bit machine,
and the address of char does not have an address that is multiple of
four, then dereferencing of long will be unaligned. If your compiler
does not support unaligned access, its a problem.

2) You're accessing more than (4 bytes) what you've defined by char (1
byte). So although it is unlikely, the 3 extra bytes you access may not
be accessable at all.

Thanks,
Bahadir

Nov 14 '05 #6
ju**********@yahoo.co.in wrote:
S.Tobias wrote:
Ramesh Tharma <ra********@yahoo.com> wrote:
Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values.
Assume that variables are initilized.


How?
char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;


There's nothing generally wrong, except that in the last line
if `c' is not aligned for type `long', then UB is invoked and
anything may happen. Everything depends on where `c' points to,
and on the implementation's alignment requirements.

Also, dereferencing "c" may give different results on implementations
that have different endianness.


The question was only about pointer conversions, and only that is what
I gave my answer to. Dereferencing above pointers (which was not
asked about) is another issue, and byte-sex is not the biggest headache.
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #7
S.Tobias wrote:
Ramesh Tharma <ra********@yahoo.com> wrote:
Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values.

Assume that variables are initilized.


How?
char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;


There's nothing generally wrong, except that in the last line
if `c' is not aligned for type `long', then UB is invoked and
anything may happen. Everything depends on where `c' points to,
and on the implementation's alignment requirements.


Actually, it is guaranteed that you can safely convert a pointer to
char* and back again, lg will have the same value as it did before the
conversions. If the original value was properly aligned, it will be
after the conversions as well.

Robert Gamble

Nov 14 '05 #8
Robert Gamble <rg*******@gmail.com> wrote:
S.Tobias wrote:
Ramesh Tharma <ra********@yahoo.com> wrote:
Is any one knows what's wrong with the following code, I was told that it
will compile and run but it will crash for some values.
Assume that variables are initilized.


How?
char* c;
long* lg;

c = (char*) lg;
lg = (long*) c;


There's nothing generally wrong, except that in the last line
if `c' is not aligned for type `long', then UB is invoked and
anything may happen. Everything depends on where `c' points to,
and on the implementation's alignment requirements.

Actually, it is guaranteed that you can safely convert a pointer to
char* and back again, lg will have the same value as it did before the
conversions. If the original value was properly aligned, it will be
after the conversions as well.


Yes, exactly so. But since the whole thing was not compilable,
I assumed those lines were unrelated snippets. I agree, if they
were - as written - part of a block, there would be absolutely
nothing wrong with them.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #9
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)


6.2.6.1p5 leaves character types exempt from trap representations.
There are potentiall signed and plain character representations for
which
the value is merely unspecified.

--
Peter

Nov 14 '05 #10
Who are you replying to? Why did you snip the original question?

ba************@gmail.com wrote:
The first conversion, from long to char is just fine. However there are
two problems with the second conversion.

1) If the char is not word aligned, i.e. if you have a 32-bit machine,
and the address of char does not have an address that is multiple of
four, then dereferencing of long will be unaligned. If your compiler
does not support unaligned access, its a problem.

2) You're accessing more than (4 bytes) what you've defined by char (1
byte). So although it is unlikely, the 3 extra bytes you access may not
be accessable at all.


I have several platforms where the numbers you give are incorrect.

On the first platform, long is 64-bits wide.
On the second platform, long is 1 byte.
Nov 14 '05 #11
Peter Nilsson <ai***@acay.com.au> wrote:
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)
6.2.6.1p5 leaves character types exempt from trap representations.
There are potentiall signed and plain character representations for
which
the value is merely unspecified.


I'm confused by that part. Now when I read it again it seems you're
right. I'd welcome others' comments on this, too.

I had a short discussion on this issue in c.s.c, here're excerpts
from my "posted" file:

# Jack Klein <ja*******@spamcop.net> wrote:
# > On 28 Nov 2004 01:49:57 GMT, "S.Tobias"
# > <si***@FamOuS.BedBuG.pAlS.INVALID> wrote in comp.std.c:
#
# > > Some people seem to believe that access of non-character objects with
# > > a character type other than `unsigned char' automatically invokes UB.
# > No, not automatically. Only if signed char, and plain char if signed,
# > have trap representations and a byte accessed via one of these lvalues
# > contains such a trap representation.

[...]

# > No, it does not. The key wording is in paragraph 5 of 6.2.6.1, where
# > the term 'trap representation' is defined. Here are the first two
# > sentences:
# [snip]
# > Now here is what I was told when I raised the issue before. The fact
# > that access with a non-character type is specifically undefined does
# > not guarantee that no accesses with a character type (other than
# > unsigned char) might not also cause undefined behavior.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #12
On 9 Jun 2005 18:18:27 -0700, "Peter Nilsson" <ai***@acay.com.au>
wrote in comp.lang.c:
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)
6.2.6.1p5 leaves character types exempt from trap representations.


No, strangely enough, it does not. Even though it appears to.
There are potentiall signed and plain character representations for
which
the value is merely unspecified.


Read 6.2.6.2 p5. Any signed integer type may have a trap
representation even if it contains only sign and value bits. All bits
are off if signed char contains padding bits, which 6.2.6.2 p2
SPECIFICALLY ALLOWS.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #13
On 10 Jun 2005 09:33:48 GMT, "S.Tobias"
<si***@FamOuS.BedBuG.pAlS.INVALID> wrote in comp.lang.c:
Peter Nilsson <ai***@acay.com.au> wrote:
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)
6.2.6.1p5 leaves character types exempt from trap representations.
There are potentiall signed and plain character representations for
which
the value is merely unspecified.


I'm confused by that part. Now when I read it again it seems you're
right. I'd welcome others' comments on this, too.

I had a short discussion on this issue in c.s.c, here're excerpts
from my "posted" file:

# Jack Klein <ja*******@spamcop.net> wrote:
# > On 28 Nov 2004 01:49:57 GMT, "S.Tobias"
# > <si***@FamOuS.BedBuG.pAlS.INVALID> wrote in comp.std.c:
#
# > > Some people seem to believe that access of non-character objects with
# > > a character type other than `unsigned char' automatically invokes UB.
# > No, not automatically. Only if signed char, and plain char if signed,
# > have trap representations and a byte accessed via one of these lvalues
# > contains such a trap representation.

[...]

# > No, it does not. The key wording is in paragraph 5 of 6.2.6.1, where
# > the term 'trap representation' is defined. Here are the first two
# > sentences:
# [snip]
# > Now here is what I was told when I raised the issue before. The fact
# > that access with a non-character type is specifically undefined does
# > not guarantee that no accesses with a character type (other than
# > unsigned char) might not also cause undefined behavior.


Here is what I just posted as a reply to Peter:
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)
6.2.6.1p5 leaves character types exempt from trap representations.


No, strangely enough, it does not. Even though it appears to.
There are potentiall signed and plain character representations for
which
the value is merely unspecified.
Read 6.2.6.2 p5. Any signed integer type may have a trap
representation even if it contains only sign and value bits. All bits
are off if signed char contains padding bits, which 6.2.6.2 p2
SPECIFICALLY ALLOWS.

I did raise this issue in comp.std.c, a long, long time ago. Not only
6.2.6.1 p5, but several other places the standard uses phrases like "a
character type" too loosely. In C89/90, there was no mention of 'trap
representations', or indeed of any possible problems with any integer
operations other than overflow or underflow of the signed types, or
division by 0.

Although, at least one member of the committee said that such things
were allowed to exist under the earlier versions of the standard, even
though they weren't mentioned.

And the reply I received was essentially what I posted and you quoted
above, and I will copy and paste here again:
# > Now here is what I was told when I raised the issue before. The fact
# > that access with a non-character type is specifically undefined does
# > not guarantee that no accesses with a character type (other than
# > unsigned char) might not also cause undefined behavior.


If you disbelieve this, see if you can cite a reference anywhere in
the standard that states specifically that this is NOT undefined.

Simple logic does indeed show that:

all but X is Y

....does not prove that:

X is NOT Y

If a byte in memory contains what is, for a given implementation, a
trap representation for signed char, and if that byte is read by an
lvalue of signed character type, or plain character type if plain char
is signed, then the behavior is undefined.

Should the wording of the standard in the several places where it uses
the phrase "a character type", when a signed character type might
invoke UB, be changed? I thought so.

BTW, our friends down the hall went us one better on this. The ISO
C++ standard specifically disallows padding bits in signed char, yet
ISO C allows them.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #14
Jack Klein wrote:
"Peter Nilsson" <ai***@acay.com.au> wrote in comp.lang.c:
S.Tobias wrote:
For example, plain `char' might be signed and have a trap representation;
dereferencing `c' in such situation might cause UB, too. (It is valid
to access any object as `unsigned char' though.)
6.2.6.1p5 leaves character types exempt from trap representations.


No, strangely enough, it does not. Even though it appears to.


Okay, character types have trap representations, however, those
representaions
are exempt from invoking undefined behaviour.
There are potentiall signed and plain character representations for
which the value is merely unspecified.


Read 6.2.6.2 p5.


"The values of any padding bits are unspecified.45) A valid
(non-trap)
object representation of a signed integer type where the sign bit is
zero is a valid object representation of the corresponding unsigned
type,
and shall represent the same value."
Any signed integer type may have a trap representation even if it
contains only sign and value bits.
But it is 6.2.6.1p5 which specifies the consequences of accessing
a trap representation, and character lvalues are left _off_ the
undefined behaviour list.
All bits are off if signed char contains padding bits, which 6.2.6.2p2
SPECIFICALLY ALLOWS.


Do you mean to imply all (padding?) bits are zero (off?), or did you
make
a typo in saying all 'bets' are off?!

If the former, there's no guarantee of that. Simply consider any
arbitrary object bye. Even on an 8-bit implementation, signed char
may have the range -127..127. But if I read a byte with the value
128 on such an implementation, the behaviour is not undefined.
The only thing you can say is the value is unspecified.

In other words, an unsigned char value of 128 may be a trap
representation
as a signed char, however the implementation must produce _some_ value
for that representation as a signed char.

--
Peter

Nov 14 '05 #15
On Fri, 10 Jun 2005 23:01:38 -0500, Jack Klein wrote:
On 9 Jun 2005 18:18:27 -0700, "Peter Nilsson" <ai***@acay.com.au>
wrote in comp.lang.c:
S.Tobias wrote:
> For example, plain `char' might be signed and have a trap representation;
> dereferencing `c' in such situation might cause UB, too. (It is valid
> to access any object as `unsigned char' though.)


6.2.6.1p5 leaves character types exempt from trap representations.


No, strangely enough, it does not. Even though it appears to.


Right, it effectively says that trap representations in character types
don't cause undefined behaviour.

Lawrence
Nov 14 '05 #16
Peter Nilsson <ai***@acay.com.au> wrote:
Jack Klein wrote:
"Peter Nilsson" <ai***@acay.com.au> wrote in comp.lang.c:
S.Tobias wrote: > For example, plain `char' might be signed and have a trap representation;
> dereferencing `c' in such situation might cause UB, too. (It is valid
> to access any object as `unsigned char' though.)

6.2.6.1p5 leaves character types exempt from trap representations.
No, strangely enough, it does not. Even though it appears to.
/Trap representation/ is an object representation that does not represent
any value of the object's type.
Okay, character types have trap representations, however, those
representaions
are exempt from invoking undefined behaviour.
That's my understanding too, after carefully reading that paragraph.
It seems to say that accessing an object itself through a character
type, does not produce UB.

(Note: the Std talks about two kinds of values: object value (which is
actually its byte representation), and a value of a given type.)
There are potentiall signed and plain character representations for
which the value is merely unspecified.

Hmm... if an object does not have a value (has a trap representation),
then its value cannot be merely unspecified. The object simply
does not have a value, period.
[ All bits are off if signed char contains padding bits, which 6.2.6.2p2
SPECIFICALLY ALLOWS.

Do you mean to imply all (padding?) bits are zero (off?), or did you
make
a typo in saying all 'bets' are off?!
I think he meant "bets".
]
Jack Klein wrote: Although, at least one member of the committee said that such things
were allowed to exist under the earlier versions of the standard, even
though they weren't mentioned.
I agree. The current Std didn't even have to define "trap representation";
just mentioning that an object may not have a value is enough.
And the reply I received was essentially what I posted and you quoted
above, and I will copy and paste here again:
# > Now here is what I was told when I raised the issue before. The fact
# > that access with a non-character type is specifically undefined does
# > not guarantee that no accesses with a character type (other than
# > unsigned char) might not also cause undefined behavior.

I don't quite agree with the word "access".
If you disbelieve this, see if you can cite a reference anywhere in
the standard that states specifically that this is NOT undefined.
I didn't bother to look, but I'm sure I wouldn't find anything.
The Std is mostly expressed in terms of values, ie. it assumes
that an operand (argument, whatever...) has a value; representation
is somewhat a secondary idea (it sums up to the fact that objects
can be accessed through unsigned char).
Simple logic does indeed show that:

all but X is Y

...does not prove that:

X is NOT Y
Yes. But if I said: "The people living in the far North, hunting
fish and seals, are called Eskimoes", this does not technically
mean that Australians aren't called that, too. However, I'm
sure you won't find any native Australian Eskimoes hunting
kangaroos near Sidney.

We're dealing with human language here, not Mathematics. If the Std
explicitly excluded some types from a behaviour, presumably it intended
that the behaviour does not engage those types.
If a byte in memory contains what is, for a given implementation, a
trap representation for signed char, and if that byte is read by an
lvalue of signed character type, or plain character type if plain char
is signed, then the behavior is undefined.

Should the wording of the standard in the several places where it uses
the phrase "a character type", when a signed character type might
invoke UB, be changed? I thought so.


I think those words should be removed, or replaced with "unsigned char
type" where relevant. I think something along these lines happened
in TC2 wrt memcpy() and friends.

+++

Now I'll try to take Jack's side.

Although accessing trap representation of a signed char does not
seem to raise UB by itself, I think UB will be invoked anyway
at some point.

For an example, let's take the simplest, primary expression:
signed char c; /* assume trap representation */
c;
The expression "is converted to the value stored in the designated object"
(6.3.2.1p2). Since `c' does not have a value (presumably meaning "value of
a given type"), the behaviour is undefined because the Std fails
to define such situation.

I think this explanation should be valid for "++" and "--" operators
as well ("sizeof" and "&" obviously aren't problematic), and extensible
to arrays and other expressions.

+++

I have one more question: in the paragraph under discussion 6.2.6.1p5
it says trap representation can be produced by "a side effect that
modifies [...] the object by an lvalue expression that does
not have character type". So, for an example, how can you produce
a trap representation (in a valid way) in a `long' object with
a `long' lvalue?

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 14 '05 #17
S.Tobias wrote:
Now I'll try to take Jack's side.

Although accessing trap representation of a signed char does not
seem to raise UB by itself, I think UB will be invoked anyway
at some point.

For an example, let's take the simplest, primary expression:
signed char c; /* assume trap representation */
c;
The expression
"is converted to the value stored in the designated object"
(6.3.2.1p2).
Since `c' does not have a value (presumably meaning "value of
a given type"), the behaviour is undefined because the Std fails
to define such situation.
That's what I think.
I think this explanation should be valid for "++" and "--" operators
as well ("sizeof" and "&" obviously aren't problematic),
and extensible to arrays and other expressions.

+++

I have one more question: in the paragraph under discussion 6.2.6.1p5
it says trap representation can be produced by "a side effect that
modifies [...] the object by an lvalue expression that does
not have character type". So, for an example, how can you produce
a trap representation (in a valid way) in a `long' object with
a `long' lvalue?


This could do it on a system that traps negative zero.

long a = rand();

a = a ^ -a;

--
pete
Nov 14 '05 #18
"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
Peter Nilsson <ai***@acay.com.au> wrote:
Jack Klein wrote:
"Peter Nilsson" <ai***@acay.com.au> wrote in comp.lang.c:
> S.Tobias wrote: > For example, plain `char' might be signed and have a trap representation;
> > dereferencing `c' in such situation might cause UB, too. (It is valid
> > to access any object as `unsigned char' though.)
>
> 6.2.6.1p5 leaves character types exempt from trap representations.

No, strangely enough, it does not. Even though it appears to.

/Trap representation/ is an object representation that does not represent
any value of the object's type.
Okay, character types have trap representations, however, those
representaions
are exempt from invoking undefined behaviour.


That's my understanding too, after carefully reading that paragraph.
It seems to say that accessing an object itself through a character
type, does not produce UB.


IMO this conclusion isn't quite right... (read on)

Jack Klein wrote:
Although, at least one member of the committee said that such things
were allowed to exist under the earlier versions of the standard, even
though they weren't mentioned.
I agree. The current Std didn't even have to define "trap representation";
just mentioning that an object may not have a value is enough.
And the reply I received was essentially what I posted and you quoted
above, and I will copy and paste here again:
# > Now here is what I was told when I raised the issue before. The fact
# > that access with a non-character type is specifically undefined does
# > not guarantee that no accesses with a character type (other than
# > unsigned char) might not also cause undefined behavior.


[snip]

We're dealing with human language here, not Mathematics. If the Std
explicitly excluded some types from a behaviour, presumably it intended
that the behaviour does not engage those types.


This comment is a misreading. The statements in 6.2.6.1 p5 require
something of all types that aren't character types; the character
types aren't stated as excluded from the requirement, they just aren't
included in the statement.

Now I'll try to take Jack's side.

Although accessing trap representation of a signed char does not
seem to raise UB by itself, I think UB will be invoked anyway
at some point.
Here is my reading. See if this strikes your fancy:

1. Accessing an object with some type not a character type, where the
object holds a trap representation of the type of the object, must
produce undefined behavior.

2. Accessing an object with some type not a character type, where the
object holds a trap representation of the type of the object, but
the access is done through a signed character type or plain character
type if plain char has the same representation as signed char, might
or might not produce undefined behavior, depending on whether the
byte(s) accessed are trap representation for the signed character
type.

3. Accessing an object with some type that is a character type, where
the object holds a trap representation for the type of signed char,
and where the access is done through a signed character type or plain
character type if plain char has the same representation as signed
char, does produce undefined behavior.

4. Accessing an object of any type with any representation whether
trap representation or not, with the access being done through
an unsigned character type, is always defined behavior because
there are never trap representations for unsigned character.

I believe that's the reading most consistent with everything else
said about trap representations (at least that I've found).

I have one more question: in the paragraph under discussion 6.2.6.1p5
it says trap representation can be produced by "a side effect that
modifies [...] the object by an lvalue expression that does
not have character type". So, for an example, how can you produce
a trap representation (in a valid way) in a `long' object with
a `long' lvalue?


Three ways for trap representations to come into existence:

1. Uninitialized variables;

2. Trap representations for other types can be stored byte-by-byte
using 'unsigned char' access; and

3. Trap representations can be generated by exceptional conditions.
See notes 44 and 45, and section 6.5 p5. Notes 44 and 45 are
interesting, because they say "no arithmetic operation on valid
values can generate a trap representation other than as part of
an exceptional condition such as an overflow". Since exceptional
conditions *already* mean undefined behavior, saying they can
produce trap representations which can cause further undefined
behavior really isn't much cause for concern.

Presumably also trap representations could also be produced by, eg,
computing a (legal) value that's an 'unsigned long' and casting it to
'long'. The cast could be done indirectly, eg, casting the address of
the 'unsigned long' variable to '(long *)' and then dereferencing. In
either case, producing the value already required undefined behavior.
I believe there's no way to defined-ly produce a trap representation
for a 'long' object other than (1) or (2) above. (The wording in
6.2.6.1 p5 doesn't say that the operation that caused the store was a
defined operation.)

I guess it's also possible that only part of the object in question
could be modified, making the object as a whole a trap representation.
How this might happen without undefined behavior having already
happened I can't say....

Nov 14 '05 #19
Tim Rentsch <tx*@alumnus.caltech.edu> wrote:
"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
Peter Nilsson <ai***@acay.com.au> wrote:
Okay, character types have trap representations, however, those
representaions
are exempt from invoking undefined behaviour.
That's my understanding too, after carefully reading that paragraph.
It seems to say that accessing an object itself through a character
type, does not produce UB. IMO this conclusion isn't quite right... (read on)
Jack Klein wrote:
Although, at least one member of the committee said that such things
were allowed to exist under the earlier versions of the standard, even
though they weren't mentioned.


I agree. The current Std didn't even have to define "trap representation";
just mentioning that an object may not have a value is enough.
And the reply I received was essentially what I posted and you quoted
above, and I will copy and paste here again:

> # > Now here is what I was told when I raised the issue before. The fact
> # > that access with a non-character type is specifically undefined does
> # > not guarantee that no accesses with a character type (other than
> # > unsigned char) might not also cause undefined behavior.


[snip]

We're dealing with human language here, not Mathematics. If the Std
explicitly excluded some types from a behaviour, presumably it intended
that the behaviour does not engage those types. This comment is a misreading. The statements in 6.2.6.1 p5 require
something of all types that aren't character types; the character
types aren't stated as excluded from the requirement, they just aren't
included in the statement.
Yes, and that was Jack Klein's POV, too. His argument was that even if
they were not included, it didn't mean they couldn't cause UB.
My argument was that if the Standard made some steps to "un-include"
them, then it probably meant to exclude them. Otherwise why does
the Standard mention character types at all?

Now I'll try to take Jack's side.

Although accessing trap representation of a signed char does not
seem to raise UB by itself, I think UB will be invoked anyway
at some point. Here is my reading. See if this strikes your fancy:
[ I wasn't sure what you meant by "Accessing an object with some type",
whether it was "object with some type" (ie. having some type), or
"accessing with some type". From the context it seems to follow that
it is the former, so I'll understand "an object having some type"
in each case. ]
1. Accessing an object with some type not a character type, where the
object holds a trap representation of the type of the object, must
produce undefined behavior.
You can access the object with with a different type. A `long' object
may have a trap representation (for `long' type), but you could access
it with an `unsigned long' lvalue (which on the implementation doesn't
have a trap representation).
2. Accessing an object with some type not a character type, where the
object holds a trap representation of the type of the object, but
the access is done through a signed character type or plain character
type if plain char has the same representation as signed char, might
or might not produce undefined behavior, depending on whether the
byte(s) accessed are trap representation for the signed character
type. 3. Accessing an object with some type that is a character type, where
the object holds a trap representation for the type of signed char,
and where the access is done through a signed character type or plain
character type if plain char has the same representation as signed
char, does produce undefined behavior. 4. Accessing an object of any type with any representation whether
trap representation or not, with the access being done through
an unsigned character type, is always defined behavior because
there are never trap representations for unsigned character. I believe that's the reading most consistent with everything else
said about trap representations (at least that I've found).
I don't quite understand what's the point in splitting the rule
into four cases, or more (eg. I don't see a case when the object
having a non-character type and having a valid representation
for its type is accessed with `signed char' type which may have
a trap representation, but now I'm not sure if you inteded include it).

How is it different from the following:
1. What only matters is the type of the lvalue an object is accessed with.
If an object's representation is a trap representation for that
type (regardless of the object type), the behaviour is undefined.
2. `unsigned char' type does not have a trap representation.

(Above I actually assumed that `signed char' may cause UB
(to fit your description), which is under discussion here.)

I have one more question: in the paragraph under discussion 6.2.6.1p5
it says trap representation can be produced by "a side effect that
modifies [...] the object by an lvalue expression that does
not have character type". So, for an example, how can you produce
a trap representation (in a valid way) in a `long' object with
a `long' lvalue?


[snip] Presumably also trap representations could also be produced by, eg,
computing a (legal) value that's an 'unsigned long' and casting it to
'long'. The cast could be done indirectly, eg, casting the address of
the 'unsigned long' variable to '(long *)' and then dereferencing. In
either case, producing the value already required undefined behavior.


I'm thinking of something similar, but inverse:

long l;
unsigned long *p = (void*)&l;
*p = some_value;

I think this is conforming (we may access `l' with `unsigned long'
lvalue), and it seems to fit the description. If `long' has a trap
representation, then such a write might generate it (no other UB
is invoked here).

I think the meaning of this is that a compiler is allowed to
pre-fetch the value of `l' at any time after the store operation (for
optimization reasons), and the programmer is responsible not to put
anything wrong in it ("pre-fetch" means the whole operation might be
performed entirely in registers, even before physically storing
the results in memory). If the same value was stored via a character
type (memcpy), then the compiler could not do such an optimization.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Nov 15 '05 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

26
by: Steven T. Hatton | last post by:
The code shown below is an example from the Coin3D documentation. I believe the use of the C-style cast is safe under the circumstances, but from what I've been exposed to (TC++PL(SE)), I would...
15
by: Christopher Benson-Manica | last post by:
If you had an unsigned int that needed to be cast to a const myClass*, would you use const myClass* a=reinterpret_cast<const myClass*>(my_val); or const myClass* a=(const myClass*)myVal; ...
3
by: Howard | last post by:
Hi, I am maintaining a lot of code that is rife with C-style casts. I've seen a lot of comments that one should not use C-style casts at all. But I'm wondering what harm there could be in...
15
by: Earl Higgins | last post by:
The company where I work as a Senior Software Engineer is currently revamping their (1991 era) "Programming and Style Guidelines", and I'm on the committee. The company, in business for over 20...
5
by: Chuck Bowling | last post by:
Maybe I'm doing something wrong or just don't understand the concept, but i'm having a problem with default properties. My impression of how a default property should act is this; MyClass c =...
4
by: 0to60 | last post by:
I'm coding in MC++ and I'm using the System::Collections data structures to store my own objects. When I get something out of a hashmap, should I be using dynamic_cast or old C-style casting? In...
3
by: Kobe | last post by:
Hi, if I need to convert a size_t to an int, in "older" C++ I'd write the following code (using C-like "casting"): <CODE> std::vector<...> v; int count = (int) v.size(); // v.size() returns...
2
by: Noah Roberts | last post by:
I have a class, that inherits from a class that inherits virtually from another class. I have a breakdown occuring and it is not wrt the virtually inherited class but one of the other MIed pure...
5
by: brekehan | last post by:
I've always been a little sketchy on the differences between static, dynamic, and reinterpret casting. I am looking to clean up the following block by using C++ casting instead of the C style...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.