448,795 Members | 1,173 Online
Need help? Post your question and get tips & solutions from a community of 448,795 IT Pros & Developers. It's quick & easy.

# Assigning values to char arrays

 P: n/a Hi all, here's an elementary question. Assume I have declared two variables, char *a, **b; I can then give a value to a like a="hello world"; The question is, how should I assign values to b? A simple b[0]="string"; results in a segmentation fault. Answers greatly appreciated. Regards, Emyl. Nov 2 '07 #1
43 Replies

 P: n/a emyl I can then give a value to a like a="hello world"; The question is, how should I assign values to b? A simple b[0]="string"; *b = "s"; > results in a segmentation fault. Answers greatly appreciated. Regards, Emyl. Nov 2 '07 #2

 P: n/a emyl said: Hi all, here's an elementary question. Assume I have declared two variables, char *a, **b; I can then give a value to a like a="hello world"; The question is, how should I assign values to b? A simple b[0]="string"; results in a segmentation fault. Answers greatly appreciated. Your definition: char *a, **b; reserves sufficient storage for a pointer-to-char named a, and a pointer-to-pointer-to-char named b. a="hello world"; assigns a value to this pointer-to-char, the value in question being the address of the first character in the given string literal. But b[0]="string"; is a problem, not because there's anything wrong with the syntax, but because you've made an incorrect assumption. b is a pointer-to-pointer-to-char, but you haven't pointed it to any pointers-to-char, so it is currently indeterminate. b[0]="string"; is *not* an attempt to give a value to b. It is an attempt to give a value to b[0]. But b[0] is meaningless unless b has a meaningful value. You can give b a meaningful value in any of several ways, but the most obvious is to allocate some fresh memory for it: #include /* allocate memory for some pointers-to-char */ char **cpalloc(size_t n) { char **ptr = malloc(n * sizeof *ptr); if(ptr != NULL) { while(n--) { ptr[n] = NULL; } } return ptr; } #include int main(void) { char *a, **b; a = "what has it got in its pocketses?"; b = cpalloc(2); if(b != NULL) { b[0] = "string"; b[1] = "nothing"; printf("%s\n", a); printf("%s or %s\n", b[0], b[1]); free(b); } return 0; } Be careful. The cpalloc function written above does not allocate storage for strings, only for a collection of pointers to char. A pointer to char is sufficient for pointing at a string, but not for storing it. -- Richard Heathfield Email: -http://www. +rjh@ Google users: "Usenet is a strange place" - dmr 29 July 1999 Nov 2 '07 #3

 P: n/a emyl wrote: > here's an elementary question. Assume I have declared two variables, char *a, **b; I can then give a value to a like a="hello world"; The question is, how should I assign values to b? A simple b[0]="string"; results in a segmentation fault. "char **b;" declares a pointer to a pointer to char. You could initialize it with "b = &a;" (provided the a declaration is present). Then **b is a[0]. However note that your initialization of leaves a pointing to an unmodifiable string. -- Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems. -- Posted via a free Usenet account from http://www.teranews.com Nov 2 '07 #4

 P: n/a Richard Hi all,here's an elementary question. Assume I have declared two variables,char *a, **b; b is a pointer to a char pointer. >>I can then give a value to a likea="hello world";The question is, how should I assign values to b? A simpleb[0]="string"; *b = "s"; *slaps cold water on face* Sorry. That was bullshit. I thought I had included the initialisation line. See other replies. Scary. >>results in a segmentation fault.Answers greatly appreciated.Regards, Emyl. Nov 2 '07 #5

 P: n/a On Nov 2, 5:32 am, Richard Heathfield /* allocate memory for some pointers-to-char */ char **cpalloc(size_t n) { char **ptr = malloc(n * sizeof *ptr); if(ptr != NULL) { while(n--) { ptr[n] = NULL; } } I have one question . Can memset be used as memset(ptr,0,n); Instead of the while loop ? return ptr; Nov 2 '07 #7

 P: n/a somenath said: On Nov 2, 5:32 am, Richard Heathfield > char **ptr = malloc(n * sizeof *ptr); > if(ptr != NULL) { while(n--) { ptr[n] = NULL; } } I have one question . Can memset be used as memset(ptr,0,n); Instead of the while loop ? Not unless you can guarantee that the representation of null pointers on all target platforms is all-bits-zero. I don't recall that the OP mentioned any platforms. The code I supplied was portable to any hosted implementation. In situations where you /can/ use memset, don't bother - just calloc it instead. -- Richard Heathfield Email: -http://www. +rjh@ Google users: "Usenet is a strange place" - dmr 29 July 1999 Nov 2 '07 #8

 P: n/a On Nov 2, 11:07 am, Richard Heathfield char **ptr = malloc(n * sizeof *ptr); if(ptr != NULL) { while(n--) { ptr[n] = NULL; } } I have one question . Can memset be used as memset(ptr,0,n); Instead of the while loop ? Not unless you can guarantee that the representation of null pointers on all target platforms is all-bits-zero. I don't recall that the OP mentioned any platforms. The code I supplied was portable to any hosted implementation. In situations where you /can/ use memset, don't bother - just calloc it instead. Many thanks for the response. But my understanding was in pointer context 0 and NULL is converted to null pointer. And converting to null pointer is compiler responsibility. So I thought 0 in memset will be converted to null pointer (which is system specific). I would request you to correct me as I am feeling I may be misunderstood some concept. Nov 2 '07 #9

 P: n/a Ark Khasin wrote: Richard wrote: I am not to argue who of us two is more of a newbie, but your post sheds no light on the question asked. Ego bubbling? This is most hilarious sentence I've read in c.l.c. this year. Nov 3 '07 #10

 P: n/a Ark Khasin wrote: Ben Bacarisse wrote: >No. unsigned char may not have padding bits. All the bits must bevalue bits. Why? 6.2.6.2 says "For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). But I couldn't find anything saying that unsigned char *may not* have padding bits. Well the above quote says that unsigned char may not have _both_ padding and value bits. Obviously the bit type left out has to be padding bits - otherwise one would not be able to potably use unsigned char objects. Nov 3 '07 #11

 P: n/a santosh wrote: Ark Khasin wrote: >Ben Bacarisse wrote: >>No. unsigned char may not have padding bits. All the bits must bevalue bits. >Why?6.2.6.2 says "For unsigned integer types other than unsigned char, thebits of the object representation shall be divided into two groups:value bits and padding bits (there need not be any of the latter).But I couldn't find anything saying that unsigned char *may not* havepadding bits. Well the above quote says that unsigned char may not have _both_ padding and value bits. Obviously the bit type left out has to be padding bits - otherwise one would not be able to potably use unsigned char objects. Is this "just a theory"? IMHO, 6.2.6.2 says *exactly nothing* about unsigned char. Nov 3 '07 #12

 P: n/a santosh wrote: Ark Khasin wrote: >Richard wrote: >I am not to argue who of us two is more of a newbie, but your postsheds no light on the question asked. Ego bubbling? This is most hilarious sentence I've read in c.l.c. this year. Ty. But Richard offered a satisfactory explanation. -- Ark Nov 3 '07 #13

 P: n/a Ark Khasin wrote: > santosh wrote: Ark Khasin wrote: Ben Bacarisse wrote: >No. unsigned char may not have padding bits. Is this "just a theory"? No. N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. -- pete Nov 3 '07 #15

 P: n/a pete wrote: Ark Khasin wrote: >santosh wrote: >>Ark Khasin wrote:Ben Bacarisse wrote:No. unsigned char may not have padding bits. >Is this "just a theory"? No. N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. Thanks to adding to my confusion :) So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most significant bits. Anything wrong? BTW, if I am not mistaken, in other integer types padding bits don't have to be contiguous. -- Ark Nov 3 '07 #16

 P: n/a Ark Khasin wrote: pete wrote: >Ark Khasin wrote: >>santosh wrote:Ark Khasin wrote:Ben Bacarisse wrote:>No. unsigned char may not have padding bits. >>Is this "just a theory"? No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. Thanks to adding to my confusion :) So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most significant bits. Anything wrong? What do you mean by UCHAR_MAX==8? Do you mean CHAR_BIT==8? As far as the Standard is concerned a char i.e., a byte (as defined by C) contains CHAR_BIT bits. Additionally unsigned char may not contain padding bits. I don't know what you mean by "machine bytes" above. Are they supposed to be different from C bytes? BTW, if I am not mistaken, in other integer types padding bits don't have to be contiguous. Yes. Padding bits need not be contiguous. Nov 3 '07 #17

 P: n/a Ark Khasin wrote, On 03/11/07 20:05: pete wrote: >Ark Khasin wrote: >>santosh wrote:Ark Khasin wrote:Ben Bacarisse wrote:>No. unsigned char may not have padding bits. >>Is this "just a theory"? No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to thepower CHAR_BIT. Thanks to adding to my confusion :) So I have an 11-bit machine bytes and UCHAR_MAX==8 and 3 padding most significant bits. Anything wrong? CHAR_BIT is the number of bits in a signed, unsigned and plain char. Note, the number of bits, NOT the number of value bits. Therefore, as UCHAR_MAX is 2 raised to the power of CHAR_BIT all of the bits must be value bits. BTW, if I am not mistaken, in other integer types padding bits don't have to be contiguous. The padding bits can be anywhere, but short of using an unsigned char pointer to look at the representation they are hard to get at since the bitwise operations are defined as operating on values. -- Flash Gordon Nov 3 '07 #18

 P: n/a Ark Khasin wrote: Ben Bacarisse wrote: >Ark Khasin Ben Bacarisse wrote:] .... >RH's point was something else altogether -- that all bits zero is notguaranteed to produce a null pointer (to be scrupulously correct, itis not guaranteed to produce a value that compares equal to a nullpointer constant). The parenthesized comment was not actually needed to make the statement "scrupulously correct"; it would have been just as correct, and less confusing, without it. That's where I am lost and reading the standard doesn't help: What's the difference between a value of an object and how it compares equal? I mean, if a==b, whatever their representations, in what context(s) does it make sense to say they may have different values? There is no difference. Don't let the unnecessary "clarification" confuse you. The issue isn't having different values with the same representation in a single type - that can't happen. The issue is that there can be multiple different representations of the same value in a given type. However, the values of objects of that type containing those different representations must compare equal. You're tripping over a minor issue; the fact that there can be multiple representations of a null pointer. However, you've lost track of the key issue: that a pointer object with all of its bits set to 0 doesn't have to be one of those representations. In fact, it doesn't have to represent a valid pointer value of any kind. [NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to say that bitwise logic is a magic performed on representations, and not on values?] No. In general, the bitwise operations are defined in terms of their actions on the values, not the representations. For instance, E>>1 is defined as dividing the value of E by 2. The complicated exceptions all involve sign bits, and most result in undefined behavior, which is why it's strongly recommended that bitwise operations be restricted to unsigned types, or at least restricted to values which are guaranteed to be positive both before and after the operation. > void *a;; memset_as_above(&a, 0, sizeof a); There is, at this point, no guarantee that 'a' contains a valid pointer representation. Therefore, the next line renders the behavior of your entire program undefined: > if (a == 0) { /* not guaranteed */ //Which is correct but implies { void **pNULL = 0; if(a==*pNULL) { /* not guaranteed */ I'm not sure what your point was; but you've just attempted to dereference a null pointer, again making the behavior undefined. Nov 3 '07 #19

 P: n/a Ark Khasin wrote: > pete wrote: Ark Khasin wrote: santosh wrote:Ark Khasin wrote:Ben Bacarisse wrote:No. unsigned char may not have padding bits. Is this "just a theory"? No. N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. Thanks to adding to my confusion :) So I have an 11-bit machine bytes That's what "CHAR_BIT equals eleven" means. -- pete Nov 3 '07 #20

 P: n/a James Kuyper Ben Bacarisse wrote: >>Ark Khasin Ben Bacarisse wrote:] ... >>RH's point was something else altogether -- that all bits zero is notguaranteed to produce a null pointer (to be scrupulously correct, itis not guaranteed to produce a value that compares equal to a nullpointer constant). The parenthesized comment was not actually needed to make the statement "scrupulously correct"; it would have been just as correct, and less confusing, without it. Sorry if I've confused the issue. I was worried about suggesting that there was only one such thing (one null pointer) but I can see that I clearly don't. Maybe I did at some point as I was editing the text. >[NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to saythat bitwise logic is a magic performed on representations, and noton values?] No. In general, the bitwise operations are defined in terms of their actions on the values, not the representations. Is that true for &, |, ^ and ~? The definitions are very bland, but they suggest (simply by saying so little) that the interpretation is to be based on the representation. This is backed up by section 6.5p4. -- Ben. Nov 4 '07 #21

 P: n/a pete wrote: >>>>>>>No. unsigned char may not have padding bits.Is this "just a theory"?No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. Thanks to adding to my confusion :)So I have an 11-bit machine bytes That's what "CHAR_BIT equals eleven" means. Sorry for being that stubborn, but: Why? Why can't I have CHAR_BIT==8 on a 11-bit machine? E.g. my int would be something like say 11(lower)+8(upper)=19 bits. Is it postulated somewhere that UINT_MAX+1==(UCHAR_MAX+1)*sizeof(unsigned) ? I don't think so. Nov 4 '07 #22

 P: n/a Ben Bacarisse wrote: James Kuyper >>[NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to saythat bitwise logic is a magic performed on representations, and noton values?] No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations. Is that true for &, |, ^ and ~? The definitions are very bland, but they suggest (simply by saying so little) that the interpretation is to be based on the representation. This is backed up by section 6.5p4. Yes, I took a beating in this ng recently for proposing, as an academic exercise, int cmpneq(int a, int b){ return a^b; } At the time, I agreed that the beating was well deserved. But as far as I can tell, it depended on ^ operating on representations. An authoritative and well-substantiated clarification would be more than welcome! -- Ark Nov 4 '07 #23

 P: n/a Ark Khasin wrote: pete wrote: >>>>>>>>>>No. unsigned char may not have padding bits.Is this "just a theory"?No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT.Thanks to adding to my confusion :)So I have an 11-bit machine bytes That's what "CHAR_BIT equals eleven" means. Sorry for being that stubborn, but: Why? Why can't I have CHAR_BIT==8 on a 11-bit machine? E.g. my int would be something like say 11(lower)+8(upper)=19 bits. Given sizeof(char) == 1 by definition, how would you express sizeof(int)? 2.38 doesn't fit into size_t very well.... -- Ian Collins. Nov 4 '07 #24

 P: n/a Ark Khasin wrote: pete wrote: >>>>>>>>>>No. unsigned char may not have padding bits.Is this "just a theory"?No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT.Thanks to adding to my confusion :)So I have an 11-bit machine bytes That's what "CHAR_BIT equals eleven" means. Sorry for being that stubborn, but: Why? Why can't I have CHAR_BIT==8 on a 11-bit machine? E.g. my int would be something like say 11(lower)+8(upper)=19 bits. Is it postulated somewhere that UINT_MAX+1==(UCHAR_MAX+1)*sizeof(unsigned) ? I don't think so. Sorry for posting nonsense contradicting 6.2.6.1 #4. It appears indeed that I cannot have 11+9-bit int. While I can have 8+8=16-bit int etc, such a C machine would simply ignore the 3 of 11 bits. Or it can use for padding, which demonstrates that padding of unsigned char is possible e.g. for trap values (for instance, uninitialized or truncated on assignment or whatever). Would it be a legit implementation? -- Ark Nov 4 '07 #25

 P: n/a Ark Khasin

 P: n/a Ark Khasin James Kuyper >>>[NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to saythat bitwise logic is a magic performed on representations, and noton values?]No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations. Is that true for &, |, ^ and ~? The definitions are very bland, butthey suggest (simply by saying so little) that the interpretation isto be based on the representation. This is backed up by section6.5p4. Yes, I took a beating in this ng recently for proposing, as an academic exercise, int cmpneq(int a, int b){ return a^b; } At the time, I agreed that the beating was well deserved. But as far as I can tell, it depended on ^ operating on representations. An authoritative and well-substantiated clarification would be more than welcome! If you think about it, you can *always* define the meaning in terms of values even if it is more natural to think of it in terms of representations. However, that would be stretching a point. An expression like '-1 | -2' does not invoke undefined behaviour and the result is most easily explained in terms of the representation of the operands. (Of course it is daft, but that is not really the point.) -- Ben. Nov 4 '07 #27

 P: n/a Ark Khasin

 P: n/a Ark Khasin pete wrote: >>>>>>>>>>>>>No. unsigned char may not have padding bits.>Is this "just a theory"?No.>N869 5.2.4.2.1 Sizes of integer types > [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT.>Thanks to adding to my confusion :)So I have an 11-bit machine bytesThat's what "CHAR_BIT equals eleven" means. Sorry for being that stubborn, but:Why?Why can't I have CHAR_BIT==8 on a 11-bit machine?E.g. my int would be something like say 11(lower)+8(upper)=19 bits.Is it postulated somewhere thatUINT_MAX+1==(UCHAR_MAX+1)*sizeof(unsigned)?I don't think so. Sorry for posting nonsense contradicting 6.2.6.1 #4. It appears indeed that I cannot have 11+9-bit int. While I can have 8+8=16-bit int etc, such a C machine would simply ignore the 3 of 11 bits. Or it can use for padding, which demonstrates that padding of unsigned char is possible e.g. for trap values (for instance, uninitialized or truncated on assignment or whatever). Would it be a legit implementation? No. Unsigned char can't have padding bits. It is not permitted. Neither are trap representations. If you choose to fake an 8-bit char on your 11-bit hardware you must do so in such a way as to hide all evidence of the extra bits. Padding bits are visible. You can tell they are there because the set of representable values in type T is less than or equal 2**(CHAR_BIT * sizeof(T) - 1). I.e. at least one bit does not contribute to the set of values. -- Ben. Nov 4 '07 #29

 P: n/a Ben Bacarisse wrote: Ark Khasin Ben Bacarisse wrote: >>James Kuyper [NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to saythat bitwise logic is a magic performed on representations, and noton values?]No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations.Is that true for &, |, ^ and ~? The definitions are very bland, butthey suggest (simply by saying so little) that the interpretation isto be based on the representation. This is backed up by section6.5p4. Yes, I took a beating in this ng recently for proposing, as anacademic exercise,int cmpneq(int a, int b){ return a^b; }At the time, I agreed that the beating was well deserved. But as faras I can tell, it depended on ^ operating on representations.An authoritative and well-substantiated clarification would be morethan welcome! If you think about it, you can *always* define the meaning in terms of values even if it is more natural to think of it in terms of representations. However, that would be stretching a point. An expression like '-1 | -2' does not invoke undefined behaviour and the result is most easily explained in terms of the representation of the operands. (Of course it is daft, but that is not really the point.) So, -1 | -2 (or better yet, (-1)^1) is... what? Does it not depend on one of the 3 models of negatives C recognizes - 2's complement, 1's complement and sign+magnitude? -- Ark Nov 4 '07 #30

 P: n/a James Kuyper wrote: Ark Khasin wrote: >Ben Bacarisse wrote: >>Ark Khasin Ben Bacarisse wrote:] ... >>RH's point was something else altogether -- that all bits zero is notguaranteed to produce a null pointer (to be scrupulously correct, itis not guaranteed to produce a value that compares equal to a nullpointer constant). The parenthesized comment was not actually needed to make the statement "scrupulously correct"; it would have been just as correct, and less confusing, without it. >That's where I am lost and reading the standard doesn't help:What's the difference between a value of an object and how it comparesequal? I mean, if a==b, whatever their representations, in whatcontext(s) does it make sense to say they may have different values? There is no difference. Don't let the unnecessary "clarification" confuse you. The issue isn't having different values with the same representation in a single type - that can't happen. The issue is that there can be multiple different representations of the same value in a given type. However, the values of objects of that type containing those different representations must compare equal. You're tripping over a minor issue; the fact that there can be multiple representations of a null pointer. I don't think I am However, you've lost track of the key issue: that a pointer object with all of its bits set to 0 doesn't have to be one of those representations. In fact, it doesn't have to represent a valid pointer value of any kind. I don't think I have. But I find it while correct, grotesque. > >[NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair to saythat bitwise logic is a magic performed on representations, and not onvalues?] No. In general, the bitwise operations are defined in terms of their actions on the values, not the representations. For instance, E>>1 is defined as dividing the value of E by 2. No. E.g. with 2's complement machine and C99 (and perhaps 90% of C90) a/2 is (a>=0)?(a>>1):((a+1)>>1) The complicated exceptions all involve sign bits, and most result in undefined behavior, which is why it's strongly recommended that bitwise operations be restricted to unsigned types, or at least restricted to values which are guaranteed to be positive both before and after the operation. >> void *a;; memset_as_above(&a, 0, sizeof a); There is, at this point, no guarantee that 'a' contains a valid pointer representation. Sure. But I find it while correct, grotesque. Therefore, the next line renders the behavior of your entire program undefined: >> if (a == 0) { /* not guaranteed */ //Which is correct but implies { void **pNULL = 0; if(a==*pNULL) { /* not guaranteed */ I'm not sure what your point was; but you've just attempted to dereference a null pointer, again making the behavior undefined. Oops. I meant void *pNULL = 0; //pNULL == NULL a == pNULL not guaranteed. Not that I don't grasp it; it just seems grotesque -- Ark Nov 4 '07 #31

 P: n/a Ark Khasin So, -1 | -2 (or better yet, (-1)^1) is... what? Does it not depend on one of the 3 models of negatives C recognizes - 2's complement, 1's complement and sign+magnitude? Yes. My reading of the standard is that whatever bits are set to represent -1 and -2 (and that depends on the kind of negative number system used) are OR'd to get the result. 1's comp s+mag 2's comp -1 1..1111110 1..0000001 1..1111111 -2 1..1111101 1..0000010 1..1111110 -1 | -2 1..1111111 1..0000011 1..1111111 value -0 -3 -1 -1 1..1111110 1..0000001 1..1111111 1 0..0000001 0..0000001 0..0000001 -1 ^ 1 1..1111111 1..0000000 1..1111110 -0 -0 -2 I prefer my example since it can result in three values (or two values and a trap representation). -- Ben. Nov 4 '07 #32

 P: n/a Ark Khasin wrote: So, -1 | -2 (or better yet, (-1)^1) is... what? Does it not depend on one of the 3 models of negatives C recognizes - 2's complement, 1's complement and sign+magnitude? (-1 | -2) == {-1,-0,-3} (-1 ^ 1) == {-2,-0,-0} (-1) | (-2) 1111.1111 | 1111.1110 == (-1) 1111.1110 | 1111.1101 == (-0) 1000.0001 | 1000.0010 == (-3) (-1) ^ (1) 1111.1111 ^ 0000.0001 == (-2) 1111.1110 ^ 0000.0001 == (-0) 1000.0001 ^ 0000.0001 == (-0) -- pete Nov 4 '07 #33

 P: n/a Ark Khasin Ark Khasin wrote: >>[NEGATIVE_ZERO comes to mind - and goes away. BTW, is it fair tosay that bitwise logic is a magic performed on representations, andnot on values?] No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations. For instance, E>>1is defined as dividing the value of E by 2. No. E.g. with 2's complement machine and C99 (and perhaps 90% of C90) a/2 is (a>=0)?(a>>1):((a+1)>>1) That is rather unfair. James Kuyper goes on, immediately, to say that the exceptions involve sign bits. He says that "most" of these cases are undefined (which is correct) but you are also largely correct in that shifting a signed and negative value one place left is implementation defined (it is not connected to 2's compliment, it is simply an implementation defined operation -- which may be undefined, of course). >The complicated exceptions allinvolve sign bits, and most result in undefined behavior, which iswhy it's strongly recommended that bitwise operations be restrictedto unsigned types, or at least restricted to values which areguaranteed to be positive both before and after the operation. -- Ben. Nov 4 '07 #34

 P: n/a Ark Khasin wrote: James Kuyper wrote: >Ark Khasin wrote: >>Ben Bacarisse wrote: .... >issue: that a pointer object with all of its bits set to 0 doesn'thave to be one of those representations. In fact, it doesn't have torepresent a valid pointer value of any kind. I don't think I have. But I find it while correct, grotesque. .... >No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations. For instance, E>>1 isdefined as dividing the value of E by 2. No. E.g. with 2's complement machine and C99 (and perhaps 90% of C90) a/2 is (a>=0)?(a>>1):((a+1)>>1) >The complicated exceptions allinvolve sign bits, and most result in undefined behavior, which is why As I said, the exceptions involve sign bits, and the behavior when the sign bit is set is the key difference between what I wrote and what you wrote. Please note that the behavior of (a+1)>>1 produces an implementation-defined result when a+1 is negative; while the most plausible behavior is to handle it in the fashion you expect, the standard does not require it. >>> void *a;; memset_as_above(&a, 0, sizeof a); There is, at this point, no guarantee that 'a' contains a validpointer representation. Sure. But I find it while correct, grotesque. I can't help you with the aesthetic judgments. As a practical matter, I believe that the standard lets these things depend upon the implementation, precisely because there are real platforms where the rules you'd like to see would make it unacceptably difficult to create an efficient implementation of C. It wasn't just invented to make things complicated for programmers. Therefore, the next line renders the behavior of your >entire program undefined: >>> if (a == 0) { /* not guaranteed */ //Which is correct but implies { void **pNULL = 0; if(a==*pNULL) { /* not guaranteed */ I'm not sure what your point was; but you've just attempted todereference a null pointer, again making the behavior undefined. Oops. I meant void *pNULL = 0; //pNULL == NULL a == pNULL not guaranteed. With those changes, if it weren't for the fact that a had been filled in by a call to memset that renders any use of 'a' dangerous, the rest of this code would have been fine. If 'a' compared equal to 0, that would normally have been sufficient to guarantee that it would also compare equal to pNULL. All null pointers must compare equal, regardless of representation, and no non-null pointer is allowed to compare equal to a null pointer. Nov 4 '07 #35

 P: n/a pete So, -1 | -2 (or better yet, (-1)^1) is... what? Does it not depend onone of the 3 models of negatives C recognizes - 2's complement, 1'scomplement and sign+magnitude? (-1 | -2) == {-1,-0,-3} (-1 ^ 1) == {-2,-0,-0} snap! -- Ben. Nov 4 '07 #36

 P: n/a James Kuyper wrote: Ark Khasin wrote: >>No. In general, the bitwise operations are defined in terms of theiractions on the values, not the representations. For instance, E>>1 isdefined as dividing the value of E by 2. No. E.g. with 2's complement machine and C99 (and perhaps 90% of C90)a/2 is (a>=0)?(a>>1):((a+1)>>1) >>The complicated exceptions allinvolve sign bits, and most result in undefined behavior, which is why As I said, the exceptions involve sign bits, and the behavior when the sign bit is set is the key difference between what I wrote and what you wrote. Please note that the behavior of (a+1)>>1 produces an implementation-defined result when a+1 is negative; while the most plausible behavior is to handle it in the fashion you expect, the standard does not require it. I have to offer my apologies. It was unfair indeed to object to the first 40% of the statement without parsing the meaning of the remaning 60%. Sorry. -- Ark Nov 4 '07 #37

 P: n/a Ark Khasin wrote: > pete wrote: >>>>No. unsigned char may not have padding bits.Is this "just a theory"?No.N869 5.2.4.2.1 Sizes of integer types [#2] The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT. Thanks to adding to my confusion :) So I have an 11-bit machine bytes That's what "CHAR_BIT equals eleven" means. Sorry for being that stubborn, but: Why? Because that's what "CHAR_BIT" means. N869 5.2.4.2 Numerical limits [#1] -- number of bits for smallest object that is not a bit- field (byte) CHAR_BIT -- pete Nov 4 '07 #38

 P: n/a Ark Khasin wrote: .... Is it so that a consensus emerges that ^ | & on negative numbers depend I can't speak for the consensus; this is just my understanding of what the standard says. In many cases my understanding differs distinctly from the consensus understanding. on representation (or, to tell the truth, act on representations) and so are implementation-defined (although only 3 ways to implement are recognized)? Those operators don't act on the representations of objects; otherwise cases like x += (a+1) | (b-2); would be undefined. The way that they operate on or generate negative values must, however, be consistent with whichever of 3 permitted ways of handling the sign bits that the implementation chose for the relevant type. I see no way for those operators to operate on the padding bits in any sense that is meaningful within the context of the standard. I would expect that what happens is that the padding bits would be handled just like the value bits, but that an implementation is not required to do so. If you want to ensure that padding bits are handled in the same fashion, you need to access the object as an array of unsigned char, and perform the operations on a byte-by-byte basis. Nov 4 '07 #39

 P: n/a Ark Khasin Is it so that a consensus emerges that ^ | & on negative numbers depend on representation (or, to tell the truth, act on representations) and so are implementation-defined (although only 3 ways to implement are recognized)? Well, not many people have expressed a view, and by this time in the thread I don't think anyone is any doubt about what the various operations *do*, so the discussion comes down to how the term "representation" is used. James Kuyper takes the view that representations only exist in objects (i.e. when values are stored) but, although that is reasonable meaning of the term, I don't think that is supported by the wording of the standard (see my other answer to him in this thread). In any case, the meaning of ^, |, and & on signed values is most definitely implementation defined because it depends on the way negative numbers are represented by the implementation. To me, that means they are defined in terms of the representation, even though no storage is involved in an expression like '-1 | -2'. You decide if you like this meaning of the term. Is that a consensus? No, but to some extent one can interpret silence as consent here. Alternatively, since such thing are inherently not portable, the silence might just mean that no one cares. I certainly care about this much less then the volume of words I have written about it suggests -- I'd never use such a construct. -- Ben. Nov 5 '07 #40

 P: n/a James Kuyper Is it so that a consensus emerges that ^ | & on negative numbersdepend I can't speak for the consensus; this is just my understanding of what the standard says. In many cases my understanding differs distinctly from the consensus understanding. >on representation (or, to tell the truth, act on representations)and so are implementation-defined (although only 3 ways to implementare recognized)? Those operators don't act on the representations of objects; otherwise cases like x += (a+1) | (b-2); You have written a correct statement because, of course, a+1 is not an object. The question is, does '(a+1) | (b-2)' depend (or act) on the representation of 'a+1' and 'b-2' and I'd say it does. Section 6.2.6 is called "Representations of types" (not objects) and 6.2.6.1 p4 takes some pains to define a new term -- the "object representation" -- which is not quite the same thing as the representation of the type. I don't want to suggest that | operates on padding bits. I don't think one can determine that either way, but it seems to be pushing the meaning of representation to its limits (and beyond that used in the standard) to say that |, &, and ^ act on the value not the representation. would be undefined. The way that they operate on or generate negative values must, however, be consistent with whichever of 3 permitted ways of handling the sign bits that the implementation chose for the relevant type. These "permitted ways" are described in the section called "Representation of types". Your sentence would be simpler if you had said that the bits they operate on (and the meaning of the bits that are produced) are determined by the representation used for signed integer types. It is possible to define the result of | on signed types by talking only about the value of the operands, but it is complicated to do so and the standard clearly distinguishes between the terms "representation" and "object representation" so that there is simple way to understand the operation. The standard uses only few words to define | (and the others) because the term "corresponding bits" is clearly intended to refer back to the bits used to represent the value as described in 6.2.6. I see no way for those operators to operate on the padding bits in any sense that is meaningful within the context of the standard. I agree, but padding bits are only one part of the representation of the type. -- Ben. Nov 5 '07 #41

 P: n/a Ben Bacarisse wrote: James Kuyper Those operators don't act on the representations of objects; otherwisecases like x += (a+1) | (b-2); You have written a correct statement because, of course, a+1 is not an object. The question is, does '(a+1) | (b-2)' depend (or act) on the representation of 'a+1' and 'b-2' and I'd say it does. Section 6.2.6 is called "Representations of types" (not objects) and 6.2.6.1 p4 takes some pains to define a new term -- the "object representation" -- which is not quite the same thing as the representation of the type. That is not clear to me. The standard defines what "object representation" means, but it never defines what "representation" means. As far as I can tell, the term "representation" is almost always used in the standard as a short form for "object representation". The standard uses the term "value representation" exactly once, in 6.2.6.2p2, and oddly enough, while it is worded as a definition, it not italicized as a definition should be. If it were an official definition, from context it would appear to be defined only for unsigned types. When the word "representation" is used without either "object" or "value" preceding it, I found only a few cases where it was clear that it was not a reference to the object representation; in every case, it was also clear that it did not refer to the value representation. Examples include such things as the representation of characters on the display screen, or the representation of data in the output file. In both of those cases, that only thing the standard says is that those are details outside of its scope. I'm not saying there's no instances where a lone "representation" clearly refers to the value representation, where inserting the word "object" before "representation" would clearly change the intended meaning. There's too many instances for me to check reliably. However, I didn't find any - do you know of any? >would be undefined. The way that they operate on or generate negativevalues must, however, be consistent with whichever of 3 permitted waysof handling the sign bits that the implementation chose for therelevant type. These "permitted ways" are described in the section called "Representation of types". Your sentence would be simpler if you had said that the bits they operate on (and the meaning of the bits that are produced) are determined by the representation used for signed integer types. Which is the way that the standard says the same thing. I don't think there's a difference in meaning, just a difference in clarity. .... and the standard clearly distinguishes between the terms "representation" and "object representation" so that there is simple Citation, please? The definition of "object representation" does not clearly distinguish them. Nov 6 '07 #42

 P: n/a Ben Bacarisse wrote: .... If section 6.2.6 was called "Object representation of types" so that it was unambiguously about how values are stored and nothing else, then I believe '-1 | -2' would be undefined. If values, not stored in objects, do not have a representation, then what "bits" are there to or together? It is because values may be represented as collections of bits (as it happens in three different ways) that such expressions can have a meaning defined by combining "corresponding bits". 6.5p4 says "Some operators (the unary operator ~, and the binary operators <<, >>, &, ^, and |, collectively described as bitwise operators) are required to have operands that have integer type. These operators yield values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types". As I understand it, that statement about the "internal representations" refers to the object representation, and does not imply that the standard attaches any meaning to the representation of a value that is not currently stored in any C object. It merely requires that the bitwise operators act on values in such a way that if the resulting value were saved in an object of that value's type, it would have the correct bit pattern. .... I have no other citation than the definition I already cited. I agree the distinction is not crystal clear, but if your reading of the words is that there is essentially no difference between "representation" and "object representation" then how do you give '-1 | -2' a meaning? You could, of course, say that | (and friends) behaves as if its operands where stored in objects and the corresponding bits are combined together, but you'd *still* be referencing the allowed representations. That's exactly how I understand it. I never denied that the allowed representations were referenced, only that the concept of a representation of a value only aquires a meaning in the event that it is stored in an object. Nov 7 '07 #43

 P: n/a James Kuyper If section 6.2.6 was called "Object representation of types" so thatit was unambiguously about how values are stored and nothing else,then I believe '-1 | -2' would be undefined. If values, not stored inobjects, do not have a representation, then what "bits" are there toor together? It is because values may be represented as collectionsof bits (as it happens in three different ways) that such expressionscan have a meaning defined by combining "corresponding bits". 6.5p4 says "Some operators (the unary operator ~, and the binary operators <<, >>, &, ^, and |, collectively described as bitwise operators) are required to have operands that have integer type. These operators yield values that depend on the internal representations of integers, and have implementation-defined and undefined aspects for signed types". As I understand it, that statement about the "internal representations" refers to the object representation, and does not imply that the standard attaches any meaning to the representation of a value that is not currently stored in any C object. I can accept that this is the intent, but I do not think it is clear and unambiguous. Since it can't be detected by a C program, I don't really care if there is one representation (when a value is stored) or another, transient one, as well. The latter just seemed to me a convenient expository device for exactly these cases, but if it is a fiction of my imagination, so be it! It merely requires that the bitwise operators act on values in such a way that if the resulting value were saved in an object of that value's type, it would have the correct bit pattern. ... >I have no other citation than the definition I already cited. I agreethe distinction is not crystal clear, but if your reading of the wordsis that there is essentially no difference between "representation"and "object representation" then how do you give '-1 | -2' a meaning?You could, of course, say that | (and friends) behaves as if itsoperands where stored in objects and the corresponding bits arecombined together, but you'd *still* be referencing the allowedrepresentations. That's exactly how I understand it. I never denied that the allowed representations were referenced, Well that was exactly what I thought you were doing when you said: "In general, the bitwise operations are defined in terms of their actions on the values, not the representations." I just can't square this with your last remark above. For four of the six, the representation is a required part of the definition. only that the concept of a representation of a value only aquires a meaning in the event that it is stored in an object. -- Ben. Nov 7 '07 #44

### This discussion thread is closed

Replies have been disabled for this discussion.