In article <ea**********@news.ryerson.ca <gt*****@ee.ryerson.cawrote:
>I have a question about string constants.
There are a number of tricks you need to "get straight in your head"
in order to deal with this.
First, a C string is actually a data structure, namely, an array
of "char"s in which the first zero-byte is considered the end of the
string.
Second, escapes like '\007' are interpreted by the compiler, and
the lexical rules for the octal version are:
From the backslash, consume up to (but no more than) three
octal digits, stopping when you run out of digits or when
the first "invalid" character occurs.
Hence, if you encounter
\1\29\00345
this "means" \1, then \2, then 9, then \003, then 4, then 5.
Third, string literals usually -- but not always[%] -- mean "generate
an anonymous array containing the characters given in the literal,
with a \0 character appended".
Last, adjacent string literals are concatenated after escape sequence
interpretation, but before adding the final \0.
char str1[] = "\007";
This string literal has one \7 character inside, so generates an
array containing two characters, namely \7 and \0.
char str2[] = "\0" "07";
Here there are two adjacent string literals. The first has one
\0 character inside, and the second has two characters inside,
'0' and '7'. These are concatenated -- giving '\0' '0' '7'
in that order -- and a final \0 is added. The result is the same
as if you wrote either:
char str2[] = "\00007";
or the initializer you gave for str3:
char str3[] = { '\0', '0', '7', '\0' };
Both of these create an array of size 4, containing the four
specified "char"s. Since str2 and str3 both begin with a zero
byte, their strlen()s are zero, even though both arrays continue
(always) to hold four "char"s.
>I understand that yet another obscure C feature is the octal character
specification so that \ddd is one character.
Right -- but only if the digits are uninterrupted, and all octal.
(The situation is quite different for \x escapes, as someone else
noted elsethread.)
>However, should not str1 and str2 be the same?
No; the order in which the escape-interpretation and
string-literal-concatenation occurs forbids this.
[% The two exceptions are: when the literal is not the last in an
adjacent sequence, so that concatenation occurs before adding the
\0, or when the literal is used as an initializer for an array
whose size was specified, and whose specified size is exactly large
enough to hold the characters in the literal without adding the
\0. Making use of this second exception is particularly annoying;
it reminds me of the Bad Old Days of Hollerith constants in Fortran.]
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it
http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.