By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
426,179 Members | 2,192 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 426,179 IT Pros & Developers. It's quick & easy.

Re: C and NULL character

P: n/a
"mkeles84" <mk******@hotmail.comwrites:
I have a problem about NULL's.
for example, a variable is name :
--------------
char name[6]="MEHMET"; /* 6 characters */
if (strcmp(name,"MEHMET") == 0){
printf("true");
}else{
printf("false");
}
I think result must bt "true" but I saw "false" on screen.
how can I compare it. I don't want to use NULL characker after variable.
comp.std.c deals with the C standard -- the document, how it's
developed, and so forth. comp.lang.c deals with the C language, and
that's where your question belongs. I've cross-posted there and set
followups.

NULL (all-caps) is a macro that expands to a null pointer constant.
What you're asking about is the null character, '\0', sometimes
refered to as NUL (one L).

Your declaration
char name[6]="MEHMET";
doesn't create a string. It creates an array of 6 characters, *not*
terminated by a null character. By passing a pointer to this array to
strcmp(), you invoke undefined behavior; literally anything could
happen. What's most likely to happen is that strcmp will scan the
first 6 characters of the arrays pointed to by its two arguments;
finding them all to be equal, it will continue to the 7th character.
If the byte in memory following your variable ``name'' happens to be
'\0', strcmp() will return 0. If it happens not to be '\0', strcmp()
will return a non-zero value. If it's outside the range of memory
that your program is allowed to access, it could crash your program.
Message posted using http://www.talkaboutcomputing.com/group/comp.std.c/
More information at http://www.talkaboutcomputing.com/faq.html
You'll do better posting through a real Usenet server. From what I've
seen, even Google Groups is likely to be better than
talkaboutcomputing.com.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 21 '08 #1
Share this Question
Share on Google+
7 Replies


P: n/a
On Mon, 21 Jul 2008 01:25:56 -0700, Keith Thompson
<ks***@mib.orgwrote:
>"mkeles84" <mk******@hotmail.comwrites:
[snip]
>
Your declaration
char name[6]="MEHMET";
doesn't create a string. It creates an array of 6 characters, *not*
terminated by a null character.
[snip]

This is one of those irritating idiosyncracies of C. The rule is
that excess initializers are discarded. GCC, and I suppose most
compilers, will issue a warning if there are excess initializers
except in the one special case where the excess initializer is
the trailing \0. (There may be some combination of GCC flags
that will produce a warning; that is not to the point.)
Richard Harter, cr*@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
Jul 21 '08 #2

P: n/a
cr*@tiac.net (Richard Harter) writes:
On Mon, 21 Jul 2008 01:25:56 -0700, Keith Thompson
<ks***@mib.orgwrote:
>>"mkeles84" <mk******@hotmail.comwrites:
[snip]
>>
Your declaration
char name[6]="MEHMET";
doesn't create a string. It creates an array of 6 characters, *not*
terminated by a null character.
[snip]

This is one of those irritating idiosyncracies of C. The rule is
that excess initializers are discarded. GCC, and I suppose most
compilers, will issue a warning if there are excess initializers
except in the one special case where the excess initializer is
the trailing \0. (There may be some combination of GCC flags
that will produce a warning; that is not to the point.)
No, there's no such rule. In fact, C99 6.7.8p2 says:

No initializer shall attempt to provide a value for an object not
contained within the entity being initialized.

Excess initializers aren't discarded; they're a constraint violation.

There's only one special case for string literals, in C99 6.7.8p14:

An array of character type may be initialized by a character
string literal, optionally enclosed in braces. Successive
characters of the character string literal (including the
terminating null character if there is room or if the array is of
unknown size) initialize the elements of the array.

It's the "if there's room" clause that allows the null character to be
ignored, but only if the declared size of the array is exactly the
same as the length of the string excluding the trailing '\0'.

IMHO a better way to handle this would have been to drop the "if
there's room" wording, and perhaps add a special flavor of string
literal that doesn't include the trailing '\0', so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 21 '08 #3

P: n/a
Keith Thompson wrote:
There's only one special case for string literals, in C99 6.7.8p14:

An array of character type may be initialized by a character
string literal, optionally enclosed in braces. Successive
characters of the character string literal (including the
terminating null character if there is room or if the array is of
unknown size) initialize the elements of the array.

It's the "if there's room" clause that allows the null character to be
ignored, but only if the declared size of the array is exactly the
same as the length of the string excluding the trailing '\0'.

IMHO a better way to handle this would have been to drop the "if
there's room" wording, and perhaps add a special flavor of string
literal that doesn't include the trailing '\0', so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };
I see no reason to provide a special syntax. If people want to use the
syntax for a string, they should be required to provide enough space to
store an entire string (including the terminator). If they want a mere
character array (without the terminator), let them use the syntax for a
normal array initializer.

IMHO, this is a flaw in the standard, but as we all know, ANSI was
chartered primarily to document what existed already, not create a
perfect language.

S
Jul 21 '08 #4

P: n/a
Stephen Sprunk <st*****@sprunk.orgwrites:
Keith Thompson wrote:
>There's only one special case for string literals, in C99 6.7.8p14:
An array of character type may be initialized by a character
string literal, optionally enclosed in braces. Successive
characters of the character string literal (including the
terminating null character if there is room or if the array is of
unknown size) initialize the elements of the array.
It's the "if there's room" clause that allows the null character to
be
ignored, but only if the declared size of the array is exactly the
same as the length of the string excluding the trailing '\0'.
IMHO a better way to handle this would have been to drop the "if
there's room" wording, and perhaps add a special flavor of string
literal that doesn't include the trailing '\0', so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };

I see no reason to provide a special syntax. If people want to use
the syntax for a string, they should be required to provide enough
space to store an entire string (including the terminator). If they
want a mere character array (without the terminator), let them use the
syntax for a normal array initializer.
It would be useful for the rare cases where you actually want a
sequence of characters that's not terminated by '\0'. In particular,
it might make it marginally easier to work with alternative
representations for strings (here I'm using "strings" in a generic
sense, not the C sense).

Using a normal array initializer is tedious; a new string literal
syntax would make the code more readable, and the leading 'N' (or
whatever) would stand out enough to make the meaning obvious.

But I'm not at all convinced that the need is great enough to justify
the change to the language (thus the word "perhaps" above).
IMHO, this is a flaw in the standard, but as we all know, ANSI was
chartered primarily to document what existed already, not create a
perfect language.
Agreed.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 21 '08 #5

P: n/a
On Mon, 21 Jul 2008 09:30:11 -0700, Keith Thompson
<ks***@mib.orgwrote:
>cr*@tiac.net (Richard Harter) writes:
>On Mon, 21 Jul 2008 01:25:56 -0700, Keith Thompson
<ks***@mib.orgwrote:
>>>"mkeles84" <mk******@hotmail.comwrites:
[snip]
>>>
Your declaration
char name[6]="MEHMET";
doesn't create a string. It creates an array of 6 characters, *not*
terminated by a null character.
[snip]

This is one of those irritating idiosyncracies of C. The rule is
that excess initializers are discarded. GCC, and I suppose most
compilers, will issue a warning if there are excess initializers
except in the one special case where the excess initializer is
the trailing \0. (There may be some combination of GCC flags
that will produce a warning; that is not to the point.)

No, there's no such rule. In fact, C99 6.7.8p2 says:

No initializer shall attempt to provide a value for an object not
contained within the entity being initialized.

Excess initializers aren't discarded; they're a constraint violation.

There's only one special case for string literals, in C99 6.7.8p14:

An array of character type may be initialized by a character
string literal, optionally enclosed in braces. Successive
characters of the character string literal (including the
terminating null character if there is room or if the array is of
unknown size) initialize the elements of the array.

My bad, I misread the section in question.

G99? Isn't that a fantasy?
>It's the "if there's room" clause that allows the null character to be
ignored, but only if the declared size of the array is exactly the
same as the length of the string excluding the trailing '\0'.

IMHO a better way to handle this would have been to drop the "if
there's room" wording, and perhaps add a special flavor of string
literal that doesn't include the trailing '\0', so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };
The cure is worse than the disease.
Richard Harter, cr*@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
Jul 22 '08 #6

P: n/a
Stephen Sprunk wrote:
Keith Thompson wrote:
...IMHO a better way to handle this would have been to drop
the "if there's room" wording, and perhaps add a special
flavor of string literal that doesn't include the trailing '\0',
so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };

I see no reason to provide a special syntax.
Neither do I.
If people want to use the syntax for a string, they should be required
to provide enough space to store an entire string (including the
terminator). If they want a mere character array (without the
terminator), let them use the syntax for a normal array initializer.
IMHO, this is a flaw in the standard, but as we all know, ANSI was
chartered primarily to document what existed already, not create
a perfect language.
Fixed width fields still exist; many databases are still filled with
CHAR(N) columns.

You may not appreciate the rule, but that's no reason to deprive those
that do, even if C++ has already done so.

--
Peter
Jul 22 '08 #7

P: n/a
Peter Nilsson <ai***@acay.com.auwrites:
Stephen Sprunk wrote:
>Keith Thompson wrote:
...IMHO a better way to handle this would have been to drop
the "if there's room" wording, and perhaps add a special
flavor of string literal that doesn't include the trailing '\0',
so that this:
char name[6] = "MEHMET";
would be a constraint violation (<OT>as it is in C++</OT>), and this:
char name[] = N"MEHMET";
would be equivalent to:
char name[6] = { 'M', 'E', 'H', 'M', 'E', 'T' };

I see no reason to provide a special syntax.

Neither do I.
>If people want to use the syntax for a string, they should be required
to provide enough space to store an entire string (including the
terminator). If they want a mere character array (without the
terminator), let them use the syntax for a normal array initializer.
IMHO, this is a flaw in the standard, but as we all know, ANSI was
chartered primarily to document what existed already, not create
a perfect language.

Fixed width fields still exist; many databases are still filled with
CHAR(N) columns.

You may not appreciate the rule, but that's no reason to deprive those
that do, even if C++ has already done so.
Ok, so my idea for a special literal syntax for non-null-terminated
character arrays doesn't seem to be very popular.

I'll just point out a problem with the current special-case rule. If
I want to store "hello, world" with no '\0' in an array, I have two
choices. I can either write out a verbose array initializer for
something that consists entirely of printable characters:

char message[] = { 'h', 'e', 'l', 'l', 'o', ',', ' ',
'w', 'o', 'r', 'l', 'd' };

or I can declare message with an explicit length:

char message[12] = "hello world";

The problem with the latter is that I have to *count characters*,
something that the compiler could and should have done for me.

And when I want to change the string, I'm likely to forget to update
the length. Did you notice that I deleted the comma but left the
length at 12, so message will have the trailing '\0' after all, with
no warning from the compiler?

Even Fortran replaced Hollerith constants.

There's no *need* for a new syntax, and I'm not even suggesting that
it would be worthwhile to add one, but there would be *some* benefit.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 22 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.