By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
428,978 Members | 1,492 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 428,978 IT Pros & Developers. It's quick & easy.

remove entry from text file [NOT A HOMEWORK]

P: n/a
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
....

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :-) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx[i] = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}

Thank you in advance.

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 23 '06 #1
Share this Question
Share on Google+
36 Replies


P: n/a

Roman Mashak wrote:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
...

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :-) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx[i] = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}

Thank you in advance.

With best regards, Roman Mashak. E-mail: mr*@tusur.ru


Look for references about "lex" C source generator (or even "yacc" if
you have more complex requirements like this one)

Feb 23 '06 #2

P: n/a
Roman Mashak wrote:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):
<snip>
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;


Here, because of opening the file for writing in present directory, it
will restrict the directory from which a user can run your program. If a
user does n't have write permission for the present directory, then he
will have to change the directory and run your program.

Probably you should have explained algorithm used in your program in
short paragraph.
Feb 23 '06 #3

P: n/a
boa
Roman Mashak wrote:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
...

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :-) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>
strings.h?

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
Maybe return EXIT_FAILURE to indicate an error?
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;
Again, EXIT_FAILURE.

if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
How about using tmpnam() instead?
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);
See http://c-faq.com/stdio/feof.html

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
strlen("zone"), not sizeof().
if (!found) {
fprintf(fn, buf);
fflush(fn);
Why do you need to call fflush()?

}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;
How about a function that strips the white space?


/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx[i] = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}


That's a start, HTH.

Boa
Feb 23 '06 #4

P: n/a
"Roman Mashak" <mr*@tusur.ru> writes:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):
<snip>
zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
<snip> (here's the code:)
#include <stdio.h>
#include <string.h>
#include <strings.h>
The last of these is not Standard C; see my discussion later.

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;
A return of -1 will have an implementation-defined meaning... slightly
better would be to #include <stdlib.h> and return EXIT_FAILURE; which
is guaranteed to have the obvious meaning on any current or future
implementation.

Also, it would really pay to emit some sort of message here. You'll
really thank yourself when you accidentally open a non-existant file,
or non-readable.... etc.
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
Whenever I find myself having to use the exact same string literal
twice like this, it's time to farm it out. Using something like:

const char zone_token[] = "zone ";
...
if (strncasecmp(idx, zone_token, (sizeof zone_token)-1) != 0)

saves you from the problem of accidental typos. If the number of
characters between the two of those were to accidentally get out of
sync, it'd be a bug. And even though you have them exactly right now,
you might later need to change the code to use a different literal,
and accidentally change only one.

BTW, strncasecmp() is not a Standard C function (it's POSIX, which is
not on-topic here). The residents on this newsgroup insist that you do
not post code containing functions that are (a) not Standard C
functions and (b) not defined in the provided code. For the sake of
example code on this newsgroup, the easiest way for you to avoid
difficulties is probably to use strncmp() instead. Or, define your own
implementation of strncasecmp()...
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;
If you decide to follow my advice, this would become:

idx += (sizeof zone_array) - 1;

Actually, my own preference has been to define a macro such as:

#define ARY_STRLEN(a) (sizeof(a)-1)

which leads to a slightly better expression, in terms of
self-documentation:

idx += ARY_STRLEN(zone_array) - 1;
/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;
This is the same thing as:

idx += strspn(idx, " \t");

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx[i] = '\0'; i = 0;
The second statement on the line above is easy to miss. I'd recommend
giving each of these a separate line.
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}


The last pair of fclose()s are not strictly necessary, if you're not
going to check their return value. However, I'd recommend that you do
check their return value, so you can emit an error message if fn
didn't flush properly... actually, you should check the return code of
your fflush() calls, too.

I'll point out that the code above won't work properly if your config
file contains lines longer than 100 characters.

-Micah
Feb 23 '06 #5

P: n/a
boa <bo*****@gmail.com> writes:
Roman Mashak wrote:
/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;
/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {


strlen("zone"), not sizeof().


You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient.
if (!found) {
fprintf(fn, buf);
fflush(fn);


Why do you need to call fflush()?


Could be useful if he's using another program to view the temporary
file while it's being written (UNIX tail...).

Slightly better might be to set line buffering on fn with setvbuf().

-Micah
Feb 23 '06 #6

P: n/a
Micah Cowan wrote:
boa <bo*****@gmail.com> writes:
Roman Mashak wrote:
> /* look for 'zone' token */
> if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {


strlen("zone"), not sizeof().


You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient.


Hmmmm ???

What's the type of a literal string in code?
I thought it was `const char *'

#include <stdio.h>
#include <string.h>

int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";

a = strlen("forty two");
b = (int)sizeof("forty two");
c = (int)sizeof(char*);
d = (int)sizeof(const char*);
e = (int)sizeof z1;
f = (int)sizeof z2; /* Ah! A match */
printf("%d, %d, %d, %d, %d, %d\n",
a, b, c, d, e, f);
return 0;
}
No Micah, I wasn't doubting you (... well, maybe just a little)
I stand corrected.

--
If you're posting through Google read <http://cfaj.freeshell.org/google>
Feb 24 '06 #7

P: n/a
Pedro Graca wrote:
What's the type of a literal string in code?


An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.

--
pete
Feb 24 '06 #8

P: n/a
pete <pf*****@mindspring.com> writes:
Pedro Graca wrote:
What's the type of a literal string in code?


An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.


"trouble" being undefined behavior.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 24 '06 #9

P: n/a
Hello, boa!
You wrote on Thu, 23 Feb 2006 20:16:14 +0100:

??>> #include <strings.h>

b> strings.h?
I should've apologized for using of non-standard stuffs :)
??>> #define BUFSIZE 100
??>>
??>> int main(int argc, char **argv)
??>> if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )

b> How about using tmpnam() instead?
Manual of my system strongly recommends never to use this function cause of
being unsafe.
??>> return -1;
??>>
??>> while ( !feof(f) ) {
??>> fgets(buf, BUFSIZE, f);

b> See http://c-faq.com/stdio/feof.html
Yeas, that was the problem I run into: printing '\n' twice, BUT it happened
only if I removed entry from the middle of file, otherwise everything came
right.
Thanks for hint.

??>> if (!found) {
??>> fprintf(fn, buf);
??>> fflush(fn);

b> Why do you need to call fflush()?
I view the result file in the next console via "tail -f" (I'm using linux
presently).
??>> }
??>> continue;
??>> }
??>>

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #10

P: n/a
On 2006-02-23, boa <bo*****@gmail.com> wrote:
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )


How about using tmpnam() instead?


<OT>
tmpnam is insecure in Unix; he's right to avoid it.
</OT>

Now, for something on topic...
if (!found) {
fprintf(fn, buf); ^^^
Don't do that. Ever.

If the input file happens to contain a '%' character, undefined
behavior ensues.
/* extract token from commas */
if (*idx++ != '"')
continue;


This branch loses the input line, which probably isn't what was
intended.

--
- David A. Holland
(the above address works if unscrambled but isn't checked often)
Feb 24 '06 #11

P: n/a
pete <pf*****@mindspring.com> wrote:
Pedro Graca wrote:
What's the type of a literal string in code?


An array of some number of char.

....

I beleive so, but what is C&V for that?

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`
Feb 24 '06 #12

P: n/a
Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
thanks for giving me many pieces of advices.

??>> /* extract token from commas */
??>> if (*idx++ != '"')
??>> continue;
??>>
??>> /* put token into buffer to compare */
??>> while (*idx != '\0' && *idx != '"')
??>> sidx[i++] = *idx++;
??>>
??>> sidx[i] = '\0'; i = 0;

MC> The second statement on the line above is easy to miss. I'd recommend
MC> giving each of these a separate line.
Is there any way in your opinion to somehow optimize this piece of code,
make it more elegant :-) may be in less statements...

??>> if (strcasecmp(sidx, argv[1]) != 0) {
??>> found = 0;
??>> fprintf(fn, buf);
??>> fflush(fn);
??>> }
??>> else
??>> found = 1;
??>> }
??>>
??>> fclose(f);
??>> fclose(fn);
??>>
??>> return 0;
??>> }

MC> The last pair of fclose()s are not strictly necessary, if you're not
What do you mean by that? OS should take care of unclosed descriptors by
yourself?
MC> going to check their return value. However, I'd recommend that you do
MC> check their return value, so you can emit an error message if fn
MC> didn't flush properly... actually, you should check the return code of
MC> your fflush() calls, too.

MC> I'll point out that the code above won't work properly if your config
MC> file contains lines longer than 100 characters.
That's correct, but I assume sane config file won't contain strings longer
than 100 characters ;-)

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #13

P: n/a
Hello, David!
You wrote on Fri, 24 Feb 2006 00:30:00 +0000 (UTC):

??>>> if (!found) {
??>>> fprintf(fn, buf);
DH> ^^^
DH> Don't do that. Ever.

DH> If the input file happens to contain a '%' character, undefined
DH> behavior ensues.
Hm, I didn't find this issue in FAQ. What's the more robust and portable
method to put formatted output into file?
??>>> /* extract token from commas */
??>>> if (*idx++ != '"')
??>>> continue;

DH> This branch loses the input line, which probably isn't what was
DH> intended.
Here it is supposed to put a pointer on to first occurence of ' " ' symbol
in the line.

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #14

P: n/a
On 2006-02-24, Keith Thompson <ks***@mib.org> wrote:
pete <pf*****@mindspring.com> writes:
Pedro Graca wrote:
What's the type of a literal string in code?


An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.


"trouble" being undefined behavior.


This seems like an annoying compromise - I think it should have been
const char - any implementation concerned with supporting old code could
have let it been a warning, or have a compiler switch to put it in a
non-conforming mode, perhaps one which would also allow the
modifications themselves.
Feb 24 '06 #15

P: n/a
Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
??>> /* skip spaces inside of token */
??>> for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
??>> ;

MC> This is the same thing as:

MC> idx += strspn(idx, " \t");
It's not exactly same, the purpose of my code it to keep original buffer and
work only with second pointer.
And result of 'strspn()' comes into 'size_t' variable, while I need to work
with 'char *'

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #16

P: n/a
"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
pete <pf*****@mindspring.com> wrote:
Pedro Graca wrote:
What's the type of a literal string in code?


An array of some number of char.

...

I beleive so, but what is C&V for that?


C99 6.4.5p5:
In translation phase 7, a byte or code of value zero is appended
to each multibyte character sequence that results from a string
literal or literals. The multibyte character sequence is then used
to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence;
[discussion of wide string literals deleted]

C99 6.5.1p4:
A string literal is a primary expression. It is an lvalue with
type as detailed in 6.4.5.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 24 '06 #17

P: n/a
Keith Thompson wrote:

"S.Tobias" <si***@FamOuS.BedBuG.pAlS.INVALID> writes:
pete <pf*****@mindspring.com> wrote:
Pedro Graca wrote:

What's the type of a literal string in code?

An array of some number of char.

...

I beleive so, but what is C&V for that?


C99 6.4.5p5:
In translation phase 7, a byte or code of value zero is appended
to each multibyte character sequence that results from a string
literal or literals. The multibyte character sequence is then used
to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence;
[discussion of wide string literals deleted]

C99 6.5.1p4:
A string literal is a primary expression. It is an lvalue with
type as detailed in 6.4.5.


N869
6.3.2 Other operands
6.3.2.1 Lvalues and function designators
[#3] Except when it is the operand of the sizeof operator or
the unary & operator, or is a string literal used to
initialize an array, an expression that has type ``array of
type'' is converted to an expression with type ``pointer to
type'' that points to the initial element of the array
object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.

--
pete
Feb 24 '06 #18

P: n/a
Jordan Abel <ra*******@gmail.com> writes:
On 2006-02-24, Keith Thompson <ks***@mib.org> wrote:
pete <pf*****@mindspring.com> writes:
Pedro Graca wrote:

What's the type of a literal string in code?

An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.


"trouble" being undefined behavior.


This seems like an annoying compromise - I think it should have been
const char - any implementation concerned with supporting old code could
have let it been a warning, or have a compiler switch to put it in a
non-conforming mode, perhaps one which would also allow the
modifications themselves.


I agree that it's annoying, but I think it was pretty much necessary.
Pre-ANSI C didn't have "const". Making string literals const would
break code like this:

int print_string(char *s)
{
return printf("s = \"%s\"\n", s);
}

print_string("hello");

The problem would occur on passing the string literal to a function
expecting a (non-const) char*, not on any attempt to actually write to
it. There would have been no good way to write this program in a
manner compatible with both K&R and ANSI C, though I suppose something
like

#ifdef __STDC__
#define CONST const
#else
#define CONST
#endif

int print_string(CONST char *s) ...

would have been ok.

gcc's "-Wwrite-strings" option causes string literals to be const (and
causes some misleading error messages).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 24 '06 #19

P: n/a
Pedro Graca <he****@dodgeit.com> writes:
Micah Cowan wrote:
boa <bo*****@gmail.com> writes:
Roman Mashak wrote:
> /* look for 'zone' token */
> if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {

strlen("zone"), not sizeof().
You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient.


Hmmmm ???

What's the type of a literal string in code?
I thought it was `const char *'


I wish the const were there....

But now. A string literal has type char[].

Since you posted some very helpful test code, I'll explain what you're
seeing...
#include <stdio.h>
#include <string.h>

int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";
These two don't involve objects generated, as they're initializers,
and not assignments.
a = strlen("forty two");
Not much need to comment here.
b = (int)sizeof("forty two");
This "forty two" requires that the generated program have at least one
character array with a length of 10, and that is the result given by
the string literal, so sizeof will return 10.
c = (int)sizeof(char*);
d = (int)sizeof(const char*);
e = (int)sizeof z1;
The size of all three of these will be implementation-dependant, but
will certainly be the same, as they all test the same type.
f = (int)sizeof z2; /* Ah! A match */
And, as you've discovered, this should also give 10, for the
explicitly declared array of 10 characters.
printf("%d, %d, %d, %d, %d, %d\n",
a, b, c, d, e, f);
return 0;
}
No Micah, I wasn't doubting you (... well, maybe just a little)
I stand corrected.


Hey, doubt away! I'm certainly human, and have certainly said my share
of stupid things on this list, both formerly and recently. That's why
posting to comp.lang.c is a great way to learn the ins and outs of the
C language: if you say something that reveals a misperception of
reality, you will quickly be corrected. :-)

-Micah
Feb 24 '06 #20

P: n/a
"Roman Mashak" <mr*@tusur.ru> writes:
Hello, David!
You wrote on Fri, 24 Feb 2006 00:30:00 +0000 (UTC):

??>>> if (!found) {
??>>> fprintf(fn, buf);
DH> ^^^
DH> Don't do that. Ever.

DH> If the input file happens to contain a '%' character, undefined
DH> behavior ensues.
Hm, I didn't find this issue in FAQ. What's the more robust and portable
method to put formatted output into file?
It's /not/ formatted output: that's the point. If he must use
fprintf(), he should use "%s", followed by buf. But fputs() or puts()
seem like more reasonable facilities.
??>>> /* extract token from commas */
??>>> if (*idx++ != '"')
??>>> continue;

DH> This branch loses the input line, which probably isn't what was
DH> intended.
Here it is supposed to put a pointer on to first occurence of ' " ' symbol
in the line.


You've missed the point. If the line doesn't specify a " after the
keyword "zone ", then it's "not the line you're looking for, move
along." The trouble is, that if it's not the line he's looking for, he
probably wants to print it out, as he does in other places, but forgot
to do that here.

-Micah
Feb 24 '06 #21

P: n/a
"Roman Mashak" <mr*@tusur.ru> writes:
Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
thanks for giving me many pieces of advices.

??>> /* extract token from commas */
??>> if (*idx++ != '"')
??>> continue;
??>>
??>> /* put token into buffer to compare */
??>> while (*idx != '\0' && *idx != '"')
??>> sidx[i++] = *idx++;
??>>
??>> sidx[i] = '\0'; i = 0;

MC> The second statement on the line above is easy to miss. I'd recommend
MC> giving each of these a separate line.
Is there any way in your opinion to somehow optimize this piece of code,
make it more elegant :-) may be in less statements...
Well, personally, I'd probably have avoided the separate and
unnecessary buffer sidx[] altogether, and just stored the start and
end of the interesting text.

Aside from that, though, it might be slightly more efficient to use an
incremented pointer instead. If you had:

char *scur;

Then the loop could be:

for (scur = sidx; *idx != '\0' && *idx != '"'; ++scur, ++idx)
*scur = *idx;

which would at least eliminate the need for i.
??>> if (strcasecmp(sidx, argv[1]) != 0) {
??>> found = 0;
??>> fprintf(fn, buf);
??>> fflush(fn);
??>> }
??>> else
??>> found = 1;
??>> }
??>>
??>> fclose(f);
??>> fclose(fn);
??>>
??>> return 0;
??>> }

MC> The last pair of fclose()s are not strictly necessary, if you're not What do you mean by that? OS should take care of unclosed descriptors by
yourself?


By itself, yes. The C implementation is required to make sure that
exit from the program properly closes any open files.
And they're not descriptors, they're file handles
(descriptors is a UNIX term for something related but different).

-Micah
Feb 24 '06 #22

P: n/a
Hello, Micah!
You wrote on Fri, 24 Feb 2006 02:45:43 GMT:

MC> It's /not/ formatted output: that's the point. If he must use
MC> fprintf(), he should use "%s", followed by buf. But fputs() or puts()
MC> seem like more reasonable facilities.
That's good point, thanks!

??>>>>> /* extract token from commas */
??>>>>> if (*idx++ != '"')
??>>>>> continue;

MC> You've missed the point. If the line doesn't specify a " after the
MC> keyword "zone ", then it's "not the line you're looking for, move
MC> along." The trouble is, that if it's not the line he's looking for, he
I got you, but the problem is if I can't locate the ' " ' sign in the line,
I can't look for what's inside the "zone " token.
May be print the invalid string an move on should be enough:

if (*idx++ != '"') {
fputs(buf, fn);
fflush(fn);
continue;
}

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #23

P: n/a
"Roman Mashak" <mr*@tusur.ru> writes:
Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
??>> /* skip spaces inside of token */
??>> for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
??>> ;

MC> This is the same thing as:

MC> idx += strspn(idx, " \t"); It's not exactly same, the purpose of my code it to keep original buffer and
work only with second pointer.
I don't see how that changes with my line above.
And result of 'strspn()' comes into 'size_t' variable, while I need to work
with 'char *'


Which is why I did +=, and not =.

-Micah
Feb 24 '06 #24

P: n/a
On 2006-02-24, Keith Thompson <ks***@mib.org> wrote:
Jordan Abel <ra*******@gmail.com> writes:
On 2006-02-24, Keith Thompson <ks***@mib.org> wrote:
pete <pf*****@mindspring.com> writes:
Pedro Graca wrote:

> What's the type of a literal string in code?

An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.

"trouble" being undefined behavior.
This seems like an annoying compromise - I think it should have been
const char - any implementation concerned with supporting old code could
have let it been a warning, or have a compiler switch to put it in a
non-conforming mode, perhaps one which would also allow the
modifications themselves.


I agree that it's annoying, but I think it was pretty much necessary.
Pre-ANSI C didn't have "const". Making string literals const would
break code like this:

int print_string(char *s)
{
return printf("s = \"%s\"\n", s);
}

print_string("hello");


Implementations could still be free to continue to support such code.
The problem is that now the "non-const const" is enshrined in the
standard.

The problem would occur on passing the string literal to a function
expecting a (non-const) char*, not on any attempt to actually write to
it. There would have been no good way to write this program in a
manner compatible with both K&R and ANSI C, though I suppose something
like


If it's K&R code, compile it in a different mode - or with a different
compiler program [cc instead of c89 on unix]. they could have also done
that with implicit int from the start, and maybe with non-prototypes.
Feb 24 '06 #25

P: n/a
Hello, Micah!
You wrote on Fri, 24 Feb 2006 03:00:51 GMT:

??>> It's not exactly same, the purpose of my code it to keep original
??>> buffer and work only with second pointer.

MC> I don't see how that changes with my line above.

??>> And result of 'strspn()' comes into 'size_t' variable, while I need to
??>> work with 'char *'

MC> Which is why I did +=, and not =.
Sorry for my stupidity :-) now I got your thought properly.
Thanks a lot!

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #26

P: n/a
Hello, Micah!
You wrote on Fri, 24 Feb 2006 02:59:38 GMT:

MC> Well, personally, I'd probably have avoided the separate and
MC> unnecessary buffer sidx[] altogether, and just stored the start and
MC> end of the interesting text.
What's the benefit of this?

MC> char *scur;

MC> Then the loop could be:

MC> for (scur = sidx; *idx != '\0' && *idx != '"'; ++scur, ++idx)
MC> *scur = *idx;

MC> which would at least eliminate the need for i.
But it will introduce one more pointer, *scur.

With best regards, Roman Mashak. E-mail: mr*@tusur.ru
Feb 24 '06 #27

P: n/a

"Micah Cowan" <mi***@cowan.name> wrote in message
news:87************@mcowan.barracudanetworks.com.. .
"Roman Mashak" <mr*@tusur.ru> writes:

-snip-
BTW, strncasecmp() is not a Standard C function (it's POSIX, which is
not on-topic here). The residents on this newsgroup insist that you do
not post code containing functions that are (a) not Standard C
functions and (b) not defined in the provided code. For the sake of
example code on this newsgroup, the easiest way for you to avoid
difficulties is probably to use strncmp() instead. Or, define your own
implementation of strncasecmp()...


Roman here is a definition for an implementation of strncasecmp I found in
some of my code. However, I didn't write it, and I can't verify if it
functions exactly like the strncasecmp Micah is talking about, or is
strictly portable.

int strncasecmp (const char *s1, const char *s2, size_t n)
{
int c1;
int c2;

do
{
c1 = *s1++;
c2 = *s2++;

if (!n--)
return 0; /* strings are equal until end point */

if (c1 != c2)
{
if (c1 >= 'a' && c1 <= 'z')
c1 -= ('a' - 'A');
if (c2 >= 'a' && c2 <= 'z')
c2 -= ('a' - 'A');
if (c1 != c2)
return -1; /* strings not equal */
}
} while (c1);

return 0; /* strings are equal */
}
hopefully those who know better will be willing to help out.

--
MrG{DRGN}
Feb 24 '06 #28

P: n/a
MrG{DRGN} wrote:
.... snip ...
Roman here is a definition for an implementation of strncasecmp I
found in some of my code. However, I didn't write it, and I can't
verify if it functions exactly like the strncasecmp Micah is
talking about, or is strictly portable.

int strncasecmp (const char *s1, const char *s2, size_t n)
{
int c1;
int c2;

do
{
c1 = *s1++;
c2 = *s2++;

if (!n--)
return 0; /* strings are equal until end point */

if (c1 != c2)
{
if (c1 >= 'a' && c1 <= 'z')
c1 -= ('a' - 'A');
if (c2 >= 'a' && c2 <= 'z')
c2 -= ('a' - 'A');
if (c1 != c2)
return -1; /* strings not equal */
}
} while (c1);

return 0; /* strings are equal */
}


Here is what I would do with that to make it portable (but not to
any wide char sets). It may well run faster too.

#include <ctype.h>
int strncasecmp (const unsigned char *s1,
const unsigned char *s2,
size_t n)
{
int c1, c2;

do {
c1 = *s1++; c2 = *s2++;
if (!n--)
return 0; /* strings are equal until end point */
if (c1 != c2) {
c1 = toupper(c1); c2 = toupper(c2);
if (c1 != c2)
return c1 - c2; /* strings not equal */
}
} while (c1);
return 0; /* strings are equal */
}

This allows for use of non-ansi char sets and returns a value whose
sign describes the relationship for unequal strings (as does
strcmp).

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
Feb 24 '06 #29

P: n/a
"Roman Mashak" <mr*@tusur.ru> writes:
MC> You've missed the point. If the line doesn't specify a " after the
MC> keyword "zone ", then it's "not the line you're looking for, move
MC> along." The trouble is, that if it's not the line he's looking for, he
I got you, but the problem is if I can't locate the ' " ' sign in the line,
I can't look for what's inside the "zone " token.
May be print the invalid string an move on should be enough:

if (*idx++ != '"') {
fputs(buf, fn);
fflush(fn);
continue;
}


Well, it's your program, do what you want: just make sure to think
things through enough to know that it really /is/ what you want.

I would probably do something like the above; but then, I'd also
probably take 100-char lines into account, too.

I suppose that, since it's a quick program for your own uses, then in
general you're probably not going to encounter zone lines without
quoted strings, so you don't have to be /too/ careful. IMO, it'd still
be very worthwhile to at least detect and report such conditions, so
you are aware of them if you accidentally encounter one.
Feb 24 '06 #30

P: n/a
CBFalconer <cb********@yahoo.com> writes:
MrG{DRGN} wrote:
Here is what I would do with that to make it portable (but not to
any wide char sets). It may well run faster too.

#include <ctype.h>
int strncasecmp (const unsigned char *s1,
const unsigned char *s2,
size_t n)
Good idea, but since in the real world strncasecmp() is a standardized
API (just not by the C one...), you can't really change the type of
s1 and s2.
{
int c1, c2;

do {
c1 = *s1++; c2 = *s2++;
if (!n--)
return 0; /* strings are equal until end point */
if (c1 != c2) {
c1 = toupper(c1); c2 = toupper(c2);
if (c1 != c2)
return c1 - c2; /* strings not equal */
}
} while (c1);
return 0; /* strings are equal */
}


I wonder how much time that first "if (c1 != c2)" saves... it seems to
me that the most usual case will be true for that condition, in which
case, I'd probably just go immediately to the toupper() conditions and
then compare.

But I think it would vary greatly depending on the application...
Feb 24 '06 #31

P: n/a
"MrG{DRGN}" <Ia****@here.com> writes:
"Micah Cowan" <mi***@cowan.name> wrote in message
news:87************@mcowan.barracudanetworks.com.. .
"Roman Mashak" <mr*@tusur.ru> writes:

-snip-
BTW, strncasecmp() is not a Standard C function (it's POSIX, which is
not on-topic here). The residents on this newsgroup insist that you do
not post code containing functions that are (a) not Standard C
functions and (b) not defined in the provided code. For the sake of
example code on this newsgroup, the easiest way for you to avoid
difficulties is probably to use strncmp() instead. Or, define your own
implementation of strncasecmp()...


Roman here is a definition for an implementation of strncasecmp I found in
some of my code. However, I didn't write it, and I can't verify if it
functions exactly like the strncasecmp Micah is talking about, or is
strictly portable.

int strncasecmp (const char *s1, const char *s2, size_t n)
{
int c1;
int c2;

do
{
c1 = *s1++;
c2 = *s2++;

if (!n--)
return 0; /* strings are equal until end point */

if (c1 != c2)
{
if (c1 >= 'a' && c1 <= 'z')
c1 -= ('a' - 'A');
if (c2 >= 'a' && c2 <= 'z')
c2 -= ('a' - 'A');
if (c1 != c2)
return -1; /* strings not equal */
}
} while (c1);

return 0; /* strings are equal */
}


I'm sure this worked on the system for which it was written; however,
C does not assume ASCII (nor POSIX), nor that ('a' - 'A') will be the
difference between lowercase and uppercase for all letters in the
alphabet. Chuck's reponse includes a more portable version.
Feb 24 '06 #32

P: n/a
Micah Cowan wrote:

"MrG{DRGN}" <Ia****@here.com> writes:
I can't verify if it
functions exactly like the strncasecmp Micah is talking about,
or is strictly portable.

int strncasecmp (const char *s1, const char *s2, size_t n)
{
int c1;
int c2;

do
{
c1 = *s1++;
c2 = *s2++;

if (!n--)
return 0; /* strings are equal until end point */

if (c1 != c2)
{
if (c1 >= 'a' && c1 <= 'z')
c1 -= ('a' - 'A');
if (c2 >= 'a' && c2 <= 'z')
c2 -= ('a' - 'A');
if (c1 != c2)
return -1; /* strings not equal */
}
} while (c1);

return 0; /* strings are equal */
}


I'm sure this worked on the system for which it was written; however,
C does not assume ASCII (nor POSIX), nor that ('a' - 'A') will be the
difference between lowercase and uppercase for all letters in the
alphabet. Chuck's reponse includes a more portable version.


The EBCDIC character set is the well known one
where the alphabet is not contiguously sequential.

http://www.natural-innovations.com/c...ciiebcdic.html

--
pete
Feb 24 '06 #33

P: n/a
Micah Cowan wrote:
Pedro Graca <he****@dodgeit.com> writes:
Micah Cowan wrote:
boa <bo*****@gmail.com> writes:

Roman Mashak wrote:
> /* look for 'zone' token */
> if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
strlen("zone"), not sizeof().
You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient. Hmmmm ???

What's the type of a literal string in code?
I thought it was `const char *'


I wish the const were there....

But now. A string literal has type char[].


Agreed.
Since you posted some very helpful test code, I'll explain what you're
seeing...
#include <stdio.h>
#include <string.h>

int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";


These two don't involve objects generated, as they're initializers,
and not assignments.


Actually, the first one *does* involve an object. It initialises z1 to
point to an anonymous char array containing the string literal.
a = strlen("forty two");


Not much need to comment here.
b = (int)sizeof("forty two");


This "forty two" requires that the generated program have at least one
character array with a length of 10, and that is the result given by
the string literal, so sizeof will return 10.


This one does not require an object in the generated program. sizeof is
an operator that returns the size of it's operand, not a function, so it
is simple (and I think normal practice) for the compiler to count the
length of the string literal and insert the size that the anonymous
array would be if it bothered to creat it.
c = (int)sizeof(char*);
d = (int)sizeof(const char*);
e = (int)sizeof z1;


The size of all three of these will be implementation-dependant, but
will certainly be the same, as they all test the same type.


const char * is not the same type as char *, but I believe they are
required to have the same representation.
f = (int)sizeof z2; /* Ah! A match */


And, as you've discovered, this should also give 10, for the
explicitly declared array of 10 characters.


Agreed.
printf("%d, %d, %d, %d, %d, %d\n",
a, b, c, d, e, f);
return 0;
}
No Micah, I wasn't doubting you (... well, maybe just a little)
I stand corrected.


Hey, doubt away! I'm certainly human, and have certainly said my share
of stupid things on this list, both formerly and recently. That's why
posting to comp.lang.c is a great way to learn the ins and outs of the
C language: if you say something that reveals a misperception of
reality, you will quickly be corrected. :-)


Yes, and I think I've spotted a couple this time ;-)
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc
Feb 25 '06 #34

P: n/a
Flash Gordon <sp**@flash-gordon.me.uk> writes:
Micah Cowan wrote:
Since you posted some very helpful test code, I'll explain what you're
seeing...
#include <stdio.h>
#include <string.h>
int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";

These two don't involve objects generated, as they're initializers,
and not assignments.


Actually, the first one *does* involve an object. It initialises z1 to
point to an anonymous char array containing the string literal.


Well, yes. What I meant was that the string literals themselves don't
generate objects, only the declarations.
a = strlen("forty two");

Not much need to comment here.
b = (int)sizeof("forty two");

This "forty two" requires that the generated program have at least
one
character array with a length of 10, and that is the result given by
the string literal, so sizeof will return 10.


This one does not require an object in the generated program. sizeof
is an operator that returns the size of it's operand, not a function,
so it is simple (and I think normal practice) for the compiler to
count the length of the string literal and insert the size that the
anonymous array would be if it bothered to creat it.


Well, yes. But from the standpoint of the Standard, the object /does/
exist; the implementation must simply behave "as if" the object were
created. While such optimizations are likely, nevertheless from my
preferred POV, and as far as any conforming program knows, the object
exists.
Feb 27 '06 #35

P: n/a
On Thu, 23 Feb 2006 20:16:14 +0100, boa <bo*****@gmail.com> wrote:
Roman Mashak wrote:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):
<OT> If you just need the function implemented, on any system where
you would have bind, this could be a one-liner in sed, awk, or perl in
awk mode something like:
awk -vqzone=\"blah\" \
'$1=="zone"{d=$2==qzone} !d{print} $1=="}"{d=0}'
and if the efficiency of this actually matters you're in trouble
anyway. OTOH if this is an exercise or opportunity to work on C
skills, continue. </>
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )

No need for + (update).
/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;
As noted elsethread this could be strspn. If not, the check for
null/zero is unnecessary; any character that is either space or tab is
necessarily not zero/null. Again below, snipped.
/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {


strlen("zone"), not sizeof().

sizeof lit -1 is fine and arguably better, as noted elsethread.
/* extract token from commas */
if (*idx++ != '"')
continue;
Not commas, quotes.
/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;
Could use strchr for the scan, and ...
sidx[i] = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {


Instead of copying, could (again) strncasecmp in place.

Unless bind allows escapes here (which I don't recall and can't be
arsed to check as it's offtopic) and you need to support that.
Although, you could just require that the user do the matching
escaping on the supplied argument. Or escape it once, at your program
startup, before doing the file scan -- but not in place; you _can_
modify argv[*] strings but not extend them. (And strictly you cannot
modify the argv[*] _pointers_, though you can argv itself.)
- David.Thompson1 at worldnet.att.net
Mar 3 '06 #36

P: n/a
On Thu, 23 Feb 2006 23:07:14 GMT, Micah Cowan <mi***@cowan.name>
wrote:
<snip: other points, and use static var for repeated string literal>
If you decide to follow my advice, this would become:

idx += (sizeof zone_array) - 1;

Actually, my own preference has been to define a macro such as:

#define ARY_STRLEN(a) (sizeof(a)-1)

which leads to a slightly better expression, in terms of
self-documentation:

idx += ARY_STRLEN(zone_array) - 1;

s/- 1// since you did it within the macro.
- David.Thompson1 at worldnet.att.net
Mar 3 '06 #37

This discussion thread is closed

Replies have been disabled for this discussion.