strcpy - my implementation - Page 2

arnuld

I have created my own implementation of strcpy library function. I would
like to have comments for improvements:
/* My version of "strcpy - a C Library Function */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

enum { ARRSIZE = 101 };

char* my_strcpy( char*, char* );

int main( int argc, char** argv )
{
char* pc;

char arr_in[ARRSIZE];
char arr_out[ARRSIZE];

memset( arr_in, '\0', ARRSIZE );
memset( arr_out, '\0', ARRSIZE );
if( 2 != argc )
{
perror("USAGE: ./exec \" your input \"\n");
exit( EXIT_FAILURE );
}
else
{
strcpy( arr_in , argv[1] );
}

pc = my_strcpy( arr_out, arr_in );

while( *pc )
{
printf("*pc = %c\n", *pc++);
}

return EXIT_SUCCESS;
}

char* my_strcpy( char* arr_out, char* arr_in )
{
char* pc;

pc = arr_out;

while( (*arr_out++ = *arr_in++) ) ;

return pc;
}
=============== OUTPUT ======================

[arnuld@dune ztest]$ gcc -ansi -pedantic -Wall -Wextra check_STRCPY.c
[arnuld@dune ztest]$ ./a.out like
*pc = l
*pc = i
*pc = k
*pc = e
[arnuld@dune ztest]$

It works fine without troubles. Now if you change the last return call in
my_strcpy from "return pc" to return "return arr_out", then while loop in
main() will not print anything at all. I really did not understand it.
Using thr array name will give a pointer to its 1st element but int htis
case it is giving a pointer to its last element. Why ? Thats why I
introduced the extra "char* pc" in first place.

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 8 '08

Subscribe Reply

8260

arnuld

On Mon, 08 Sep 2008 11:51:45 +0500, arnuld wrote:

Here is the new version whihc puts a check on maximum input:
/* My version of "strcpy - a C Library Function
*
* version 1.1
*
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

enum { ARRSIZE = 4 };

char* my_strcpy( char*, char* );

int main( int argc, char** argv )
{
char* pc;
int input_size;

char src[ARRSIZE+1] = {0};
char dest[ARRSIZE+1] = {0};
if( 2 != argc )
{
perror("USAGE: ./exec \" your input \"\n");
exit( EXIT_FAILURE );
}
else
{
input_size = strlen( argv[1] );

if( ARRSIZE < input_size )
{
fprintf(stderr, "Input must be %d characters or less\n", ARRSIZE);
exit(EXIT_FAILURE);
}

strcpy( src , argv[1] );
}

pc = my_strcpy( dest, src );
while( *pc )
{
printf("*pc = %c\n", *pc++);
}
return EXIT_SUCCESS;
}

char* my_strcpy( char* dest, char* src )
{
char *const pc = dest;

while( (*dest++ = *src++) )
{
;
}

return pc;
}

The one thing I do not understand here, if the arrays are created with
size ARRSIZE or even ARRSIZE+1 ( +1 for extra NULL character), the output
is not affected. Since the user has to enter 4 characters in this case
like "Love" + 1 for NULL but even with ARSSIZe = 4 in totoal, it works
fine. Is there some problem here ?

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 9 '08 #51

CBFalconer

arnuld wrote:

>

.... snip ...

>
That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ? and which one
is advised to use by c.l.c ?

They are basically the same in code generated, although that is
obviously up to the decisions of the implementor. The advantage of
the two statement version is that you can separate the activity
from the sawing off of the memory space (barring use of const), and
the generated code is closer to what you actually typed. To me,
this adds understanding.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 9 '08 #52

Ian Collins

arnuld wrote:

>On Mon, 08 Sep 2008 11:51:45 +0500, arnuld wrote:

Here is the new version whihc puts a check on maximum input:

char* my_strcpy( char* dest, char* src )
{

You still haven't fixed the const on the src parameter.
--
Ian Collins.

Sep 9 '08 #53

Nick Keighley

On 9 Sep, 02:16, Keith Thompson <ks...@mib.orgwrote:

Richard<rgr...@gmail.comwrites:
pete <pfil...@mindspring.comwrites:

Richard wrote:
Richard Heathfield <r...@see.sig.invalidwrites:

>>CBFalconer said:

>>>pete wrote:
>arnuld said:
... snip ...
>>* while( (*arr_out++ = *arr_in++) ) ;
/* I would go even further than just extra parentheses: */

>>>>* * *while ((*arr_out++ = *arr_in++) != '\0') {
* * * * * * * *^^^^^^^^^^^^^^^^^^^^^^

>>>Silly and pointless.
I disagree. An explicit comparison against 0 is pointless in terms
of the code, yes, but it's not pointless in terms of
self-documentation, and it certainly isn't silly.

>I disagree. It is long winded and unnecessary. "while(*d++=*s++){}"is a
corner stone of C programming and understanding. Adding the comparison
does nothing to aid the understanding. A style thing maybe.

It is a style thing.
The one and only circumstance
under which I will ommit the explicit comparison against zero
(assuming that zero is what the expression is being compared against),
is when the expression is conceptually boolean in nature,
such as something like:

* * while (isspace(*c)) {

Could you explain why C does not effectively make the while() above
"boolean" in nature and will not always do so?

It does, of course; that's not the point.

Since I share pete's opinion on this style issue, I'll try to explain.

When I write an if or while statement, I prefer to use an expression
that is *conceptually* boolean. *By "conceptually boolean", I mean
that the value of the expression can be thought of as either true or
false, and carries no additional information.

For example, if I'm examining the value of a character, the following
are equivalent:
* * if (c) { ... }
* * if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself isn't
just a true or false value, but the result of the "!=" operator is.

Similarly, I would write
* * if (strcmp(s1, s2) != 0) { ... }
rather than
* * if (!strcmp(s1, s2)) { ... }

and I would write
* * if ((ptr = malloc(N)) != NULL) { ... }
rather than
* * if (ptr = malloc(N)) { ... }

and so forth.

I'm perfectly well aware (as is pete, I'm sure) that in each case the
two forms are precisely equivalent, and will most likely result in
identical generated code. *I'm also aware that some C programmers
(including, if I'm not mistaken, Kernighan and Ritchie themselves)
prefer the terser forms and consider the forms that I prefer to be too
verbose. *I don't necessarily think that preference is wrong, I just
don't share it. *Finally, I don't have any real difficulty
understanding either form; it might sometimes take me a marginally
longer time to understand something in the shorter form, but it's not
really significant.

I use the same conventions for the same reasons

--
Nick Keighley

Sep 9 '08 #54

Flash Gordon

CBFalconer wrote, On 09/09/08 05:39:

<snip>

However, notice that replacing strcpy is different than adding
strlcpy.

On some implementations it is the same.

strcpy exists in the current libraries. strlcpy does
not.

Apart from the implementations which provide it as an extension,
something they are allowed to do.

Any possible problem is somewhere in the future.

Apart from it being undefined behaviour and the fact that some
implementations do define it.

And, as I
pointed out, those names are alterable in the source code.

I agree with that. However they do not have anything to do with what the
OP was doing which was as an exercise re-implementing strcpy.
--
Flash Gordon

Sep 9 '08 #55

Richard Heathfield

arnuld said:

>On Mon, 08 Sep 2008 17:48:20 +0000, Richard Heathfield wrote:

>I don't think it would be reasonable to call it plagiarism, since it's
blindingly obvious to all concerned that it's basically the K&R2 code.

No, it is not. That I learned from Stroustrup, section 6.2.5, special
edition. I do not even remember that I ever saw that code ever in K&R2.

Nevertheless, it is still basically the K&R2 code. Where do you think
Stroustrup first saw it?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 9 '08 #56

Richard Heathfield

arnuld said:

>On Mon, 08 Sep 2008 17:48:20 +0000, Richard Heathfield wrote:

>It isn't very good camouflage, though, since any experienced C
programmer will recognise it for what it is. It's just a more elegant
way to write the code. We don't write char foo[8] = { 'H', 'e', 'l',
'l', 'o', '\0', '\0', '\0' } just because it makes explicit the fact
that eight characters are being copied into the array. We write char
foo[8] = "Hello", and trust that competent programmers will understand.

[snip]

That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ?

Since you're initialising an array of integers (for chars are integers),
they do the same thing. The first version does it in fewer lines of code.
If the type were non-integer (e.g. pointer, or floating point, or struct
or union of any kind), the two versions would not be equivalent and the
memset version would simply be wrong.

and which one is advised to use by c.l.c ?

That depends on which c.l.c. subscriber you ask, but I'd go for the version
that is right every time, wouldn't you?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 9 '08 #57

Old Wolf

On Sep 9, 5:34*pm, CBFalconer <cbfalco...@yahoo.comwrote:

Keith Thompson wrote:
* * if (c) { ... }
* * if (c != '\0') { ... }

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. *The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. *Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I don't know what you're smoking today, but
the above two if statements are exactly
equivalent in effect, and I challenge you
to come up with a compiler that generates
different code for them, let alone one that
actually causes a different code branch to
execute.

Sep 9 '08 #58

Old Wolf

On Sep 9, 6:18*pm, arnuld <sunr...@invalid.addresswrote:

That whole discussion leaves me wondering whether:

* *char arrc[100] = {0};

is same as:

* *char arrc[100];
* *memset(arrc, '\0', 100);

or whether latter is more expansive than former ? *and which one is
advised to use by c.l.c ?

They both have the same effect, but the first
one is far better because it is less error-prone.

Anyone who disagrees needs to notice that the
1970s have ended, IMHO.

Examples of memset errors:
http://www.google.com/codesearch?hl=...\)&btnG=Search

Sep 9 '08 #59

pete

CBFalconer wrote:

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I expect
while (c)
to generate the exact same code as
while ((((((c) != 0) != 0) != 0) != 0) != 0)
with optimization disabled, regardless of the type of (c),
and regardless of whether (0) is replaced by ('\0').

--
pete

Sep 9 '08 #60

pete

Old Wolf wrote:

On Sep 9, 6:18 pm, arnuld <sunr...@invalid.addresswrote:
>That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ? and which one is
advised to use by c.l.c ?

They both have the same effect, but the first
one is far better because it is less error-prone.

The first way makes it more obvious to the compiler
what it is that needs to be done.

I like code that gives the compiler a better chance
to take advantage of any available information.

--
pete

Sep 9 '08 #61

vippstar

On Sep 9, 11:58 am, pete <pfil...@mindspring.comwrote:

Old Wolf wrote:
On Sep 9, 6:18 pm, arnuld <sunr...@invalid.addresswrote:
That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ? and which one is
advised to use by c.l.c ?

They both have the same effect, but the first
one is far better because it is less error-prone.

In this particular case, yes. I have explained in my other post in
this thread why they are not equivalent for pointers/floating points.

The first way makes it more obvious to the compiler
what it is that needs to be done.

I like code that gives the compiler a better chance
to take advantage of any available information.

Also it's far less characters to type, and it's more obvious to the
programmer as well.
If there were more object definitions,

char arrc[100];
int i;
size_t n;
/* ... */

memset(arrc, 0, sizeof arrc);

It's not immediately obvious that the array is going to be zeroed.

Sep 9 '08 #62

Richard

pete <pf*****@mindspring.comwrites:

Richard wrote:

>Erm, we are talking about the nul character. or are we?

We were talking about which kind of expressions
I compare explicitly to zero
and which kind of expressions
I don't explicitly compare to zero.

And why. But you snipped so I dont know...

--

Sep 9 '08 #63

Richard

Richard Heathfield <rj*@see.sig.invalidwrites:

CBFalconer said:

>Richard Heathfield wrote:
>>CBFalconer said:

<snip>

>>>
It already tests the value of the underlined
expression for zero/non-zero.

Do you think 'pete' doesn't already know that?

You forget the destination of Usenet messages. They are public,
not person to person. Do you conceive that all non-petes ALL know
that?

Does everyone really expect every poster to explain every nuance of every
aspect of C they use in every article?

Chuck is unable to follow a thread. He also has zero retention for
posters obvious skills and previous posts when he feels he can utilise
his ignorance of their posting history to belittle the other poster.

Sep 9 '08 #64

James Kuyper

arnuld wrote:

>On Mon, 08 Sep 2008 19:08:08 +0200, Richard wrote:

>>Nate Eldredge <na**@vulcan.lanwrites:
It's hardly necessary to make veiled accusations of plagiarism for
such a trivial piece of code. I suspect if you put 100 C programmers
in clean rooms and asked them to write an implementation of strcpy,
you'd only get about three essentially different versions, and this is
one of them. K&R is probably the most memorable appearance of the
`*p++ = *q++' idiom, but just because one saw it there and continues
to use it doesn't make one a plagiarist. It's a textbook, after all.

>You would be amazed at how few would actually do it this way. There are
many people out there who discourage such stuff. I even read here once
that it "misuses C" ... the mind boggles. I have seen many code bases
where you hardly ever see a pointer used in its natural habitat. A
crying shame IMO.

I am not a native English man so I don't know what you mean by "....used
in its natural habitat..."

He's using a metaphor, talking about pointers as if they were a species
of animal. C is one of the natural habitats for pointers, far more so
than most other languages. What he's saying is that some people
deliberately write C so that there's practically no use of pointers,
which is a shame, because C is the place where pointers belong. I
presume that he means explicitly declared variables of pointer type. I
can't imagine any significant amount of C code being written without the
use of expressions with a pointer type.

Do you want to say that c.l.c discourages *p++ = *q++ ?

No. He's saying that many people never use that idiom, and that some
people deliberately avoid all use of pointers; he's not saying that clc
has that opinion. I can confirm that first point. I've been asked to
interview dozens of people since 1994 for a variety of entry-level C
positions. I've given every single one of them a test built around

while(*p++ = *q++);

Few of them could even tell me what it does. Almost none of them could
explain to me how it works. The sticking point seems to be figuring out
why it is that the loop actually stops looping. Even the ones who could
tell me that it stops when it reaches a null character, generally could
not tell me WHY it stops when it reaches a null character. We had to
hire such people (!), because people who could actually pass this test
seem to be unavailable, at least at the salary levels we could afford to
offer.

Sep 9 '08 #65

James Kuyper

arnuld wrote:

>On Mon, 08 Sep 2008 17:48:20 +0000, Richard Heathfield wrote:

>It isn't very good camouflage, though, since any experienced C
programmer will recognise it for what it is. It's just a more elegant
way to write the code. We don't write char foo[8] = { 'H', 'e', 'l',
'l', 'o', '\0', '\0', '\0' } just because it makes explicit the fact
that eight characters are being copied into the array. We write char
foo[8] = "Hello", and trust that competent programmers will understand.

[snip]

That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ? and which one is
advised to use by c.l.c ?

I strongly favor the first form, despite what I'm about to say. Whether
or not the same code is generated depends upon the implementation, and
some implementations get this very wrong. In particular, I remember my
horror when I found out why a particular piece of code was running so
slow. It basically said something like this:

double array[40][5416] = {0.0};

What I eventually figured out is that the compiler was generating code
(at default optimization levels!) equivalent to the following:

double array[40][5416];
array[0][0] = 0.0;
array[0][1] = 0.0;
// 216637 similar lines
array[39][5415] = 0.0;

My object file and executable both got hundreds of thousands of bytes
smaller when I removed the {0} and replaced it with an explicit
initialization loop. The initialization loop actually made the code run
faster, too; probably because of the time wasted in the {0} version by
loading those 216640 initialization lines into memory.

However, implementations that stupid are rare (I hope!). Note: I
complained to the vendor, who defended the compiler by saying that this
was the "natural" way of handling the initialization.

Sep 9 '08 #66

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

Keith Thompson wrote:
>>
... snip ...
>>
For example, if I'm examining the value of a character, the
following are equivalent:
if (c) { ... }
if (c != '\0') { ... }
I prefer to write the latter, because the value of c by itself
isn't just a true or false value, but the result of the "!="
operator is.

I disagree.

You disagree with my statement about my own preference?

Remarkable.

The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing.

C99 6.8.4.1:

In both forms, the first substatement is executed if the
expression compares unequal to 0. In the else form, the second
substatement is executed if the expression compares equal to 0.

I suppose you could call that a conversion to 0 or 1 (since the "=="
and "!=" operators yield only those values), but it certainly doesn't
have to be implemented that way.

Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I wouldn't -- or rather, I'd expect most compilers to perform some
minimal level of optimization, and therefore to generate identical
code for the two forms, even with no optimization options specified.

(A quick experiment shows that one compiler does generate identical
code, another does not. Sun's SPARC compiler generates "cmp %l0,%g0"
for one form, "cmp %l0,0" for the other. "%g0" is a register whose
value is always 0. The effect is obviously identical.

But the difference goes away when optimization is enabled. I'm sure
you weren't suggesting this as a reason to use the shorter form.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 9 '08 #67

Keith Thompson

Richard Heathfield <rj*@see.sig.invalidwrites:

arnuld said:

[...]

>That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ?

Since you're initialising an array of integers (for chars are integers),
they do the same thing. The first version does it in fewer lines of code.
If the type were non-integer (e.g. pointer, or floating point, or struct
or union of any kind), the two versions would not be equivalent and the
memset version would simply be wrong.

A struct, union, or array whose only non-composite sub-members are of
integer type[*] can safely be initialized with memset, setting the
whole thing to all-bits-zero. It's only unsafe if some of the
sub-members are of floating-point or pointer type.

(Note that it's still not entirely equivalent, since memset will zero
any padding bytes, and {0} won't necessarily do so. Also, {0} might,
on some exotic implementation, use a representation of 0 other than
all-bits-zero -- I think.

The guarantee that all-bits-zero is a valid representation of 0 for
all integer types doesn't appear in the C90 or C99 standard; it was
added in one of the post-C99 Technical Corrigenda, and can be found in
n1256.pdf.

But it's always possible that a pointer or floating-point member might
be added later during maintenance.
[*] That's a clumsy way of saying it, but I couldn't think of a better
one. You have to consider members of the struct or union, or elements
of the array, and recursively for all sub-members and/or sub-elements,
until you get down to things thare are of scalar (numeric or pointer)
type.

>and which one is advised to use by c.l.c ?

That depends on which c.l.c. subscriber you ask, but I'd go for the version
that is right every time, wouldn't you?

Agreed. {0} expresses the intent more clearly and avoids certain
problems. The problems it avoids are rare -- which means they're
difficult to detect.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 9 '08 #68

viza

On Tue, 09 Sep 2008 11:06:49 +0500, arnuld wrote:

>On Mon, 08 Sep 2008 11:51:45 +0500, arnuld wrote:

Here is the new version whihc puts a check on maximum input:
/* My version of "strcpy - a C Library Function
*
* version 1.1
*
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

enum { ARRSIZE = 4 };

This isn't what enum is for. Use

#define ARRSIZE 4

char* my_strcpy( char*, char* );

const !!!!!

int main( int argc, char** argv )
{
char* pc;
int input_size;

char src[ARRSIZE+1] = {0};
char dest[ARRSIZE+1] = {0};
if( 2 != argc )
{
perror("USAGE: ./exec \" your input \"\n"); exit( EXIT_FAILURE );
}

Don't use perror unless a library function that sets errno has failed or
you have set it yourself.

The one thing I do not understand here, if the arrays are created with
size ARRSIZE or even ARRSIZE+1 ( +1 for extra NULL character), the
output is not affected. Since the user has to enter 4 characters in this
case like "Love" + 1 for NULL but even with ARSSIZe = 4 in totoal, it
works fine. Is there some problem here ?

No, it's the opposite of a problem. It should have exploded your
computer but because there just happened to be a null character (NB: not
NULL character) where you are supposed to put one yourself when you ran
it, it didn't.

You cannot rely on it just being there by coincidence, any more than you
can jump out of a window relying on a truck full of feathers to be
passing by.

Sep 10 '08 #69

CBFalconer

Old Wolf wrote:

CBFalconer <cbfalco...@yahoo.comwrote:
>Keith Thompson wrote:

if (c) { ... }
if (c != '\0') { ... }

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I don't know what you're smoking today, but the above two if
statements are exactly equivalent in effect, and I challenge you
to come up with a compiler that generates different code for them,
let alone one that actually causes a different code branch to
execute.

I didn't say the effect differed. I said the code generated
differed, before optimization. It has to, because one uses the
value of c, and the other converts that to 0 or 1 before testing.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 10 '08 #70

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

Old Wolf wrote:
>CBFalconer <cbfalco...@yahoo.comwrote:
>>Keith Thompson wrote:

if (c) { ... }
if (c != '\0') { ... }

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I don't know what you're smoking today, but the above two if
statements are exactly equivalent in effect, and I challenge you
to come up with a compiler that generates different code for them,
let alone one that actually causes a different code branch to
execute.

I didn't say the effect differed. I said the code generated
differed, before optimization. It has to, because one uses the
value of c, and the other converts that to 0 or 1 before testing.

No, it doesn't have to differ. I suppose a compiler could generate
painfully naive code with some options, but there's no reason for the
value to be converted to 0 or 1. Even in the case where I found a
difference, there was no such conversion.

In any case, by definition the statement "if (c) { ... }" compares
the value of c to 0. That comparison is done by the equivalent of
"c != 0", which yields a value of 0 or 1. There's just as much basis
(i.e., practically none) for assuming that "if (c)" will convert the
result to 0 or 1 as for assuming that "if (c != '\0')" will do so.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 10 '08 #71

Keith Thompson

viza <to******@gm-il.com.obviouschange.invalidwrites:

On Tue, 09 Sep 2008 11:06:49 +0500, arnuld wrote:

[...]

>enum { ARRSIZE = 4 };

This isn't what enum is for.

Says who?

Use

#define ARRSIZE 4

Why?

Here's what the standard says:

The identifiers in an enumerator list are declared as constants
that have type int and may appear wherever such are permitted.

Perhaps enum wasn't designed for the purpose of declaring single
constants, but it does it quite well (if you can live with the
restriction to type int). It's a common and clever idiom, and I see
nothing wrong with using it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 10 '08 #72

arnuld

On Tue, 09 Sep 2008 08:09:49 +0000, Richard Heathfield wrote:

Nevertheless, it is still basically the K&R2 code. Where do you think
Stroustrup first saw it?

I knew you will be ready to rub me for this ;) .

I meant, I did not see that code in K&R2, I learned that from Stroustrup,
so I don't have much idea of whether K&R2 has this code or not. And I
don't know where Stroustrup saw it first time :P

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 10 '08 #73

arnuld

On Tue, 09 Sep 2008 19:07:14 +1200, Ian Collins wrote:

You still haven't fixed the const on the src parameter.

OMG! , I just forgot :-\

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 10 '08 #74

Richard

Keith Thompson <ks***@mib.orgwrites:

Richard Heathfield <rj*@see.sig.invalidwrites:
>arnuld said:
[...]

>>That whole discussion leaves me wondering whether:

char arrc[100] = {0};

is same as:

char arrc[100];
memset(arrc, '\0', 100);

or whether latter is more expansive than former ?

Since you're initialising an array of integers (for chars are integers),
they do the same thing. The first version does it in fewer lines of code.
If the type were non-integer (e.g. pointer, or floating point, or struct
or union of any kind), the two versions would not be equivalent and the
memset version would simply be wrong.

A struct, union, or array whose only non-composite sub-members are of
integer type[*] can safely be initialized with memset, setting the
whole thing to all-bits-zero. It's only unsafe if some of the
sub-members are of floating-point or pointer type.

(Note that it's still not entirely equivalent, since memset will zero
any padding bytes, and {0} won't necessarily do so. Also, {0} might,
on some exotic implementation, use a representation of 0 other than
all-bits-zero -- I think.

Few! I thought it was only me that didn't understand this stuff anymore
after reading posts in c.l.c!

Sep 10 '08 #75

Richard

CBFalconer <cb********@yahoo.comwrites:

Old Wolf wrote:
>CBFalconer <cbfalco...@yahoo.comwrote:
>>Keith Thompson wrote:

if (c) { ... }
if (c != '\0') { ... }

I disagree. The (c) expression worries only about whether the
character is or is not something with a zero value. The (c !=
'\0') expression expressly converts that zeroness into either the
value 0 or the value 1 before testing. Optimization may affect
this.

I would normally expect the second expression to generate larger
code than does the first, with optimization disabled.

I don't know what you're smoking today, but the above two if
statements are exactly equivalent in effect, and I challenge you
to come up with a compiler that generates different code for them,
let alone one that actually causes a different code branch to
execute.

I didn't say the effect differed. I said the code generated
differed, before optimization. It has to, because one uses the
value of c, and the other converts that to 0 or 1 before testing.

I dont think I ever heard something SO wrong before in a technical news
group. Congratulations Chuck. You have proven me wrong on one thing -
you can indeed get more ridiculous.

Sep 10 '08 #76

CBFalconer

Keith Thompson wrote:

CBFalconer <cb********@yahoo.comwrites:

.... snip ...

>
>I didn't say the effect differed. I said the code generated
differed, before optimization. It has to, because one uses the
value of c, and the other converts that to 0 or 1 before testing.

No, it doesn't have to differ. I suppose a compiler could generate
painfully naive code with some options, but there's no reason for
the value to be converted to 0 or 1. Even in the case where I
found a difference, there was no such conversion.

In any case, by definition the statement "if (c) { ... }" compares
the value of c to 0. That comparison is done by the equivalent of
"c != 0", which yields a value of 0 or 1. There's just as much
basis (i.e., practically none) for assuming that "if (c)" will
convert the result to 0 or 1 as for assuming that "if (c != '\0')"
will do so.

Well, we have different ideas of what optimization does and where
it starts. To me, the compiler parses "if (" and then compiles a
statement, requiring a ')' termination char. If the statement is a
comparison, it must do the conversion to 0 or 1 because those are
the only values that a comparison can yield. When it finds the ')'
termination char it evaluates that statement and makes the decision
based on a zero or non-zero value. Optimization can go over the
code generated above as it wishes to remove unnecessary code.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 10 '08 #77

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

Keith Thompson wrote:
>CBFalconer <cb********@yahoo.comwrites:

... snip ...
>>
>>I didn't say the effect differed. I said the code generated
differed, before optimization. It has to, because one uses the
value of c, and the other converts that to 0 or 1 before testing.

No, it doesn't have to differ. I suppose a compiler could generate
painfully naive code with some options, but there's no reason for
the value to be converted to 0 or 1. Even in the case where I
found a difference, there was no such conversion.

In any case, by definition the statement "if (c) { ... }" compares
the value of c to 0. That comparison is done by the equivalent of
"c != 0", which yields a value of 0 or 1. There's just as much
basis (i.e., practically none) for assuming that "if (c)" will
convert the result to 0 or 1 as for assuming that "if (c != '\0')"
will do so.

Well, we have different ideas of what optimization does and where
it starts. To me, the compiler parses "if (" and then compiles a
statement, requiring a ')' termination char. If the statement is a
comparison, it must do the conversion to 0 or 1 because those are
the only values that a comparison can yield. When it finds the ')'
termination char it evaluates that statement and makes the decision
based on a zero or non-zero value. Optimization can go over the
code generated above as it wishes to remove unnecessary code.

You mean expression, not statement.

Given:

if (expr) stmt;

the "stmt;" part is executed "if the expression compares unequal to
0". So by your reasoning, if the expression is "c" (the value of an
object of type char), then c must be compared to 0 by the equivalent
of "c != 0", yielding a value of 0 or 1, which determines the
subsequent control flow. If the expression is "c != '\0'", then that
comparison is done, yielding 0 or 1, and then that result is compared
to 0, again yielding 0 or 1.

Typically these trivial comparisons aren't eliminated by a separate
optimization phase, they're just not generated in the first place.
For example, the compiler's processing of a "!=" operator might vary
depending on the context in which it appears; if it's assigned to an
object, then it has to result in a value of 0 or 1 (which may require
extra code), but if it's the top-level operator of an if or while
condition, it might just result directly in a "branch if equal"
instruction (or, more likely, the equivalent in some intermediate
code).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 11 '08 #78

strcpy - my implementation

Similar topics