By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,763 Members | 1,641 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,763 IT Pros & Developers. It's quick & easy.

Tryiing just to read/understand this code

P: n/a
Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr methodand
why it's written the ways it written, as well as in regards to perfomance or
code simplicity, it will help me learn c.Thanks. 497 char *
498 strstr(const char *as1, const char *as2)
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
514 while ((c = *++s2) == *s1++ && c)
515 ;
516 if (c == 0)
517 return ((char *)tptr - 1);
518 s1 = tptr;
519 s2 = as2;
520 c = *s2;
521 }
522
523 return (NULL);
524 }
Aug 5 '06 #1
Share this Question
Share on Google+
54 Replies


P: n/a

smnoff wrote:
Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr methodand
why it's written the ways it written, as well as in regards to perfomance or
code simplicity, it will help me learn c.Thanks. 497 char *
498 strstr(const char *as1, const char *as2)
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
514 while ((c = *++s2) == *s1++ && c)
515 ;
516 if (c == 0)
517 return ((char *)tptr - 1);
518 s1 = tptr;
519 s2 = as2;
520 c = *s2;
521 }
522
523 return (NULL);
524 }
Wrap the function above in a main then just view the locals in a
debugger. That's the best way to learn what is happening. Also, it's
just basic pointer math, so you might to write some small programs to
practice your pointer math.

Brian

Aug 5 '06 #2

P: n/a
On 2006-08-05, smnoff <rh******@hotmail.comwrote:
Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr methodand
why it's written the ways it written, as well as in regards to perfomance or
code simplicity, it will help me learn c.Thanks. 497 char *
498 strstr(const char *as1, const char *as2)
We're looking for a sequence of characters in the string as1 that match
as2.
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
At this point *(s1 - 1) and *s2 both contain copies of the same
character. So this is a candidate for a match, but we need to check if
the two pointers carry on pointing to the same characters if we
increment them each one byte at a time until the last
non-null-terminating character in s2.

strstr("catenation", "nat") should match, but strstr("catentation",
"natj") should return NULL. At this point we've got s1-1 pointing to the
n of "catenation" and s2 pointing to the n of "nat". We need to walk
through both strings, to match up the 'a' and the 't' and find our way
to the 'j'.

We could write the next bit perhaps more readably like this:

const char *p = s1 - 1;
const char *q = s2;
size_t i;

for (i = 0; q[i] != '\0'; i++)
if (p[i] != q[i])
return NULL; /* partial match, but not good enough */

/* If we get here then as2 is a substring of as1 ... */

514 while ((c = *++s2) == *s1++ && c)
515 ;
So, here we're doing the same thing: we increment s2 and s1 for as long
as they continue to point to equal bytes, and s2 hasn't got to a null
terminator. Lines 514 and 515 mean basically this in pseudocode:

forever
increment s2
let c = *s2

if c != s1
increment s1
break
endif

increment s1
if c == '\0' break
endfor
Aug 5 '06 #3

P: n/a
In your pseudocode, why is the

if c != s1

use the != as opposed to the == like iin the actual code?

> 514 while ((c = *++s2) == *s1++ && c)
515 ;

So, here we're doing the same thing: we increment s2 and s1 for as long
as they continue to point to equal bytes, and s2 hasn't got to a null
terminator. Lines 514 and 515 mean basically this in pseudocode:

forever
increment s2
let c = *s2

if c != s1
increment s1
break
endif

increment s1
if c == '\0' break
endfor

Aug 7 '06 #4

P: n/a
On 2006-08-07, smnoff <34**************@hotmail.comwrote:
In your pseudocode, why is the

if c != s1

use the != as opposed to the == like iin the actual code?
It should be c != *s1. My mistake.

But != is right, because in the pseudocode this is the condition for
breaking _out_ of the while loop, but in the actual code, ==, because
it's a condition for staying _in_ the loop.

>> 514 while ((c = *++s2) == *s1++ && c)
515 ;

So, here we're doing the same thing: we increment s2 and s1 for as long
as they continue to point to equal bytes, and s2 hasn't got to a null
terminator. Lines 514 and 515 mean basically this in pseudocode:

forever
increment s2
let c = *s2

if c != s1
increment s1
break
endif

increment s1
if c == '\0' break
endfor
Aug 7 '06 #5

P: n/a
Any reason why the original programmer of this code in line 514 and 515 put
all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially the
assignment of

c = *++s2

in the while loop statement?

In my opinion, it makes it extremely hard to read what's going on (or even
debug for that matter.)

Was this all written in one line of code for performance reasons?


"Ben C" <sp******@spam.eggswrote in message
news:sl*********************@bowser.marioworld...
On 2006-08-07, smnoff <34**************@hotmail.comwrote:
>In your pseudocode, why is the

if c != s1

use the != as opposed to the == like iin the actual code?

It should be c != *s1. My mistake.

But != is right, because in the pseudocode this is the condition for
breaking _out_ of the while loop, but in the actual code, ==, because
it's a condition for staying _in_ the loop.

>>> 514 while ((c = *++s2) == *s1++ && c)
515 ;

So, here we're doing the same thing: we increment s2 and s1 for as long
as they continue to point to equal bytes, and s2 hasn't got to a null
terminator. Lines 514 and 515 mean basically this in pseudocode:

forever
increment s2
let c = *s2

if c != s1
increment s1
break
endif

increment s1
if c == '\0' break
endfor

Aug 8 '06 #6

P: n/a
smnoff said:
Any reason why the original programmer of this code in line 514 and 515
put all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially
the assignment of

c = *++s2

in the while loop statement?
It's relatively idiomatic. I wouldn't feel any particular guilt in writing
code like that. It just means "bump the pointer to grab the value of the
next object along".
In my opinion, it makes it extremely hard to read what's going on (or even
debug for that matter.)
In my opinion, it's not a huge deal.
Was this all written in one line of code for performance reasons?
No. It could be split without any particular performance hit. It would mean
re-coding the loop, of course.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Aug 8 '06 #7

P: n/a
Do you think it's written like that for "job security"?
"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:5O********************@bt.com...
smnoff said:
>Any reason why the original programmer of this code in line 514 and 515
put all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially
the assignment of

c = *++s2

in the while loop statement?

It's relatively idiomatic. I wouldn't feel any particular guilt in writing
code like that. It just means "bump the pointer to grab the value of the
next object along".
>In my opinion, it makes it extremely hard to read what's going on (or
even
debug for that matter.)

In my opinion, it's not a huge deal.
>Was this all written in one line of code for performance reasons?

No. It could be split without any particular performance hit. It would
mean
re-coding the loop, of course.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Aug 8 '06 #8

P: n/a

smnoff wrote:
Do you think it's written like that for "job security"?
No, it's written like that because any decent C programmer shouldn't
have a problem with it and because it follows the philosophy of tight
code. Tight code has less wriggle room for introducing errors by
maintenance coders. Creating more lines of code and more variables
creates more opportunity for screwing it up. If you are going to modify
that code, you'd better know what you're doing and not just start
hacking at it as so many do. Since this is a low-level function, the
chances of it needing maintenance are slim.

There's a difference between tight code and obfuscated code and this
code is tight, not obfuscated.
>

"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:5O********************@bt.com...
smnoff said:
Any reason why the original programmer of this code in line 514 and 515
put all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially
the assignment of

c = *++s2

in the while loop statement?
It's relatively idiomatic. I wouldn't feel any particular guilt in writing
code like that. It just means "bump the pointer to grab the value of the
next object along".
In my opinion, it makes it extremely hard to read what's going on (or
even
debug for that matter.)
In my opinion, it's not a huge deal.
Was this all written in one line of code for performance reasons?
No. It could be split without any particular performance hit. It would
mean
re-coding the loop, of course.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Aug 8 '06 #9

P: n/a
smnoff said:
Do you think it's written like that for "job security"?
No. It's relatively idiomatic.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Aug 8 '06 #10

P: n/a
smnoff wrote:

See below.

Brian
--
Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes. See the majority of other posts in the
newsgroup.
Aug 8 '06 #11

P: n/a
Wow, for a C language newgroup that is supposedly idiomatic and can handle
concise syntax (i.e. a top post), they can't seem to figure out where the
replies to postings are and where they belong!!! Maybe the posting was TOO
concise...then again can't newsgroup posting be idiomatic just like the C
language?

"Default User" <de***********@yahoo.comwrote in message
news:4j************@individual.net...
smnoff wrote:

See below.

Brian
--
Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes. See the majority of other posts in the
newsgroup.

Aug 8 '06 #12

P: n/a
smnoff wrote:

Please don't top post. You reply belongs under the portion of the post
you are replying to, not above. See 90% of the posts in this group for
how to do it properly.
>

"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:5O********************@bt.com...
>smnoff said:
>>Any reason why the original programmer of this code in line 514 and 515
put all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially
the assignment of

c = *++s2

in the while loop statement?
It's relatively idiomatic. I wouldn't feel any particular guilt in writing
code like that. It just means "bump the pointer to grab the value of the
next object along".
>>In my opinion, it makes it extremely hard to read what's going on (or
even
debug for that matter.)
In my opinion, it's not a huge deal.
>>Was this all written in one line of code for performance reasons?
No. It could be split without any particular performance hit. It would
mean
re-coding the loop, of course.
Do you think it's written like that for "job security"?
I don't and I doubt Richard does. c = *++s2 is such a common idiom in C
that most people who have significant experience in C will understand it
without thinking about it. I would do something like that simply because
it is the "normal" way to write it.
>--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Please don't quote peoples sigs, i.e. the bit above you quoted from
Richard's post, unless you are commenting on them.
--
Flash Gordon
Still sigless on this computer.
Aug 8 '06 #13

P: n/a
smnoff wrote:
Wow, for a C language newgroup that is supposedly idiomatic and can
handle concise syntax (i.e. a top post), they can't seem to figure
out where the replies to postings are and where they belong!!! Maybe
the posting was TOO concise...then again can't newsgroup posting be
idiomatic just like the C language?
*plonk*


Brian

Aug 8 '06 #14

P: n/a
smnoff said:
Wow, for a C language newgroup that is supposedly idiomatic
No, the /expression under discussion/ was idiomatic. The newsgroup itself is
not idiomatic. It's a newsgroup.
and can handle concise syntax (i.e. a top post),
There's nothing concise about top-posting.
they can't seem to figure out where the
replies to postings are and where they belong
Sounds like you're spoiling for a fight. Sorry to disappoint you. If others
want to play that game, that's up to them. But if you can't be bothered to
work out how to post to comp.lang.c in a way that minimises wasted time for
your readers, then I can't be bothered to read your stuff.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Aug 8 '06 #15

P: n/a
"smnoff" <34**************@hotmail.comwrites:
Wow, for a C language newgroup that is supposedly idiomatic and can handle
concise syntax (i.e. a top post), they can't seem to figure out where the
replies to postings are and where they belong!!! Maybe the posting was TOO
concise...then again can't newsgroup posting be idiomatic just like the C
language?
You misunderstand.

There is nothing "concise" about top-posting, and it's inconsistent
with the conventions that have evolved in this newsgroup over the last
few decades. By ignoring those conventions, you make your postings
more difficult to read, and many of us won't bother.

Please read <http://www.caliburn.nl/topposting.htmlfor more
information.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 8 '06 #16

P: n/a
On 2006-08-08, smnoff <34**************@hotmail.comwrote:
>>>> 514 while ((c = *++s2) == *s1++ && c)
515 ;
[snip]
Any reason why the original programmer of this code in line 514 and
515 put all the increments all in the same line?
And for that matter "jammed" everything in a single line? and
especially the assignment of
c = *++s2
in the while loop statement?
In my opinion, it makes it extremely hard to read what's going on (or
even debug for that matter.)
You get used to it, I don't think it's hard to read. I think people are
inspired to write C like this when they read the demonstration of
writing strcpy in K&R, which starts with a fairly verbose version, and
ends up saying well you might as well just write:

while (*s2++ = *s1++);

This is easy to read surely? You often get a compiler warning like
"suggest parentheses around assignment used as truth value" for this
kind of thing, however.

The opensolaris strchr was a bit hard to read, IMHO, because of the way
s1 and s2 were out-of-step so you got a pre-increment on one of them and
a post-increment on the other. Then again, strchr's always going to be a
bit fiddly.
Was this all written in one line of code for performance reasons?
Unlikely, although it's possible, because performance is a high priority
for library functions like strchr. It may be that this particular way of
writing it produced better code on the compiler/machine this started
life on.
Aug 8 '06 #17

P: n/a
"smnoff" <34**************@hotmail.comwrote:
Wow, for a C language newgroup that is supposedly idiomatic and can handle
concise syntax (i.e. a top post),
If you luserish, top-posting Outhouse monkeys were at all concerned with
conciseness, you'd learn to bleeding well snip.

Have fun in my bozo bin, Bubba.

Richard
Aug 9 '06 #18

P: n/a
"smnoff" <34**************@hotmail.comwrites:
Any reason why the original programmer of this code in line 514 and 515 put
all the
increments all in the same line?

And for that matter "jammed" everything in a single line? and especially the
assignment of

c = *++s2

in the while loop statement?
This would be perfectly readable, bog standard C. Nothing special or
gimmicky. I would be surprised to see someone break it into two
statements such as

s2++;
c=*s2;

to be honest. Use the features of the language. Dont try and program a
language like another language.

e.g do use

x=(f?a:b);

rather than

if(f)
x=a;
else
x=b;

Such things can and do seem a little strange at first, but like

while(*d++=*s++);

it is perfectly clear and recognisable to any C programmer worth his
salt.

Aug 9 '06 #19

P: n/a
I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull as opposed
to something like '\0'

Can anyone please explain?


"smnoff" <rh******@hotmail.comwrote in message
news:qYUAg.4449$W93.4358@dukeread05...
Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr
methodand why it's written the ways it written, as well as in regards to
perfomance or code simplicity, it will help me learn c.Thanks. 497 char
*
498 strstr(const char *as1, const char *as2)
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
514 while ((c = *++s2) == *s1++ && c)
515 ;
516 if (c == 0)
517 return ((char *)tptr - 1);
518 s1 = tptr;
519 s2 = as2;
520 c = *s2;
521 }
522
523 return (NULL);
524 }

Aug 24 '06 #20

P: n/a

"smnoff" <34**************@hotmail.comwrote in message
news:rmpHg.100728$LF4.87736@dukeread05...
>I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull as
opposed to something like '\0'

Can anyone please explain?
C uses zero to represent logical false. Also as the string terminating
character. Also to represent a pointer to nothing, or the null pointer.
Rather an overworked beast.
'\0' to represent a string-terminating character is slightly more correct
than 0, but the integer value of the character must be zero, and 0 is less
typing. So you will see both.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.

Aug 25 '06 #21

P: n/a
Malcolm wrote:
"smnoff" <34**************@hotmail.comwrote in message
>I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull
as opposed to something like '\0'

Can anyone please explain?

C uses zero to represent logical false. Also as the string
terminating character. Also to represent a pointer to nothing, or
the null pointer. Rather an overworked beast.

'\0' to represent a string-terminating character is slightly more
correct than 0, but the integer value of the character must be
zero, and 0 is less typing. So you will see both.
'\0' is a constant, of type integer. In "if (expression)" the
expression has logical type, and is considered false iff it
evaluates to zero. Any other value is considered true.

--
Chuck F (cb********@yahoo.com) (cb********@maineline.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.netUSE maineline address!
Aug 25 '06 #22

P: n/a
CBFalconer wrote:
>
Malcolm wrote:
"smnoff" <34**************@hotmail.comwrote in message
I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull
as opposed to something like '\0'

Can anyone please explain?
C uses zero to represent logical false. Also as the string
terminating character. Also to represent a pointer to nothing, or
the null pointer. Rather an overworked beast.

'\0' to represent a string-terminating character is slightly more
correct than 0, but the integer value of the character must be
zero, and 0 is less typing. So you will see both.

'\0' is a constant, of type integer.
ITYM "of type int"

--
pete
Aug 25 '06 #23

P: n/a
pete wrote:
>
CBFalconer wrote:

Malcolm wrote:
"smnoff" <34**************@hotmail.comwrote in message
>
>I am trying to understand why line 516:
>>
> 516 if (c == 0)
>>
>is using 0 as the indicator to know when the match is successfull
>as opposed to something like '\0'
>>
>Can anyone please explain?
>
C uses zero to represent logical false. Also as the string
terminating character. Also to represent a pointer to nothing, or
the null pointer. Rather an overworked beast.
>
'\0' to represent a string-terminating character is slightly more
correct than 0, but the integer value of the character must be
zero, and 0 is less typing. So you will see both.
'\0' is a constant, of type integer.

ITYM "of type int"
Not meaning to suggest that "of type integer" is wrong,
because it is correct.

--
pete
Aug 25 '06 #24

P: n/a
On Fri, 25 Aug 2006 04:05:23 UTC, "Malcolm" <re*******@btinternet.com>
wrote:
>
"smnoff" <34**************@hotmail.comwrote in message
news:rmpHg.100728$LF4.87736@dukeread05...
I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull as
opposed to something like '\0'

Can anyone please explain?
C uses zero to represent logical false. Also as the string terminating
character. Also to represent a pointer to nothing, or the null pointer.
Rather an overworked beast.
'\0' to represent a string-terminating character is slightly more correct
than 0,
No. It doesn't matter if you use '\0', 0, 0x00 or 000 or 1-1. Writing
'\0' as string terminator is only a better documentation for the human
reader.
but the integer value of the character must be zero, and 0 is less
typing. So you will see both.
The equivalent is for pointer. Using NULL instead of 0 is only better
documentation. You may even use '\0' but that is to confuse the human
reader, not the compiler.
--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Aug 25 '06 #25

P: n/a
So what's better performance wise?

c == 0

or

c == '\0'

or even something else?

"smnoff" <34**************@hotmail.comwrote in message
news:rmpHg.100728$LF4.87736@dukeread05...
>I am trying to understand why line 516:

516 if (c == 0)

is using 0 as the indicator to know when the match is successfull as
opposed to something like '\0'

Can anyone please explain?


"smnoff" <rh******@hotmail.comwrote in message
news:qYUAg.4449$W93.4358@dukeread05...
>Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr
methodand why it's written the ways it written, as well as in regards to
perfomance or code simplicity, it will help me learn c.Thanks. 497
char *
498 strstr(const char *as1, const char *as2)
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
514 while ((c = *++s2) == *s1++ && c)
515 ;
516 if (c == 0)
517 return ((char *)tptr - 1);
518 s1 = tptr;
519 s2 = as2;
520 c = *s2;
521 }
522
523 return (NULL);
524 }


Aug 26 '06 #26

P: n/a
smnoff wrote:
>
So what's better performance wise?

c == 0

or

c == '\0'

or even something else?
Neither. They mean exactly the same thing.

Kindly DO NOT top-post. See the links below.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
Aug 26 '06 #27

P: n/a

pete wrote:
pete wrote:

CBFalconer wrote:
>
Malcolm wrote:
"smnoff" <34**************@hotmail.comwrote in message

I am trying to understand why line 516:
>
516 if (c == 0)
>
is using 0 as the indicator to know when the match is successfull
as opposed to something like '\0'
>
Can anyone please explain?

C uses zero to represent logical false. Also as the string
terminating character. Also to represent a pointer to nothing, or
the null pointer. Rather an overworked beast.

'\0' to represent a string-terminating character is slightly more
correct than 0, but the integer value of the character must be
zero, and 0 is less typing. So you will see both.
>
'\0' is a constant, of type integer.
ITYM "of type int"

Not meaning to suggest that "of type integer" is wrong,
because it is correct.
"Of integer type" might be right; "of type integer" isn't
appropriate, since 'integer' might be a type name for a
type other than int -

typedef unsigned long integer;

Aug 26 '06 #28

P: n/a
On Fri, 25 Aug 2006 21:28:34 -0500, in comp.lang.c , "smnoff"
<34**************@hotmail.comwrote:
>So what's better performance wise?

c == 0

or

c == '\0'
they're identical since '\0' is an integer with value zero.
>or even something else?
As long as the compiler could determine that it was zero, you could
use any expression and it would be replaced by zero by any compiler
worth its salt.

c== (12-12);

and quite possibly

c== (strlen("hello")-strlen("hello"));

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 26 '06 #29

P: n/a
"Mark McIntyre" <ma**********@spamcop.netwrote in message >
they're identical since '\0' is an integer with value zero.

Does '\0' get converted to an integer at compile time? I am thinking yes.
And is NULL also an integer of value 0 as well?
Aug 26 '06 #30

P: n/a

smnoff wrote:
"Mark McIntyre" <ma**********@spamcop.netwrote in message >
they're identical since '\0' is an integer with value zero.


Does '\0' get converted to an integer at compile time? I am thinking yes.
And is NULL also an integer of value 0 as well?
It doesn't get converted to an integer, it's just an integer.
It might be convenient to think of it as a character, but
the type is int.

Aug 26 '06 #31

P: n/a
On Sat, 26 Aug 2006 16:48:56 -0500, in comp.lang.c , "smnoff"
<34**************@hotmail.comwrote:
>"Mark McIntyre" <ma**********@spamcop.netwrote in message >
>they're identical since '\0' is an integer with value zero.


Does '\0' get converted to an integer at compile time? I am thinking yes.
No. '\0' IS an integer.

From 6.4.4.4 Chaacter constants

An integer character constant is a sequence of one or more multibyte
characters enclosed in single-quotes, as in 'x'...
....
The octal digits that follow the backslash in an octal escape sequence
are taken to be part of the construction of a single character for an
integer character constant....
>And is NULL also an integer of value 0 as well?
NULL is defined as 0 or '\0'.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Aug 26 '06 #32

P: n/a
smnoff wrote:
>
So what's better performance wise?

c == 0

or

c == '\0'

or even something else?
No.

--
pete
Aug 27 '06 #33

P: n/a
Mark McIntyre wrote:
On Sat, 26 Aug 2006 16:48:56 -0500, in comp.lang.c , "smnoff"
<34**************@hotmail.comwrote:
>"Mark McIntyre" <ma**********@spamcop.netwrote in message >
>>they're identical since '\0' is an integer with value zero.
<snip>
>And is NULL also an integer of value 0 as well?

NULL is defined as 0 or '\0'.
Or, amongst other possibilities, (void*)0 which most definitely is not
an integer.
--
Flash Gordon
Aug 27 '06 #34

P: n/a
On Sat, 26 Aug 2006 21:48:56 UTC, "smnoff"
<34**************@hotmail.comwrote:
"Mark McIntyre" <ma**********@spamcop.netwrote in message >
they're identical since '\0' is an integer with value zero.


Does '\0' get converted to an integer at compile time? I am thinking yes.
Depends on how it is used. Is it a member of a string no conversion is
done.
Is it an operand of an assignment it gets converted to the type of the
other operand.
'\0' in source can be converted at compile time to any type, even to a
null pointer constant.
And is NULL also an integer of value 0 as well?
Depends on how NULL is defined. NULL is defined to be an null pointer
constant. As NULL is a macro defined as (void*)0 then conversion to
any other type as a pointer type will fail, if it is defined simply as
0 then conversion is the same as 0 or '\0'.
--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
Aug 27 '06 #35

P: n/a
I am a little confused on the WHILE loop at line 511 shown below(or the
original post). Lines 512 to 521 is the IF statement and following that is
the line 523, the return statement.

Is line 512 to 521 of the IF statement seen as the single and only line of
the WHILE loop in Line 511? or does it go to the next semi colon in line 523
of the return statement?

"smnoff" <rh******@hotmail.comwrote in message
news:qYUAg.4449$W93.4358@dukeread05...
Below is a section from string.c at this
linkhttp://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.cthat
I am trying to fully understand.I don't fully understand LINE 514; not to
mention that entire inner while loop and what it'strying to accomplish. I
figured that if I can at least understand each line of this strstr
methodand why it's written the ways it written, as well as in regards to
perfomance or code simplicity, it will help me learn c.Thanks. 497 char
*
498 strstr(const char *as1, const char *as2)
499 {
500 const char *s1, *s2;
501 const char *tptr;
502 char c;
503
504 s1 = as1;
505 s2 = as2;
506
507 if (s2 == NULL || *s2 == '\0')
508 return ((char *)s1);
509 c = *s2;
510
511 while (*s1)
512 if (*s1++ == c) {
513 tptr = s1;
514 while ((c = *++s2) == *s1++ && c)
515 ;
516 if (c == 0)
517 return ((char *)tptr - 1);
518 s1 = tptr;
519 s2 = as2;
520 c = *s2;
521 }
522
523 return (NULL);
524 }

Aug 28 '06 #36

P: n/a
smnoff wrote:
Is line 512 to 521 of the IF statement seen as the single and only line of
the WHILE loop in Line 511?
Yeah. And I totally refuse to buy any argument I've ever heard
against using brackets on control structures, even when they are only
one-statement blocks.

In my shop, you'd format it like this, or maybe you'd be happier
elsewhere :-)

strstr(const char *as1, const char *as2)
{
const char *s1,
*s2;
const char *tptr;
char c;

s1 = as1;
s2 = as2;

if (s2 == NULL || *s2 == '\0') {
return ((char *) s1);
}
c = *s2;

while (*s1) {
if (*s1++ == c) {
tptr = s1;
while ((c = *++s2) == *s1++ && c);
if (c == 0) {
return ((char *) tptr - 1);
}
s1 = tptr;
s2 = as2;
c = *s2;
}
}
return (NULL);
}
Aug 28 '06 #37

P: n/a
So I got to ask, why does code get written like this to begin with?
Especially since this is library code? Job Security? Just because they can?
Neat trick?
"jmcgill" <jm*****@email.arizona.eduwrote in message
news:2%FIg.4034$y61.2475@fed1read05...
smnoff wrote:
>Is line 512 to 521 of the IF statement seen as the single and only line
of
the WHILE loop in Line 511?

Yeah. And I totally refuse to buy any argument I've ever heard
against using brackets on control structures, even when they are only
one-statement blocks.

In my shop, you'd format it like this, or maybe you'd be happier
elsewhere :-)

strstr(const char *as1, const char *as2)
{
const char *s1,
*s2;
const char *tptr;
char c;

s1 = as1;
s2 = as2;

if (s2 == NULL || *s2 == '\0') {
return ((char *) s1);
}
c = *s2;

while (*s1) {
if (*s1++ == c) {
tptr = s1;
while ((c = *++s2) == *s1++ && c);
if (c == 0) {
return ((char *) tptr - 1);
}
s1 = tptr;
s2 = as2;
c = *s2;
}
}
return (NULL);
}

Aug 28 '06 #38

P: n/a
smnoff wrote:
So I got to ask, why does code get written like this to begin with?
Machine generated from something higher level maybe?
Especially since this is library code?
It did not look that bad to me, but it had evidently acquired
indentation problems while becoming a USENET message.

I personally think it should be commented, both at the top level of the
function and in the hairy, idiomatic buffer operation. Even a
programmer with decades of experience would prefer to read in English
why this works, than to parse the code.

But then, I imagine it *does* work and even is probably somewhat tuned.
Aug 28 '06 #39

P: n/a
jmcgill wrote:
smnoff wrote:
>Is line 512 to 521 of the IF statement seen as the single and only line of
the WHILE loop in Line 511?

Yeah. And I totally refuse to buy any argument I've ever heard
against using brackets on control structures, even when they are only
one-statement blocks.

In my shop, you'd format it like this, or maybe you'd be happier
elsewhere :-)

strstr(const char *as1, const char *as2)
I think you missed something here. Namely the return type. Especially as
you did not mean for it to return an int! I would also point out that
the user is not allowed to write a function named strstr with external
linkage (or internal linkage if string.h has been included) so I assume
this is discussing possible implementations the library could use.
{
const char *s1,
*s2;
const char *tptr;
Be consistent about whether you are splitting up definitions of the same
type or not.
char c;

s1 = as1;
s2 = as2;
Why not initialise s1 and s2 on definition?
if (s2 == NULL || *s2 == '\0') {
return ((char *) s1);
}
c = *s2;

while (*s1) {
if (*s1++ == c) {
tptr = s1;
while ((c = *++s2) == *s1++ && c);
if (c == 0) {
return ((char *) tptr - 1);
You don't need the parenthesis for return so I would not use them.
}
s1 = tptr;
s2 = as2;
c = *s2;
}
}
return (NULL);
Again with the parenthesis.
}
Other than that I would consider your style acceptable and more readable
than the alternative you were responding to.
--
Flash Gordon
Aug 28 '06 #40

P: n/a
smnoff wrote:
I am a little confused on the WHILE loop at line 511 shown below(or
Please don't top-post. Your replies belong following or interspersed
with properly trimmed quotes.


Brian
Aug 28 '06 #41

P: n/a
jmcgill wrote:
In my shop, you'd format it like this, or maybe you'd be happier
elsewhere :-)
while ((c = *++s2) == *s1++ && c);
ITYM

while ((c = *++s2) == *s1++ && c) {
;
}

This is my version of strstr:

#include <stddef.h>

char *str_str(const char *s1, const char *s2);
size_t str_len(const char *s);
char *str_chr(const char *s, int c);
int str_ncmp(const char *s1, const char *s2, size_t n);

char *str_str(const char *s1, const char *s2)
{
const int c = *s2++;

if (c != '\0') {
const size_t n = str_len(s2);

s1 = str_chr(s1, c);
while (s1 != NULL && str_ncmp(s1 + 1, s2, n) != 0) {
s1 = str_chr(s1 + 1, c);
}
}
return (char *)s1;
}

size_t str_len(const char *s)
{
size_t n;

for (n = 0; *s != '\0'; ++s) {
++n;
}
return n;
}

char *str_chr(const char *s, int c)
{
while (*s != (char)c) {
if (*s == '\0') {
return NULL;
}
++s;
}
return (char *)s;
}

int str_ncmp(const char *s1, const char *s2, size_t n)
{
const unsigned char *p1 = (const unsigned char *)s1;
const unsigned char *p2 = (const unsigned char *)s2;

while (n-- != 0) {
if (*p1 != *p2) {
return *p2 *p1 ? -1 : 1;
}
if (*p1 == '\0') {
return 0;
}
++p1;
++p2;
}
return 0;
}
--
pete
Aug 29 '06 #42

P: n/a
pete wrote:
> while ((c = *++s2) == *s1++ && c);

ITYM

while ((c = *++s2) == *s1++ && c) {
;
}
Yeah, of course. It actually made me cringe earlier. I *know* what's
valid, but I don't *like* it.
This is my version of strstr:
Much easier on the eyes.
Aug 29 '06 #43

P: n/a
Flash Gordon wrote:
I think you missed something here.
I think all I did was run the code through indent and maybe added a pair
of brackets.

Your questions all need to be directed to someone who codes for the open
version of the string library for Solaris, which is where I was led to
believe this code originated.
Aug 29 '06 #44

P: n/a
On Mon, 28 Aug 2006 10:48:24 -0700, jmcgill
<jm*****@email.arizona.eduwrote:
>smnoff wrote:
>Is line 512 to 521 of the IF statement seen as the single and only line of
the WHILE loop in Line 511?

Yeah. And I totally refuse to buy any argument I've ever heard
against using brackets on control structures, even when they are only
one-statement blocks.

In my shop, you'd format it like this, or maybe you'd be happier
elsewhere :-)

strstr(const char *as1, const char *as2)
{
const char *s1,
*s2;
const char *tptr;
char c;

s1 = as1;
s2 = as2;

if (s2 == NULL || *s2 == '\0') {
return ((char *) s1);
}
c = *s2;

while (*s1) {
if (*s1++ == c) {
tptr = s1;
while ((c = *++s2) == *s1++ && c);
if (c == 0) {
return ((char *) tptr - 1);
}
s1 = tptr;
s2 = as2;
c = *s2;
}
}
return (NULL);
}

Ugh!

Oz
--
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Aug 29 '06 #45

P: n/a
ozbear wrote:
>
Ugh!
I didn't write it. I only formatted it.
Don't tell me you preferred the original, with no indents,
and overly economical braces.
Aug 29 '06 #46

P: n/a
jmcgill wrote:
ozbear wrote:
>>
Ugh!

I didn't write it. I only formatted it.
Don't tell me you preferred the original, with no indents,
and overly economical braces.
One can think that both are unnecessarily ugly.

[And I've heard no convincing argument for using brackets on
control structure arguments when they are only one-statement
blocks.]

--
Chris "convincing /to me/, natch" Dollin
"People are part of the design. It's dangerous to forget that." /Star Cops/

Aug 29 '06 #47

P: n/a
jmcgill wrote:
Flash Gordon wrote:
>I think you missed something here.

I think all I did was run the code through indent and maybe added a pair
of brackets.

Your questions all need to be directed to someone who codes for the open
version of the string library for Solaris, which is where I was led to
believe this code originated.
Possibly, or possibly you missed the return type on a cut and paste.
However, it was definitely missing.
--
Flash Gordon
Aug 29 '06 #48

P: n/a

Chris Dollin wrote:
jmcgill wrote:
ozbear wrote:
>
Ugh!
I didn't write it. I only formatted it.
Don't tell me you preferred the original, with no indents,
and overly economical braces.

One can think that both are unnecessarily ugly.

[And I've heard no convincing argument for using brackets on
control structure arguments when they are only one-statement
blocks.]
Out of curiousity, what arguments have you heard?

Sep 3 '06 #49

P: n/a
en******@yahoo.com wrote:
>
Chris Dollin wrote:
>jmcgill wrote:
ozbear wrote:

Ugh!
I didn't write it. I only formatted it.
Don't tell me you preferred the original, with no indents,
and overly economical braces.

One can think that both are unnecessarily ugly.

[And I've heard no convincing argument for using brackets on
control structure arguments when they are only one-statement
blocks.]

Out of curiousity, what arguments have you heard?
That not using brackets means that when additional statements
are added to the controlled statement, the adder won't put
in the brackets and the code will be broken.

One should be consistent.

The local style guide says so.

--
Chris "unconvinced" Dollin
Meaning precedes definition.

Sep 4 '06 #50

54 Replies

This discussion thread is closed

Replies have been disabled for this discussion.