473,396 Members | 2,085 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Evaluation of C program

Hi,

I'm a python programmer that's started to play a bit with C as I'll
probably have to make C extensions eventually.. I made this little
program that I'd like to get feedback on, it's basically a find
substring and return pointer to it function and tests for it..

Could this be done better/differently? Is anything fundamentally
wrong?

---

#include <stdio.h>

char *find_substring(char *substring, char *string) {
int index_string, index_substring;
for (index_string = 0; string[index_string] != 0; index_string++) {
index_substring = 0;
do {
if (substring[index_substring] == 0)
goto success;
if (string[index_string + index_substring] !=
substring[index_substring])
goto next;
} while (++index_substring);
success:
return &string[index_string];
next: ;
}
return 0;
}

int main() {
if(find_substring("test", "this is a test"))
printf("Found test in string!\n");

return 0;
}

---

Thanks,

Morten

Feb 16 '06 #1
8 1447
In article <11**********************@g43g2000cwa.googlegroups .com>,
<mo*****@gmail.com> wrote:
I'm a python programmer that's started to play a bit with C as I'll
probably have to make C extensions eventually.. I made this little
program that I'd like to get feedback on, it's basically a find
substring and return pointer to it function and tests for it.. Could this be done better/differently? Is anything fundamentally
wrong? #include <stdio.h>
Your entire find_substring() function could be replaced with a
single call to strstr();
char *find_substring(char *substring, char *string) {
int index_string, index_substring;
for (index_string = 0; string[index_string] != 0; index_string++) {
strings can be longer than an 'int' can hold. Check out size_t
index_substring = 0;
do {
if (substring[index_substring] == 0)
goto success;
goto should rarely be used. Consider using "break" here.
if (string[index_string + index_substring] !=
substring[index_substring])
goto next;
goto should rarely be used. Consider using "continue" here.
} while (++index_substring);
The only way that ++index_substring could be false in that test is
if the int representation overflowed. This suggests that you are using
the wrong control structure; consider using a for().
success:
return &string[index_string];
next: ;
}
return 0;
Returning 0 is not wrong here, but it would be more readable
if you were to use NULL instead of 0.
}
int main() {
int main(void) {
if(find_substring("test", "this is a test"))
printf("Found test in string!\n");
If you do not find the substring, then you print nothing, and
won't know what happened.
return 0;
Returning 0 indicates success, but it could be argued that if you
do not find the substring then your program has failed and so
returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
in that instance.
}
Could this be done better/differently?


Suppose you don't find a match the first iteration through
because the 10th character in the trial substring does not match
the 10th character in the input string. You then loop back
and start testing from the substring again from the second
character of the input string -- ignoring all the information
you could have gleened by paying attention to what was in
offsets 1 thru 9 that you have already looked at.

Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

What this tells you is that there are algorithms which can be
much faster than your search algorithm. Indeed, there is an entire
literature on search algorithms. Some are probably mentioned in the C
FAQ or by googling for "string search algorithms".
--
If you lie to the compiler, it will get its revenge. -- Henry Spencer
Feb 16 '06 #2
>> #include <stdio.h>

Your entire find_substring() function could be replaced with a
single call to strstr();


Yep. This was an exercise so I didn't look, but good to know.
char *find_substring(char *substring, char *string) {
int index_string, index_substring;
for (index_string = 0; string[index_string] != 0; index_string++) {


strings can be longer than an 'int' can hold. Check out size_t


Good point.
index_substring = 0;
do {
if (substring[index_substring] == 0)
goto success;


goto should rarely be used. Consider using "break" here.
if (string[index_string + index_substring] !=
substring[index_substring])
goto next;


goto should rarely be used. Consider using "continue" here.


I don't see the harm in using goto.. it does the job. :-)
} while (++index_substring);


The only way that ++index_substring could be false in that test is
if the int representation overflowed. This suggests that you are using
the wrong control structure; consider using a for().


Well, it will always goto using one of the two statements above it, so
it works. But it should be a long, yes.
success:
return &string[index_string];
next: ;
}
return 0;


Returning 0 is not wrong here, but it would be more readable
if you were to use NULL instead of 0.


Yep.
}

int main() {


int main(void) {
if(find_substring("test", "this is a test"))
printf("Found test in string!\n");


If you do not find the substring, then you print nothing, and
won't know what happened.


Yeah..
return 0;


Returning 0 indicates success, but it could be argued that if you
do not find the substring then your program has failed and so
returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
in that instance.
}


Could this be done better/differently?


Suppose you don't find a match the first iteration through
because the 10th character in the trial substring does not match
the 10th character in the input string. You then loop back
and start testing from the substring again from the second
character of the input string -- ignoring all the information
you could have gleened by paying attention to what was in
offsets 1 thru 9 that you have already looked at.

Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

What this tells you is that there are algorithms which can be
much faster than your search algorithm. Indeed, there is an entire
literature on search algorithms. Some are probably mentioned in the C
FAQ or by googling for "string search algorithms".


OK. This was a bit beyond what I was looking to figure out, but good to
know I guess.

Thanks. :)

-Morten
Feb 16 '06 #3

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:dt**********@canopus.cc.umanitoba.ca...
In article <11**********************@g43g2000cwa.googlegroups .com>,
<mo*****@gmail.com> wrote:
I'm a python programmer that's started to play a bit with C as I'll
probably have to make C extensions eventually.. I made this little
program that I'd like to get feedback on, it's basically a find
substring and return pointer to it function and tests for it..
Could this be done better/differently? Is anything fundamentally
wrong?

#include <stdio.h>


Your entire find_substring() function could be replaced with a
single call to strstr();
char *find_substring(char *substring, char *string) {
int index_string, index_substring;
for (index_string = 0; string[index_string] != 0; index_string++) {


strings can be longer than an 'int' can hold. Check out size_t
index_substring = 0;
do {
if (substring[index_substring] == 0)
goto success;


goto should rarely be used. Consider using "break" here.
if (string[index_string + index_substring] !=
substring[index_substring])
goto next;


goto should rarely be used. Consider using "continue" here.
} while (++index_substring);


The only way that ++index_substring could be false in that test is
if the int representation overflowed. This suggests that you are using
the wrong control structure; consider using a for().
success:
return &string[index_string];
next: ;
}
return 0;


Returning 0 is not wrong here, but it would be more readable
if you were to use NULL instead of 0.
}


int main() {


int main(void) {
if(find_substring("test", "this is a test"))
printf("Found test in string!\n");


If you do not find the substring, then you print nothing, and
won't know what happened.

return 0;


Returning 0 indicates success, but it could be argued that if you
do not find the substring then your program has failed and so
returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
in that instance.
}


Could this be done better/differently?


Suppose you don't find a match the first iteration through
because the 10th character in the trial substring does not match
the 10th character in the input string. You then loop back
and start testing from the substring again from the second
character of the input string -- ignoring all the information
you could have gleened by paying attention to what was in
offsets 1 thru 9 that you have already looked at.

Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

Not True!

Suppose string = "XXXXXXXXXXY"
and substring = "XXXXXXXXXY"

They do not match beginning from index 0, since 'X' != 'Y'
But you cannot skip to index 9, since they DO match statring at index 1.
What this tells you is that there are algorithms which can be
much faster than your search algorithm. Indeed, there is an entire
literature on search algorithms. Some are probably mentioned in the C
FAQ or by googling for "string search algorithms".
--
If you lie to the compiler, it will get its revenge. -- Henry Spencer


--
Fred L. Kleinschmidt
Boeing Associate Technical Fellow
Technical Architect, Software Reuse Project
Feb 16 '06 #4
In article <Iu********@news.boeing.com>,
Fred Kleinschmidt <fr******************@boeing.com> wrote:
"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.ca> wrote in message
news:dt**********@canopus.cc.umanitoba.ca...
Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

Not True! Suppose string = "XXXXXXXXXXY"
and substring = "XXXXXXXXXY" They do not match beginning from index 0, since 'X' != 'Y'
But you cannot skip to index 9, since they DO match statring at index 1.


But that violates the proposition "and there are no occurances
of 'X' in the trial substring".
--
I was very young in those days, but I was also rather dim.
-- Christopher Priest
Feb 17 '06 #5
mo*****@gmail.com wrote:
I'm a python programmer that's started to play a bit with C as I'll
probably have to make C extensions eventually.. I made this little
program that I'd like to get feedback on, it's basically a find
substring and return pointer to it function and tests for it..

Could this be done better/differently? Is anything fundamentally
wrong?

---

#include <stdio.h>

char *find_substring(char *substring, char *string) {
int index_string, index_substring;
for (index_string = 0; string[index_string] != 0; index_string++) {
index_substring = 0;
do {
if (substring[index_substring] == 0)
goto success;
if (string[index_string + index_substring] !=
substring[index_substring])
goto next;
} while (++index_substring);
success:
return &string[index_string];
next: ;
}
return 0;
}

int main() {
if(find_substring("test", "this is a test"))
printf("Found test in string!\n");

return 0;
}


Here is how I would do it.
--- Source Text ---

#include <stdio.h>
#include <string.h>
/* Returns the starting index of pattern in s or -1 if not found. */

int position(const char *pattern, const char *s)
{
int j, k, plen, slen, res;

plen = strlen(pattern);
slen = strlen(s);
res = -1;
j = 0;
while ((res < 0) && (j + plen < slen)) {
k = 0;
while ((k < plen) && (pattern[k] == s[j + k])) { k++; }
if (k == plen) { res = j; }
j++;
}
return res;
}
int main(void)
{
char s[] = "Hello there!";
char pattern[] = "there";
int pos;

pos = position(pattern, s);
if (pos < 0) {
printf("\"%s\" does not contain \"%s\".\n", s, pattern);
} else {
printf("\"%s\" contains \"%s\" starting at index %d.\n",
s, pattern, pos);
}
return 0;
}

--- End Of Source Text ---
August

--
I am the "ILOVEGNU" signature virus. Just copy me to your
signature. This email was infected under the terms of the GNU
General Public License.
Feb 17 '06 #6
> Here is how I would do it.


--- Source Text ---

#include <stdio.h>
#include <string.h>
/* Returns the starting index of pattern in s or -1 if not found. */

int position(const char *pattern, const char *s)
{
int j, k, plen, slen, res;

plen = strlen(pattern);
slen = strlen(s);
res = -1;
j = 0;
while ((res < 0) && (j + plen < slen)) {
k = 0;
while ((k < plen) && (pattern[k] == s[j + k])) { k++; }
if (k == plen) { res = j; }
j++;
}
return res;
}


This was a nice example. Thanks for posting it. :-)

-Morten
Feb 17 '06 #7
Walter Roberson wrote:
In article <11**********************@g43g2000cwa.googlegroups .com>,
<mo*****@gmail.com> wrote:
I'm a python programmer that's started to play a bit with C as I'll
probably have to make C extensions eventually.. I made this little
program that I'd like to get feedback on, it's basically a find
substring and return pointer to it function and tests for it..
Could this be done better/differently? Is anything fundamentally
wrong?

Suppose you don't find a match the first iteration through
because the 10th character in the trial substring does not match
the 10th character in the input string. You then loop back
and start testing from the substring again from the second
character of the input string -- ignoring all the information
you could have gleened by paying attention to what was in
offsets 1 thru 9 that you have already looked at.

Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

What this tells you is that there are algorithms which can be
much faster than your search algorithm. Indeed, there is an entire
literature on search algorithms. Some are probably mentioned in the C
FAQ or by googling for "string search algorithms".


Boyer Horspool Moore comes to mind. I implemented it with
a circular buffer; it's fast.

Stijn

Feb 17 '06 #8
mi****@gmail.com wrote:
Walter Roberson wrote:
<mo*****@gmail.com> wrote:
I'm a python programmer that's started to play a bit with C as
I'll probably have to make C extensions eventually.. I made
this little program that I'd like to get feedback on, it's
basically a find substring and return pointer to it function
and tests for it..

Could this be done better/differently? Is anything fundamentally
wrong?

Suppose you don't find a match the first iteration through
because the 10th character in the trial substring does not match
the 10th character in the input string. You then loop back
and start testing from the substring again from the second
character of the input string -- ignoring all the information
you could have gleened by paying attention to what was in
offsets 1 thru 9 that you have already looked at.

Suppose for example that what you found at the 10th character at the
input string was an 'X', and there are no occurances of 'X' in the
trial substring. A moment's reflection would show you that in
such an instance you would be able to skip testing for the
substring as starting from 1 thru 9, since that 'X' will still be
there at offset 9 blocking any possible match.

What this tells you is that there are algorithms which can be
much faster than your search algorithm. Indeed, there is an entire
literature on search algorithms. Some are probably mentioned in the C
FAQ or by googling for "string search algorithms".


Boyer Horspool Moore comes to mind. I implemented it with
a circular buffer; it's fast.


Knuth-Morris-Pratt has the distinct advantage of allowing operation
on streams and totally avoiding any need for look ahead or look
back. The following is from a post I made here almost two years
ago, and illustrates the use of KMP on a file stream.

/*
Leor Zolman wrote: On 25 Feb 2004 07:34:40 -0800, jo**@ljungh.se (spike) wrote:
Im trying to write a program that should read through a binary
file searching for the character sequence "\name\"

Then it should read the characters following the "\name\"
sequence until a NULL character is encountered.

But when my program runs it gets a SIGSEGV (Segmentation
vioalation) signal.

Whats wrong? And is there a better way than mine to solve
this task (most likely)


I think so. Here's a version I just threw together:

*/

/* And heres another throw -- binfsrch.c by CBF */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>

/* The difference between a binary and a text file, on read,
is the conversion of end-of-line delimiters. What those
delimiters are does not affect the action. In some cases
the presence of 0x1a EOF markers (MsDos) does.

This is a version of Knuth-Morris-Pratt algorithm. The
point of using this is to avoid any backtracking in file
reading, and thus avoiding any use of buffer arrays.
*/

size_t chrcount; /* debuggery, count of input chars, zeroed */

/* --------------------- */

/* Almost straight out of Sedgewick */
/* The next array indicates what index in id should next be
compared to the current char. Once the (lgh - 1)th char
has been successfully compared, the id has been found.
The array is formed by comparing id to itself. */
void initnext(int *next, const char *id, int lgh)
{
int i, j;

assert(lgh > 0);
next[0] = -1; i = 0; j = -1;
while (i < lgh) {
while ((j >= 0) && (id[i] != id[j])) j = next[j];
i++; j++;
next[i] = j;
}
#if (0)
for (i = 0; i < lgh; i++)
printf("id[%d] = '%c' next[%d] = %d\n",
i, id[i], i, next[i]);
#endif
} /* initnext */

/* --------------------- */

/* reads f without rewinding until either EOF or *marker
has been found. Returns EOF if not found. At exit the
last matching char has been read, and no further. */
int kmpffind(const char *marker, int lgh, int *next, FILE *f)
{
int j; /* char position in marker to check */
int ch; /* current char */

assert(lgh > 0);
j = 0;
while ((j < lgh) && (EOF != (ch = getc(f)))) {
chrcount++;
while ((j >= 0) && (ch != marker[j])) j = next[j];
j++;
}
return ch;
} /* kmpffind */

/* --------------------- */

/* Find marker in f, display following printing chars
up to some non printing character or EOF */
int binfsrch(const char *marker, FILE *f)
{
int *next;
int lgh;
int ch;
int items; /* count of markers found */

lgh = strlen(marker);
if (!(next = malloc(lgh * sizeof *next))) {
puts("No memory");
exit(EXIT_FAILURE);
}
else {
initnext(next, marker, lgh);
items = 0;
while (EOF != kmpffind(marker, lgh, next, f)) {
/* found, take appropriate action */
items++;
printf("%d %s : \"", items, marker);
while (isprint(ch = getc(f))) {
chrcount++;
putchar(ch);
}
puts("\"");
if (EOF == ch) break;
else chrcount++;
}
free(next);
return items;
}
} /* binfsrch */

/* --------------------- */

int main(int argc, char **argv)
{
FILE *f;

f = stdin;
if (3 == argc) {
if (!(f = fopen(argv[2], "rb"))) {
printf("Can't open %s\n", argv[2]);
exit(EXIT_FAILURE);
}
argc--;
}
if (2 != argc) {
puts("Usage: binfsrch name [binaryfile]");
puts(" (file defaults to stdin text mode)");
}
else if (binfsrch(argv[1], f)) {
printf("\"%s\" : found\n", argv[1]);
}
else printf("\"%s\" : not found\n", argv[1]);
printf("%lu chars\n", (unsigned long)chrcount);
return 0;
} /* main binfsrch */

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
Feb 17 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Leo Breebaart | last post by:
Hi all, I have a question about Python and delayed evaluation. Short-circuiting of Boolean expressions implies that in: >>> if a() and b(): any possible side-effects the call to b() might...
20
by: Ilias Lazaridis | last post by:
" A cooperation between Sun Microsystems and IBM&Co. in conjunction with liberal & high evolutive communities would result in an nearly unbeatable programming platform. My evaluation has shown:...
16
by: Bhushit Joshipura | last post by:
This post contains one question and one proposal. A. May I know why order of evaluation of arguments is not specified in C/C++? I asked a question in comp.lang.c++ for the following...
15
by: Jens.Toerring | last post by:
Hi, I have a possibly rather stupid question about the order of evaluation in a statement like this: result = foo( x ) - bar( y ); How can I make 100% sure that foo(x) is evaluated before...
77
by: berns | last post by:
Hi All, A coworker and I have been debating the 'correct' expectation of evaluation for the phrase a = b = c. Two different versions of GCC ended up compiling this as b = c; a = b and the other...
10
by: int main(void) | last post by:
Hi all, In the following program, #include<stdio.h> int main(void) { int x = 10; int y = 10;
32
by: silpau | last post by:
hi, i am a bit confused on expression evaluation order in expressions involving unary increment.decrement operators along with binary operators. For example in the following expression x...
54
by: Rasjid | last post by:
Hello, I have just joined and this is my first post. I have never been able to resolve the issue of order of evaluation in C/C++ and the related issue of precedence of operators, use of...
11
by: Pietro Cerutti | last post by:
Hi group, here I come with a question which is quite simple per se, but for which I can't find an answer. Does the C standard guarantee that inside an expression such as (x && y) "y" is...
39
by: Boltar | last post by:
Why does C do lazy evaluation for logical boolean operations but not bitwise ones? Ie: the following program prints "1 2" , not "1 1" under gcc main() { int a = 1; int b = 1; 0 && ++a;
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.