Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old August 28th, 2008, 06:25 AM
hiteshthappa@gmail.com
Guest
 
Posts: n/a
Default simulation of a "wc" command

hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Here is my code......
#include <string.h>
#include <stdio.h>

main(int argc , char *argv[])
{
FILE *fp;
int ch;
int chr=0;
int totchr=0;
int bspc=0,totbspc=0;
int nline=0;
int word=0,totwrd=0;
int i=0;

fp=fopen(argv[1],"r");
if(argc != 2)
{
printf("\tInsufficient arguments\n");
printf("\tusage: wrd <filename>\n");
exit(0);
}
if(fp==NULL)
{
printf("Error In File Opening\n");
exit(0);
}
else
{
while((ch=fgetc(fp))!=EOF)
{
if(ch == ' ')
{
bspc++;
}
if(ch == '\n')
{
nline++;
}
chr++;
word++;


}
}
//word = bspc + nline;
totchr += chr;
totbspc += bspc;
totwrd += word;

printf("\nFile %s has\n", argv[1]);
printf("\n\twhite spaces are: %d\n", totbspc);
printf("\twords are: %d\n", totwrd);
printf("\tcharacters are: %d\n", totchr);
printf("\tlines are: %d\n\n", nline);

fclose(fp);
}
  #2  
Old August 28th, 2008, 07:05 AM
Andrew Poelstra
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On 2008-08-28, hiteshthappa@gmail.com <hiteshthappa@gmail.comwrote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Here is my code......
>
[code snipped]
>
Could you repost with your code indented (two spaces is easiest to
read on Usenet, IMHO) correctly, please?

Also, if you don't need to use format specifiers, the puts() function
will print a string, automatically appending a '\n' to it, so it makes
things a bit easier to read.

--
Andrew Poelstra apoelstra@wpsoftware.com
To email me, use the above email addresss with .com set to .net
  #3  
Old August 28th, 2008, 08:55 AM
Pietro Cerutti
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

hiteshthappa@gmail.com wrote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Here is my code......
#include <string.h>
#include <stdio.h>
>
main(int argc , char *argv[])
{
FILE *fp;
int ch;
int chr=0;
int totchr=0;
int bspc=0,totbspc=0;
int nline=0;
int word=0,totwrd=0;
int i=0;
>
fp=fopen(argv[1],"r");
if(argc != 2)
argv[1] may not be accessible. That's why you want the check against
argc. But you need to do that check before you try to access argv[1].

Quote:
{
printf("\tInsufficient arguments\n");
printf("\tusage: wrd <filename>\n");
exit(0);
}
if(fp==NULL)
{
printf("Error In File Opening\n");
exit(0);
}
else
{
while((ch=fgetc(fp))!=EOF)
{
if(ch == ' ')
{
bspc++;
}
if(ch == '\n')
{
nline++;
}
chr++;
word++;
>
>
}
}
//word = bspc + nline;
totchr += chr;
totbspc += bspc;
totwrd += word;
>
printf("\nFile %s has\n", argv[1]);
printf("\n\twhite spaces are: %d\n", totbspc);
printf("\twords are: %d\n", totwrd);
printf("\tcharacters are: %d\n", totchr);
printf("\tlines are: %d\n\n", nline);
>
fclose(fp);
}
It's most about your definition of "words" and "white spaces". The unix
utility wc refers to a word as a string of characters delimited by a
blank space. If that's the behavior that you're trying to mimic, then
totbspc may be a good approximation of the number of words in your file.

Some error checking after fgetc returns EOF to see whether an
end-of-file event or an error occurred may be needed at the end.

--
Pietro Cerutti
  #4  
Old August 28th, 2008, 09:56 AM
Barry Schwarz
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 27, 10:18*pm, hiteshtha...@gmail.com wrote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Define problem. Be specific. What was your input? What was your
output? What output do you want?
Quote:
Here is my code......
#include <string.h>
#include <stdio.h>
>
main(int argc , char *argv[])
{
FILE *fp;
int ch;
int chr=0;
int totchr=0;
int bspc=0,totbspc=0;
int nline=0;
int word=0,totwrd=0;
int i=0;
>
fp=fopen(argv[1],"r");
if(argc != 2)
{
printf("\tInsufficient arguments\n");
printf("\tusage: wrd <filename>\n");
exit(0);}
Obviously this test should come before the call to fopen.

Why return zero if it failed. Use EXIT_FAILURE from stdlib.h.
Quote:
>
if(fp==NULL)
{
printf("Error In File Opening\n");
exit(0);}
>
else
{
while((ch=fgetc(fp))!=EOF)
{
if(ch == ' ')
{
bspc++;}
>
if(ch == '\n')
{
nline++;}
>
chr++;
word++;
Why are you incrementing word for every character. You should
increment it only if the current character is white space (see isspace
in your reference).
Quote:
>
}
}
A consistent indenting style will save you a lot of time in your
programming efforts.
Quote:
>
//word = bspc + nline;
totchr += chr;
totbspc += bspc;
totwrd += word;
Are the left side operands ever non-zero?
Quote:
>
printf("\nFile %s has\n", argv[1]);
printf("\n\twhite spaces are: %d\n", totbspc);
printf("\twords are: %d\n", totwrd);
printf("\tcharacters are: %d\n", totchr);
printf("\tlines are: %d\n\n", nline);
>
fclose(fp);
>
>
>
}
  #5  
Old August 28th, 2008, 08:15 PM
Bill Reid
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command


Pietro Cerutti <gahr_SPAM_gahr_ME_chwrote in message
news:etWdnbY4FOXdxSvVnZ2dneKdnZydnZ2d@giganews.com ...
Quote:
hiteshthappa@gmail.com wrote:
Quote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Quote:
It's most about your definition of "words" and "white spaces". The unix
utility wc refers to a word as a string of characters delimited by a
blank space. If that's the behavior that you're trying to mimic, then
totbspc may be a good approximation of the number of words in your file.
How about something better than an "approximation"...
Quote:
Some error checking after fgetc returns EOF to see whether an
end-of-file event or an error occurred may be needed at the end.
OK, no assurances this is the "best" way to do this, but here is
how I do it:

char *find_text_field(char *curr_char) {

for(;isspace(*curr_char)!=0;curr_char++);

return curr_char;
}

char *find_next_text_field(char *curr_char) {

for(;isspace(*curr_char)!=0;curr_char++);

for(;*curr_char!='\0';curr_char++)
if(isspace(*curr_char)!=0) break;

for(;isspace(*curr_char)!=0;curr_char++);

return curr_char;
}

unsigned count_text_words(char *text) {
unsigned num_word=0;
char *curr_char;

if(*(curr_char=find_text_field(text))!='\0')
while(*curr_char!='\0') {

num_word++;

if(*(curr_char=find_next_text_field(curr_char))==' \0')
break;
}

return num_word;
}

Now this is assuming you've read the file into a text buffer first, and
you can use it that way, or possibly modify the logic to work with a
text file stream instead (I didn't look in my file utilities library,
because
I'm not sure I really need/use a "word counter" for files, but note
that you are essentially just reading through the text character by
character, so you could just use fgetc() and check for EOF
rather than '\0' throughout)...

---
William Ernest Reid


  #6  
Old August 28th, 2008, 08:45 PM
vippstar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 28, 10:07 pm, "Bill Reid" <hormelf...@happyhealthy.netwrote:
Quote:
Pietro Cerutti <gahr_SPAM_gahr_ME_chwrote in message
>
news:etWdnbY4FOXdxSvVnZ2dneKdnZydnZ2d@giganews.com ...
>
Quote:
hiteshtha...@gmail.com wrote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
It's most about your definition of "words" and "white spaces". The unix
utility wc refers to a word as a string of characters delimited by a
blank space. If that's the behavior that you're trying to mimic, then
totbspc may be a good approximation of the number of words in your file.
>
How about something better than an "approximation"...
>
Quote:
Some error checking after fgetc returns EOF to see whether an
end-of-file event or an error occurred may be needed at the end.
>
OK, no assurances this is the "best" way to do this, but here is
how I do it:
>
char *find_text_field(char *curr_char) {
>
for(;isspace(*curr_char)!=0;curr_char++);
Undefined behavior.
Cast *curr_char to (unsigned char).

<snip>
  #7  
Old August 28th, 2008, 08:55 PM
Andrew Poelstra
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On 2008-08-28, vippstar@gmail.com <vippstar@gmail.comwrote:
Quote:
On Aug 28, 10:07 pm, "Bill Reid" <hormelf...@happyhealthy.netwrote:
Quote:
>Pietro Cerutti <gahr_SPAM_gahr_ME_chwrote in message
>>
>news:etWdnbY4FOXdxSvVnZ2dneKdnZydnZ2d@giganews.co m...
>>
Quote:
hiteshtha...@gmail.com wrote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
It's most about your definition of "words" and "white spaces". The unix
utility wc refers to a word as a string of characters delimited by a
blank space. If that's the behavior that you're trying to mimic, then
totbspc may be a good approximation of the number of words in your file.
>>
>How about something better than an "approximation"...
>>
Quote:
Some error checking after fgetc returns EOF to see whether an
end-of-file event or an error occurred may be needed at the end.
>>
>OK, no assurances this is the "best" way to do this, but here is
>how I do it:
>>
>char *find_text_field(char *curr_char) {
>>
> for(;isspace(*curr_char)!=0;curr_char++);
>
Undefined behavior.
Cast *curr_char to (unsigned char).
>
Not necessarily. If you are sure that the value of *curr_char
will be within the range of unsigned char (or char is unsigned
by default!) the behavior is defined.

Having said that, your advice is still good advice; just not
strictly necessary if you check the input to find_text_field()
carefully enough. :)

--
Andrew Poelstra apoelstra@wpsoftware.com
To email me, use the above email addresss with .com set to .net
  #8  
Old August 28th, 2008, 09:05 PM
vippstar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.homewrote:
Quote:
On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
>
Quote:
On Aug 28, 10:07 pm, "Bill Reid"
Quote:
char *find_text_field(char *curr_char) {
>
Quote:
Quote:
for(;isspace(*curr_char)!=0;curr_char++);
>
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>
Not necessarily. If you are sure that the value of *curr_char
will be within the range of unsigned char (or char is unsigned
by default!) the behavior is defined.
Whether char is unsigned or signed is unspecified.
How do you suggest to "check" the value of *curr_char?

Regardless of what you suggest, it _is_ undefined behavior, in his
code. In *your* code with *your* checks/guarantees, it might not be.

(for example, char c = someval; assert(c >= 0); isspace(c); is not UB)
  #9  
Old August 28th, 2008, 09:45 PM
Harald van =?UTF-8?b?RMSzaw==?=
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Thu, 28 Aug 2008 13:02:46 -0700, vippstar wrote:
Quote:
On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.homewrote:
Quote:
>On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
Quote:
On Aug 28, 10:07 pm, "Bill Reid"
>char *find_text_field(char *curr_char) {
>>
Quote:
> for(;isspace(*curr_char)!=0;curr_char++);
>>
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>>
>Not necessarily. If you are sure that the value of *curr_char will be
>within the range of unsigned char (or char is unsigned by default!) the
>behavior is defined.
>
Whether char is unsigned or signed is unspecified.
It's implementation-defined.
Quote:
How do you suggest to
"check" the value of *curr_char?
One possibility is reading the implementation's documentation. Another is
calling find_text_field(" hello "); or any other string consisting
only of characters in the basic execution character set.
Quote:
Regardless of what you suggest, it _is_ undefined behavior, in his code.
In *your* code with *your* checks/guarantees, it might not be.
>
(for example, char c = someval; assert(c >= 0); isspace(c); is not UB)
If c is within the range of unsigned char, the behaviour of isspace(c) is
specified by the standard regardless of whether you code a check to verify
its value.
  #10  
Old August 28th, 2008, 10:35 PM
vippstar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 28, 11:40 pm, Harald van Dijk <true...@gmail.comwrote:
Quote:
On Thu, 28 Aug 2008 13:02:46 -0700, vippstar wrote:
Quote:
On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.homewrote:
Quote:
On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
On Aug 28, 10:07 pm, "Bill Reid"
char *find_text_field(char *curr_char) {
>
Quote:
Quote:
for(;isspace(*curr_char)!=0;curr_char++);
>
Quote:
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>
Quote:
Quote:
Not necessarily. If you are sure that the value of *curr_char will be
within the range of unsigned char (or char is unsigned by default!) the
behavior is defined.
>
Quote:
Whether char is unsigned or signed is unspecified.
>
It's implementation-defined.
3.4.1
1 implementation-defined behavior
unspecified behavior where each implementation documents how the
choice is made

We seem to agree.
Quote:
Quote:
How do you suggest to
"check" the value of *curr_char?
>
One possibility is reading the implementation's documentation. Another is
Reading the implementations documentation? Why would you do such thing
when you can simply write valid C code that doesn't rely on
implementation documentation?
Quote:
calling find_text_field(" hello "); or any other string consisting
only of characters in the basic execution character set.
I don't see how this checks for anything (it's a guarantee that
characters in the basic execution set have a value 0)
Quote:
Quote:
Regardless of what you suggest, it _is_ undefined behavior, in his code..
In *your* code with *your* checks/guarantees, it might not be.
>
Quote:
(for example, char c = someval; assert(c >= 0); isspace(c); is not UB)
>
If c is within the range of unsigned char, the behaviour of isspace(c) is
specified by the standard regardless of whether you code a check to verify
its value.
So? Have I said otherwise? (I actually implied exactly _that_ when I
suggested the (unsigned char) cast)
  #11  
Old August 28th, 2008, 10:55 PM
Harald van =?UTF-8?b?RMSzaw==?=
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Thu, 28 Aug 2008 14:26:47 -0700, vippstar wrote:
Quote:
On Aug 28, 11:40 pm, Harald van Dijk <true...@gmail.comwrote:
Quote:
>On Thu, 28 Aug 2008 13:02:46 -0700, vippstar wrote:
Quote:
On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.home>
wrote:
>On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
On Aug 28, 10:07 pm, "Bill Reid"
>char *find_text_field(char *curr_char) {
>>
Quote:
> for(;isspace(*curr_char)!=0;curr_char++);
>>
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>>
Quote:
>Not necessarily. If you are sure that the value of *curr_char will
>be within the range of unsigned char (or char is unsigned by
>default!) the behavior is defined.
>>
Quote:
Whether char is unsigned or signed is unspecified.
>[...]
How do you suggest to
"check" the value of *curr_char?
>>
>One possibility is reading the implementation's documentation. Another
>is
>
Reading the implementations documentation? Why would you do such thing
when you can simply write valid C code that doesn't rely on
implementation documentation?
Because you've already read it for other reasons? Because you mistook the
compiler's documentation as a guarantee that char is always unsigned, on
every compiler? It doesn't need to be a good idea to have people do it.
Quote:
Quote:
>calling find_text_field(" hello "); or any other string consisting
>only of characters in the basic execution character set.
>
I don't see how this checks for anything (it's a guarantee that
characters in the basic execution set have a value 0)
It took the liberty of considering a verification by the programmer that
each character is in fact in the basic execution character set as a check.
If you don't approve, then I don't see the point of your question

How do you suggest to "check" the value of *curr_char?

since no check is required.
Quote:
Quote:
Quote:
Regardless of what you suggest, it _is_ undefined behavior, in his
code. In *your* code with *your* checks/guarantees, it might not be.
>>
Quote:
(for example, char c = someval; assert(c >= 0); isspace(c); is not
UB)
>>
>If c is within the range of unsigned char, the behaviour of isspace(c)
>is specified by the standard regardless of whether you code a check to
>verify its value.
>
So? Have I said otherwise? (I actually implied exactly _that_ when I
suggested the (unsigned char) cast)
Yes. I take an unqualified "Undefined behavior." as saying the behaviour
is undefined, not that the behaviour may or may not be undefined. More
clearly, you also claimed rather explicitly "Regardless of what you
suggest, it _is_ undefined behavior, in his code."
  #12  
Old August 28th, 2008, 11:15 PM
vippstar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 29, 12:46 am, Harald van Dk <true...@gmail.comwrote:
Quote:
On Thu, 28 Aug 2008 14:26:47 -0700, vippstar wrote:
Quote:
On Aug 28, 11:40 pm, Harald van Dk <true...@gmail.comwrote:
Quote:
On Thu, 28 Aug 2008 13:02:46 -0700, vippstar wrote:
On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.home>
wrote:
On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
On Aug 28, 10:07 pm, "Bill Reid"
char *find_text_field(char *curr_char) {
>
Quote:
Quote:
for(;isspace(*curr_char)!=0;curr_char++);
>
Quote:
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>
Quote:
Quote:
Not necessarily. If you are sure that the value of *curr_char will
be within the range of unsigned char (or char is unsigned by
default!) the behavior is defined.
>
Quote:
Quote:
Whether char is unsigned or signed is unspecified.
[...]
How do you suggest to
"check" the value of *curr_char?
>
Quote:
Quote:
One possibility is reading the implementation's documentation. Another
is
>
Quote:
Reading the implementations documentation? Why would you do such thing
when you can simply write valid C code that doesn't rely on
implementation documentation?
>
Because you've already read it for other reasons? Because you mistook the
compiler's documentation as a guarantee that char is always unsigned, on
every compiler? It doesn't need to be a good idea to have people do it.
I asked how you'd check *curr_char in C.
You reply with reading the implementations documentation. It's not a
logical answer.
(indeed, I did not explicity said "in C", but it was implied, I think)
Quote:
Quote:
Quote:
calling find_text_field(" hello "); or any other string consisting
only of characters in the basic execution character set.
>
Quote:
I don't see how this checks for anything (it's a guarantee that
characters in the basic execution set have a value 0)
>
It took the liberty of considering a verification by the programmer that
each character is in fact in the basic execution character set as a check..
If you don't approve, then I don't see the point of your question
>
How do you suggest to "check" the value of *curr_char?
>
since no check is required.
No check is required. Then mr Poelstra did not have a point, not me.
It was him who suggested "checking" the value of *curr_char.
Quote:
Quote:
Quote:
Regardless of what you suggest, it _is_ undefined behavior, in his
code. In *your* code with *your* checks/guarantees, it might not be.
>
Quote:
Quote:
(for example, char c = someval; assert(c >= 0); isspace(c); is not
UB)
>
Quote:
Quote:
If c is within the range of unsigned char, the behaviour of isspace(c)
is specified by the standard regardless of whether you code a check to
verify its value.
>
Quote:
So? Have I said otherwise? (I actually implied exactly _that_ when I
suggested the (unsigned char) cast)
>
Yes. I take an unqualified "Undefined behavior." as saying the behaviour
is undefined, not that the behaviour may or may not be undefined. More
When you rely on implementation-defined behavior, that one of the
behaviors is undefined, you're invoking undefined behavior.
Quote:
clearly, you also claimed rather explicitly "Regardless of what you
suggest, it _is_ undefined behavior, in his code."
Yes I did, and I was wrong. I did not read the whole code, I only
assumed he used that function in the input of some file stream.
His code did not do such thing, so he is not invoking undefined
behavior. (at least not there)

So yes, I do see your point now. (I was replying to your post as I was
reading it)
  #13  
Old August 28th, 2008, 11:25 PM
Harald van =?UTF-8?b?RMSzaw==?=
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Thu, 28 Aug 2008 15:12:24 -0700, vippstar wrote:
Quote:
Quote:
Quote:
>On Thu, 28 Aug 2008 13:02:46 -0700, vippstar wrote:
On Aug 28, 10:51 pm, Andrew Poelstra <apoels...@supernova.home>
wrote:
>On 2008-08-28, vipps...@gmail.com <vipps...@gmail.comwrote:
On Aug 28, 10:07 pm, "Bill Reid"
>char *find_text_field(char *curr_char) {
>>
Quote:
> for(;isspace(*curr_char)!=0;curr_char++);
>>
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>>
Quote:
>Not necessarily. If you are sure that the value of *curr_char
>will be within the range of unsigned char (or char is unsigned by
>default!) the behavior is defined.
>>
Quote:
Whether char is unsigned or signed is unspecified.
>[...]
How do you suggest to
"check" the value of *curr_char?
[...snip...]
Quote:
It was [mr Poelstra] who suggested "checking" the value of *curr_char.
Ah! Thanks for clearing that up. There's something missing in the quoted
material. In the quote, he says you need to be sure that the value of
*curr_char will be within the range of unsigned char, not that you need to
check it. Now that I've looked up the message, I see your point a bit
better.

(No comment on the rest of your message right now.)
  #14  
Old August 29th, 2008, 02:15 AM
Bill Reid
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command


<vippstar@gmail.comwrote in message
news:6095075f-f1cd-4223-84c8-bb066c925641@d77g2000hsb.googlegroups.com...
Quote:
On Aug 28, 10:07 pm, "Bill Reid" <hormelf...@happyhealthy.netwrote:
Quote:
Pietro Cerutti <gahr_SPAM_gahr_ME_chwrote in message

news:etWdnbY4FOXdxSvVnZ2dneKdnZydnZ2d@giganews.com ...
Quote:
hiteshtha...@gmail.com wrote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but
counting
Quote:
Quote:
Quote:
words ia problem.I have tried many ways but it didnt help.
It's most about your definition of "words" and "white spaces". The
unix
Quote:
Quote:
Quote:
utility wc refers to a word as a string of characters delimited by a
blank space. If that's the behavior that you're trying to mimic, then
totbspc may be a good approximation of the number of words in your
file.
Quote:
Quote:

How about something better than an "approximation"...
Quote:
Some error checking after fgetc returns EOF to see whether an
end-of-file event or an error occurred may be needed at the end.
OK, no assurances this is the "best" way to do this, but here is
how I do it:

char *find_text_field(char *curr_char) {

for(;isspace(*curr_char)!=0;curr_char++);
>
Undefined behavior.
Cast *curr_char to (unsigned char).
Tee-hee...this old thing from a few weeks ago...

You know if I WAS a troll, not including that SUPER-IMPORTANT!!!!
cast (which appears to be totally unnecessary on my "implementation")
would constitute the PERFECT troll, since it has now spawned a whole
raft of argumentative replies while I just enjoyed another afternoon in
paradise...

---
William Ernest Reid


  #15  
Old August 29th, 2008, 02:25 AM
vippstar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On Aug 29, 4:06 am, "Bill Reid" <hormelf...@happyhealthy.netwrote:
Quote:
<vipps...@gmail.comwrote in message
Quote:
Undefined behavior.
Cast *curr_char to (unsigned char).
>
Tee-hee...this old thing from a few weeks ago...
>
You know if I WAS a troll, not including that SUPER-IMPORTANT!!!!
cast (which appears to be totally unnecessary on my "implementation")
would constitute the PERFECT troll, since it has now spawned a whole
raft of argumentative replies while I just enjoyed another afternoon in
paradise...
Actually, I'm not doubting that you are a troll, because I know you
are one. (thus why I didn't bother reading the rest of your code)
I'm glad I posted my post though; the "spawn of argumentative replies"
was quite informative for me.
Perhaps it was informative for the rest of those who participated, and
perhaps it will be informative to some random visitor. (search engines
commonly direct people to web mirrors of usenet posts)

Lastly, if you think 20 or so replies constitute of the perfect troll,
you need to read some scott nudds ;-)
  #16  
Old August 29th, 2008, 11:25 AM
s.dhilipkumar@gmail.com
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

Hi

I have never done this excersice, this is just my version of
code. :) i have just checked few basic conditions and it imitates wc
to an extent.

#include<stdio.h>

char find_char(char x)
{
if( '\n' == x)
return 1;
if( ' ' == x)
return 2;
else
return 3;
}




int main(int argc, char* argv[])
{
FILE *fp=NULL;
char prev=0,ch,ctyp;
int wc=0,nl=0,ws=0;
if (argc != 2)
{
printf ("invlid argument \n");
return 1;
}
fp=fopen(argv[1], "r");
if (fp == NULL)
{
printf("Unable to open the file %s\n",argv[1]);
return 1;
}
while(!feof(fp))
{
ch = fgetc(fp);
ctyp = find_char(ch);
if(ctyp == 1)
{
if ( prev == 3 )
wc++;nl++;
}
else if(ctyp == 2)
{
if(prev == 3)
wc++;
ws++;
}
prev = ctyp;

}

printf("Final Newline = %d Wite space = %d Word Count = %d\n",
nl,ws,wc);
return 0;
}

Sample Input:
line 1
line 2
line 3
line 4
end of line

Sample output:
../my_wc sample.txt
Final Newline = 5 Wite space = 12 Word Count = 11

Expected output
wc sample.txt
5 11 46 sample.txt

Regards,
Dhilip

hiteshtha...@gmail.com wrote:
Quote:
hi
Can anyone please help me in finding the total number of words in a
file
I get the newlines, characters and blankspaces correctly but counting
words ia problem.I have tried many ways but it didnt help.
Here is my code......
#include <string.h>
#include <stdio.h>
>
main(int argc , char *argv[])
{
FILE *fp;
int ch;
int chr=0;
int totchr=0;
int bspc=0,totbspc=0;
int nline=0;
int word=0,totwrd=0;
int i=0;
>
fp=fopen(argv[1],"r");
if(argc != 2)
{
printf("\tInsufficient arguments\n");
printf("\tusage: wrd <filename>\n");
exit(0);
}
if(fp==NULL)
{
printf("Error In File Opening\n");
exit(0);
}
else
{
while((ch=fgetc(fp))!=EOF)
{
if(ch == ' ')
{
bspc++;
}
if(ch == '\n')
{
nline++;
}
chr++;
word++;
>
>
}
}
//word = bspc + nline;
totchr += chr;
totbspc += bspc;
totwrd += word;
>
printf("\nFile %s has\n", argv[1]);
printf("\n\twhite spaces are: %d\n", totbspc);
printf("\twords are: %d\n", totwrd);
printf("\tcharacters are: %d\n", totchr);
printf("\tlines are: %d\n\n", nline);
>
fclose(fp);
}
  #17  
Old August 29th, 2008, 03:45 PM
Andrew Poelstra
Guest
 
Posts: n/a
Default Re: simulation of a "wc" command

On 2008-08-29, s.dhilipkumar@gmail.com <s.dhilipkumar@gmail.comwrote:
Quote:
Hi
>
I have never done this excersice, this is just my version of
code. :) i have just checked few basic conditions and it imitates wc
to an extent.
>
#include<stdio.h>
>
char find_char(char x)
{
if( '\n' == x)
return 1;
if( ' ' == x)
return 2;
else
return 3;
}
>
This function can probably be replaced by the standard function
isspace() given in <ctype.h>.
Quote:
>
int main(int argc, char* argv[])
{
FILE *fp=NULL;
char prev=0,ch,ctyp;
int wc=0,nl=0,ws=0;
if (argc != 2)
{
printf ("invlid argument \n");
return 1;
}
1 is not guaranteed to be a valid return value from main(). Use
EXIT_FAILURE from <stdlib.hinstead.
Quote:
fp=fopen(argv[1], "r");
if (fp == NULL)
{
printf("Unable to open the file %s\n",argv[1]);
return 1;
}
while(!feof(fp))
Uh-oh. This will cause a fencepost error - instead, check ch
against EOF (and define it as an int, not a char, to hold this
value) and use feof() to confirm that it really was end-of-file,
not another error.
Quote:
{
ch = fgetc(fp);
ctyp = find_char(ch);
if(ctyp == 1)
Yuck. Better to explicitly compare ch against ' ' or '\n',
whichever one you meant. Magic numbers are Bad News.
Quote:
{
if ( prev == 3 )
wc++;nl++;
}
else if(ctyp == 2)
{
if(prev == 3)
wc++;
ws++;
}
This logic looks like I could replace it with the simpler:

if(isspace(ch))
++wc;

while(isspace(ch))
{
if(ch == '\n')
++nl;
++ws;
ch = fgetc(fp);
}

which is clearer IMHO.


Also, don't top-post. Your reply belongs below the text.

--
Andrew Poelstra apoelstra@wpsoftware.com
To email me, use the above email addresss with .com set to .net
 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles