Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old April 21st, 2007, 09:45 PM
ume$h
Guest
 
Posts: n/a
Default No. of 'a' in a text file

/* I wrote the following program to calculate no. of 'a' in the file
c:/1.txt but it fails to give appropriate result. What is wrong with
it? */

#include"stdio.h"
int main(void)
{
FILE *f;
char ch;
long int a=0;
f=fopen("c:/1.txt","r");
while(ch=getc(f)!=EOF)
{
switch(ch)
{
case 'a': a++;break;
}
}
printf("No. of 'a' = %d\n",a);
fclose(f);
return 0;
}

  #2  
Old April 21st, 2007, 09:55 PM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

ume$h said:
Quote:
/* I wrote the following program to calculate no. of 'a' in the file
c:/1.txt but it fails to give appropriate result. What is wrong with
it? */
>
#include"stdio.h"
int main(void)
{
FILE *f;
char ch;
long int a=0;
f=fopen("c:/1.txt","r");
What happens if the fopen fails?
Quote:
while(ch=getc(f)!=EOF)
Are you sure you meant to say this? Consider the precedences of = and !=
and check the return type of getc.
Quote:
{
switch(ch)
{
case 'a': a++;break;
Do you really mean to break out of your loop at this point?
Quote:
}
}
printf("No. of 'a' = %d\n",a);
Did you read the documentation for printf?

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #3  
Old April 21st, 2007, 10:15 PM
pete
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

ume$h wrote:
Quote:
>
/* I wrote the following program to calculate no. of 'a' in the file
c:/1.txt but it fails to give appropriate result. What is wrong with
it? */
A lot.

Quote:
#include"stdio.h"
#include <stdio.h>
Quote:
int main(void)
{
FILE *f;
char ch;
int ch;
/*
** getc returns int, EOF is type int.
*/

Quote:
long int a=0;
f=fopen("c:/1.txt","r");
What if fopen returns NULL?
Quote:
while(ch=getc(f)!=EOF)
ch is assigned a value of either 1 or 0,
depending on whether or not getc returns EOF.
Quote:
{
switch(ch)
{
case 'a': a++;break;
}
You need a default case.
Quote:
}
printf("No. of 'a' = %d\n",a);
fclose(f);
return 0;
}
/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
FILE *f;
char *fn = "c:/1.txt";
int c;
long unsigned a = 0;

f = fopen(fn,"r");
if (f != NULL) {
while ((c = getc(f)) != EOF) {
a += c == 'a';
}
printf("No. of 'a' = %lu\n", a);
fclose(f);
} else {
printf("fopen problem with %s\n", fn);
}
return 0;
}

/* END new.c */

--
pete
  #4  
Old April 21st, 2007, 10:15 PM
Umesh
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

// This one runs.

#include"stdio.h"
int main(void)
{
FILE *f;
long int a=0;
f=fopen("c:/1.txt","r");
while (getc(f)=='a')
a++;
printf("No. of 'a' = %d\n",a);
fclose(f);
return 0;
}

  #5  
Old April 21st, 2007, 10:25 PM
Richard Tobin
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

In article <fI-dnb0QitWC57fbRVnysQA@bt.com>,
Richard Heathfield <rjh@see.sig.invalidwrote:
Quote:
Quote:
>while(ch=getc(f)!=EOF)
>
>Are you sure you meant to say this? Consider the precedences of = and !=
>and check the return type of getc.
>
Quote:
>{
> switch(ch)
> {
> case 'a': a++;break;
>
>Do you really mean to break out of your loop at this point?
What? That does not break out of the loop.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
  #6  
Old April 21st, 2007, 11:55 PM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

Umesh said:
Quote:
// This one runs.
>
#include"stdio.h"
int main(void)
{
FILE *f;
long int a=0;
f=fopen("c:/1.txt","r");
while (getc(f)=='a')
a++;
printf("No. of 'a' = %d\n",a);
fclose(f);
return 0;
}
On my system, the output of this program is:

"Segmentation fault (core dumped)"

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #7  
Old April 21st, 2007, 11:55 PM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

Richard Tobin said:
Quote:
Richard Heathfield wrote:
>
Quote:
Quote:
>> switch(ch)
>> {
>> case 'a': a++;break;
>>
>>Do you really mean to break out of your loop at this point?
>
What? That does not break out of the loop.
You're right, of course. A crit too far.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #8  
Old April 21st, 2007, 11:55 PM
Keith Thompson
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

pete <pfiland@mindspring.comwrites:
Quote:
ume$h wrote:
[...]
Quote:
Quote:
>{
> switch(ch)
> {
> case 'a': a++;break;
> }
>
You need a default case.
[...]

What for?

The switch statement is equivalent to:

if (ch == 'a') {
a++;
}

which, of course, would be a better way to write it (unless the OP is
planning to expand it to handle other characters).

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
  #9  
Old April 21st, 2007, 11:55 PM
pete
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

Umesh wrote:
Quote:
>
// This one runs.
>
#include"stdio.h"
Should be
#include <stdio.h>
Quote:
int main(void)
{
FILE *f;
long int a=0;
f=fopen("c:/1.txt","r");
What if fopen returns a null pointer?
Quote:
while (getc(f)=='a')
a++;
This loop will tally the number of 'a' characters
at the begining of the file
and will stop as soon as it reads any other character.
Quote:
printf("No. of 'a' = %d\n",a);
Should be %ld because (a) is long.
Quote:
fclose(f);
That function call is undefined if (f) is a null pointer.
Quote:
return 0;
}
--
pete
  #10  
Old April 22nd, 2007, 12:05 AM
pete
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

Keith Thompson wrote:
Quote:
>
pete <pfiland@mindspring.comwrites:
Quote:
ume$h wrote:
[...]
Quote:
Quote:
{
switch(ch)
{
case 'a': a++;break;
}
You need a default case.
[...]
>
What for?
Style?
I forgot that the default case is optional for switch statements.
Quote:
The switch statement is equivalent to:
>
if (ch == 'a') {
a++;
}
>
which, of course, would be a better way to write it (unless the OP is
planning to expand it to handle other characters).
--
pete
  #11  
Old April 22nd, 2007, 03:05 AM
Keith Thompson
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

Umesh <fraternitydisposal@gmail.comwrites:
Quote:
// This one runs.
>
#include"stdio.h"
int main(void)
{
FILE *f;
long int a=0;
f=fopen("c:/1.txt","r");
while (getc(f)=='a')
a++;
printf("No. of 'a' = %d\n",a);
fclose(f);
return 0;
}
I'm sure it runs, but it doesn't work. It appears to count the number
of consecutive 'a' characters starting at the beginning of the file,
which I don't think is what you're trying to do.

It also ignores all the advice that's been given to you so far:

Use <stdio.h>, not "stdio.h"

Check the result of fopen().

The "%d" format expects an int; you're giving it a long int.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
  #12  
Old April 22nd, 2007, 06:05 AM
Umesh
Guest
 
Posts: n/a
Default No. of 'ab' or specific word in a text file

/*Calculate the no. of occurance of 'ab'. But is this one OK/
EFFICIENT?*/
#include<stdio.h>
int main(void)
{
FILE *f;
long int c,c1,ab=1;
f=fopen("c:/1.txt","r");
while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
if(c=='a' && c1=='b') ab++;}
fclose(f);
printf("No. of 'ab' = %ld\n",ab);
return 0;
}

  #13  
Old April 22nd, 2007, 06:15 AM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh said:
Quote:
/*Calculate the no. of occurance of 'ab'. But is this one OK/
EFFICIENT?*/
#include<stdio.h>
int main(void)
{
FILE *f;
long int c,c1,ab=1;
f=fopen("c:/1.txt","r");
while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
if(c=='a' && c1=='b') ab++;}
fclose(f);
printf("No. of 'ab' = %ld\n",ab);
return 0;
}
I ran this, and got the following output:

Segmentation fault (core dumped)


But yes, the core dump did happen very quickly. No efficiency complaints
here.


--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #14  
Old April 22nd, 2007, 06:45 AM
Umesh
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file


Richard Heathfield wrote:
Quote:
I ran this, and got the following output:
>
Segmentation fault (core dumped)
>
>
But yes, the core dump did happen very quickly. No efficiency complaints
here.
The program is running in TC++ 4.5 and VC++ 6 compiler. What is wrong
with you?
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program, it would be a
toilsome job. Is there any alternative? Thank you.

  #15  
Old April 22nd, 2007, 06:45 AM
attn.steven.kuo@gmail.com
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

On Apr 21, 9:59 pm, Umesh <fraternitydispo...@gmail.comwrote:
Quote:
/*Calculate the no. of occurance of 'ab'. But is this one OK/
EFFICIENT?*/
#include<stdio.h>
int main(void)
{
FILE *f;
long int c,c1,ab=1;

Since getc returns int, I'd
use int for the 'c' and 'c1' variables.

Quote:
f=fopen("c:/1.txt","r");

Others have said that you
need to check to see if fopen succeeds.

Quote:
while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
if(c=='a' && c1=='b') ab++;}
fclose(f);
printf("No. of 'ab' = %ld\n",ab);
return 0;
>
}

And what happens if the file
you're reading contains the
string "bababa\n"? Ask yourself
if the output would be any different
for a file that contained the string
"ababab\n"?

--
Hope this helps,
Steven


  #16  
Old April 22nd, 2007, 06:45 AM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh said:
Quote:
>
Richard Heathfield wrote:
Quote:
>I ran this, and got the following output:
>>
>Segmentation fault (core dumped)
>>
>>
>But yes, the core dump did happen very quickly. No efficiency
>complaints here.
>
The program is running in TC++ 4.5 and VC++ 6 compiler.
No, it isn't. The program doesn't run in the compiler. It runs as a
process on the computer. (One might reasonably describe a C program as
running "in" an interpreter if one happened to be using one, but you
aren't.)
Quote:
What is wrong with you?
What a question. Here's a better question: what is wrong with your
program, that causes it to produce a segmentation fault on my system
instead of a graceful error message? Hint: fopen can fail.
Quote:
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program, it would be a
toilsome job. Is there any alternative? Thank you.
Step 1: make it readable (so that you can correct it).
Step 2: make it correct (so that it's worth speeding up).
Step 3: make it fast (if it isn't already fast enough).

Leaving aside readability issues (although I don't consider your program
to be very readable), you have at least three problems in your program
that stop it from being correct. Firstly, it makes an invalid
assumption that its resource acquisition request is bound to succeed.
Secondly, it starts its count from 1 rather than from 0. And thirdly,
it fails to count any 'a', 'b' pair that are an odd number of bytes
into the file.

I suggest you fix those three problems before worrying about
performance.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #17  
Old April 22nd, 2007, 07:05 AM
Ian Collins
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh wrote:
Quote:
/*Calculate the no. of occurance of 'ab'. But is this one OK/
EFFICIENT?*/
#include<stdio.h>
int main(void)
{
FILE *f;
long int c,c1,ab=1;
f=fopen("c:/1.txt","r");
while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
if(c=='a' && c1=='b') ab++;}
fclose(f);
printf("No. of 'ab' = %ld\n",ab);
return 0;
}
>
Whitespace has plummeted in price over the past few years.

Making you program a little safer and readable.

#include <stdio.h>

int main( int argc, char** argv )
{
if( argc 1 )
{
FILE *f = fopen( argv[1],"r");

if( f )
{
int c,c1;
unsigned ab=0;

while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF)
{
if(c=='a' && c1=='b') ab++;
}

fclose(f);
printf("No. of 'ab' = %ld\n",ab);
}
}
return 0;
}

Now try running it on its self and you find your logic errors.

--
Ian Collins.
  #18  
Old April 22nd, 2007, 07:15 AM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Ian Collins said:
Quote:
Umesh wrote:
<snip>
Quote:
Quote:
>long int c,c1,ab=1;
<snip>
Quote:
Quote:
>printf("No. of 'ab' = %ld\n",ab);
<snip>
Quote:
>
Making you program a little safer and readable.
Laudable, but you have introduced at least one fresh bug, by making ab
into an unsigned int...

<snip>
Quote:
unsigned ab=0;
<snip>
Quote:
printf("No. of 'ab' = %ld\n",ab);
....without fixing the printf to match.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #19  
Old April 22nd, 2007, 10:55 AM
pete
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh wrote:
Quote:
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program, it would be a
toilsome job. Is there any alternative? Thank you.
/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
int c;
FILE *f;
char *fn = "c:/1.txt";
long unsigned count = 0;
enum states {A = 0,B,C,D,E,F,G,H,I,J,K} waiting_for = A;

f = fopen(fn,"r");
if (f != NULL) {
while ((c = getc(f)) != EOF) {
switch (c) {
case 'a':
if (waiting_for++ != A) {
waiting_for = A;
}
break;
case 'b':
if (waiting_for++ != B) {
waiting_for = A;
}
break;
case 'c':
if (waiting_for++ != C) {
waiting_for = A;
}
break;
case 'd':
if (waiting_for++ != D) {
waiting_for = A;
}
break;
case 'e':
if (waiting_for++ != E) {
waiting_for = A;
}
break;
case 'f':
if (waiting_for++ != F) {
waiting_for = A;
}
break;
case 'g':
if (waiting_for++ != G) {
waiting_for = A;
}
break;
case 'h':
if (waiting_for++ != H) {
waiting_for = A;
}
break;
case 'i':
if (waiting_for++ != I) {
waiting_for = A;
}
break;
case 'j':
if (waiting_for++ != J) {
waiting_for = A;
}
break;
case 'k':
if (waiting_for == K) {
++count;
}
waiting_for = A;
break;
default:
waiting_for = A;
break;
}
}
printf("No. of 'abcdefghijk' = %lu\n", count);
fclose(f);
} else {
printf("fopen problem with %s\n", fn);
}
return 0;
}

/* END new.c */


--
pete
  #20  
Old April 22nd, 2007, 11:25 AM
pete
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

pete wrote:
Quote:
>
Umesh wrote:
>
Quote:
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program,
it would be a toilsome job. Is there any alternative? Thank you.
/* BEGIN new.c */

#include <stdio.h>
#include <string.h>

#define STRING "abcdefghijk"

int main(void)
{
int c;
FILE *f;
char *letter;
char *fn = "c:/1.txt";
long unsigned count = 0;
enum states {A = 0,B,C,D,E,F,G,H,I,J,K} waiting_for = A;
const char* const string = STRING;

f = fopen(fn, "r");
if (f != NULL) {
while ((c = getc(f)) != EOF) {
switch (c) {
case 'k':
if (waiting_for == K) {
++count;
}
waiting_for = A;
break;
default:
letter = strchr(string, c);
if (letter == NULL
|| waiting_for++ != letter - string)
{
waiting_for = A;
}
break;
}
}
printf("No. of '%s' = %lu\n", string, count);
fclose(f);
} else {
printf("fopen problem with %s\n", fn);
}
return 0;
}

/* END new.c */


--
pete
  #21  
Old April 22nd, 2007, 11:25 AM
James Kanze
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

On Apr 22, 8:16 am, Richard Heathfield <r...@see.sig.invalidwrote:
Quote:
Ian Collins said:
Quote:
Quote:
Umesh wrote:
<snip>
Quote:
Quote:
long int c,c1,ab=1;
<snip>
Quote:
Quote:
printf("No. of 'ab' = %ld\n",ab);
<snip>
Quote:
Quote:
Making you program a little safer and readable.
Quote:
Laudable, but you have introduced at least one fresh bug, by making ab
into an unsigned int...
Quote:
<snip>
Quote:
Quote:
unsigned ab=0;
<snip>
Quote:
printf("No. of 'ab' = %ld\n",ab);
Quote:
...without fixing the printf to match.
It's worth pointing out that this was also cross-posted to
comp.lang.c++, and that C++ has better ways of handling this
problem: in C++ code, I'd use std::deque (for the general case,
anyway) and istream, for example, to maintain a sliding two
character window in the file. In C, I'd probably simulate the
use of deque to acheive the same thing---a two character queue
is pretty easy to program. Alternatively, a simple state
machine is an efficient solution in both languages.

In C, for this specific case, I'd probably write something like:

#include <stdio.h>
#include <stdlib.h>

int
main()
{
FILE* f = fopen( "somefile.txt", "r" ) ;
if ( f == NULL ) {
fprintf( stderr, "cannot open: %s\n", "somefile.txt" ) ;
exit( 2 ) ;
}
int prev = '\0' ;
int count = 0 ;
for ( int ch = getc( f ) ; ch != EOF ; ch = getc( f ) ) {
if ( prev == 'a' && ch == 'b' ) {
++ count ;
}
prev = ch ;
}
printf( "%d\n", count ) ;
return 0 ;
}

(I think that this is 100% C. At any rate, gcc -pedantic
-std=c99 -Wall compiles it without warnings.)

--
James Kanze (Gabi Software) email: james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

  #22  
Old April 22nd, 2007, 11:45 AM
pete
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

pete wrote:
Quote:
>
pete wrote:
Quote:

Umesh wrote:
Quote:
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program,
it would be a toilsome job. Is there any alternative? Thank you.
/* BEGIN new.c */

#include <stdio.h>
#include <string.h>

#define STRING "abcdef"

int main(void)
{
enum states {A = 0,B,C,D,E,F,G,H,I,J,K} waiting_for = A;
/*
** Just make sure that the enum states cover the STRING
** and you can change STRING to whatever you want
** without having to change any other lines in this program.
*/
int c;
FILE *fp;
char *letter;
char *fn = "c:/1.txt";
long unsigned count = 0;
const char* const string = STRING;

fp = fopen(fn, "r");
if (fp != NULL) {
while ((c = getc(fp)) != EOF) {
if (c == STRING[sizeof STRING - 2]) {
if (waiting_for == sizeof STRING - 2) {
++count;
}
waiting_for = A;
} else {
letter = strchr(string, c);
if (letter == NULL
|| waiting_for++ != letter - string)
{
waiting_for = A;
}
}
}
printf("No. of '%s' = %lu\n", string, count);
fclose(fp);
} else {
printf("fopen problem with %s\n", fn);
}
return 0;
}

/* END new.c */

--
pete
  #23  
Old April 22nd, 2007, 12:05 PM
pete
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

pete wrote:
Quote:
>
pete wrote:
Quote:

pete wrote:
Quote:
>
Umesh wrote:
>
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program,
it would be a toilsome job. Is there any alternative? Thank you.
Quote:
#define STRING "abcdef"
>
int main(void)
{
enum states {A = 0,B,C,D,E,F,G,H,I,J,K} waiting_for = A;
/*
** Just make sure that the enum states cover the STRING
** and you can change STRING to whatever you want
** without having to change any other lines in this program.
*/
Actually, the following line of code
Quote:
letter = strchr(string, c);
prevents the program from working correctly
with strings like "baby",
that have multiple occurances of letters.

--
pete
  #24  
Old April 22nd, 2007, 01:25 PM
Bruce !C!+
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

my be it will like this,
while(ch=getc(f)!=EOF)-----------while(ch=getc(f)&&ch!=EOF)

{ //do something

  #25  
Old April 22nd, 2007, 01:35 PM
Gianni Mariani
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh wrote:
Quote:
/*Calculate the no. of occurance of 'ab'. But is this one OK/
EFFICIENT?*/
#include<stdio.h>
int main(void)
{
FILE *f;
long int c,c1,ab=1;
f=fopen("c:/1.txt","r");
while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
if(c=='a' && c1=='b') ab++;}
fclose(f);
printf("No. of 'ab' = %ld\n",ab);
return 0;
}
>
That will not work, it will fail for sequences like 'xab'.

The code I attached below shows how you can create a state machine to
search for any sequence of characters. You'll need to adapt it to read
from a file (should be trivial). It's not designed for speed.






#include <map>
#include <vector>
#include <memory>
#include <cassert>

// ======== IteratorTranverser =======================================
/**
* IteratorTranverser is a template class that iterates through
* a pointer or iterator. The pointers passed to it must be valid
* for the life-time of this object.
*/

template <typename Itr, typename t_key_char>
struct IteratorTranverser
{

Itr m_from;
const Itr m_end;

IteratorTranverser( const Itr & i_from, const Itr & i_end )
: m_from( i_from ),
m_end( i_end )
{
}


bool GetChar( t_key_char & l_char )
{
if ( m_from != m_end )
{
l_char = * ( m_from ++ );
return true;
}

return false;
}

bool HasInput( bool i_wait )
{
return m_from != m_end;
}

};


// ======== CombiningTraverser ========================================
/**
*
*
*/

template <typename TraverserTypeFirst, typename TraverserTypeSecond, typename t_key_char>
struct CombiningTraverser
{

TraverserTypeFirst & m_first;
TraverserTypeSecond & m_second;
bool m_use_second;

CombiningTraverser(
TraverserTypeFirst & io_first,
TraverserTypeSecond & io_second
)
: m_first( io_first ),
m_second( io_second ),
m_use_second( false )
{
}

bool GetChar( t_key_char & l_char )
{
if ( ! m_use_second )
{
if ( m_first.GetChar( l_char ) )
{
return true;
}
m_use_second = true;
}

return m_second.GetChar( l_char );
}

bool HasInput( bool i_wait )
{
if ( ! m_use_second )
{
if ( m_first.HasInput( i_wait ) )
{
return true;
}
m_use_second = true;
}

return m_second.HasInput( i_wait );
}

};


/**
* SimpleScanner is a simple scanner generator
*/

template <typename t_key_char, typename t_result>
class SimpleScanner
{
/**
* DFA_State contains a list of transitionstransitions
*/

struct DFA_State
{
typedef std::map<t_key_char, DFA_State * t_map_type;
typedef typename t_map_type::iterator t_iterator;
t_map_type m_transitions;

t_result m_terminal;
bool m_has_val;

DFA_State()
: m_terminal(),
m_has_val( false )
{
}

/**
* FindOrInsertTransition is used to construct the scanner
*/

DFA_State * FindOrInsertTransition( t_key_char i_char )
{
std::pair<t_iterator, booll_insert_result =
m_transitions.insert( typename t_map_type::value_type( i_char, 0 ) );

if ( ! l_insert_result.second )
{
return l_insert_result.first->second;
}

return l_insert_result.first->second = new DFA_State;
}


/**
* FindTransition is used to traverse the scanner
*/

DFA_State * FindTransition( t_key_char i_char )
{
t_iterator l_insert_result =
m_transitions.find( i_char );

if ( l_insert_result != m_transitions.end() )
{
return l_insert_result->second;
}

return 0;
}

};

struct DFA_Machine
{

DFA_State * m_initial_state;
DFA_State * m_current_state;
DFA_State * m_last_accept_state;
std::vector<t_key_char m_str;

DFA_Machine( DFA_State * i_initial_state )
: m_initial_state( i_initial_state ),
m_current_state( i_initial_state ),
m_last_accept_state( 0 )
{
}

/**
* NextChar will traverse the state machine with the next
* character and return the terminal t_result if one exists.
* If i_char does not make a valid transition, o_valid
* is set to false.
*/
bool NextChar( t_key_char i_char )
{
m_str.push_back( i_char );
DFA_State * l_next_state = m_current_state->FindTransition( i_char );

if ( l_next_state )
{
m_current_state = l_next_state;

// If there is an accepting state then we
// can roll back the push-back buffer.
if ( l_next_state->m_has_val )
{
m_last_accept_state = l_next_state;
m_str.clear();

}

return true;
}

m_current_state = m_initial_state;
return false;
}

template <typename Traverser>
bool ScanStream( Traverser & io_traverser, t_result & o_result )
{
t_key_char l_char;

while ( io_traverser.GetChar( l_char ) )
{
bool i_valid;

i_valid = NextChar( l_char );

DFA_State * l_last_accept_state = m_last_accept_state;

// If there are no more transitions or the last
if ( ( ! i_valid ) || ( m_current_state->m_transitions.size() == 0 ) )
{
if ( l_last_accept_state )
{
m_last_accept_state = 0;
m_current_state = m_initial_state;
if ( l_last_accept_state->m_has_val )
{
o_result = l_last_accept_state->m_terminal;
return true;
}
}
return false;
}

// There are transitions ...
assert( m_current_state->m_transitions.size() != 0 );

// If there are transitions (true here) and this is an interactive
// scan (waiting for user input) then wait a little longer, if there
// are no accept states - wait forever (which means calling GetChar).

if ( l_last_accept_state )
{
if ( ! io_traverser.HasInput( true ) )
{
// there is no longer any pending input. We're done.
m_last_accept_state = 0;
m_current_state = m_initial_state;
o_result = l_last_accept_state->m_terminal;
return true;
}
}
}

return false;
}


template <typename TraverserType>
bool DoScan( TraverserType & io_traverser, t_result & o_result )
{
std::vector<t_key_char l_str = std::vector<t_key_char>();
l_str.swap( m_str );

if ( l_str.size() != 0 )
{
IteratorTranverser< typename std::vector<t_key_char>::iterator, t_key_char l_tvsr(
l_str.begin(),
l_str.end()
);

CombiningTraverser<
IteratorTranverser< typename std::vector<t_key_char>::iterator, t_key_char >,
TraverserType,
t_key_char
Quote:
l_combined( l_tvsr, io_traverser );
bool l_scanned = ScanStream( l_combined, o_result );

// may still have content locally - push that back into the
m_str.insert( m_str.end(), l_tvsr.m_from, l_tvsr.m_end );

return l_scanned;
}
else
{
return ScanStream( io_traverser, o_result );
}

return false;
}

bool HasInput( bool )
{
return m_str.size() != 0;
}

bool GetChar( t_key_char & l_char )
{
if ( m_str.size() != 0 )
{
l_char = m_str.front();
m_str.erase( m_str.begin() );
return true;
}
return false;
}

};

struct Scanner
{
DFA_State * m_initial_state;

Scanner()
: m_initial_state( new DFA_State )
{
}

DFA_Machine * NewMachine()
{
return new DFA_Machine( m_initial_state );
}

/**
* AddTerminal will add a terminal and will return the colliding
* terminal (if there is one)
*/

template <typename t_iterator>
bool AddTerminal(
int i_length,
t_iterator i_str,
const t_result & i_kd,
t_result & o_result
) {

DFA_State * l_curr_state = m_initial_state;

t_iterator l_str = i_str;

for ( int i = 0; i < i_length; ++ i )
{
DFA_State * l_next_state = l_curr_state->FindOrInsertTransition( * l_str );

++ l_str;

l_curr_state = l_next_state;
}

if ( l_curr_state->m_has_val )
{
// We have a collision !
o_result = l_curr_state->m_terminal;
return true;
}

l_curr_state->m_terminal = i_kd;
l_curr_state->m_has_val = true;

#if 0
// actually test the scanner to make sure that we decode what we expect
// to decode
std::auto_ptr<DFA_Machinel_machine( NewMachine() );

IteratorTranverser< t_iterator l_tvsr( i_str, i_str + i_length );

const t_result * l_kd2 = l_machine->ScanStream( l_tvsr );

// assert( l_kd2 == i_kd );

return 0;
#endif
return false;

}
};

Scanner m_scanner;


public:

struct Machine
{

DFA_Machine * m_machine;

Machine()
: m_machine( 0 )
{
}

~Machine()
{
if ( m_machine )
{
delete m_machine;
}
}

bool HasInput( bool )
{
if ( m_machine )
{
return m_machine->HasInput( false );
}
return false;
}

bool GetChar( t_key_char & l_char )
{
if ( m_machine )
{
return m_machine->GetChar( l_char );
}
return false;
}

private:

// no copies allowed
Machine( const Machine & );
Machine & operator=( const Machine & );

};

template <typename TraverserType>
bool Traverse( Machine & i_machine, TraverserType & io_traverser, t_result & o_kd )
{
DFA_Machine * l_machine = i_machine.m_machine;

if ( ! l_machine )
{
l_machine = i_machine.m_machine = m_scanner.NewMachine();
}

return l_machine->DoScan( io_traverser, o_kd );

}


bool AddTerminal(
int i_length,
const t_key_char * i_str,
const t_result & i_kd,
t_result & o_result
) {

return m_scanner.AddTerminal( i_length, i_str, i_kd, o_result );

}

bool AddTerminal(
const t_key_char * i_str,
const t_result & i_kd,
t_result & o_result
) {

return m_scanner.AddTerminal( std::strlen( i_str ), i_str, i_kd, o_result );

}

template < typename t_container >
bool AddTerminal(
const t_container i_str,
const t_result & i_kd,
t_result & o_result
) {

return m_scanner.AddTerminal( i_str.size(), i_str.begin(), i_kd, o_result );

}

}; // SimpleScanner




#include <string>
#include <iostream>
#include <ostream>
#include <istream>

class NoisyStr
{
public:
std::string m_value;

NoisyStr()
: m_value( "unassigned" )
{
}

NoisyStr( const std::string & i_value )
: m_value( i_value )
{
}

NoisyStr( const char * i_value )
: m_value( i_value )
{
}

NoisyStr( const NoisyStr & i_value )
: m_value( i_value.m_value )
{
std::cout << "Copied " << m_value << "\n";
}

NoisyStr & operator=( const NoisyStr & i_value )
{
std::cout << "Assigned " << m_value;
m_value = i_value.m_value;
std::cout << " to " << m_value << "\n";
return * this;
}

const char * c_str()
{
return m_value.c_str();
}
};

typedef std::string KeyType;

int main()
{
SimpleScanner< char, KeyType l_scanner;

KeyType l_collision;

l_scanner.AddTerminal( "abcde", "ZZ", l_collision );
l_scanner.AddTerminal( "xyz", "YY", l_collision );
l_scanner.AddTerminal( "dx_", "DX", l_collision );

static const char l_test[] = "abcde_test_abcdx_xyz";

std::cout << "scanning " << l_test << std::endl;


IteratorTranverser< const char *, char l_trav( l_test, l_test + sizeof( l_test ) -1 );

SimpleScanner< char, KeyType >::Machine machine;

KeyType l_result;

while (true )
{
if ( l_scanner.Traverse( machine, l_trav, l_result ) )
{
std::cout << l_result.c_str();
}
else
{
char l_ch;

if ( ! machine.GetChar( l_ch ) )
{
if ( ! l_trav.GetChar( l_ch ) )
{
break;
}
}

std::cout << l_ch;

}
}

std::cout << std::endl;

} // main



  #26  
Old April 22nd, 2007, 01:55 PM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Gianni Mariani said:
Quote:
Umesh wrote:
Quote:
>/*Calculate the no. of occurance of 'ab'. But is this one OK/
>EFFICIENT?*/
>#include<stdio.h>
>int main(void)
>{
>FILE *f;
>long int c,c1,ab=1;
>f=fopen("c:/1.txt","r");
>while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
>if(c=='a' && c1=='b') ab++;}
>fclose(f);
>printf("No. of 'ab' = %ld\n",ab);
>return 0;
>}
>>
>
That will not work, it will fail for sequences like 'xab'.
>
The code I attached below shows how you can create a state machine to
search for any sequence of characters. You'll need to adapt it to
read
from a file (should be trivial). It's not designed for speed.
Nor for brevity - not at almost 600 lines.

Nor, alas, did gcc like it very much:

foo.c:1: map: No such file or directory
foo.c:2: vector: No such file or directory
foo.c:3: memory: No such file or directory
foo.c:4: cassert: No such file or directory
foo.c:250: unterminated character constant
foo.c:484: string: No such file or directory
foo.c:485: iostream: No such file or directory
foo.c:486: ostream: No such file or directory
foo.c:487: istream: No such file or directory
make: *** [foo.o] Error 1

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #27  
Old April 22nd, 2007, 02:25 PM
Gianni Mariani
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Richard Heathfield wrote:
Quote:
Gianni Mariani said:
>
Quote:
>Umesh wrote:
Quote:
>>/*Calculate the no. of occurance of 'ab'. But is this one OK/
>>EFFICIENT?*/
>>#include<stdio.h>
>>int main(void)
>>{
>>FILE *f;
>>long int c,c1,ab=1;
>>f=fopen("c:/1.txt","r");
>>while ((c=getc(f))!=EOF && (c1=getc(f))!=EOF) {
>>if(c=='a' && c1=='b') ab++;}
>>fclose(f);
>>printf("No. of 'ab' = %ld\n",ab);
>>return 0;
>>}
>>>
>That will not work, it will fail for sequences like 'xab'.
>>
>The code I attached below shows how you can create a state machine to
>search for any sequence of characters. You'll need to adapt it to
>read
>from a file (should be trivial). It's not designed for speed.
>
Nor for brevity - not at almost 600 lines.
>
Nor, alas, did gcc like it very much:
....

Try naming the file with a .cpp (C++ extension) and compiling it with a
C++ compiler.


  #28  
Old April 22nd, 2007, 03:25 PM
Richard Heathfield
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Gianni Mariani said:
Quote:
Richard Heathfield wrote:
Quote:
>>
>Nor, alas, did gcc like it very much:
...
>
Try naming the file with a .cpp (C++ extension) and compiling it with
a C++ compiler.
No, thank you. If I want C++, I know where to find it - but I don't
expect to find it in comp.lang.c.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
  #29  
Old April 22nd, 2007, 05:05 PM
Army1987
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

"James Kanze" <james.kanze@gmail.comha scritto nel messaggio
news:1177237290.678769.185220@e65g2000hsc.googlegr oups.com...
Quote:
In C, for this specific case, I'd probably write something like:
[snip]
Quote:
FILE* f = fopen( "somefile.txt", "r" ) ;
What's the point of putting f so far on the right?
Anyhow, I'd write FILE *f rather than FILE* f, or else when you
write FILE* f, g you'll be surprised by results.

[snip]
Quote:
exit( 2 ) ;
The behaviour of this is implementation-defined. On the DS9K,
exit(2) makes the program securely erase the whole disk on exit.
The standard way to do that is exit(EXIT_FAILURE);
Quote:
int prev = '\0' ;
You can use declarations after statements only in C99. No problem
if you have a C99-compliant compiler (gcc isn't). The same for
declarations within the guard of a for loop.


  #30  
Old April 22nd, 2007, 05:05 PM
Army1987
Guest
 
Posts: n/a
Default Re: No. of 'a' in a text file

"Bruce !C!+" <aaniao002@163.comha scritto nel messaggio
news:1177244453.808150.103490@e65g2000hsc.googlegr oups.com...
Quote:
my be it will like this,
while(ch=getc(f)!=EOF)-----------while(ch=getc(f)&&ch!=EOF)
>
{ //do something
It'll stop if it hits a null character.
Try while((ch = getc(f) != EOF)


  #31  
Old April 22nd, 2007, 05:25 PM
red floyd
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Army1987 wrote:
Quote:
"James Kanze" <james.kanze@gmail.comha scritto nel messaggio
news:1177237290.678769.185220@e65g2000hsc.googlegr oups.com...
>
Quote:
>In C, for this specific case, I'd probably write something like:
[snip]
Quote:
> FILE* f = fopen( "somefile.txt", "r" ) ;
What's the point of putting f so far on the right?
Anyhow, I'd write FILE *f rather than FILE* f, or else when you
write FILE* f, g you'll be surprised by results.
Well I wouldn't write it that way. I'd write it:

FILE *f; /* or FILE* f; */
FILE *g;

Single line declarations are probably the best.
Quote:
>
[snip]
Quote:
> exit( 2 ) ;
The behaviour of this is implementation-defined. On the DS9K,
exit(2) makes the program securely erase the whole disk on exit.
The standard way to do that is exit(EXIT_FAILURE);
>
Quote:
> int prev = '\0' ;
You can use declarations after statements only in C99. No problem
if you have a C99-compliant compiler (gcc isn't). The same for
declarations within the guard of a for loop.
>
>
  #32  
Old April 22nd, 2007, 06:35 PM
Alf P. Steinbach
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

* Army1987:
Quote:
"James Kanze" <james.kanze@gmail.comha scritto nel messaggio
news:1177237290.678769.185220@e65g2000hsc.googlegr oups.com...
>
Quote:
>In C, for this specific case, I'd probably write something like:
[snip]
Quote:
> FILE* f = fopen( "somefile.txt", "r" ) ;
What's the point of putting f so far on the right?
Anyhow, I'd write FILE *f rather than FILE* f, or else when you
write FILE* f, g you'll be surprised by results.
The problem you encounter is that it's an ungood idea to have multiple
declarators in one declaration.

That's what you shouldn't be doing.

And when you're not doing that ungood thing, writing FILE* f makes sense.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
  #33  
Old April 22nd, 2007, 07:05 PM
Ernie Wright
Guest
 
Posts: n/a
Default Re: No. of 'ab' or specific word in a text file

Umesh wrote:
Quote:
I wanted to say that if I want to find no. of occurance of
'abcdefghijk' in a text file by modifying this program, it would be a<