By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,733 Members | 930 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,733 IT Pros & Developers. It's quick & easy.

Simple noobish question

P: n/a
I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;

file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in[i] ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code
All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Thanks

May 22 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a
th*************@gmail.com wrote:
I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;
^^^^^^^^^^^^

Drop this declaration of 'x'.
>
file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;
I believe here should be the 'x', not 'good'.
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in[i] ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )
Since 'x' is never reset to 0, you're always falling into the
'continue' part.
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code
All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.
Use tracing to see how your code executes. Or a debugger.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
May 22 '07 #2

P: n/a
* th*************@gmail.com:
I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
This is really used to get a line: misleading name.

int old_total = 0;
int new_total = 0;
int x = 0;
The variable x is used as a boolean.

Declare it as 'bool'.

Use the values 'true' and 'false' instead of '1' and '0'.
file_in.open("words.list");
file_out.open("words.out");
Here you should check for failure opening the files.

while ( !file_in.eof() )
For the first iteration there has not yet been any attempt to read from
the file, and so the eof flag cannot have been set yet, and so it's
meaningless to check it here.

{
getline(file_in, word_in);
Here you should check whether getline fails, and not proceed if getline
fails (which can be due to end of file).

old_total++;
int good = 0;
Here 'x' should be set to '0' (or rather, 'false').

for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in[i] ) != 0)
{
continue;
}
else
{
x = 1;
break;
}
}
The loop above would better be expressed as a separate function.

if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
The else-part is superfluous.

}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code
All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.
See above.

By the way, please convert tab characters to spaces before posting code.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 22 '07 #3

P: n/a
<th*************@gmail.comwrote:

I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>

using namespace std;

int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;

file_in.open("words.list");
file_out.open("words.out");

while ( !file_in.eof() )
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in[i] ) != 0)
{
continue;
}
else
{
x = 1;
break;

The character you mention is going to cause that break to exit the loop.
ISTM that this may be the first non-ASCII char in the file and this break
may not be what you want.
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;

return 0;
}
//code
All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:

72411 lines in the original file.
589 lines in new file.
71822 lines of difference.

Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.

Thanks
May 22 '07 #4

P: n/a
On May 22, 6:45 pm, theronnights...@gmail.com wrote:
I am having a problem with a small bit of code. I can't figure out why
it won't work. Here:
There are a couple of problems.
//code:
#include <iostream>
#include <fstream>
#include <ctype.h>
#include <string>
using namespace std;
int main()
{
ifstream file_in;
ofstream file_out;
string word_in;
int old_total = 0;
int new_total = 0;
int x = 0;
file_in.open("words.list");
file_out.open("words.out");
while ( !file_in.eof() )
This line is definitely wrong. The *only* time it is useful to
check eof() is after a read has failed, to know if it failed
because you were at eof, or because of some other failure.

The standard idiom here would be:

while ( std::getline( file_in, word_in ) ) // ...

In production code, it's generally considered a good idea to
check bad() and eof() *after* the loop: bad() means you had a
hard read error, and !eof() means that there was a format error
in the file---it shouldn't happen with getline().
{
getline(file_in, word_in);
old_total++;
int good = 0;
for ( int i = 0; i < word_in.size(); i++ )
{
if ( isalpha( word_in[i] ) != 0)
This is undefined behavior. You can't legally call the single
argument version of isalpha with a char.

If I were writing this code, I'd use std::find_if, and a
predicate using the std::ctype facet. But I do this sort of
thing often enough that I've got the necessary predicate in my
tool box. For a simple loop, I'd write something like:

for ( std::string::const_iterator it = word_in.begin() ;
it != word_in.end()
&& isalpha( static_cast< unsigned char >( *it ) );
++ it ) {
}
if ( it == word_in.end() ) {
file_out << word_in << std::endl ;
++ new_total ;
}

Note that in general, the use of continue or break in a loop is
a sign of poor programming practice. Especially in this case:
you are using a standard algorithm, called linear search, and it
has a more or less standard implementation, easily recognized
and verified by anyone familiar with the idioms of the language.
{
continue;
}
else
{
x = 1;
break;
}
}
if ( x == 0 )
{
file_out << word_in << endl;
new_total++;
}
else
{
continue;
}
}
cout << old_total << " lines in the original file." << endl;
cout << new_total << " lines in new file." << endl;
cout << old_total - new_total << " lines of difference." << endl;
return 0;}
//code
All this is supposed to do is run through a list of words from one
file and put the one that only have the standard 26 letters of the
English alphabet into another file. It works fine until it hits the
first word with a non-english character, at which time it finishes and
exits. The output is this:
72411 lines in the original file.
589 lines in new file.
71822 lines of difference.
Obviously the new file should have a whole lot more lines than that.
At line 590 is the word Asunción. It is such a minor thing but it is
driving me up a wall none the less. Any comments would be much
appreciated.
Well, there are two obvious problems. The first is that once
you've set x, you never reset it, so it is non-zero for the rest
of the program. Of course, to begin with, it shouldn't be an
int, but a bool, with true and false. And since it is only used
within the loop, it should be declared (and initialized) within
the loop, and not outside. But in the end, if you're writing
idiomatic C++, there's no need for it what so ever.

The second problem is that you are calling isalpha with a char.
If char is signed on your implementation (it usually is), then
it is almost certain that the encoding for ó results in a
negative value, which in turn results in undefined behavior when
calling the single parameter version of isalpha. If you use
this version, you absolutely *must* cast the char to unsigned
char. Otherwise, it's undefined behavior. (With most
implementations, you'll get random results---some accented
characters may return true.)

Note too that whether isalpha( 'ó' ) returns true or false
depends on the locale. By default, here, you are in "C" locale,
where it is guaranteed to return false, but it's something you
generally want to be careful with; at least where I live, real
programs very rarely leave the default locale "C". This is
another advantage of using the std::ctype facet---you can
specify any locale you want at the call site, thus avoiding any
risk of the global locale not being the one you want.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

May 23 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.