Code review requested, Accelerated C++

Joe Van Dyk

This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
any comments (regarding style, efficiency, standards, etc) you fellows
have.

The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

// Write a program that finds all the palindromes in a dictionary,
// and also find the largest palindrome.

#include <iostream>
#include <vector>

using std::cout; using std::cin;
using std::string; using std::endl;
using std::vector;

bool is_palindrome(const string&);
bool compare_word_length(const string &, const string &);

int main()
{
cout << "This program will find all the palindromes in a dictionary "
"and will find the largest one." << std::endl;

string input;
vector<string> palindromes;

// Get the palindromes
while (cin >> input)
if (is_palindrome(input))
palindromes.push_back(input);

// Display the palindromes
cout << "\nThe palindromes were:\n";
vector<string>::const_iterator iter;
for (iter = palindromes.begin(); iter != palindromes.end(); ++iter)
{
cout << *iter << '\t';
}
cout << '\n';

// Sorting palindromes
sort (palindromes.begin(), palindromes.end(), compare_word_length);

// Display the shortest and longest one
string smallest = *palindromes.begin();
string largest = *(palindromes.end()-1);

cout << "\nThe shortest palindrome was '" << smallest << "'"
" and the longest one was '" << largest << "'\n";

cout << endl;

return 0;
}

bool is_palindrome(const string &input)
{
string::const_iterator begin, end;
begin = input.begin();
end = input.end() - 1;

while (begin < end)
{
if (*begin != *end)
return false;
begin++;
end--;
}
return true;
}

bool compare_word_length(const string &a, const string &b)
{
return (a.size() < b.size());
}

Jul 22 '05 #1

Subscribe Post Reply

2129

Joe Van Dyk

On 2004-10-16 17:26:45 -0700, Joe Van Dyk <jo*******@nospam.gmail.com> said:

This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
any comments (regarding style, efficiency, standards, etc) you fellows
have.
<snip>

while (begin < end)
{
if (*begin != *end)
return false;
begin++;
end--;

Just realized this should be:
++begin;
--end;

Joe

Jul 22 '05 #2

E. Robert Tisdale

Joe Van Dyk wrote:

This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
any comments (regarding style, efficiency, standards, etc) you fellows
have.

The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

// Write a program that finds all the palindromes in a dictionary,
// and also find the largest palindrome. cat main.cc // Write a program that finds all the palindromes
// in a dictionary and also find the largest palindrome.

#include <iostream>
#include <vector>

bool is_palindrome(const std::string& input) {
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end() - 1;

while (begin < end) {
if (*begin != *end)
return false;
++begin;
--end;
}
return true;
}

bool compare_word_length(const std::string &a,
const std::string &b) {
return (a.size() < b.size());
}

int main(int argc, char* argv[]) {
std::cout << std::endl <<
"This program will find and print all the palindromes\n"
"in a dictionary then finds and prints\n"
"the the smallest and largest ones." << std::endl;

std::string input;
std::vector<std::string> palindromes;

// Get the palindromes
while (std::cin >> input)
if (is_palindrome(input))
palindromes.push_back(input);

// Display the palindromes
std::cout << std::endl;
std::cout << "The palindromes were:\n";
for (std::vector<std::string>::const_iterator iter =
palindromes.begin(); iter != palindromes.end(); ++iter) {
std::cout << *iter << '\t';
}
std::cout << std::endl;

// Sorting palindromes
sort(palindromes.begin(),
palindromes.end(), compare_word_length);

// Display the shortest and longest one
std::string smallest = *palindromes.begin();
std::string largest = *(palindromes.end() - 1);

std::cout << std::endl;
std::cout << "The shortest palindrome was '" << smallest
<< "' and the longest one was '" << largest << "'.";

std::cout << std::endl;

return 0;
}
g++ -Wall -ansi -pedantic -o main main.cc
time ./main < /usr/share/dict/words
This program will find and print all the palindromes
in a dictionary then finds and prints
the the smallest and largest ones.

The palindromes were:
bib bob boob civic dad deed did dud \
eke ere ewe eye gag gig huh level \
madam non noon nun peep pep pip pop \
pup radar redder refer reviver rotator rotor sees \
sexes solos tit

The shortest palindrome was 'tit' \
and the longest one was 'rotator'.
0.303u 0.007s 0:00.31 96.7% 0+0k 0+0io 0pf+0w wc /usr/share/dict/words

45427 45427 409305 /usr/share/dict/words

Jul 22 '05 #3

John Harrison

"Joe Van Dyk" <jo*******@nospam.gmail.com> wrote in message
news:2004101617264416807%joevandyk@nospamgmailcom. ..

This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
any comments (regarding style, efficiency, standards, etc) you fellows
have.

Basically looks fine to me. But one or two suggestions follow.
The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

// Write a program that finds all the palindromes in a dictionary,
// and also find the largest palindrome.

#include <iostream>
#include <vector>

using std::cout; using std::cin;
using std::string; using std::endl;
using std::vector;

bool is_palindrome(const string&);
bool compare_word_length(const string &, const string &);

int main()
{
cout << "This program will find all the palindromes in a dictionary "
"and will find the largest one." << std::endl;

string input;
vector<string> palindromes;

// Get the palindromes
while (cin >> input)
if (is_palindrome(input))
palindromes.push_back(input);

// Display the palindromes
cout << "\nThe palindromes were:\n";
vector<string>::const_iterator iter;
for (iter = palindromes.begin(); i ter!=palindromes.end ++iter)
{
cout << *iter << '\t';
}
cout << '\n';
You could consider this as a replacement for the above loop

copy(palindromes.begin (), palindromes.end (), ostream_iterator(cout,
"\t"));
cout << '\n';

You'd need to include <algorithm> for copy, and <iterator>, for
ostream_iterator.

// Sorting palindromes
sort (palindromes.begin(), palindromes.end,compare_word_length
// Display the shortest and longest one
string smallest = *palindromes.begin();
string largest = *(palindromes.end()-1);
I would add a check that you have at least one palindrome, otherwise you are
going to crash, and I'd use front() and back()

if (!palindromes.empty())
{
string smallest = palindromes.front();
string largest = palindromes.back();
...
}

cout << "\nThe shortest palindrome was '" << smallest << "'"
" and the longest one was '" << largest << "'\n";

cout << endl;

return 0;
}

bool is_palindrome(const string &input)
{
string::const_iterator begin, end;
begin = input.begin end = i nput.end-1
I think its better style to combine declaration and initialisation, in some
circumstances it is also more efficient.

string::const_iterator begin = input.begin ();
string::const_iterator end = input.end ()-1;

while (begin < end)
{
if (*begin != *end)
return false;
begin++;
end--;
}
return true;
}

bool compare_word_length(const string &a, const string &b)
{
return (a.size() < b.size());
}

john

Jul 22 '05 #4

Andrew Koenig

"Joe Van Dyk" <jo*******@nospam.gmail.com> wrote in message
news:2004101617264416807%joevandyk@nospamgmailcom. ..

The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

I haven't read the code in detail, but I did read enough of it to make a
strategic suggestion.

Your program works by making a vector of all of the palindromes, then
sorting it to determine the largest smallest. This technique is legitimate,
but takes substantiallly more space and time than needed.

Consider: If there are n palindromes, your program creates an n-element
vector, all the elements of which must fit in memory at the same time.
Moreover, sorting those elements takes time proportional to n*log(n).

I am not generally fond of optimizing programs too early, but I make a
distinction between local optimizations and writing code in a way that
changes the exponent in the expression of how much time or space a program
takes. In this case, for example, storing an n-element array means you have
a space exponent of 1 (that is, the amount of space is proportional to the
1st power of the amount of input), and if you can reduce that to 0, it may
well be worth doing.

The way to do it is to stop remembering every palindrome, and remember only
the shortest and longest you've seen so far. Each time you find a new one,
print it, check whether it's shorter than the current shortest or longer
than the current longest, and then throw it away.

This strategy has the advantage of running in time proportional to n, rather
than to n*log(n), and in space proportional to the longest word in the file,
rather than proportional to the sum of the lengths of all the palindromes.
That could be a big saving.

Of course you then give up the possibility of printing the palindromes in
alphabetical order, but I don't think you were doing that anyway.

Regards,

Andrew Koenig

Jul 22 '05 #5

Joe Van Dyk

On 2004-10-17 07:56:02 -0700, "Andrew Koenig" <ar*@acm.org> said:

"Joe Van Dyk" <jo*******@nospam.gmail.com> wrote in message
news:2004101617264416807%joevandyk@nospamgmailcom. ..
The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

I haven't read the code in detail, but I did read enough of it to make
a strategic suggestion.

Your program works by making a vector of all of the palindromes, then
sorting it to determine the largest smallest. This technique is
legitimate, but takes substantiallly more space and time than needed.

Consider: If there are n palindromes, your program creates an n-element
vector, all the elements of which must fit in memory at the same time.
Moreover, sorting those elements takes time proportional to n*log(n).

I am not generally fond of optimizing programs too early, but I make a
distinction between local optimizations and writing code in a way that
changes the exponent in the expression of how much time or space a
program takes. In this case, for example, storing an n-element array
means you have a space exponent of 1 (that is, the amount of space is
proportional to the 1st power of the amount of input), and if you can
reduce that to 0, it may well be worth doing.

The way to do it is to stop remembering every palindrome, and remember
only the shortest and longest you've seen so far. Each time you find a
new one, print it, check whether it's shorter than the current shortest
or longer than the current longest, and then throw it away.

This strategy has the advantage of running in time proportional to n,
rather than to n*log(n), and in space proportional to the longest word
in the file, rather than proportional to the sum of the lengths of all
the palindromes. That could be a big saving.

Of course you then give up the possibility of printing the palindromes
in alphabetical order, but I don't think you were doing that anyway.

Regards,

Andrew Koenig

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to
do on a 300K word set. I think it's because of the repeated cout
calls, right? (i.e. we're calling cout everytime we find a palindrome,
and not once at the end.)

I also tried adding all the palindromes to a string (what's the best
way to do string concatenation?) and then printing that out at the end,
but that took about the same amount of time as my first approach.

Joe

Jul 22 '05 #6

Mike Wahler

"Joe Van Dyk" <jo*******@no.spam.gmail.com> wrote in message
news:2004101710234716807%joevandyk@nospamgmailcom. ..

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to
do on a 300K word set. I think it's because of the repeated cout
calls, right?
Probably (but we can only guess without seeing the code). The only
way to know for sure is to test it with and without the outputs.
I/O is almost always the slowest part of a program, so it will tend
to distort timing results.
(i.e. we're calling cout everytime we find a palindrome,
and not once at the end.)

I also tried adding all the palindromes to a string (what's the best
way to do string concatenation?)
Use the std::string type's '+' operator. But note that expansion
of a string can result in reallocation and copying. However, there
are a few ways to alleviate that, e.g. first estimate the ultimate size
of the string and pass that value to 'std::string::reserve()' or
perhaps initially create the string (of e.g. characters with value zero)
with that size (via a constructor). But depending upon how many
palindromes you find, this could still result in a rather large
string (and could again impact performance due to e.g. paging on
a 'virtual memory' system).

But note that if your only goal is to find e.g. a 'largest' or
'smallest' item, there's no reason to save all the items. As
Francis suggested, you only need a single string to replace
with the 'current' 'largest' item as you iterate and compare.
and then printing that out at the end,
but that took about the same amount of time as my first approach.

Try several approaches, and Test, Test, Test! :-)

-Mike

Jul 22 '05 #7

David Rubin

Joe Van Dyk <jo*******@nospam.gmail.com> wrote in message news:<2004101617264416807%joevandyk@nospamgmailcom >...

This is my answer to problem 5-10 from Accelerated C++. I'd appreciate
any comments (regarding style, efficiency, standards, etc) you fellows
have.

The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

// Write a program that finds all the palindromes in a dictionary,
// and also find the largest palindrome.

#include <iostream>
#include <vector>

using std::cout; using std::cin;
using std::string; using std::endl;
using std::vector;

bool is_palindrome(const string&);
bool compare_word_length(const string &, const string &);

int main()
{
cout << "This program will find all the palindromes in a dictionary "
"and will find the largest one." << std::endl;

string input;
vector<string> palindromes;

[snip - store plandromes in the vector, then sort, and pull out the
largest one]

If you store input in this way, use a deque at the very least. It is
much more efficient. Storing into a vector (without pre-allocating
space) is O(n^2). But, given the problem specification, there is no
reason to either store or sort input. You just need to maintain the
"largest" palindrome and display each palindrome as you examine each
input. This reduces the time complexity to O(n), and the space
requirement to O(1). /david

Jul 22 '05 #8

David Rubin

"Andrew Koenig" <ar*@acm.org> wrote in message news:<6K*********************@bgtnsc04-news.ops.worldnet.att.net>...

"Joe Van Dyk" <jo*******@nospam.gmail.com> wrote in message
news:2004101617264416807%joevandyk@nospamgmailcom. ..
The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.
^^^^^^^^^^

[snip] Of course you then give up the possibility of printing the palindromes in
alphabetical order, but I don't think you were doing that anyway.

Presumably, input from a dictionary is already sorted. Not that
"dictionary" implies a literal dictionary, even in the sense of an
ordered list of words, but it's not a bad assumption.

Jul 22 '05 #9

Andrew Koenig

"Joe Van Dyk" <jo*******@no.spam.gmail.com> wrote in message
news:2004101710234716807%joevandyk@nospamgmailcom. ..

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to do
on a 300K word set. I think it's because of the repeated cout calls,
right? (i.e. we're calling cout everytime we find a palindrome, and not
once at the end.)

I also tried adding all the palindromes to a string (what's the best way
to do string concatenation?) and then printing that out at the end, but
that took about the same amount of time as my first approach.

Interesting. That suggests that the program spends most of its time in the
I/O library no matter what you do. Although that conclusion is not
surprising, it's also worth testing; it's a useful exercise to try to figure
out a way of doing so.

Jul 22 '05 #10

Andrew Koenig

"David Rubin" <da********@warpmail.net> wrote in message
news:82*************************@posting.google.co m...

If you store input in this way, use a deque at the very least. It is
much more efficient. Storing into a vector (without pre-allocating
space) is O(n^2).

No it isn't. std::vector is required to be implemented in such a way that
calling push_back n times on an initially empty vector executes in O(n)
time. Individual executions might take more than O(1), but the total time
for n calls must be O(n).

Jul 22 '05 #11

Mike Wahler

"Andrew Koenig" <ar*@acm.org> wrote in message
news:2n*****************@bgtnsc05-news.ops.worldnet.att.net...

"David Rubin" <da********@warpmail.net> wrote in message
news:82*************************@posting.google.co m...
If you store input in this way, use a deque at the very least. It is
much more efficient. Storing into a vector (without pre-allocating
space) is O(n^2).

No it isn't. std::vector is required to be implemented in such a way that
calling push_back n times on an initially empty vector executes in O(n)
time. Individual executions might take more than O(1), but the total time
for n calls must be O(n).

aka "amortized constant time".

-Mike

Jul 22 '05 #12

Joe Van Dyk

On 2004-10-17 10:51:44 -0700, "Mike Wahler" <mk******@mkwahler.net> said:

"Joe Van Dyk" <jo*******@no.spam.gmail.com> wrote in message
news:2004101710234716807%joevandyk@nospamgmailcom. ..

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to
do on a 300K word set. I think it's because of the repeated cout
calls, right?
Probably (but we can only guess without seeing the code). The only
way to know for sure is to test it with and without the outputs.
I/O is almost always the slowest part of a program, so it will tend
to distort timing results.
(i.e. we're calling cout everytime we find a palindrome,
and not once at the end.)

I also tried adding all the palindromes to a string (what's the best
way to do string concatenation?)

Use the std::string type's '+' operator. But note that expansion
of a string can result in reallocation and copying. However, there
are a few ways to alleviate that, e.g. first estimate the ultimate size
of the string and pass that value to 'std::string::reserve()' or
perhaps initially create the string (of e.g. characters with value zero)
with that size (via a constructor). But depending upon how many
palindromes you find, this could still result in a rather large
string (and could again impact performance due to e.g. paging on
a 'virtual memory' system).

But note that if your only goal is to find e.g. a 'largest' or
'smallest' item, there's no reason to save all the items. As
Francis suggested, you only need a single string to replace
with the 'current' 'largest' item as you iterate and compare.

One of the program requirements was to print out all the palindromes in
the dictionary, so either I'd need to print them out as I found them or
save them somehow.

and then printing that out at the end,
but that took about the same amount of time as my first approach.

Try several approaches, and Test, Test, Test! :-)

-Mike

Jul 22 '05 #13

David Rubin

"Andrew Koenig" <ar*@acm.org> wrote in message news:<2n*****************@bgtnsc05-news.ops.worldnet.att.net>...

"David Rubin" <da********@warpmail.net> wrote in message
news:82*************************@posting.google.co m...
If you store input in this way, use a deque at the very least. It is
much more efficient. Storing into a vector (without pre-allocating
space) is O(n^2).

No it isn't. std::vector is required to be implemented in such a way that
calling push_back n times on an initially empty vector executes in O(n)
time. Individual executions might take more than O(1), but the total time
for n calls must be O(n).

Thanks.

Jul 22 '05 #14

Mike Wahler

"Joe Van Dyk" <jo*******@no.spam.gmail.com> wrote in message
news:2004101717094016807%joevandyk@nospamgmailcom. ..

One of the program requirements was to print out all the palindromes in
the dictionary, so either I'd need to print them out as I found them or
save them somehow.

OK, I didn't thoroughly review the requirments. Outputting as
you go would probably be faster, since you wouldn't have any
overhead caused by storing them.

-Mike

Jul 22 '05 #15

E. Robert Tisdale

Mike Wahler wrote:

Joe Van Dyk wrote:
One of the program requirements was to print out all the palindromes in
the dictionary, so either I'd need to print them out as I found them or
save them somehow.

OK, I didn't thoroughly review the requirments.
Outputting as you go would probably be faster,
since you wouldn't have any overhead caused by storing them.

I was surprised to see how few (35) palindromes there were in my

/usr/share/dict/words

dictionary. Evidently,
virtually *all* of the time is spent searching for palindromes.
It requires almost no time to maintain and sort such a short list.
In this case, Joe's original design was probably best.

Jul 22 '05 #16

Vyacheslav Kononenko

Joe Van Dyk wrote:

On 2004-10-17 07:56:02 -0700, "Andrew Koenig" <ar*@acm.org> said:
"Joe Van Dyk" <jo*******@nospam.gmail.com> wrote in message
news:2004101617264416807%joevandyk@nospamgmailcom. ..
The resulting program seems reasonably fast, searching through a
dictionary of 300,000+ words in 3 seconds on my PowerBook G4.

I haven't read the code in detail, but I did read enough of it to make
a strategic suggestion.

Your program works by making a vector of all of the palindromes, then
sorting it to determine the largest smallest. This technique is
legitimate, but takes substantiallly more space and time than needed.

Consider: If there are n palindromes, your program creates an
n-element vector, all the elements of which must fit in memory at the
same time. Moreover, sorting those elements takes time proportional to
n*log(n).

I am not generally fond of optimizing programs too early, but I make a
distinction between local optimizations and writing code in a way that
changes the exponent in the expression of how much time or space a
program takes. In this case, for example, storing an n-element array
means you have a space exponent of 1 (that is, the amount of space is
proportional to the 1st power of the amount of input), and if you can
reduce that to 0, it may well be worth doing.

The way to do it is to stop remembering every palindrome, and remember
only the shortest and longest you've seen so far. Each time you find
a new one, print it, check whether it's shorter than the current
shortest or longer than the current longest, and then throw it away.

This strategy has the advantage of running in time proportional to n,
rather than to n*log(n), and in space proportional to the longest word
in the file, rather than proportional to the sum of the lengths of all
the palindromes. That could be a big saving.

Of course you then give up the possibility of printing the palindromes
in alphabetical order, but I don't think you were doing that anyway.

Regards,

Andrew Koenig

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to do
on a 300K word set. I think it's because of the repeated cout calls,
right? (i.e. we're calling cout everytime we find a palindrome, and not
once at the end.)

I also tried adding all the palindromes to a string (what's the best way
to do string concatenation?) and then printing that out at the end, but
that took about the same amount of time as my first approach.

Joe

How about to combine both methods? Keep palindromes in a vector but do
not sort it. Have two index variebles that will point to shortest and
longest palindrome in the vector.
--
Regards,
Slava

Jul 22 '05 #17

rossum

On Sun, 17 Oct 2004 17:23:47 GMT, Joe Van Dyk
<jo*******@no.spam.gmail.com> wrote:

[snip]

Thanks for the idea. I tried it the way you suggested and it actually
took longer (4 seconds compared to 3 seconds on my first approach) to
do on a 300K word set. I think it's because of the repeated cout
calls, right? (i.e. we're calling cout everytime we find a palindrome,
and not once at the end.) Looking at your original code you were calling cout inside a for loop.
It will be called once for each time round the loop, that is once for
each palindrome. I don't think that is the explanation.

rossum

I also tried adding all the palindromes to a string (what's the best
way to do string concatenation?) and then printing that out at the end,
but that took about the same amount of time as my first approach.

Joe

--

The ultimate truth is that there is no Ultimate Truth

Jul 22 '05 #18

Code review requested, Accelerated C++

Similar topics