468,512 Members | 1,369 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,512 developers. It's quick & easy.

Program to find occurences of a word in a file

Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(

This is what I did.

1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.

I was asked to do this without sorting.
Is there a better way to do this?
Neehar
Nov 20 '07 #1
18 3554
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(

This is what I did.

1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.

I was asked to do this without sorting.
Is there a better way to do this?
std::map<std::string,int>
Nov 20 '07 #2
On 2007-11-19 23:29:38 -0500, Neehar <ne************@gmail.comsaid:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(

This is what I did.

1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.

I was asked to do this without sorting.
Is there a better way to do this?
Not sure if it's better, but can't you use a std::map instead?

std::map<string, int wordlist;
string word;

while ((word = readword(cin)) != "")
{
wordlist[word] ++;
}
>

Neehar
--

-kira

Nov 20 '07 #3
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(

This is what I did.

1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.

I was asked to do this without sorting.
Is there a better way to do this?
std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
Now, there are issues on deciding what a word is. The above just leaves that
to operator>>.
Best

Kai-Uwe Bux
Nov 20 '07 #4
On Mon, 19 Nov 2007 20:29:38 -0800, Neehar wrote:
I was asked to do this without sorting. Is there a better way to do
this?
This could eliminate the map based solutions as map does an implicit sort
on insertion. I would think a hash based data structure might do, like
Python's dictionaries?

--
Sohail Somani
http://uint32t.blogspot.com
Nov 20 '07 #5
On Tue, 20 Nov 2007 05:44:46 +0000, Sohail Somani wrote:
On Mon, 19 Nov 2007 20:29:38 -0800, Neehar wrote:
>I was asked to do this without sorting. Is there a better way to do
this?

This could eliminate the map based solutions as map does an implicit
sort on insertion. I would think a hash based data structure might do,
like Python's dictionaries?
Should have said why: hash tables have amortized complexity of O(1). I
don't have my CLR book handy otherwise I would give you a more
authoritative reference rather than try and remember back to my comp sci
classes :-)

--
Sohail Somani
http://uint32t.blogspot.com
Nov 20 '07 #6
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?

std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}

Now, there are issues on deciding what a word is. The above just leaves that
to operator>>.
does map initialize values to zero for intrinsic types?

regards,
FM.
Nov 20 '07 #7
On Nov 20, 6:44 am, Sohail Somani <soh...@taggedtype.netwrote:
On Mon, 19 Nov 2007 20:29:38 -0800, Neehar wrote:
I was asked to do this without sorting. Is there a better way to do
this?
This could eliminate the map based solutions as map does an
implicit sort on insertion.
Sort of, and of course, std::map.
I would think a hash based data structure might do, like
Python's dictionaries?
std::unsorted_map, from the next version of the standard?

Another alternative, if you're going to implement the underlying
data structure yourself, would be a trie. Still, I'd start with
std::map< Word, int >. The real problem would be defining Word;
we've not been given enough information to do this. (If the
definition is just the usual meaning in non-technical English,
then I don't think the problem is solvable. On the other hand,
just any sequence of non-white space characters, separated by
white space, makes the solution so trivial it certainly wouldn't
be asked with 30 minutes to solve it.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Nov 20 '07 #8
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
>Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?

std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}

Now, there are issues on deciding what a word is. The above just leaves
that to operator>>.

does map initialize values to zero for intrinsic types?
[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Note that the default-value is T(). For unsigned long, that would be 0.
Best

Kai-Uwe Bux
Nov 20 '07 #9
On Tue, 20 Nov 2007 03:30:24 -0800, James Kanze wrote:
Another alternative, if you're going to implement the underlying data
structure yourself, would be a trie.
Yes of course. I will punish myself for not thinking of that first.
Unfortunately, the punishment will not be public.

--
Sohail Somani
http://uint32t.blogspot.com
Nov 20 '07 #10
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
Now, there are issues on deciding what a word is. The above just leaves
that to operator>>.
does map initialize values to zero for intrinsic types?

[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Note that the default-value is T(). For unsigned long, that would be 0.
long l;
cout<<l<<endl;

you do not expect the above to print '0' for all runs of the program
do you?
I do not think that default ctor for intrinsic types do anything.
BTW I guess the default allocator sets every byte to zero.

regards,
FM.
Nov 22 '07 #11
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of times
each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
Now, there are issues on deciding what a word is. The above just leaves
that to operator>>.
does map initialize values to zero for intrinsic types?

[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Note that the default-value is T(). For unsigned long, that would be 0.
plz check this to get surprised:

#include <iostream>
#include <conio.h>//a dos header
using namespace std;

template <typename Tstruct testd{
T t;
testd():t(){};
};

int main(){
testd<longtl, *ptr=new testd<long;
cout << tl.t << endl;
cout << ptr->t << endl;
delete ptr;
getch();/*prevent the console from closing before viewing the
output*/
return 0;
};

regards,
FM.
Nov 22 '07 #12
terminator wrote:
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
>terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of
times each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
> std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
>Now, there are issues on deciding what a word is. The above just
leaves that to operator>>.
does map initialize values to zero for intrinsic types?

[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Note that the default-value is T(). For unsigned long, that would be 0.

long l;
cout<<l<<endl;

you do not expect the above to print '0' for all runs of the program
do you?
Nope. But the above does not default-initialize l. The reason is clause
[8.5/9]:

If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the object
shall be default-initialized; if the object is of const-qualified type,
the underlying class type shall have a user-declared default constructor.
Otherwise, if no initializer is specified for a nonstatic object, the
object and its subobjects, if any, have an indeterminate initial
value90); ...

With

long l;

you request an uninitialized long, and that is what you get.

I do not think that default ctor for intrinsic types do anything.
Please refer to [8.5/5]

To default-initialize an object of type T means:
? if T is a non-POD class type (clause 9), the default constructor for T
is called (and the initialization is ill-formed if T has no accessible
default constructor);
? if T is an array type, each element is default-initialized;
? otherwise, the object is zero-initialized.

However, that is of little relevance. The real question is, what is the
value of T() if T is a built-in type. To answer that, we need to know about
[8.5/7]:

An object whose initializer is an empty set of parentheses, i.e., (),
shall be value-initialized.

and about [5.2.3/2]:

The expression T(), where T is a simple-type-specifier (7.1.5.2) for a
non-array complete object type or the (possibly cv-qualified) void type,
creates an rvalue of the specified type, which is value-initialized
(8.5; no initialization is done for the void() case).

So that says that long() will be a value-initialized long. Now, look up
[8.5/5]:

To value-initialize an object of type T means:
? if T is a class type (clause 9) with a user-declared constructor (12.1),
then the default constructor for T is called (and the initialization is
ill-formed if T has no accessible default constructor);
? if T is a non-union class type without a user-declared constructor, then
every non-static data member and base-class component of T is
value-initialized;
? if T is an array type, then each element is value-initialized;
? otherwise, the object is zero-initialized

The last item applies and we have to look up zero-initialization in the same
clause:

To zero-initialize an object of type T means:
...
? if T is a scalar type (3.9), the object is set to the value of 0 (zero)
converted to T;
So, if you do:

std::cout << long() << '\n';

you are justified to expect 0.
Best

Kai-Uwe Bux
Nov 22 '07 #13
On Nov 22, 12:34 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of
times each unique word appears in a given file.
I tried my level best even after the quiz to come up with a solution
but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
Now, there are issues on deciding what a word is. The above just
leaves that to operator>>.
does map initialize values to zero for intrinsic types?
[23.3.1.2/1]:
T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.
Note that the default-value is T(). For unsigned long, that would be 0.
long l;
cout<<l<<endl;
you do not expect the above to print '0' for all runs of the program
do you?

Nope. But the above does not default-initialize l. The reason is clause
[8.5/9]:

If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the object
shall be default-initialized; if the object is of const-qualified type,
the underlying class type shall have a user-declared default constructor.
Otherwise, if no initializer is specified for a nonstatic object, the
object and its subobjects, if any, have an indeterminate initial
value90); ...

With

long l;

you request an uninitialized long, and that is what you get.
I do not think that default ctor for intrinsic types do anything.

Please refer to [8.5/5]

To default-initialize an object of type T means:
? if T is a non-POD class type (clause 9), the default constructor for T
is called (and the initialization is ill-formed if T has no accessible
default constructor);
? if T is an array type, each element is default-initialized;
? otherwise, the object is zero-initialized.

However, that is of little relevance. The real question is, what is the
value of T() if T is a built-in type. To answer that, we need to know about
[8.5/7]:

An object whose initializer is an empty set of parentheses, i.e., (),
shall be value-initialized.

and about [5.2.3/2]:

The expression T(), where T is a simple-type-specifier (7.1.5.2) for a
non-array complete object type or the (possibly cv-qualified) void type,
creates an rvalue of the specified type, which is value-initialized
(8.5; no initialization is done for the void() case).

So that says that long() will be a value-initialized long. Now, look up
[8.5/5]:

To value-initialize an object of type T means:
? if T is a class type (clause 9) with a user-declared constructor (12.1),
then the default constructor for T is called (and the initialization is
ill-formed if T has no accessible default constructor);
? if T is a non-union class type without a user-declared constructor, then
every non-static data member and base-class component of T is
value-initialized;
? if T is an array type, then each element is value-initialized;
? otherwise, the object is zero-initialized

The last item applies and we have to look up zero-initialization in the same
clause:

To zero-initialize an object of type T means:
...
? if T is a scalar type (3.9), the object is set to the value of 0 (zero)
converted to T;

So, if you do:

std::cout << long() << '\n';

you are justified to expect 0.

all what I get is that in case of default constructing ,intrinsic
rvalues go zero.

#include <iostream>
#include <conio.h>
using namespace std;

template <typename Tstruct testd{
T t;
testd():t(){};
};

int main(){
testd<longtl, *ptr=new testd<long;
cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;

cout <<"Three\n"<< long() << endl;

long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;

delete l;

cout<<"Six\n" << testd<long>().t <<endl;

getch();
return 0;
};

only the third output comes zero.
so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .

regards,
FM.
Nov 23 '07 #14
terminator wrote:
On Nov 22, 12:34 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
>terminator wrote:
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of
times each unique word appears in a given file.
I tried my level best even after the quiz to come up with a
solution but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
> std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
>Now, there are issues on deciding what a word is. The above just
leaves that to operator>>.
does map initialize values to zero for intrinsic types?
>[23.3.1.2/1]:
> T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.
>Note that the default-value is T(). For unsigned long, that would be
0.
long l;
cout<<l<<endl;
you do not expect the above to print '0' for all runs of the program
do you?

Nope. But the above does not default-initialize l. The reason is clause
[8.5/9]:

If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the
object shall be default-initialized; if the object is of
const-qualified type, the underlying class type shall have a
user-declared default constructor. Otherwise, if no initializer is
specified for a nonstatic object, the object and its subobjects, if
any, have an indeterminate initial value90); ...

With

long l;

you request an uninitialized long, and that is what you get.
I do not think that default ctor for intrinsic types do anything.

Please refer to [8.5/5]

To default-initialize an object of type T means:
? if T is a non-POD class type (clause 9), the default constructor for
T
is called (and the initialization is ill-formed if T has no
accessible default constructor);
? if T is an array type, each element is default-initialized;
? otherwise, the object is zero-initialized.

However, that is of little relevance. The real question is, what is the
value of T() if T is a built-in type. To answer that, we need to know
about
[8.5/7]:

An object whose initializer is an empty set of parentheses, i.e., (),
shall be value-initialized.

and about [5.2.3/2]:

The expression T(), where T is a simple-type-specifier (7.1.5.2) for a
non-array complete object type or the (possibly cv-qualified) void
type, creates an rvalue of the specified type, which is
value-initialized (8.5; no initialization is done for the void() case).

So that says that long() will be a value-initialized long. Now, look up
[8.5/5]:

To value-initialize an object of type T means:
? if T is a class type (clause 9) with a user-declared constructor
(12.1),
then the default constructor for T is called (and the initialization
is ill-formed if T has no accessible default constructor);
? if T is a non-union class type without a user-declared constructor,
then
every non-static data member and base-class component of T is
value-initialized;
? if T is an array type, then each element is value-initialized;
? otherwise, the object is zero-initialized

The last item applies and we have to look up zero-initialization in the
same clause:

To zero-initialize an object of type T means:
...
? if T is a scalar type (3.9), the object is set to the value of 0
(zero)
converted to T;

So, if you do:

std::cout << long() << '\n';

you are justified to expect 0.


all what I get is that in case of default constructing ,intrinsic
rvalues go zero.

#include <iostream>
#include <conio.h>
using namespace std;

template <typename Tstruct testd{
T t;
testd():t(){};
};

int main(){
testd<longtl, *ptr=new testd<long;
cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;

cout <<"Three\n"<< long() << endl;

long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;

delete l;

cout<<"Six\n" << testd<long>().t <<endl;

getch();
return 0;
};

only the third output comes zero.
That would be the line

cout <<"Three\n"<< long() << endl;

right? Well, that is all you need.

so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .
I have given you all in the standard that answers your question:

[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Please parse that carefully. Note that the make_pair(x,T()) is evaluated and
then inserted into the map. The resulting iterator is then used to create a
reference to the second entry. Now, which value is it that gets inserted
into the map? Well, that is to ask: what is the value of

make_pair( x, T() )

for T = unsigned long. This reduces to figuring out what T() is for T =
unsigned long. And that problem, I have discussed up-thread. Also note that
this is exactcly the case from the third output of your program.
If you still think that the allocator just nulls the memory, I suggest you
implement your own allocator that does something different with the memory
on allocation. If it does the right thing in the constuct() method, you
should still get the right results.
Best

Kai-Uwe Bux
Nov 23 '07 #15
On Fri, 23 Nov 2007 04:06:38 -0800, terminator wrote:
all what I get is that in case of default constructing ,intrinsic
rvalues go zero.

#include <iostream>
#include <conio.h>
using namespace std;

template <typename Tstruct testd{
T t;
testd():t(){};
};

int main(){
testd<longtl, *ptr=new testd<long; cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;

cout <<"Three\n"<< long() << endl;

long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;

delete l;

cout<<"Six\n" << testd<long>().t <<endl;

getch();
return 0;
};

only the third output comes zero.
so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .
Well, I tried it by myself and in all except fourth and fifth I got zero.
When I changed fourth and fifth to print pointee instead of pointer I
also got zeroes.
gcc 4.1.3
--
Tadeusz B. Kopec (tk****@NOSPAMPLEASElife.pl)
It's gonna be alright,
It's almost midnight,
And I've got two more bottles of wine.
Nov 23 '07 #16
On Nov 23, 3:54 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 22, 12:34 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 3:23 pm, Kai-Uwe Bux <jkherci...@gmx.netwrote:
terminator wrote:
On Nov 20, 7:42 am, Kai-Uwe Bux <jkherci...@gmx.netwrote:
Neehar wrote:
Hello
For one of the interviews I took recently, I was given an offline
programming quiz.
In 30 minutes I had to write code in C++ to counts the number of
times each unique word appears in a given file.
I tried my level best even after the quiz to come up with a
solution but cudnt find an efficient one. :(
This is what I did.
1. Read all the words from the file and store them in a vector
container.
2. Sort the words in the vector container using most efficient
algorithm ( like Quicksort, Mergesort etc)
3. Then finding duplicates and occurences is very easy.
I was asked to do this without sorting.
Is there a better way to do this?
std::map< std::string, unsigned long count;
std::string word;
while ( std::cin >word ) {
++ count[ word ];
}
Now, there are issues on deciding what a word is. The above just
leaves that to operator>>.
does map initialize values to zero for intrinsic types?
[23.3.1.2/1]:
T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.
Note that the default-value is T(). For unsigned long, that would be
0.
long l;
cout<<l<<endl;
you do not expect the above to print '0' for all runs of the program
do you?
Nope. But the above does not default-initialize l. The reason is clause
[8.5/9]:
If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the
object shall be default-initialized; if the object is of
const-qualified type, the underlying class type shall have a
user-declared default constructor. Otherwise, if no initializer is
specified for a nonstatic object, the object and its subobjects, if
any, have an indeterminate initial value90); ...
With
long l;
you request an uninitialized long, and that is what you get.
I do not think that default ctor for intrinsic types do anything.
Please refer to [8.5/5]
To default-initialize an object of type T means:
? if T is a non-POD class type (clause 9), the default constructor for
T
is called (and the initialization is ill-formed if T has no
accessible default constructor);
? if T is an array type, each element is default-initialized;
? otherwise, the object is zero-initialized.
However, that is of little relevance. The real question is, what is the
value of T() if T is a built-in type. To answer that, we need to know
about
[8.5/7]:
An object whose initializer is an empty set of parentheses, i.e., (),
shall be value-initialized.
and about [5.2.3/2]:
The expression T(), where T is a simple-type-specifier (7.1.5.2) for a
non-array complete object type or the (possibly cv-qualified) void
type, creates an rvalue of the specified type, which is
value-initialized (8.5; no initialization is done for the void() case).
So that says that long() will be a value-initialized long. Now, look up
[8.5/5]:
To value-initialize an object of type T means:
? if T is a class type (clause 9) with a user-declared constructor
(12.1),
then the default constructor for T is called (and the initialization
is ill-formed if T has no accessible default constructor);
? if T is a non-union class type without a user-declared constructor,
then
every non-static data member and base-class component of T is
value-initialized;
? if T is an array type, then each element is value-initialized;
? otherwise, the object is zero-initialized
The last item applies and we have to look up zero-initialization in the
same clause:
To zero-initialize an object of type T means:
...
? if T is a scalar type (3.9), the object is set to the value of 0
(zero)
converted to T;
So, if you do:
std::cout << long() << '\n';
you are justified to expect 0.
all what I get is that in case of default constructing ,intrinsic
rvalues go zero.
#include <iostream>
#include <conio.h>
using namespace std;
template <typename Tstruct testd{
T t;
testd():t(){};
};
int main(){
testd<longtl, *ptr=new testd<long;
cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;
cout <<"Three\n"<< long() << endl;
long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;
delete l;
cout<<"Six\n" << testd<long>().t <<endl;
getch();
return 0;
};
only the third output comes zero.

That would be the line

cout <<"Three\n"<< long() << endl;

right? Well, that is all you need.
so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .

I have given you all in the standard that answers your question:

[23.3.1.2/1]:

T& operator[](const key_type& x);
Returns: (*((insert(make_pair(x, T()))).first)).second.

Please parse that carefully. Note that the make_pair(x,T()) is evaluated and
then inserted into the map. The resulting iterator is then used to create a
reference to the second entry. Now, which value is it that gets inserted
into the map? Well, that is to ask: what is the value of

make_pair( x, T() )
that does it.

thanks.
Nov 23 '07 #17
On Nov 23, 9:31 pm, "Tadeusz B. Kopec" <tko...@NOSPAMPLEASElife.pl>
wrote:
On Fri, 23 Nov 2007 04:06:38 -0800, terminator wrote:
all what I get is that in case of default constructing ,intrinsic
rvalues go zero.
#include <iostream>
#include <conio.h>
using namespace std;
template <typename Tstruct testd{
T t;
testd():t(){};
};
int main(){
testd<longtl, *ptr=new testd<long; cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;
cout <<"Three\n"<< long() << endl;
long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;
delete l;
cout<<"Six\n" << testd<long>().t <<endl;
getch();
return 0;
};
only the third output comes zero.
so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .

Well, I tried it by myself and in all except fourth and fifth I got zero.
When I changed fourth and fifth to print pointee instead of pointer I
also got zeroes.
gcc 4.1.3
--
I do not claim if it was std behavior;I just point that my copy of
std::map zeros values eventhough the above statements on my platform
do not .And Kai-Uwe Bux just pointed out the reason.

regards,
FM.
Nov 23 '07 #18
On Nov 23, 9:31 pm, "Tadeusz B. Kopec" <tko...@NOSPAMPLEASElife.pl>
wrote:
On Fri, 23 Nov 2007 04:06:38 -0800, terminator wrote:
all what I get is that in case of default constructing ,intrinsic
rvalues go zero.
#include <iostream>
#include <conio.h>
using namespace std;
template <typename Tstruct testd{
T t;
testd():t(){};
};
int main(){
testd<longtl, *ptr=new testd<long; cout <<"One\n"<< tl.t << endl;
cout <<"Two\n"<< ptr->t << endl;
delete ptr;
cout <<"Three\n"<< long() << endl;
long* l=new long;
cout <<"Four\n"<< l << endl;
new(l)long();
cout <<"Five\n"<< l << endl;
delete l;
cout<<"Six\n" << testd<long>().t <<endl;
getch();
return 0;
};
only the third output comes zero.
so in case the container uses placement new nop is done and it cannot
zero-default the value.
now I believe that it is the default allocator - not the ctor - who
resets the allocated bytes to zero .

Well, I tried it by myself and in all except fourth and fifth I got zero.
When I changed fourth and fifth to print pointee instead of pointer I
also got zeroes.
silly me. but derefrencing did not zero mine.

best,
FM.
Nov 23 '07 #19

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

33 posts views Thread by Nick Evans | last post: by
9 posts views Thread by cppaddict | last post: by
2 posts views Thread by RadiationX | last post: by
66 posts views Thread by genestarwing | last post: by
4 posts views Thread by Dameon | last post: by
1 post views Thread by fmendoza | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.