Hi,
I'm using a map with the key of type string, and value of type int.
typedef map<string, int, less<string> >concordance;
I'm finding words within some text and keeping a count of their
frequency.
I'm new to STL, but I've just read through the headers and can't see a
way to sort the results by value. (Maybe there's a way to modify less<>?)
Or... Should I create a second multimap (a copy) and have the int as the
key, and the string as the value, and then trivaially keep that sorted.
Or am I just using inappropriate container types to start with?
As I'm dealing with very large text files, speed is a concern too.
Thanks
Steve 10 7571
Steve Edwards wrote: Hi,
I'm using a map with the key of type string, and value of type int.
typedef map<string, int, less<string> >concordance;
I'm finding words within some text and keeping a count of their frequency.
I'm new to STL, but I've just read through the headers and can't see a way to sort the results by value. (Maybe there's a way to modify less<>?)
"Results" is a misleading word here. std::map<Key, Value> is used to
keep
one (composite) value per key. It's a container, and containers do not
have
results. Functions have results=return values.
Or... Should I create a second multimap (a copy) and have the int as the key, and the string as the value, and then trivaially keep that sorted.
That's one solution, if you often need that representation too, and the
keys
and values don't change. (basic caching pattern).
Or am I just using inappropriate container types to start with?
It's a good container to use when determining the int values, but it
may not
be the best container to keep those int values in.
As I'm dealing with very large text files, speed is a concern too.
Speed of what? Building the collection, or using it? And if the latter,
how?
(And if you talk files, you often are I/O limited anyway)
HTH,
Michiel Salters
In article <gf************ ***********@new s.btinternet.co m>,
Steve Edwards <gf*@lineone.ne t> wrote: Hi,
I'm using a map with the key of type string, and value of type int.
typedef map<string, int, less<string> >concordance;
I'm finding words within some text and keeping a count of their frequency.
I'm new to STL, but I've just read through the headers and can't see a way to sort the results by value. (Maybe there's a way to modify less<>?)
Or... Should I create a second multimap (a copy) and have the int as the key, and the string as the value, and then trivaially keep that sorted.
Or am I just using inappropriate container types to start with?
As I'm dealing with very large text files, speed is a concern too.
I'm think rather than a multimap, one could use "map<int, set<string> >"
That way, the strings will stay alphabetized.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment. I'm new to STL, but I've just read through the headers and can't see a way to sort the results by value. (Maybe there's a way to modify less<>?)
"Results" is a misleading word here. std::map<Key, Value> is used to keep one (composite) value per key. It's a container, and containers do not have
My mistake... I meant contents.
As I'm dealing with very large text files, speed is a concern too.
Speed of what? Building the collection, or using it? And if the latter, how? (And if you talk files, you often are I/O limited anyway)
I've already loaded my large text file in to memory as a simple array
of strings (one per word/token). I'm doing various lexical analyses on
these data. For some of these functions I need to quickly look up the
word as the key (hence my choice of <map>) to get the associated data;
other functions then require that the value mapped to the key can be
retrieved rapidly in order (hence my original question).
Since reading your reply I've built a copy as a multimap with the
key/value swapped, and using both containers together seems to be
working quite well in retrieving by either key or value rapidly.
(Though I don't have an alternative strategy to compare speed with, so
who knows.)
Building the structures is naturally slower, now, but it is an
acceptable tradeoff.
------------------------
In my original map
typedef map<string, int, less<string> >concordance;
every time I find another occurrence of the string key, is there a
quicker way to increment it's value count, than:
myConcordance[theWord] = myConcordance [theWord]+1;
It seems I'm doing 2 lookups of [theWord], can I change the value
in-place instead?
Thanks for your help.
In article <gf************ ***********@new s.btinternet.co m>,
Steve Edwards <gf*@lineone.ne t> wrote: In my original map
typedef map<string, int, less<string> >concordance;
every time I find another occurrence of the string key, is there a quicker way to increment it's value count, than:
myConcordance[theWord] = myConcordance [theWord]+1;
It seems I'm doing 2 lookups of [theWord], can I change the value in-place instead?
++myConcordance[theWord];
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.
In article <po************ *************** ***@news.east.e arthlink.net>,
"Daniel T." <po********@ear thlink.net> wrote: In article <gf************ ***********@new s.btinternet.co m>, Steve Edwards <gf*@lineone.ne t> wrote:
Hi,
I'm using a map with the key of type string, and value of type int.
typedef map<string, int, less<string> >concordance;
I'm finding words within some text and keeping a count of their frequency.
I'm new to STL, but I've just read through the headers and can't see a way to sort the results by value. (Maybe there's a way to modify less<>?)
Or... Should I create a second multimap (a copy) and have the int as the key, and the string as the value, and then trivaially keep that sorted.
Or am I just using inappropriate container types to start with?
As I'm dealing with very large text files, speed is a concern too.
I'm think rather than a multimap, one could use "map<int, set<string> >" That way, the strings will stay alphabetized.
Thanks, I'll try that, see if it's quicker for my lookup needs.
In article <gf************ ***********@new s.btinternet.co m>,
Steve Edwards <gf*@lineone.ne t> wrote: In article <po************ *************** ***@news.east.e arthlink.net>, "Daniel T." <po********@ear thlink.net> wrote:
I'm think rather than a multimap, one could use "map<int, set<string> >" That way, the strings will stay alphabetized.
Thanks, I'll try that, see if it's quicker for my lookup needs.
If nothing else, it will make it faster to determine how many words have
the same occurrence count.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.
In article <po************ *************** ***@news.east.e arthlink.net>,
"Daniel T." <po********@ear thlink.net> wrote: myConcordance[theWord] = myConcordance [theWord]+1;
++myConcordance[theWord];
Over 100 million insertions it dropped from 20s to 12.5s. Thanks.
"map<int, set<string> >"
If nothing else, it will make it faster to determine how many words have the same occurrence count.
That's a benefit, and it's cleaner to iterate through, too.
On 2006-02-21, Steve Edwards <gf*@lineone.ne t> wrote: Hi,
I'm using a map with the key of type string, and value of type int.
typedef map<string, int, less<string> >concordance;
I'm finding words within some text and keeping a count of their frequency.
I'm new to STL, but I've just read through the headers and can't see a way to sort the results by value. (Maybe there's a way to modify less<>?) > Or... Should I create a second multimap (a copy) and have the int as the key, and the string as the value, and then trivaially keep that sorted.
Or am I just using inappropriate container types to start with?
As I'm dealing with very large text files, speed is a concern too.
Here's another idea, which turned out to be more complicated than I at
first thought. The original plan was to store actual map iterators in
the vector, but then calling equal range became problematic, because I
had no iterator to pass in as the value to search for.
#include <iostream>
#include <iterator>
#include <vector>
#include <map>
#include <string>
#include <algorithm>
using std::map;
using std::string;
using std::vector;
using std::cout;
using std::pair;
bool less_first(pair <int, string> const& lhs, pair<int, string> const& rhs)
{
return lhs.first < rhs.first;
}
vector<pair<int , string> > mirror_map(map< string, int> const& m)
{
vector<pair<int , string> > mirror;
for (map<string, int>::const_ite rator i = m.begin(); i != m.end(); ++i)
{
mirror.push_bac k(pair<int, string>(i->second, i->first));
}
std::sort(mirro r.begin(), mirror.end(), less_first);
return mirror;
}
typedef vector<pair<int , string> >::iterator viter;
pair<viter, viter> word_range(vect or<pair<int, string> >& v, int n)
{
return std::equal_rang e(v.begin(),
v.end(),
pair<int, string>(n, ""),
less_first);
}
int main()
{
map<string, int> words;
words["love"] = 4;
words["like"] = 3;
words["the"] = 10;
words["hate"] = 4;
words["toodles"] = 1;
vector<pair<int , string> > mirror = mirror_map(word s);
// The whole mirror
for (viter i = mirror.begin() ; i != mirror.end() ; ++i)
{
cout << i->first << ": " << i->second << "\n";
}
std::cout << "----\n";
// The part of the mirror with value 4.
std::pair<viter , viter> range = word_range(mirr or, 4);
for (viter i = range.first ; i != range.second ; ++i)
{
cout << i->first << ": " << i->second << "\n";
}
return 0;
}
It is more lightweight than a multimap<int, string>, but not by as
much as I had originally hoped. I assumed that the map would not
change after it had been built up, i.e., you don't need the mirror
while building the map.
--
Neil Cerutti
In article <sl************ *********@FIAD0 6.norwich.edu>,
Neil Cerutti <le*******@emai l.com> wrote: Here's another idea, which turned out to be more complicated than I at first thought. The original plan was to store actual map iterators in the vector, but then calling equal range became problematic, because I had no iterator to pass in as the value to search for.
#include <iostream> #include <iterator> #include <vector> #include <map> #include <string> #include <algorithm>
using std::map; using std::string; using std::vector; using std::cout; using std::pair;
bool less_first(pair <int, string> const& lhs, pair<int, string> const& rhs) { return lhs.first < rhs.first; }
vector<pair<int , string> > mirror_map(map< string, int> const& m) { vector<pair<int , string> > mirror; for (map<string, int>::const_ite rator i = m.begin(); i != m.end(); ++i) { mirror.push_bac k(pair<int, string>(i->second, i->first)); } std::sort(mirro r.begin(), mirror.end(), less_first); return mirror; }
typedef vector<pair<int , string> >::iterator viter; pair<viter, viter> word_range(vect or<pair<int, string> >& v, int n) { return std::equal_rang e(v.begin(), v.end(), pair<int, string>(n, ""), less_first); }
int main() { map<string, int> words; words["love"] = 4; words["like"] = 3; words["the"] = 10; words["hate"] = 4; words["toodles"] = 1;
vector<pair<int , string> > mirror = mirror_map(word s);
// The whole mirror for (viter i = mirror.begin() ; i != mirror.end() ; ++i) { cout << i->first << ": " << i->second << "\n"; } std::cout << "----\n";
// The part of the mirror with value 4. std::pair<viter , viter> range = word_range(mirr or, 4); for (viter i = range.first ; i != range.second ; ++i) { cout << i->first << ": " << i->second << "\n"; } return 0; }
It is more lightweight than a multimap<int, string>, but not by as much as I had originally hoped. I assumed that the map would not change after it had been built up, i.e., you don't need the mirror while building the map.
Whoa... thanks for this... I've been staring at it for 20 minutes trying
to get my head around it (I'm new to the stl.) Sorry I can't say
anything more constructive at the moment, until I can understand whether
doing it this way is going to be beneficial. Not having any STL text
book, I'd only found <map> code snippets on the web that seemed to suit
my needs. Having read this and googled for 'pair', I'm amazed I got as
far as I did without understanding them. (Up until now I'd been using
Objective-C which has NSDictionary and similar structures, that hide the
guts of this kind of stuff behind a lot of really intuitive convenience
functions... but I prefer C++ so here I am!) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Diego Barros |
last post by:
Hello,
I was wondering if it was posibble to sort a map using the values and not
the keys. Sorting on the keys (if they were strings, for example) is
currently possible. What about, for example, if I have a map with string
keys and object values, can I have custom sorting of the items in the map
based on member variables of said object?
Any help would be appreciated. Thanks.
|
by: Aaron Walker |
last post by:
I want a std::map<std::string, std::string> but I don't want it sorted by keys.
I've been able to simulate this with a vector of pairs and operator.
class umap : public std::vector<std::pair<std::string, std::string > >
{
public:
std::string &operator (std::string s)
{
for (iterator i = begin() ; i != end() ; ++i)
if (i->first == s)
|
by: Ross |
last post by:
The following codes originally provided by a kind responder Toby A Inkster
in another newsgroup for working on displaying a table on web capable of
sorting are modified by me (a newbie). How to disable sorting by say, length
and chart (last few codes at the end)? I find the order is important to
display so cannot just single them out.
<?php
function insert_datatable_cmp ($a, $b) {
return ($a]<$b]) ? -1 : 1;
}
|
by: Tim Henderson |
last post by:
Hi
The question why are there no sorted dictionaries in python, seems to
pop up with unseeming regularity. That question in itself in
nonsensical sense dictionaries are hash-maps, however should python
have a sorted map type object is a good question.
clearly many people like have a sorted map, and sorting the keys every
time seems rather wasteful, as does keeping a separate sorted list of
the keys.
|
by: Gernot Frisch |
last post by:
Hi,
I know I can give a std::map a sorting function in the c'tor. However,
I would need to _change_ the sorting behaviour of an existing
instance. How would I do that?
Thank you,
--
| |
by: zaineb |
last post by:
Hi,
This is a follow-up of sort of this thread:
http://groups.google.com/group/comp.lang.c++/browse_thread/thread/de12af6539e0d645/662cab752779f1fa?lnk=st&q=c%2B%2B+vector+vs+map&rnum=1&hl=en#662cab752779f1fa
I've been chewing on this problem for a while. When I came across the
above thread, I figured that maybe other members have a better idea on
how to approach this.
I'm currently trying to decide between which one of maps or...
|
by: Sjaakie |
last post by:
Hi,
I'm, what it turns out to be, fooling around with 3-tier design.
At several websites people get really enthusiastic about using custom
dataobjects instead of datasets/-tables.
While trying to write such layers myself I got stuck on how to get
filtered or sorted data from the data-layer.
This is what I got:
Objects
|
by: tomek milewski |
last post by:
Hello,
I have a map with keys that can be compared each other. I need a
method that returns the lowest and the greatest key from that map.
Now I'm using begin() and rbegin() which gives iterators to first and
last element of the map.
My question is:
Does begin() and rbegin() guarantee that returned iterators point on
the pair with lowest and gratest value?
|
by: jabbah |
last post by:
I have some data in a map and I want to sort it. Currently I have implemented it like this:
#include <iostream>
#include <map>
#include <string>
using namespace std;
int main(){
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |