473,385 Members | 1,707 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

multiset segfault


I'm seeing a bug at the moment that I can't track down. It's
part of a moderately large program, but here is a small program
that exhibits the bug on gcc. (The program code follows at the
bottom of the message.)
The symptom is this:

% g++ --version
g++ (GCC) 3.2.1
[...]
% g++ -W -Wall -pedantic test.cpp
% ./a.out
push_back(3402,tian).
push_back(3401,tian).
push_back(3405,wu)Segmentation fault
%

So it appears something is going wrong in multiset::insert.
I'm not very familiar with the STL, so I suppose I'm abusing
multiset in some way, but for the life of me I can't figure
out how, or how to fix it.
(The goal of the TerminalList class is to be a container
of Terminals, but to allow very fast access to the list of
Terminals containing a certain 'syllable' or 'unino'. I'm
using 'lower_bound' and 'upper_bound' to extract a vector
of Terminals with a given syllable from myStrHash; please
tell me if you can see a better way, but this is unrelated
to the segfault bug, which is why I snipped the code that
does all that from the program below.)

Thanks much,
-Arthur

==code begins==

#include <cstdio>
#include <cctype>
#include <vector>
#include <set>
#include <string>
using namespace std;

struct Terminal
{
int unino;
string syllable;
int d;

Terminal(int u, const string& y, int c):
unino(u), syllable(y), d(c)
{}
};

struct myStrPred {
bool operator() (Terminal* a, Terminal* b)
{ return (a->syllable < b->syllable); }
};

struct TerminalList
{
void push_back(const Terminal& t);

typedef multiset<Terminal*, myStrPred> StrHashT;
vector<Terminal> myVector;
StrHashT myStrHash;
};

void TerminalList::push_back(const Terminal& t)
{
myVector.push_back(t);
Terminal* pt = &myVector.back();

printf("push_back(%04x,%s)",
pt->unino, pt->syllable.c_str());
fflush(stdout);

myStrHash.insert(pt);

printf(".\n");
}

int main()
{
TerminalList characters;
characters.push_back(Terminal(0x3402, "tian", 1));
characters.push_back(Terminal(0x3401, "tian", 1));
characters.push_back(Terminal(0x3405, "wu", 3));
return 0;
}

Jul 22 '05 #1
10 2868
"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> wrote...
I'm seeing a bug at the moment that I can't track down. It's
part of a moderately large program, but here is a small program
that exhibits the bug on gcc. (The program code follows at the
bottom of the message.)
The symptom is this:

% g++ --version
g++ (GCC) 3.2.1
[...]
% g++ -W -Wall -pedantic test.cpp
% ./a.out
push_back(3402,tian).
push_back(3401,tian).
push_back(3405,wu)Segmentation fault
%

So it appears something is going wrong in multiset::insert.
I'm not very familiar with the STL, so I suppose I'm abusing
multiset in some way, but for the life of me I can't figure
out how, or how to fix it.
The only way to fix it is to insert something that doesn't change
between calls to your 'push_back' function. I am nor really sure
what to recommend. An index should definitely be OK, but then
you need a mechanism to extract the values from the vector...
(The goal of the TerminalList class is to be a container
of Terminals, but to allow very fast access to the list of
Terminals containing a certain 'syllable' or 'unino'. I'm
using 'lower_bound' and 'upper_bound' to extract a vector
of Terminals with a given syllable from myStrHash; please
tell me if you can see a better way, but this is unrelated
to the segfault bug, which is why I snipped the code that
does all that from the program below.)

Thanks much,
-Arthur

==code begins==

#include <cstdio>
#include <cctype>
#include <vector>
#include <set>
#include <string>
using namespace std;

struct Terminal
{
int unino;
string syllable;
int d;

Terminal(int u, const string& y, int c):
unino(u), syllable(y), d(c)
{}
};

struct myStrPred {
bool operator() (Terminal* a, Terminal* b)
{ return (a->syllable < b->syllable); }
};

struct TerminalList
{
void push_back(const Terminal& t);

typedef multiset<Terminal*, myStrPred> StrHashT;
vector<Terminal> myVector;
StrHashT myStrHash;
};

void TerminalList::push_back(const Terminal& t)
{
myVector.push_back(t);
This invalidates all references and pointers to elements of that
vector _if_ reallocation occurs. In your case, reallocation does
not occur until the third element is inserted (probably).
Terminal* pt = &myVector.back();
Here you take an address to an element of the vector. That address
_can_ and _will_ be invalidated by the next push_back.

printf("push_back(%04x,%s)",
pt->unino, pt->syllable.c_str());
fflush(stdout);

myStrHash.insert(pt);
Here you attempt to preserve that address which can and will become
invalid next time you call this function.

printf(".\n");
}
I'd rewrite your class as

------------------------ Only changed parts
struct myStrPred {
vector<Terminal>& container;
myStrPred(vector<Terminal>& c) : container(c) {}
bool operator() (int i1, int i2)
{ return (container[i1].syllable < container[i2].syllable); }
};

struct TerminalList
{
void push_back(const Terminal& t);

typedef multiset<int, myStrPred> StrHashT;
vector<Terminal> myVector;
StrHashT myStrHash;

TerminalList() : myStrHash(myStrPred(myVector)) {}
};

void TerminalList::push_back(const Terminal& t)
{
int next_ind = myVector.size();
myVector.push_back(t);

printf("push_back(%04x,%s)",
myVector[next_ind].unino, myVector[next_ind].syllable.c_str());
fflush(stdout);

myStrHash.insert(next_ind);

printf(".\n");
}
--------------------------------------------------------

int main()
{
TerminalList characters;
characters.push_back(Terminal(0x3402, "tian", 1));
characters.push_back(Terminal(0x3401, "tian", 1));
characters.push_back(Terminal(0x3405, "wu", 3));
return 0;
}

Jul 22 '05 #2

"Arthur J. O'Dwyer" ha escrito:

[segfault stuff deleted]
(The goal of the TerminalList class is to be a container
of Terminals, but to allow very fast access to the list of
Terminals containing a certain 'syllable' or 'unino'. I'm
using 'lower_bound' and 'upper_bound' to extract a vector
of Terminals with a given syllable from myStrHash; please
tell me if you can see a better way, but this is unrelated
to the segfault bug, which is why I snipped the code that
does all that from the program below.)


I've written a library called Boost.MultiIndex, to be promptly released
in Boost 1.32, that can help you construct this sort of hybrid
containers. Documentation can be consulted at

http://boost-consulting.com/boost/li...doc/index.html

In particular, the section "A bidirectional list with fast lookup" in
the
tutorial shows how to construct a container that, AFAICS, can
effectively replace your TerminalList. Maybe this is helpful to you.

Regards,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

Jul 22 '05 #3

On Thu, 17 Jun 2004, [iso-8859-1] Joaquín Mª López Muñoz wrote:

"Arthur J. O'Dwyer" ha escrito:
(The goal of the TerminalList class is to be a container
of Terminals, but to allow very fast access to the list of
Terminals containing a certain 'syllable' or 'unino'.


I've written a library called Boost.MultiIndex, to be promptly released
in Boost 1.32, that can help you construct this sort of hybrid
containers. Documentation can be consulted at

http://boost-consulting.com/boost/li...doc/index.html

In particular, the section "A bidirectional list with fast lookup" in
the tutorial shows how to construct a container that, AFAICS, can
effectively replace your TerminalList. Maybe this is helpful to you.


Yes and no. ;-( The 'multi_index_container' template *is* exactly
what I'm looking for; I certainly hope Boost becomes at least a
/de facto/ standard for C++ library support, if not part of the
standard itself! But as I understand it, to use 'multi_index_container'
I would need to download and install *all* of Boost (or at least the
large portion of it #included by multi_index_container.hpp) on all
the machines on which I intended to compile my program. And that's
a little much, for my humble purposes. :) [Being as there are three
of them, and one of them is a Linux system on which I have neither
administrative privileges nor megabytes of userspace to spare.]
Thank you for bringing it to my attention, though; I'm going to
have to start looking at what else Boost can do that I've been doing
by hand. Maybe one of these days it will become worth my while to
tie all my larger programs to Boost.

Meanwhile, I'm still wondering what's wrong with my use of
'multiset'...

-Arthur
Jul 22 '05 #4

"Arthur J. O'Dwyer" ha escrito:
[...]

Yes and no. ;-( The 'multi_index_container' template *is* exactly
what I'm looking for; I certainly hope Boost becomes at least a
/de facto/ standard for C++ library support, if not part of the
standard itself! But as I understand it, to use 'multi_index_container'
I would need to download and install *all* of Boost (or at least the
large portion of it #included by multi_index_container.hpp) on all
the machines on which I intended to compile my program.
Yep, that's true.
And that's
a little much, for my humble purposes. :) [Being as there are three
of them, and one of them is a Linux system on which I have neither
administrative privileges nor megabytes of userspace to spare.]
Well, there's a recurrent problem with the entry barriers to using
Boost, the main one being the huge size of the library.
But then again, if your main obstacle is how to convince your manager
to use Boost as part of the common development base, you can
put the matter like this: How long will it take you to craft your
TerminalList solution and debug it? A week? A week of your time
is orders of magnitude more expensive than the HD space taken
by Boost.

That said, there's an utility called bcp for extracting libraries
out of Boost, dragging their dependencies along:

http://boost-consulting.com/boost/tools/bcp/bcp.html

I can comment on how covnenient this tool is, but you might want to
have a look at it.


Thank you for bringing it to my attention, though; I'm going to
have to start looking at what else Boost can do that I've been doing
by hand. Maybe one of these days it will become worth my while to
tie all my larger programs to Boost.

Meanwhile, I'm still wondering what's wrong with my use of
'multiset'...


Ummm... The problem lies in that you're storing pointers to a
*vector* inside myStrHash. But these pointers are *not* stable,
the elements inside myVector will be relocated when its size
grows; in your code this happens at about the second insertion.
You have two solutions:

1. Use an std::list insiead of an std::vector. std::list references
are stable, i.e elements won't be relocated over time.
2. Redefine myStrHash to use indices (numbers) instead of
pointers.

I'd rather use #1, though.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

Jul 22 '05 #5

On Thu, 17 Jun 2004, [iso-8859-1] Joaquín Mª López Muñoz wrote:

"Arthur J. O'Dwyer" ha escrito:
[re: don't want to install Boost everywhere myself]
Well, there's a recurrent problem with the entry barriers to using
Boost, the main one being the huge size of the library.
But then again, if your main obstacle is how to convince your manager
to use Boost as part of the common development base, you can
put the matter like this: How long will it take you to craft your
TerminalList solution and debug it? A week? A week of your time
is orders of magnitude more expensive than the HD space taken
by Boost.
*If* that were my main obstacle. ;) The program in question is
just a hobby project anyway.
That said, there's an utility called bcp for extracting libraries
out of Boost, dragging their dependencies along:

http://boost-consulting.com/boost/tools/bcp/bcp.html
I'll take a look.

Meanwhile, I'm still wondering what's wrong with my use of
'multiset'...


Ummm... The problem lies in that you're storing pointers to a
*vector* inside myStrHash. But these pointers are *not* stable,


Oh, bother! See, I knew it was something simple... :(
the elements inside myVector will be relocated when its size
grows; in your code this happens at about the second insertion.
You have two solutions:

1. Use an std::list insiead of an std::vector. std::list references
are stable, i.e elements won't be relocated over time.


Aha. This would be the easiest solution, then. Thanks.
For future reference, are there any other STL containers for
which the elements' addresses are guaranteed to be stable?
And I seem to recall that in a std::vector, the elements are
guaranteed to be in contiguous memory --- but this is not true
of std::list --- is it true of any other containers?

I've been relying mostly on the Dinkum C++ library reference
page and on Google for my STL information. Do you have any
suggestions of a better STL reference for the casual C++ user?
[Read: one that might manage to warn me about dumb mistakes like
that one.]

-Arthur
Jul 22 '05 #6


"Arthur J. O'Dwyer" ha escrito:
[...]
Meanwhile, I'm still wondering what's wrong with my use of
'multiset'...
Ummm... The problem lies in that you're storing pointers to a
*vector* inside myStrHash. But these pointers are *not* stable,


Oh, bother! See, I knew it was something simple... :(
the elements inside myVector will be relocated when its size
grows; in your code this happens at about the second insertion.
You have two solutions:

1. Use an std::list insiead of an std::vector. std::list references
are stable, i.e elements won't be relocated over time.


Aha. This would be the easiest solution, then. Thanks.
For future reference, are there any other STL containers for
which the elements' addresses are guaranteed to be stable?
And I seem to recall that in a std::vector, the elements are
guaranteed to be in contiguous memory --- but this is not true
of std::list --- is it true of any other containers?


std::vector is the only container guaranteeing that elements
are in contiguous memory.



I've been relying mostly on the Dinkum C++ library reference
page and on Google for my STL information. Do you have any
suggestions of a better STL reference for the casual C++ user?
[Read: one that might manage to warn me about dumb mistakes like
that one.]


Well,

* C++ FAQ Lite (http://www.parashift.com/c++-faq-lite/) contains
some sections on STL containers, and it is anyway worth reading the
whole contents even if not specifically dealing with STL.
* The reference at cppreference (http://www.cppreference.com/cpp_stl.html)
is a little more readable than Dinkumware's stuff.
* Thinking in C++ (http://www.bruceeckel.com/) is a C++ programming
guide downloadable for free. Vol II has some stuff on STL.

In general, if you google for "STL tutorial" you'll find many more
links. Beware to check the date of the docs you're reading: if it
is prior to 1998 (when the standard was published) then you'd rather
not use it, much info will be outdated.

HTH

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

Jul 22 '05 #7

On Thu, 17 Jun 2004, Arthur J. O'Dwyer wrote:
1. Use an std::list instead of an std::vector. std::list references
are stable, i.e elements won't be relocated over time.


Aha. This would be the easiest solution, then. Thanks.


FYI: changed from std::vector to std::list in the two places I
used pointers to elements [thus ickifying a couple of loops that
expected random access to the underlying vectors, but...], and
the program now works just fine![1] Thanks for your help!

http://www.contrib.andrew.cmu.edu/~ajo/workshop/hao.cpp

-Arthur

[1] - Modulo the ickiness of the source code; it's still a
very rough version. Refactoring is coming... ;)
Jul 22 '05 #8
"Joaquín Mª López Muñoz" <jo*****@tid.es> wrote in message
news:40***************@tid.es...
* The reference at cppreference (http://www.cppreference.com/cpp_stl.html)
is a little more readable than Dinkumware's stuff.


It has more tutorial material, to be sure, but it it also riddled
with inaccuracies and omissions. You'll want to supplement it
with a good reference.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Jul 22 '05 #9

On Thu, 17 Jun 2004, P.J. Plauger wrote:

"Joaquín Mª López Muñoz" <jo*****@tid.es> wrote in message
news:40***************@tid.es...
* The reference at cppreference (http://www.cppreference.com/cpp_stl.html)
is a little more readable than Dinkumware's stuff.
It has more tutorial material, to be sure, but it it also riddled
with inaccuracies and omissions. You'll want to supplement it
with a good reference.


Yes; that was the site where I found
http://www.cppreference.com/cppmulti...ls.html#insert
^^^^^^^^ The function insert() either:
* inserts val after the element at pos, and returns an iterator
to that element.
* inserts a range of elements from start to end.
* inserts val, but only if val doesn't already exist. The return ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ value is an iterator to the element inserted, and a boolean describing
whether an insertion took place.


That was a little disconcerting. :)

-Arthur

Jul 22 '05 #10

This odious piece of code is your trouble....
void TerminalList::push_back(const Terminal& t)
{
myVector.push_back(t);
Terminal* pt = &myVector.back();

printf("push_back(%04x,%s)",
pt->unino, pt->syllable.c_str());
fflush(stdout);

myStrHash.insert(pt);

printf(".\n");
It is not a good idea to take addresses of objects inside a vector for the
simple
reason as you insert more objects the objects are relocated in memory, and
the
pointer values you are holding onto no longer point to those objects. When
you
do the insert(pt), the compare function you provide will eventually hit one
of these
invalid pt values and you're a gonner sunshine.

dave

"Arthur J. O'Dwyer" <aj*@nospam.andrew.cmu.edu> wrote in message
news:Pi***********************************@unix43. andrew.cmu.edu...
I'm seeing a bug at the moment that I can't track down. It's
part of a moderately large program, but here is a small program
that exhibits the bug on gcc. (The program code follows at the
bottom of the message.)
The symptom is this:

% g++ --version
g++ (GCC) 3.2.1
[...]
% g++ -W -Wall -pedantic test.cpp
% ./a.out
push_back(3402,tian).
push_back(3401,tian).
push_back(3405,wu)Segmentation fault
%

So it appears something is going wrong in multiset::insert.
I'm not very familiar with the STL, so I suppose I'm abusing
multiset in some way, but for the life of me I can't figure
out how, or how to fix it.
(The goal of the TerminalList class is to be a container
of Terminals, but to allow very fast access to the list of
Terminals containing a certain 'syllable' or 'unino'. I'm
using 'lower_bound' and 'upper_bound' to extract a vector
of Terminals with a given syllable from myStrHash; please
tell me if you can see a better way, but this is unrelated
to the segfault bug, which is why I snipped the code that
does all that from the program below.)

Thanks much,
-Arthur

==code begins==

#include <cstdio>
#include <cctype>
#include <vector>
#include <set>
#include <string>
using namespace std;

struct Terminal
{
int unino;
string syllable;
int d;

Terminal(int u, const string& y, int c):
unino(u), syllable(y), d(c)
{}
};

struct myStrPred {
bool operator() (Terminal* a, Terminal* b)
{ return (a->syllable < b->syllable); }
};

struct TerminalList
{
void push_back(const Terminal& t);

typedef multiset<Terminal*, myStrPred> StrHashT;
vector<Terminal> myVector;
StrHashT myStrHash;
};

void TerminalList::push_back(const Terminal& t)
{
myVector.push_back(t);
Terminal* pt = &myVector.back();

printf("push_back(%04x,%s)",
pt->unino, pt->syllable.c_str());
fflush(stdout);

myStrHash.insert(pt);

printf(".\n");
}

int main()
{
TerminalList characters;
characters.push_back(Terminal(0x3402, "tian", 1));
characters.push_back(Terminal(0x3401, "tian", 1));
characters.push_back(Terminal(0x3405, "wu", 3));
return 0;
}

Jul 22 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: joeboatertx | last post by:
I have some simple C++ code I am trying to compile using VS.NET 2003. (Below) Every time I try to compile I get numerous C2784 errors. If I simply comment out both of the chores.insert() lines,...
3
by: John Harrison | last post by:
If you insert equal values into a multiset is it specified what order they will have in the multiset. For instance, what should be the output of this program, is it defined? #include <set>...
1
by: SUPER_SOCKO | last post by:
I am new to STL. I don't know how to access a multiset of a list. My code is the following: #include <list> #include <iostream> #include <set> using namespace std;
5
by: Dale Marchand | last post by:
I'm trying to use an object in two different multiset containers, each with it's own sort method. For the most frequently used, I overrode the operator< method in the class, and for the second I...
2
by: =?iso-8859-1?q?Jo=E3o_Correia?= | last post by:
class CScore { public: int L; int C; CScore(int l, int c) { L = l; C = c;
9
by: neil.johnston | last post by:
I have a cut down example program that uses multiset to order some data. The data arrives from various sources and has a time stamp, data with identical timestamps can arrive and due to fifo's and...
2
by: parvtb | last post by:
Thanks for your patience to read the entire post to understand my confusion I have the user-defined class "tools": class tools{ public: tools( ......) {} friend bool operator<...
4
Digital Don
by: Digital Don | last post by:
Hi, I was looking at the MultiSet STL structure with some "less" keyword. I was told that we can use the "MultiSet" STL structure to automatically sort the content. Is it possible to store and...
3
by: orzeech | last post by:
Hi everyone! I have a following problem: #include<iostream> #include<set> using namespace std;
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.