While writing some code to demonstrate different local search
strategies, I found something kind of unusual. I suspect it's a bug in
my understanding rather than a bug in GCC, and I'm hoping someone here
can help me out.
The task is simple: solve the game Boggle using a pure hill-climbing
strategy, returning a tab-separated list of words as a char*. (I'm
interfacing with Python, so the char* is necessary; that seems to be
the easiest way to get Python strings back from C++ code.) When
reading the data out I wind up overrunning the end of the allocated
space, despite the fact that at first blush it appears I'm doing things
right.
I'm retyping the offending code here. For clarity I'm leaving off the
std:: prefix from some calls, but these should be obvious. The
following is a method, not a function; it's declared inline in the
header file.
=====
char* words() const
{
stringstream ss;
ostream_iterator<string> oi(ss, "\t");
copy(_wordset->begin(), _wordset->end(), oi);
// The stringstream has a trailing '\t' at the end of it
// which we're not going to copy. This will turn into
// a trailing '\0'.
char* rv = new char[ss.str().size()];
memset((void*) rv, 0, ss.str().size());
copy(ss.str().begin(), ss.str().end() - 1, rv);
return rv;
}
=====
On OS X, this code works as expected. On Win32 and Debian, SIGSEGV (or
its Windows equivalent) is caught, with the offending line being the
call to copy. Replacing the last two lines of the method with
string t = ss.str().substr(0, ss.str().size() - 1);
copy(t.begin(), t.end(), rv);
return rv;
.... makes everything work just fine, though.
Can anyone give me a clear, concise description of my error in
understanding? Or is this a bug in GCC? 12 8985
Robert J. Hansen wrote: stringstream ss;
[...] char* rv = new char[ss.str().size()]; memset((void*) rv, 0, ss.str().size()); copy(ss.str().begin(), ss.str().end() - 1, rv); return rv; }
That's a good one :-) Note that the 'str()' method of the string
stream returns the string *by value* that is, for each call to
'str()' you get a fresh copy. In the above code I spotted a total
of four copies which can easily turn into an unnecessary performance
problem even if the code happens to work for whatever unfortunate
reason you encounter (a condidate could be the use of some form of
reference counted copy where the resulting sequence is actually a
valid one). As a result of having multiple copies, you try to
iterate over an invalid range by using the begin iterator of one
sequence and the end iterator of another. Apart from the resulting
range being a different size you also run the danger of trying to
access arbitrary memory in between.
Thus, you should get the string only once and copy it from there.
I'd write the above portion of copying the code something like
below which also avoids the unnecessary call to 'memset()' which
merely assigns values which will be overridden anyway:
char* to_c_string(std::string const& str) {
char* rc = new char[str.size()];
*std::copy(str.begin(), str.end() - 1, rc) * 0;
return rc;
}
...
return to_c_string(ss.str());
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
> Note that the 'str()' method of the string stream returns the string *by value*
Thanks! This is exactly the thing I was missing--for some reason I
thought it was returning by reference. In light of this, your remarks
about unnecessary construction of string objects is well-taken,
although it's not a performance hit in this problem (the code executes
in sub-second time).
Once more, thanks for the reply!
On Thu, 09 Feb 2006 16:53:37 +0100, Dietmar Kuehl
<di***********@yahoo.com> wrote: Robert J. Hansen wrote: stringstream ss;[...] char* rv = new char[ss.str().size()]; memset((void*) rv, 0, ss.str().size()); copy(ss.str().begin(), ss.str().end() - 1, rv); return rv; }
That's a good one :-) Note that the 'str()' method of the string stream returns the string *by value* that is, for each call to 'str()' you get a fresh copy.
The main problem with the code is that it mixes low-level (memset,
char[]) and (insufficiently understood) high level constructs
(stringstream, string, copy). That mixture is always an indication for
problems in the code.
In the above code I spotted a total of four copies which can easily turn into an unnecessary performance problem even if the code happens to work for whatever unfortunate reason you encounter (a condidate could be the use of some form of reference counted copy where the resulting sequence is actually a valid one). As a result of having multiple copies, you try to iterate over an invalid range by using the begin iterator of one sequence and the end iterator of another. Apart from the resulting range being a different size you also run the danger of trying to access arbitrary memory in between.
Thus, you should get the string only once and copy it from there. I'd write the above portion of copying the code something like below which also avoids the unnecessary call to 'memset()' which merely assigns values which will be overridden anyway:
char* to_c_string(std::string const& str) { char* rc = new char[str.size()]; *std::copy(str.begin(), str.end() - 1, rc) * 0;
Don't know what the above line does ...
return rc; }
... return to_c_string(ss.str());
Of course, returning a new-ed object to the caller (delete by caller)
is bad style. Very bad style, IMO.
Best regards,
Roland Pibinger
On 9 Feb 2006 09:33:56 -0800, "Robert J. Hansen"
<ci********@gmail.com> wrote: Note that the 'str()' method of the string stream returns the string *by value*
Thanks! This is exactly the thing I was missing--for some reason I thought it was returning by reference.
You are right and the interface is wrong. There really should be a
'reference-iterface' instead of a 'value-interface'.
Best wishes,
Roland Pibinger
Roland Pibinger wrote: On Thu, 09 Feb 2006 16:53:37 +0100, Dietmar Kuehl <di***********@yahoo.com> wrote:Robert J. Hansen wrote: stringstream ss;[...] char* rv = new char[ss.str().size()]; memset((void*) rv, 0, ss.str().size()); copy(ss.str().begin(), ss.str().end() - 1, rv); return rv; }
That's a good one :-) Note that the 'str()' method of the string stream returns the string *by value* that is, for each call to 'str()' you get a fresh copy.
The main problem with the code is that it mixes low-level (memset, char[]) and (insufficiently understood) high level constructs (stringstream, string, copy). That mixture is always an indication for problems in the code.
Whether the mixture of code is problematic or not depends on the
user's requirements. The problem he is encountering, however, is
not related to this mixture but to the use of iterators to different
temporaries. char* to_c_string(std::string const& str) { char* rc = new char[str.size()]; *std::copy(str.begin(), str.end() - 1, rc) * 0;
Don't know what the above line does ...
Right... There is small but rather obvious typo: the last asterisk
should have been an assignment:
*std::copy(str.begin(), str.end() - 1, rc) = 0;
Sorry for that.
Of course, returning a new-ed object to the caller (delete by caller) is bad style. Very bad style, IMO.
I generally agree on this. However, as was explained in the original
article, the author considered this to be mandatory.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
> the author considered this to be mandatory
I'd phrase it as "regrettably mandatory", myself. I really dislike how
unsafe and obnoxious-to-use C ways of doing things are the lowest
common denominator for inter-language communications, but for now that
appears to be the only game in town.
On the other hand, I haven't taken a look at Boost::Python yet. Maybe
I should.
On Fri, 10 Feb 2006 11:32:27 +0100, Dietmar Kuehl
<di***********@yahoo.com> wrote: char* to_c_string(std::string const& str) { char* rc = new char[str.size()]; *std::copy(str.begin(), str.end() - 1, rc) * 0;
Don't know what the above line does ...
Right... There is small but rather obvious typo: the last asterisk should have been an assignment:
*std::copy(str.begin(), str.end() - 1, rc) = 0;
Nitpicking ... you need to check for str.size() > 0.
On 10 Feb 2006 08:07:06 -0800, "Robert J. Hansen"
<ci********@gmail.com> wrote: the author considered this to be mandatory I'd phrase it as "regrettably mandatory", myself.
When (on which occasion) do you delete the returned char[]?
I really dislike how unsafe and obnoxious-to-use C ways of doing things are the lowest common denominator for inter-language communications, but for now that appears to be the only game in town.
At least, we have a common, non-proprietary, widely understood
'inter-language'.
Best wishes,
Roland Pibinger
> When ... do you delete the returned char[]?
This is Officially Not My Problem(tm). SWIG creates a wrapper which
takes the char* and takes responsibility for deleting it. Or, rather,
that's my understanding of what SWIG is doing; it wouldn't be
impossible for me to be mistaken.
On 10 Feb 2006 15:05:29 -0800, "Robert J. Hansen"
<ci********@gmail.com> wrote: When ... do you delete the returned char[]? This is Officially Not My Problem(tm). SWIG creates a wrapper which takes the char* and takes responsibility for deleting it.
That would be a surprise to me. How can SWIG know in which way the
char[] has been created? With new[], malloc, a custom allocator, as
static, ...?
Or, rather, that's my understanding of what SWIG is doing; it wouldn't be impossible for me to be mistaken.
Maybe a SWIG expert can comment ...
> How can SWIG know in which way the char[] has been created?
Just hazarding a guess, the "-c++" flag I pass to SWIG might be a good
clue to it that it should use delete.
Robert J. Hansen wrote: How can SWIG know in which way the char[] has been created?
Just hazarding a guess, the "-c++" flag I pass to SWIG might be a good clue to it that it should use delete.
Which one though? Here are the built-in candidates but these are not
the only ones which might exist in a program:
delete p;
delete[] p;
operator delete(p);
operator delete[](p);
My guess would be that it actually uses 'malloc(p)' in which case
the memory is better allocated differently...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Ellarco |
last post by:
Im sorry for asking a question that is surely in the archives somewhere, but
I have been unable to locate it.
Its about string memory management. I need to dynamically construct a
C-style string...
|
by: Woodster |
last post by:
I am using std::stringstream to format a string. How can I clear the
stringstream variable I am using to "re use" the same variable?
Eg:
Using std::string
std::string buffer;
buffer =...
|
by: Dylan |
last post by:
Hi again,
In the following program I expected step 3 to assign the values 1 and
2 to iVal1 and iVal2 yet it appears ss.seekg(0, std::ios::beg) does
not move the read position back to the...
|
by: Marcin Kalicinski |
last post by:
Is there a vectorstream class that implements the functionality similar to
std::stringstream but with std::vector, not std::string?
cheers,
Marcin
|
by: Àî°× |
last post by:
hi all:
I want erase a stringstream' content for input new data to it.
example:
std::stringstream stm;
stm<<"this is a string";
std::cout<<stm.str();
// here print:this is a string
|
by: vigacmoe |
last post by:
I was trying to cast some strings to integers with stringstream, then
this strange problem poped up.
Here is my test code:
stringstream conv;
string from;
int to;
from = "1";
conv << from;
|
by: akitoto |
last post by:
Hi there,
I am using the std::stringstream to create a byte array in memory. But
it is not working properly. Can anyone help me?
Code:
#include <vector>
#include <sstream>
#include...
|
by: martinezfive |
last post by:
Hi,
I feel no doubt some documentation contains my answer, so bare with
me. Given the following:
#inclde <stdio.h>
#include <sstream>
void f()
{
std::stringstream a("Hello World!\n");
|
by: dominolog |
last post by:
Hello
I want to convert a char* buffer to a std::string containing a hex
description of it. I use a std::stringstream in the following manner:
std::tstring ToHex( const char* buffer, size_t...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| | |