473,406 Members | 2,847 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

help with dirty hack

Nmx
Hi everyone,

I'm writing a patch to a search engine (aspseek
http://www.aspseek.org/) compile under gcc 3.4.4 on FC3.

At some point, I found this piece of code:

--
// Dirty hack to avoid non-threadsafeness of string class
// We set ref to big value here so it will not reach 0
// Works only with GNU libstdc++ STL!
static long* ref = (long*)config_name.data() - 2;
*ref = 0x40000000;
--

I don't have a clue of this do except for a really bad side effect: if
I change the value of any std::string initialized as an empty string
then every string initialized by the same way has his value modified.
If I strip out this piece of code then the side effect disappear.

Does anyone have a clue of what this code really do?

Thanks in advanced,

nmx

Jul 28 '06 #1
5 2207
Nmx wrote:
I'm writing a patch to a search engine (aspseek
http://www.aspseek.org/) compile under gcc 3.4.4 on FC3.

At some point, I found this piece of code:

--
Please do not use the two-dashes-on-a-line as your formatting element.
It's actually a standard separator for the signature, and many news (and
e-mail) programs remove everything below it when quoting. Now I am forced
to manually re-quote your text...
// Dirty hack to avoid non-threadsafeness of string class
// We set ref to big value here so it will not reach 0
// Works only with GNU libstdc++ STL!
static long* ref = (long*)config_name.data() - 2;
*ref = 0x40000000;
--

I don't have a clue of this do except for a really bad side effect: if
I change the value of any std::string initialized as an empty string
then every string initialized by the same way has his value modified.
If I strip out this piece of code then the side effect disappear.

Does anyone have a clue of what this code really do?
If 'config_name' is an object of type 'std::string', then this hack
seems to rely on the implementation of the class that stores the *ref*
*count* two longs before the memory location of the buffer. Setting
the ref count to 4 * 2^28 helps to keep it alive or protect from any
reallocation, (maybe?) I have no real idea. You might want to consult
with GNU folks.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 28 '06 #2
Nmx wrote:
// Dirty hack to avoid non-threadsafeness of string class
// We set ref to big value here so it will not reach 0
// Works only with GNU libstdc++ STL!
static long* ref = (long*)config_name.data() - 2;
*ref = 0x40000000;
--

I don't have a clue of this do except for a really bad side effect: if
I change the value of any std::string initialized as an empty string
then every string initialized by the same way has his value modified.
If I strip out this piece of code then the side effect disappear.

Does anyone have a clue of what this code really do?
Nobody answered in an hour, so I will provide the generic answer:

Take it out, and carefully review the rest of the code. Someone capable of
commenting such a diseased piece of code as a mild "hack" is capable of
writing anything and not commenting it.

The hack appears to reach into the secret shared string that's inside many
std::string representations. It sets the reference count to a huge number,
allowing the referred-to string data to leak. That, in turn, lets other
threads read the leaked data after its owning thread has destroyed it.

I don't know why this would affect other strings. The hack might have once
worked "better" than it does now. Your compiler, your input libraries, even
your linker settings may have changed since the hack worked. The symptom you
report might change if you compile a different way.

You need to perform the tiny bit of research that your predecessor deferred.
Find a forum that discusses the thread package this program uses, and ask it
how to pass a string between two threads. They will help you write something
simple and robust.

Whatever the problems of this hack, you might get problems if you take it
out. The other thread might access the string after this thread destroys it.

In general, you must maintain this code's value without increasing your
burden of debugging it. Read /Working Effectively with Legacy Code/ by Mike
Feathers to learn to do that.

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
Jul 28 '06 #3
Nmx
Victor,
Please do not use the two-dashes-on-a-line as your formatting element.
It's actually a standard separator for the signature, and many news (and
e-mail) programs remove everything below it when quoting. Now I am forced
to manually re-quote your text...
I didn't realize. Sorry for that...
If 'config_name' is an object of type 'std::string', then this hack
seems to rely on the implementation of the class that stores the *ref*
*count* two longs before the memory location of the buffer. Setting
the ref count to 4 * 2^28 helps to keep it alive or protect from any
reallocation, (maybe?) I have no real idea. You might want to consult
with GNU folks.
Thanks for your reply.

Actually yes, config_name is a std::string object. I will check this
with GNU guys. By the way, do you know how this can help with
threadsafeness of string class?

Thanks again,

nmx
Victor Bazarov wrote:
Nmx wrote:
I'm writing a patch to a search engine (aspseek
http://www.aspseek.org/) compile under gcc 3.4.4 on FC3.

At some point, I found this piece of code:

--

Please do not use the two-dashes-on-a-line as your formatting element.
It's actually a standard separator for the signature, and many news (and
e-mail) programs remove everything below it when quoting. Now I am forced
to manually re-quote your text...
// Dirty hack to avoid non-threadsafeness of string class
// We set ref to big value here so it will not reach 0
// Works only with GNU libstdc++ STL!
static long* ref = (long*)config_name.data() - 2;
*ref = 0x40000000;
--

I don't have a clue of this do except for a really bad side effect: if
I change the value of any std::string initialized as an empty string
then every string initialized by the same way has his value modified.
If I strip out this piece of code then the side effect disappear.

Does anyone have a clue of what this code really do?

If 'config_name' is an object of type 'std::string', then this hack
seems to rely on the implementation of the class that stores the *ref*
*count* two longs before the memory location of the buffer. Setting
the ref count to 4 * 2^28 helps to keep it alive or protect from any
reallocation, (maybe?) I have no real idea. You might want to consult
with GNU folks.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 28 '06 #4
Actually yes, config_name is a std::string object. I will check this
with GNU guys. By the way, do you know how this can help with
threadsafeness of string class?
nmx
Please read my post.

Nmx wrote:
Victor,
>Please do not use the two-dashes-on-a-line as your formatting element.
It's actually a standard separator for the signature, and many news (and
e-mail) programs remove everything below it when quoting. Now I am
forced
to manually re-quote your text...

I didn't realize. Sorry for that...
The standard separator for the signature is "-- \n", with a space:

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
Jul 28 '06 #5
Nmx
Thanks for your explanation, Phlip.
I don't know why this would affect other strings. The hack might have once
worked "better" than it does now. Your compiler, your input libraries, even
your linker settings may have changed since the hack worked. The symptom you
report might change if you compile a different way.
Actually this program was originally intend to compile under gcc 2.9
I'm trying to make it work on gcc 3.4.

Thanks again and I'll spend some time reading that book.

nmx
Phlip wrote:
Nmx wrote:
// Dirty hack to avoid non-threadsafeness of string class
// We set ref to big value here so it will not reach 0
// Works only with GNU libstdc++ STL!
static long* ref = (long*)config_name.data() - 2;
*ref = 0x40000000;
--

I don't have a clue of this do except for a really bad side effect: if
I change the value of any std::string initialized as an empty string
then every string initialized by the same way has his value modified.
If I strip out this piece of code then the side effect disappear.

Does anyone have a clue of what this code really do?

Nobody answered in an hour, so I will provide the generic answer:

Take it out, and carefully review the rest of the code. Someone capable of
commenting such a diseased piece of code as a mild "hack" is capable of
writing anything and not commenting it.

The hack appears to reach into the secret shared string that's inside many
std::string representations. It sets the reference count to a huge number,
allowing the referred-to string data to leak. That, in turn, lets other
threads read the leaked data after its owning thread has destroyed it.

I don't know why this would affect other strings. The hack might have once
worked "better" than it does now. Your compiler, your input libraries, even
your linker settings may have changed since the hack worked. The symptom you
report might change if you compile a different way.

You need to perform the tiny bit of research that your predecessor deferred.
Find a forum that discusses the thread package this program uses, and ask it
how to pass a string between two threads. They will help you write something
simple and robust.

Whatever the problems of this hack, you might get problems if you take it
out. The other thread might access the string after this thread destroys it.

In general, you must maintain this code's value without increasing your
burden of debugging it. Read /Working Effectively with Legacy Code/ by Mike
Feathers to learn to do that.

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
Jul 28 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
by: Rey | last post by:
Howdy all. Appreciate your help with several problems I'm having: I'm trying to determine if the Visit subform (subformVisits) has a new record or been changed, i.e. dirty. The form that...
2
by: Salad | last post by:
A97. I have a command button to save and exit. I had the code If Me.Dirty Then Docmd.RunCommand acCmdSaveRecord ...more code endif I was getting an error because a value was not getting...
9
by: Susan Bricker | last post by:
I am currently using the OnDirty event of a Form to detect whether any fields have been modified. I set a boolean variable. Then, if the Close button is clicked before the Save button, I can put...
2
by: Trevor | last post by:
I'd really appreciate it if someone could please help me understand the code below. The only question I have about c_table() is about the t && range( 1 )... line; what does the t && part mean? The...
3
by: Rick Shaw | last post by:
Hi, can anyone help? I am fairly new to C# and need help with a functionlity in my app. What I need to accomplish is to be able to enable the SAVE button as soon as the user modifies a data in...
7
by: smerf | last post by:
I am trying to write a personal spider to crawl through websites and create a highly specialized personal list of sites and pages that I may like to see based on preferences that I have supplied. ...
5
by: DanielM | last post by:
Please Help - If Me.Dirty I Understood That It Check If any Data in controls had changed. I Have a Form with few textboxes, and On Exit from the form i want to check if data was changed. But It...
5
by: weidongtom | last post by:
Hi, I tried to implement the Universal Machine as described in http://www.boundvariable.org/task.shtml, and I managed to get one implemented (After looking at what other's have done.) But when I...
0
by: davidsavill | last post by:
Hi All, I am migrating a database from Firebird/Interbase to DB2 and have having issues with the stored procedures/functions. I have a number of functions that loop over a FOR loop, each pass...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.