473,387 Members | 1,624 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

String constructor returning interned string?

I've just noticed something rather odd and disturbing. The following
code displays "True":

using System;

class Test
{
public static void Main(string[] args)
{
string x = new string ("".ToCharArray());
string y = new string ("".ToCharArray());
Console.WriteLine (object.ReferenceEquals (x, y));
}
}

In other words, new string(...) is *not* returning a new string
reference.

This worries me - not so much for the specific example, but for the
precedent set. What other new ... expressions might return non-new
references? This could have significant implications in multi-
threading, where you may rely on two references being different for
locking purposes.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #1
4 1644

"Jon Skeet" <sk***@pobox.com> wrote in message
news:MP************************@news.microsoft.com ...
I've just noticed something rather odd and disturbing. The following
code displays "True":

using System;

class Test
{
public static void Main(string[] args)
{
string x = new string ("".ToCharArray());
string y = new string ("".ToCharArray());
Console.WriteLine (object.ReferenceEquals (x, y));
}
}

In other words, new string(...) is *not* returning a new string
reference.

This worries me - not so much for the specific example, but for the
precedent set. What other new ... expressions might return non-new
references? This could have significant implications in multi-
threading, where you may rely on two references being different for
locking purposes.

This is odd...quite honestly. However, have you checked to make sure the JIT
isn't making some kind of very clever optimzation here? Perhaps realizing
your creating two strings from the same source(the char array of an already
interned string), and not creating the new object but instead setting both x
& y to the same reference?
Otherwise this is a quite disturbing find, indeed.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #2
Daniel O'Connell <onyxkirx@--NOSPAM--comcast.net> wrote:
This worries me - not so much for the specific example, but for the
precedent set. What other new ... expressions might return non-new
references? This could have significant implications in multi-
threading, where you may rely on two references being different for
locking purposes.
This is odd...quite honestly. However, have you checked to make sure the JIT
isn't making some kind of very clever optimzation here? Perhaps realizing
your creating two strings from the same source(the char array of an already
interned string), and not creating the new object but instead setting both x
& y to the same reference?


It only happens with the empty string, as far as I can see.
Otherwise this is a quite disturbing find, indeed.


I think it's disturbing either way - basically, if you rely on the new
operator always returning a previously unknown reference, you've got
problems.

However, I've looked at the String(char[]) docs, and the remarks say:

<quote>
If value is a null reference (Nothing in Visual Basic) or contains no
element, an Empty instance is initialized.
</quote>

I suspect if I hadn't known what that meant beforehand (due to seeing
this) I wouldn't have understood it.

I'd have thought this would actually take more work, and that there
wouldn't really be that much benefit in it. I just wonder where else
this might be lurking...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #3
There's more on .NET's string interning over on Chris Brumme's blog:

http://blogs.gotdotnet.com/cbrumme/P...3-3d7a0dbba270

He even has some example code snippets to illustrate how it can bite you
when calling into unmanaged code.

/kel

Jon Skeet wrote:
Daniel O'Connell <onyxkirx@--NOSPAM--comcast.net> wrote:
This worries me - not so much for the specific example, but for the
precedent set. What other new ... expressions might return non-new
references? This could have significant implications in multi-
threading, where you may rely on two references being different for
locking purposes.


This is odd...quite honestly. However, have you checked to make sure the JIT
isn't making some kind of very clever optimzation here? Perhaps realizing
your creating two strings from the same source(the char array of an already
interned string), and not creating the new object but instead setting both x
& y to the same reference?

It only happens with the empty string, as far as I can see.

Otherwise this is a quite disturbing find, indeed.

I think it's disturbing either way - basically, if you rely on the new
operator always returning a previously unknown reference, you've got
problems.

However, I've looked at the String(char[]) docs, and the remarks say:

<quote>
If value is a null reference (Nothing in Visual Basic) or contains no
element, an Empty instance is initialized.
</quote>

I suspect if I hadn't known what that meant beforehand (due to seeing
this) I wouldn't have understood it.

I'd have thought this would actually take more work, and that there
wouldn't really be that much benefit in it. I just wonder where else
this might be lurking...


Nov 15 '05 #4
Hi, Jon,
Null and Empty string are the only cases.
This is to avoid too many instances of empty strings in CLR.
But I agree we probably should always create new strings in new String(...)

Gang Peng
[MS]

"Jon Skeet" <sk***@pobox.com> wrote in message
news:MP************************@news.microsoft.com ...
Daniel O'Connell <onyxkirx@--NOSPAM--comcast.net> wrote:
This worries me - not so much for the specific example, but for the
precedent set. What other new ... expressions might return non-new
references? This could have significant implications in multi-
threading, where you may rely on two references being different for
locking purposes.


This is odd...quite honestly. However, have you checked to make sure the JIT isn't making some kind of very clever optimzation here? Perhaps realizing your creating two strings from the same source(the char array of an already interned string), and not creating the new object but instead setting both x & y to the same reference?


It only happens with the empty string, as far as I can see.
Otherwise this is a quite disturbing find, indeed.


I think it's disturbing either way - basically, if you rely on the new
operator always returning a previously unknown reference, you've got
problems.

However, I've looked at the String(char[]) docs, and the remarks say:

<quote>
If value is a null reference (Nothing in Visual Basic) or contains no
element, an Empty instance is initialized.
</quote>

I suspect if I hadn't known what that meant beforehand (due to seeing
this) I wouldn't have understood it.

I'd have thought this would actually take more work, and that there
wouldn't really be that much benefit in it. I just wonder where else
this might be lurking...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Byron Morgan | last post by:
Anyone run into this before? I have a python app that has been reliable, running for days on end without a crash. Suddenly, It repeatedly crashes with the following message: "Fatal Python...
6
by: cppdev | last post by:
Hi All! I want to clear the string contents from sensitive information such as passwords, and etc. It's always a case that password will appear as string at some point or another. And i feel...
11
by: Zeng | last post by:
Hello, Is it true that in C# we can't modify directly a character in a string? I can't find the method anywhere. I would expect something like this should work,but it doesn't string myStr =...
7
by: Dale | last post by:
A year or two ago, I read an article on Microsoft's MSDN or Patterns and Practices site about application optimization when using strings. Some of the recommendations were: use string.Empty...
34
by: Larry Hastings | last post by:
This is such a long posting that I've broken it out into sections. Note that while developing this patch I discovered a Subtle Bug in CPython, which I have discussed in its own section below. ...
26
by: anonieko | last post by:
In the past I always used "" everywhere for empty string in my code without a problem. Now, do you think I should use String.Empty instead of "" (at all times) ? Let me know your thoughts.
35
by: Smithers | last post by:
I have been told that it is a good idea to *always* declare string variables with a default value of string.Empty - for cases where an initial value is not known... like this: string myString =...
3
by: Sin Jeong-hun | last post by:
If I use something like this, string html = "<h1>C# is great</h1>"; Console.WriteLine(html.Replace("&lt;","<").Replace("&gt;",">")); Does this recreate new string objects two times, even though it...
34
by: raylopez99 | last post by:
StringBuilder better and faster than string for adding many strings. Look at the below. It's amazing how much faster StringBuilder is than string. The last loop below is telling: for adding...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.