String Interning and Thread Locking

Chris Mullins

I've spent some time recently looking into optimizing some memory usage in
our products. Much of this was doing through the use of string Interning. I
spent the time and checked numbers in both x86 and x64, and have published
the results here:
http://www.coversant.com/dotnetnuke/...=88&EntryID=24

The benefits for our SoapBox suite of products are pretty compelling, memory
wise.

Before I roll the changes into our products, I have a very real concern:

The Intern Pool is (according to Richter) Process wide, and therefore shared
by any number of AppDomains. By implication, this also means all the threads
in the process share a single intern pool.

Our products are very heavily multi-threaded and I'm worried that accessing
the Intern Pool from a number of threads is going to introduce signifigant
locking that I don't have any control over.

Does anyone know what locking algorithms the Intern pool is using? Is it
Monitor semantics, Reader-Writer Lock semantics, or (I hope not) a remoting
construct to all cross appdomain synchronization ?

I don't have any visability into the Intern pool, and without that
visability I'm very hesitant to use it. I poked around with Reflector, but
it ends up in an Internal Call method pretty quickly.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Nov 8 '06 #1

Subscribe Reply

1711

Dave Hiniker - MSFT

Hi Chris,

There's a good chance that you can see for yourself in our shared source
release
(http://www.microsoft.com/downloads/d...displaylang=en)
by following that internal call or just searching for a string literal
implementation.

If we don't provide the implementation, I encourage you to write a
multi-threaded test with a representative number of readers and writers. I
believe you will find things very performant, and to check you could compare
against your own implementation of any of those locking schemes.

Hope this helps,
Dave

"Chris Mullins" wrote:

I've spent some time recently looking into optimizing some memory usage in
our products. Much of this was doing through the use of string Interning. I
spent the time and checked numbers in both x86 and x64, and have published
the results here:
http://www.coversant.com/dotnetnuke/...=88&EntryID=24

The benefits for our SoapBox suite of products are pretty compelling, memory
wise.

Before I roll the changes into our products, I have a very real concern:

The Intern Pool is (according to Richter) Process wide, and therefore shared
by any number of AppDomains. By implication, this also means all the threads
in the process share a single intern pool.

Our products are very heavily multi-threaded and I'm worried that accessing
the Intern Pool from a number of threads is going to introduce signifigant
locking that I don't have any control over.

Does anyone know what locking algorithms the Intern pool is using? Is it
Monitor semantics, Reader-Writer Lock semantics, or (I hope not) a remoting
construct to all cross appdomain synchronization ?

I don't have any visability into the Intern pool, and without that
visability I'm very hesitant to use it. I poked around with Reflector, but
it ends up in an Internal Call method pretty quickly.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Nov 13 '06 #2

Chris Mullins

My issue is that there's no guarantee the Shared Source implementation, and
the production .Net implementation are the same.

Likewise, unless it's documented somewhere in the ECMA standards, there's no
guarantee it won't change out from under me.

Rather than relying on the built-in string interning, I'm just using a
custom implementation. This will also allow me to sweep the intern table
from time to time, and remove unused strings. In a long running server
process, the idea of a string table that grows without bounds and provides
no mechanism for removing values just doesn't have very much appeal.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

"Dave Hiniker - MSFT" <Da************ *@discussions.m icrosoft.comwro te in
message news:68******** *************** ***********@mic rosoft.com...

Hi Chris,

There's a good chance that you can see for yourself in our shared source
release
(http://www.microsoft.com/downloads/d...displaylang=en)
by following that internal call or just searching for a string literal
implementation.

If we don't provide the implementation, I encourage you to write a
multi-threaded test with a representative number of readers and writers.
I
believe you will find things very performant, and to check you could
compare
against your own implementation of any of those locking schemes.

Hope this helps,
Dave

"Chris Mullins" wrote:

>I've spent some time recently looking into optimizing some memory usage
in
our products. Much of this was doing through the use of string Interning.
I
spent the time and checked numbers in both x86 and x64, and have
published
the results here:
http://www.coversant.com/dotnetnuke/...=88&EntryID=24

The benefits for our SoapBox suite of products are pretty compelling,
memory
wise.

Before I roll the changes into our products, I have a very real concern:

The Intern Pool is (according to Richter) Process wide, and therefore
shared
by any number of AppDomains. By implication, this also means all the
threads
in the process share a single intern pool.

Our products are very heavily multi-threaded and I'm worried that
accessing
the Intern Pool from a number of threads is going to introduce
signifigant
locking that I don't have any control over.

Does anyone know what locking algorithms the Intern pool is using? Is it
Monitor semantics, Reader-Writer Lock semantics, or (I hope not) a
remoting
construct to all cross appdomain synchronization ?

I don't have any visability into the Intern pool, and without that
visability I'm very hesitant to use it. I poked around with Reflector,
but
it ends up in an Internal Call method pretty quickly.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Nov 14 '06 #3

Brian Gideon

Chris,

I'm certainly not an expert at the internal workings of the CLR, but if
I were absolutely forced to make a guess I would say that the intern
pool probably uses some sort of very fast low-lock (or even lock-free)
hashtable implementation. Though, a monitor like implementation seems
reasonable. I think a reader-writer lock would be terribly slow in
this situation since the operation is not bound to IO or other blocking
resource. A remoting construct seems unlikely since the call quickly
dives into the internal CLR code wher it's not unreasonable to think
the standard AppDomain access rules do not apply.

Brian

Chris Mullins wrote:

I've spent some time recently looking into optimizing some memory usage in
our products. Much of this was doing through the use of string Interning. I
spent the time and checked numbers in both x86 and x64, and have published
the results here:
http://www.coversant.com/dotnetnuke/...=88&EntryID=24

The benefits for our SoapBox suite of products are pretty compelling, memory
wise.

Before I roll the changes into our products, I have a very real concern:

The Intern Pool is (according to Richter) Process wide, and therefore shared
by any number of AppDomains. By implication, this also means all the threads
in the process share a single intern pool.

Our products are very heavily multi-threaded and I'm worried that accessing
the Intern Pool from a number of threads is going to introduce signifigant
locking that I don't have any control over.

Does anyone know what locking algorithms the Intern pool is using? Is it
Monitor semantics, Reader-Writer Lock semantics, or (I hope not) a remoting
construct to all cross appdomain synchronization ?

I don't have any visability into the Intern pool, and without that
visability I'm very hesitant to use it. I poked around with Reflector, but
it ends up in an Internal Call method pretty quickly.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Nov 14 '06 #4

Chris Mullins

Hi Brian,

I'm curious, what makes you think about the lock constructs that way?

For example, in a number of cases, lock-free is slower than locking. It's
only under certain conditions that these two reverse roles, and lock-free
becomes the quicker of the two.

Also, I'm curious as to why you thin reader-writer lock would be slower than
a monitor? There's nothing inheriently I/O related to the reader-writer lock
that I know of. For an intern pool, I would imagine 90%+ of the operations
would be reads against a hashtable of some sort - which would imply a
reader-writer lock as an applicable algorithm. What here am I missing?

I agree with your assessment of "standard AppDomain access rules do not
apply", but it's always good to know for sure.

--
Chris Mullins

"Brian Gideon" <br*********@ya hoo.comwrote

Chris,

I'm certainly not an expert at the internal workings of the CLR, but if
I were absolutely forced to make a guess I would say that the intern
pool probably uses some sort of very fast low-lock (or even lock-free)
hashtable implementation. Though, a monitor like implementation seems
reasonable. I think a reader-writer lock would be terribly slow in
this situation since the operation is not bound to IO or other blocking
resource. A remoting construct seems unlikely since the call quickly
dives into the internal CLR code wher it's not unreasonable to think
the standard AppDomain access rules do not apply.

Brian

Chris Mullins wrote:
>I've spent some time recently looking into optimizing some memory usage
in
our products. Much of this was doing through the use of string Interning.
I
spent the time and checked numbers in both x86 and x64, and have
published
the results here:
http://www.coversant.com/dotnetnuke/...=88&EntryID=24

The benefits for our SoapBox suite of products are pretty compelling,
memory
wise.

Before I roll the changes into our products, I have a very real concern:

The Intern Pool is (according to Richter) Process wide, and therefore
shared
by any number of AppDomains. By implication, this also means all the
threads
in the process share a single intern pool.

Our products are very heavily multi-threaded and I'm worried that
accessing
the Intern Pool from a number of threads is going to introduce
signifigant
locking that I don't have any control over.

Does anyone know what locking algorithms the Intern pool is using? Is it
Monitor semantics, Reader-Writer Lock semantics, or (I hope not) a
remoting
construct to all cross appdomain synchronization ?

I don't have any visability into the Intern pool, and without that
visability I'm very hesitant to use it. I poked around with Reflector,
but
it ends up in an Internal Call method pretty quickly.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise
http://www.coversant.net/blogs/cmullins

Nov 14 '06 #5

Brian Gideon

Chris,

Honestly, it's nothing more than a guess. And I'm more than willing to
be very wrong about it.

You bring up a good point that low-lock strategies can be slower in
some scenarios. I hadn't originally considered that.

Similarly, the reader-writer lock would only be faster in some
scenarios as well. It's been my experience (admittedly limited) that
reader-writer locks work best in scenarios where the critical section
spends some time in a wait state. In the case of the intern pool I
would be surprised if a reader-writer lock were faster than a monitor
even if reads outnumbered writes 10 to 1. It might be beneficial for
me to reexamine the ReaderWriterLoc k and compare it against the Monitor
to see what scenarios it performs better and what the break even point
is.

Brian

Chris Mullins wrote:

Hi Brian,

I'm curious, what makes you think about the lock constructs that way?

For example, in a number of cases, lock-free is slower than locking. It's
only under certain conditions that these two reverse roles, and lock-free
becomes the quicker of the two.

Also, I'm curious as to why you thin reader-writer lock would be slower than
a monitor? There's nothing inheriently I/O related to the reader-writer lock
that I know of. For an intern pool, I would imagine 90%+ of the operations
would be reads against a hashtable of some sort - which would imply a
reader-writer lock as an applicable algorithm. What here am I missing?

I agree with your assessment of "standard AppDomain access rules do not
apply", but it's always good to know for sure.

--
Chris Mullins

Nov 15 '06 #6

Similar topics

15052

String.Join vs. StringBuilder, which is faster?

by: Bob | last post by:

I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in a comma-delimited string or string array. I don't want it to be a string because I want to overload this function, and it's sister already uses a string input parameter. Now if I define the function to take in a string array, it solves my...

C# / C Sharp

1648

how to make two references to one string that stay refered to the same string reguardless of the changing value in the string?

by: Daniel | last post by:

how to make two references to one string that stay refered to the same string reguardless of the changing value in the string?

C# / C Sharp

1943

string as lock key?

by: Lloyd Dupont | last post by:

in an ASP.NET page I have some static method which get value for the cache or, if the cache is empty, query the database, put the value in the cache and return it. because ASP.NET is thread intensive I was thinking to lock these method, using a fine grained lock. the best lock I was thinking about was the string key in the cache! however I have one concern, these unique string might be used in other lock, could they? could this be a...

ASP.NET

2802

Shall I now always use String.Empty instead of ""

by: anonieko | last post by:

In the past I always used "" everywhere for empty string in my code without a problem. Now, do you think I should use String.Empty instead of "" (at all times) ? Let me know your thoughts.

.NET Framework

9714

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

10599

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10347

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

10090

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

9173

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7635

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5673

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3832

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

3001

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General