Idea for ECMA/C# Standard - compile time hash for performance

wxs

Many times we have a bunch of enums we have from either different enums or
the same enum that will have various numeric values assigned. Rarely will
there be collisions in numbering between the enums. These enums you might
imagine would be like OrderPrice=27, OrderQuantity=50, OrderSide=62. There
may be a lot of these. So normally what we would have to do is store these
in hash table so hashtable[OrderPrice]=value could be set with the value.
Hashing is very expensive in terms of performance at the level we deal with.
What would be nice is if the C#/C++ compiler would allow me to do something
like this:
Below example is more C++ pseudo code than C# but you get the idea.

Imagine an original enum:

enum OrderInformation {., OrderPrice=27, OrderQuantity=50, OrderSide=62};

Now imagine if at compile time the compiler could do the hash for us like:

enum map LocalOrderInformation { OrderInformation.OrderPrice alias
OrderPrice=0, OrderInformation.OrderQuantity alias OrderQuantity,
OrderInformation.OrderSide alias OrderSide};

So basically this tells the compiler to create me a new enum mapping the old
enumeration to the new mapping to be 0 based or 1 based so that now we can
use a direct array lookup rather than a hashmap.

So now we could do something like:

int OrderInfo[3];

void OrderClass::ProcessOrder(LocalOrderInformation info, int value)

{

OrderInfo[info]=value;

}

So that a caller if they did something like this:

OrderClassInstance.ProcessOrder(OrderInformation.O rderPrice,27);

The compiler would translate the OrderInformatin.OrderPrice enum to our
LocalOrderInformation enum value for that enum to be 0. Now we can do a
direct array entry rather than a hashtable.

I'm sure there are numerous issues here to work out and it is not quite as
simple as the pseudo code above. But if it were and we could get the
compiler to do this mapping for us our runtime code would be much improved.

I'm sure there are technical hurdles and the need for improved syntax, but
what do people think of the idea? I think it would help a lot of stuff I
work on to avoid hashtables. Thoughts, ideas for better syntax, different
variations?

Thanks,

Dave

Nov 17 '05 #1

Subscribe Post Reply

3178

Bruce Wood

If your original enum type does not have values up into four digits or
more, why not just use an array directly, even if it's a sparse array?
What difference does it make, on a machine with 500 MB of memory, if
you waste 80 entries in an array of 100?

It seems hardly worth a change to the language (or a hash table for
that matter) to save such a small amount of memory. (The hash table
would probably consume as much overhead as that anyway.)

Nov 17 '05 #2

WXS

The example I gave was just that an example and the reality is those enums
can be almost anything in an integer range so sparse array really isn't an
answer, especially allocating the whole array. And if done as a lookup tree
hash or map requires that lookup time, which in the financial industry I work
in with thousands of market data messages (with lots of data in each), or
more comming in, it is becoming less and less viable by the day (although
increasing processor speeds help mitigate a bit - when a customer is willing
to upgrade).

The key is I don't want to burn a huge amount of memory and I want to incur
the minimal amount of lookup time possible. In this case the only way to
handle this is with a language change I believe. Because having the compiler
do the mapping at compile time saves me from burning huge memory and
virtually any lookup time at run time.
Thanks,
Dave
"Bruce Wood" wrote:

If your original enum type does not have values up into four digits or
more, why not just use an array directly, even if it's a sparse array?
What difference does it make, on a machine with 500 MB of memory, if
you waste 80 entries in an array of 100?

It seems hardly worth a change to the language (or a hash table for
that matter) to save such a small amount of memory. (The hash table
would probably consume as much overhead as that anyway.)

Nov 17 '05 #3

WXS

Just to give an idea of scale of the problem imagine I have an object that
contains 100 fields that I do hash lookups on to get and set the values. Now
imagine I need to populate, read and process 3000-6000 of these per second.
So for populating it is 300,000 operations per second (hashing) and an
equivalent number of reads. So we'll say 600,000 operations. On a 1.8GHZ box
600,000 calls to GetHashCode is about 1/4 of a second in and of itself not
counting the lookup then and processing. So when all is said and done the
hashing in this case in and of itself accounts for 25-35% of processor time
or more depending on how many times those members need to be accessed. If
this could be done with a straight array lookup using the minimal memory
possible we could probably cut this down to 1% processor time or so. A huge
savings since we do much more than just populate and read the data, we do
calculations and comparisons so having that extra 24-34% of CPU would help us
quite a bit.

"WXS" wrote:

The example I gave was just that an example and the reality is those enums
can be almost anything in an integer range so sparse array really isn't an
answer, especially allocating the whole array. And if done as a lookup tree
hash or map requires that lookup time, which in the financial industry I work
in with thousands of market data messages (with lots of data in each), or
more comming in, it is becoming less and less viable by the day (although
increasing processor speeds help mitigate a bit - when a customer is willing
to upgrade).

The key is I don't want to burn a huge amount of memory and I want to incur
the minimal amount of lookup time possible. In this case the only way to
handle this is with a language change I believe. Because having the compiler
do the mapping at compile time saves me from burning huge memory and
virtually any lookup time at run time.
Thanks,
Dave
"Bruce Wood" wrote:
If your original enum type does not have values up into four digits or
more, why not just use an array directly, even if it's a sparse array?
What difference does it make, on a machine with 500 MB of memory, if
you waste 80 entries in an array of 100?

It seems hardly worth a change to the language (or a hash table for
that matter) to save such a small amount of memory. (The hash table
would probably consume as much overhead as that anyway.)

Nov 17 '05 #4

Bruce Wood

Two reactions.

First, what is the nature of the keys you're using for lookup? Account
numbers? Something else? You may be able to design a better data
structure than a generic hash table to improve your speed.

Second, can you take advantage of multithreading and parallelism, and
in so doing move to a multi-CPU box and take advantage of those
multiple CPUs? Of course, it all depends upon the nature of your
problem: some problems don't scale well to parallel computing; others
do.

Nov 17 '05 #5

Bruce Wood

Oh, yes. I forgot to ask: what is your ratio of inserts to reads? Do
you ever delete entries from the hash table? These things impact the
kinds of data structures you could potentially use for the mapping.

I'm assuming that the hash table is established once (at compile time)
and never changed, otherwise your scheme of having the compiler do the
mapping for you would not work.

Nov 17 '05 #6

WXS

In the example scenario the hash would be an integer value so it is a pretty
quick hash but still a hash none the less. We could optimize the hash
possibly but even that likely would not match an array lookup.

We have to be careful about our scaling as the software can be used on
clients or servers for what we are building, some clients have multiple
cpu's. In this scenario example we can't take advantage of parallelism for
updating the one particular object as that is a discrete message that needs
to be processed in order. We do take advantage of parallelism by having
different data feeds process on different threads but our net data rate and
CPU utilization is still the same on a single CPU box. On a multi-cpu it
helps but having to do those hashes and lookups are still pretty costly which
an array lookup could avoid.

"Bruce Wood" wrote:

Two reactions.

First, what is the nature of the keys you're using for lookup? Account
numbers? Something else? You may be able to design a better data
structure than a generic hash table to improve your speed.

Second, can you take advantage of multithreading and parallelism, and
in so doing move to a multi-CPU box and take advantage of those
multiple CPUs? Of course, it all depends upon the nature of your
problem: some problems don't scale well to parallel computing; others
do.

Nov 17 '05 #7

WXS

For the object that maintains the fields not all of the fields may be set or
sometimes all of them will be. They are probably set once hash lookup to set
the value of the field and then read a lot more (hash lookups to get the
value of the field), now we have done things like hard code the fields rather
than have a generic object which requires walking a large IF statement which
is pretty fast, and we do that in some places but the code is very messy to
maintain... the array lookup would likely be fast enough without going to
extremes and keep the code maintainable.

Yes the hash is established once and a field may or may not be set for a
given message, so with an array we would burn the space, but since hash
tables burn some default space anyway to get the right bucket sizes it
probably is about equivalent.

I'm sure there are ways to improve performance other than what I am
suggesting here (there almost always are), but this type of problem of having
a set of numbers that are mostly unique to hash seems fairly common with the
code we work with. Given that if we could do the array lookup in that case
rather than a hash it would be a fairly easy way to significantly improve
performance.

"Bruce Wood" wrote:

Oh, yes. I forgot to ask: what is your ratio of inserts to reads? Do
you ever delete entries from the hash table? These things impact the
kinds of data structures you could potentially use for the mapping.

I'm assuming that the hash table is established once (at compile time)
and never changed, otherwise your scheme of having the compiler do the
mapping for you would not work.

Nov 17 '05 #8

Bruce Wood

So...

You have an integer value and you need to get something based on that
value.

The set of integers never changes. It is always the same. The same
integer always maps to the same object.

The range of the set is very large and very sparse, so using the
integers as array indices is a non-starter. (How large is the range,
anyway, and how sparse? What is the min and max of the integer values?
How many objects are there, in the end?)

Did I miss anything?

Nov 17 '05 #9

WXS

As I mentioned there a probably plenty of otherways to improve performance,
but none as simple or as straightforward as providing this compiler option
and since we see this problem come up all the time, it would seem to make
sense to have an option in the compiler to do this.

Background detail on the problem just to give an idea of scale. We know how
to go about improving things in this problem space in otherwise, it just
seemed the compiler option would be a simple case to help out when people hit
this generic hashing problem:

A lot of times an individual team may not have complete control of the range
since the base enumeration is provided by another team, so it varies widely
depending on what the object refers to and the usage of the enumeration.
I've seen some of enums go to the millions mostly to create ranges of enums
for people to work with, without overlapping so it is very sparse. We may
get new integers from new enums in the future we would need to add to our
set, but for a given run that set would not change. In one of the scenarios
even if the range were not that large, let's say 100 (but sparsely
populated). Since in the course of one second we could have as much as 13,000
messages each requiring some of the fields if we did what you had earlier
suggested and just burned memory in one second we would eat up easily 1.3MB
and that is not even the worst case scenario. In a stable state it would
depend on how many symbols someone was watching which range from just a few
to as many as 2500 or so and each of those could have from zero to 1000
objects associated with them.

"Bruce Wood" wrote:

So...

You have an integer value and you need to get something based on that
value.

The set of integers never changes. It is always the same. The same
integer always maps to the same object.

The range of the set is very large and very sparse, so using the
integers as array indices is a non-starter. (How large is the range,
anyway, and how sparse? What is the min and max of the integer values?
How many objects are there, in the end?)

Did I miss anything?

Nov 17 '05 #10

Bruce Wood

> As I mentioned there a probably plenty of otherways to improve
performance, but none as simple or as straightforward as providing this
compiler option

I don't mean to be confrontational, but I see nothing simple or
straightforward about having Microsoft propose changes to a language
standard to ECMA, have it approved, and then have their army of
programmers implement a change to the C# language, while all the while
ensuring that his doesn't break any code that's already out there...
doesn't seem simple to me. :)

and since we see this problem come up all the time
Perhaps in your business. I've never had this problem before and never
been anywhere else where they had it. That's why I'm fishing around
trying to cook up another solution for you, because I think that this
is a very specific problem to your domain, and your odds of having it
fixed by an addition to the language definition are about nil. Not that
I don't want you to succeed... I just think that it's highly unlikely.
:)
Since in the course of one second we could have as much as 13,000

messages each requiring some of the fields if we did what you had
earlier suggested and just burned memory in one second we would eat up
easily 1.3MB and that is not even the worst case scenario.

You've completely lost me here. How can the number of messages arriving
chew up more and more memory in a mapping that is fixed at compile
time? The only memory lost by using a sparse array would be one pointer
(either 4 or 8 bytes, depending upon whether you're running 32-bit or
64-bit architecture) for each enum value that doesn't map to anything.
That loss is fixed, and not affected by any messages arriving. It's
just a different way of performing the mapping that your current hash
table gives you, unless I'm missing something. (Highly likely.)

My first two choices for an alternate data structure would be:

1. Replace the hash table by an array of pointers, and use the enum to
index directly into the array. As I pointed out, this is just the same
thing as your hash table, except that it wastes either 4 or 8 bytes for
each value that either has no equivalent enum or has an enum but that
enum doesn't map to anything (but then the latter would have overhead
in your compiler-change scenario as well). This is the classic
speed-for-memory tradeoff.

2. Replace the hash table by an array of byte or short values
(depending upon the maximum number of enums that will map at any one
time). Look up the enum, get a value which is an index into another
array. This will cut the memory waste in half (or in quarter), but at
the cost of limiting the number of enums that can map. This is an
unlikely solution unless you're running a 64-bit architecture and you
know that very few enums will be mapped at any one time.

Nov 17 '05 #11

WXS

I agree with you the chance of a compiler change is slim, but if it wasn't it
would be the best performing option available, with the easiest maintenance
and least effort for us at least :)

Yes this problem is specific to the business and types of data, plus
architecture in use. Realtime programming with constant datastreams most
people aren't used to except for maybe people involved in low level
networking layers, hardware, some signal processing, and the financial
industry (probably plenty of others I left out, but still probably a very
small subset of developers dealing with this.). We've actually been through
all of those types of ideas and the best performing solution in this scenario
but the ugliest code was is an if statement checking the high access fields
and then delegating to the standard hash for fields accessed less frequently.

I'm sure there are some new ideas out there, we've come up with a lot here
but none would perform better than either the if statement which is ugly and
less efficient when very large, or the array lookup which would require the
compiler change. (Unless we just plain changed to direct member access but
we usually have to relate those members back to field id's... but if we got
out of that completely we would just have a lot of very specific objects (we
were trying to implement a generic object that can optimize specific fields
and fallback for unknown fields. (We are writing generic data objects but
need to be optimized for specific types of data because the hash is too much
overhead for our high access rate.)

We actually test are system at data rates under stress up to 20,000 or more
messages per second and each message could be one of these objects with up to
100+ fields in a hash. Some messages may only have as few as 10 fields
stored in the hash but have disparate field id's (the hash code that could be
any value). So allocating an array to the max that enum value could be which
could be very high for 20,000 messages per second would obviously be out (as
if the range were just 1000 allocating an array of references for of the size
1000 for each message would mean 1000*20,000=10,000,000 object references * 4
bytes on 32bit intel which would be 40MB). The other indexing schemes would
require a binary search, linear walk, or possibly skip list implementations
which are all slower than the if statement method I mentioned (but a bit
cleaner I must say), and obviously slower than the array lookup since there
is both a search and lookup in the other methods (Methods#2) you mentioned.

We've got a pretty good handle on some exotic datastructures and we've tried
quite a few including a look-aside type addressing as you mention, priority
heap trees, skip lists, binary trees, red-black trees, custom hashes,
priority queues, etc. When it comes down to it the fastest is no lookup at
all which means non-generic specific code, the next best is hard coded checks
like if's for small numbers of items (like jump tables, or ordered jump
tables), and depending on the number of items probably a direct memory lookup
like an array lookup of known size of each element. Anything more complex
will likely be slower, or if not, non-generic. The other idea we tossed
around was a hardware accelerator for certain memory accesses, like a private
cache we could control that could not be impacted by other threads or
processes on the machine with some special hardware functionality, don't
think we'd have any better luck at getting that implemented than the compiler
change though :(

"Bruce Wood" wrote:

As I mentioned there a probably plenty of otherways to improve

performance, but none as simple or as straightforward as providing this
compiler option

I don't mean to be confrontational, but I see nothing simple or
straightforward about having Microsoft propose changes to a language
standard to ECMA, have it approved, and then have their army of
programmers implement a change to the C# language, while all the while
ensuring that his doesn't break any code that's already out there...
doesn't seem simple to me. :)
and since we see this problem come up all the time

Perhaps in your business. I've never had this problem before and never
been anywhere else where they had it. That's why I'm fishing around
trying to cook up another solution for you, because I think that this
is a very specific problem to your domain, and your odds of having it
fixed by an addition to the language definition are about nil. Not that
I don't want you to succeed... I just think that it's highly unlikely.
:)
Since in the course of one second we could have as much as 13,000

messages each requiring some of the fields if we did what you had
earlier suggested and just burned memory in one second we would eat up
easily 1.3MB and that is not even the worst case scenario.

You've completely lost me here. How can the number of messages arriving
chew up more and more memory in a mapping that is fixed at compile
time? The only memory lost by using a sparse array would be one pointer
(either 4 or 8 bytes, depending upon whether you're running 32-bit or
64-bit architecture) for each enum value that doesn't map to anything.
That loss is fixed, and not affected by any messages arriving. It's
just a different way of performing the mapping that your current hash
table gives you, unless I'm missing something. (Highly likely.)

My first two choices for an alternate data structure would be:

1. Replace the hash table by an array of pointers, and use the enum to
index directly into the array. As I pointed out, this is just the same
thing as your hash table, except that it wastes either 4 or 8 bytes for
each value that either has no equivalent enum or has an enum but that
enum doesn't map to anything (but then the latter would have overhead
in your compiler-change scenario as well). This is the classic
speed-for-memory tradeoff.

2. Replace the hash table by an array of byte or short values
(depending upon the maximum number of enums that will map at any one
time). Look up the enum, get a value which is an index into another
array. This will cut the memory waste in half (or in quarter), but at
the cost of limiting the number of enums that can map. This is an
unlikely solution unless you're running a 64-bit architecture and you
know that very few enums will be mapped at any one time.

Nov 17 '05 #12

Bruce Wood

Ahh. The fog clears. You're looking up object properties based on a
key, like a field code. So, each object has to contain the mapping we
talked about. OK... now it all falls into place.

Here's how you can get the effect you're looking for without the
compiler change. At least, it's the best I can come up with.

Allocate a static array of the enums that you're dealing with. The
array maps each enum onto an index value. This is exactly the thing you
wanted the compiler to do, just that you're doing it at run time. It
will cost you one array lookup.

The array gives you back a zero-based index value. That value provides
an index into an array that each object holds with its property values.

Remember, there is only one translation array for the whole system. (Or
two or three if you have multiple mappings.) You don't need one for
each object, because the enum-to-index mapping is the same for every
object.

Now, some optimizations.

Let's say that 20% of the fields in any given object are hit 80% of the
time, and let's say that making an array for all possible fields uses
up too much memory, because many objects will be missing many fields,
leading to a large waste of space.

Then what you do is in your mapping array (remember, there's only one
of them), put in some code values: 0 or above means "look in the array
at this index for the field value." -1 means "look in the hash table;
this field isn't used commonly enough to warrant a space in the array."
-2 means "this enum is not used. Error!" Now you can decide how many
fields and which fields belong in the array, and which in the hash
table.

It's your "if" chain sped up: one array lookup gets you the most common
fields without cascading through an if...else if chain. It's also
better than your compiler change because it allows you to control the
size of the "fast cache": the array in each object, and relegate
seldom-used fields to the hash table in each object.

Assuming that you have no more than 100 or so often-used fields that
deserve spots in the array, you can make your enum-to-index array an
array of bytes, so your only cost is one wasted byte for each enum that
has no corresponding array entry. If you use bytes, just change your
"in hash table" and "invalid enum" entries to 254 and 255, leaving you
254 possible "often-used fields". Later on you could move up to shorts
if you like, with a bit more wasted memory but 65534 possible
"often-used fields".

You could further control memory consumption in your target objects by
having a "maximum often-used field index" constant defined, and making
the arrays only that big, so they don't need to have 254 (or 65534)
slots, only as many as there are often-used fields in your system.
Since .NET arrays are allocated at run-time, you wouldn't even need to
recompile to increase that limit (although you probably would anyway,
to change the mapping array).

Again, this is exactly the system you envisioned in the compiler
change, only done at run-time, using a single static mapping array, and
more flexible because you can control which "often-used" fields are
stored in an array and which are stored in the hash table.

Hope this helps. :)

Nov 17 '05 #13

Idea for ECMA/C# Standard - compile time hash for performance

Similar topics