473,850 Members | 2,093 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Hashing function (possibly OT)

Hi,

I apologise in advance if this is off topic, but would appreciate any
pointers you chaps might be able to provide.

I'm relatively novice in the art of C so am after any suggestions about
implementing a very simple hash (or quasi hash)

I need to be able to store a counter and a timestamp for requests from
particular device MAC addresses (6 bytes, presented as xx:xx:xx:xx:xx: xx
where x is a hex digit), ideally in a dynamic data structure. Additionally I
need to be able to access the data quickly, hence my thoughts to use a
hashing function.

This may be naive but I'd thought to code something along the line of a tree
structure, with each octet representing a node holding 256 pointers to the
next set of nodes (from most significant to least significant octect) with
the last octect holding pointers to a simple record holding the data.

This would mean that at most i would need to traverse 6 tree nodes before
hitting the data whatever the volume of addresses.(?)

Does this seem feasible?

If so (and this is where I get unstuck :o)), in C i'd have a structure
defined as an array of 255 pointers, then allocate space for one of these
structures as I parse each octect of the mac address. I'm having difficulty
in defining the structures correctly so that the last but one octet
references the actual data rather than another set of node pointers.

Can someone please advise on the best way to do this? Or if you think the
above is complete pap suggest any easier or more sensible ( please :P)
solutions. I'm open to advice/flaming/suggestions!

Cheers,
Matt
Nov 13 '05 #1
2 2089
Matt Bull wrote:

I need to be able to store a counter and a timestamp for requests from
particular device MAC addresses (6 bytes, presented as xx:xx:xx:xx:xx: xx
where x is a hex digit), ideally in a dynamic data structure. Additionally I
need to be able to access the data quickly, hence my thoughts to use a
hashing function.

This may be naive but I'd thought to code something along the line of a tree
structure, with each octet representing a node holding 256 pointers to the
next set of nodes (from most significant to least significant octect) with
the last octect holding pointers to a simple record holding the data.
<off-topic>

This approach would probably waste a lot of memory. For
example, suppose your entire data base contains just two six-byte
keys that happen to differ in the first byte. You'll allocate
one 256-entry node for the first byte and two for each of the
second through sixth bytes -- eleven 256-entry nodes in all, just
for twelve bytes of key data?

Incidentally, the scheme you describe is known as a "trie,"
and doesn't sound like a hashing method at all.

</off-topic>
If so (and this is where I get unstuck :o)), in C i'd have a structure
defined as an array of 255 pointers,
Don't you mean 256?
then allocate space for one of these
structures as I parse each octect of the mac address. I'm having difficulty
in defining the structures correctly so that the last but one octet
references the actual data rather than another set of node pointers.
The first step is to define your "payload," the data
structure you'll find at the eventual end of whatever search
path you'll take. A typical way to do this in C is with a
struct, something like

typedef struct {
TimeStampType when;
CounterType count;
OtherType other;
} Payload;

Next, you need something capable of pointing to 256
different Payloads. An array of pointers is just the trick:

typedef Payload *Level6[256];

That is, a `Level6' object is an array of 256 pointers to
`Payload' objects. At the next level you'll need a bunch of
pointers to Level6's, and so on up to the Level1 root:

typedef Level6 *Level5[256];
typedef Level5 *Level4[256];
typedef Level4 *Level3[256];
typedef Level3 *Level2[256];
typedef Level2 *Level1[256];

/* finally, here's the root of your trie: */
Level1 root;

.... and there you have it: The lone Level1 object points to 256
Level2's, each Level2 points to 256 Level3's, and so on until
the chosen Level6 object points to the desired Payload.

However, this is an incredibly verbose way to proceed! The
novice C programmer is usually comfortable with the idea of a
"pointer to something" but starts to get queasy when confronted
with a "pointer to pointer to something." The LevelN types have
no real function other than to assign names to various "somethings "
so the novice can visualize them more easily -- throw away all
those names and you get the exact same structure, expressed in
terms of multiple levels of pointers:

Payload *****root[256];

(Actually, there's a subtle difference between the two
formulations: the LevelN version carries the "256-ness" information
along while the latter version doesn't, and this will affect the way
you calculate how much memory to allocate for each node. Still,
the essential structure of a six-level trie is the same.)
Can someone please advise on the best way to do this? Or if you think the
above is complete pap suggest any easier or more sensible ( please :P)
solutions. I'm open to advice/flaming/suggestions!


<off-topic>

Choice of data structure and search/update method really hasn't
anything to do with the C language. Once a choice has been made,
the question of how best to express the solution in C would be
topical, but the selection itself is driven by criteria that would
be just as valid (or invalid) for Fortran, Java, BASIC, COBOL, ...

Also, you haven't revealed enough about your situation to allow
anything more than informed guesswork. For example, how many of
these six-byte keys do you need to store? You say you need to
access them "quickly," but you don't mention how many million queries
per second you need to sustain. Once entered, do the keys persist
forever or can they be deleted? And so on, and so forth -- and
NONE of these apparently vital considerations has anything at all
to do with C. Look elsewhere for advice, and return here if you're
having difficulty with coding the chosen solution in C.

</off-topic>

Good luck!

--
Er*********@sun .com
Nov 13 '05 #2

"Eric Sosman" <Er*********@su n.com> wrote in message
news:3F******** *******@sun.com ...
Matt Bull wrote:

I need to be able to store a counter and a timestamp for requests from
particular device MAC addresses (6 bytes, presented as xx:xx:xx:xx:xx: xx
where x is a hex digit), ideally in a dynamic data structure. Additionally I need to be able to access the data quickly, hence my thoughts to use a
hashing function.

This may be naive but I'd thought to code something along the line of a tree structure, with each octet representing a node holding 256 pointers to the next set of nodes (from most significant to least significant octect) with the last octect holding pointers to a simple record holding the data.
<off-topic>

This approach would probably waste a lot of memory. For
example, suppose your entire data base contains just two six-byte
keys that happen to differ in the first byte. You'll allocate
one 256-entry node for the first byte and two for each of the
second through sixth bytes -- eleven 256-entry nodes in all, just
for twelve bytes of key data?

Incidentally, the scheme you describe is known as a "trie,"
and doesn't sound like a hashing method at all.


I've analysed most of the data we are processing at the moment and the keys
would be in quite a low number of most significant byte "chunks", where the
vast majority of difference lies in lower 3 octets. So it might be ok.

This method seemed to offer the best of both ease of programming and memory
use at the time, although I'm open to suggestions for another method that
wouldn't make my brain melt :o)
</off-topic>
If so (and this is where I get unstuck :o)), in C i'd have a structure
defined as an array of 255 pointers,
Don't you mean 256?


Yes I did! hehe
The first step is to define your "payload," the data
structure you'll find at the eventual end of whatever search
path you'll take. A typical way to do this in C is with a
struct, something like
<snipped good stuff>
(Actually, there's a subtle difference between the two
formulations: the LevelN version carries the "256-ness" information
along while the latter version doesn't, and this will affect the way
you calculate how much memory to allocate for each node. Still,
the essential structure of a six-level trie is the same.)

Ahhh thas excellent, just the info I was after!
Can someone please advise on the best way to do this? Or if you think the above is complete pap suggest any easier or more sensible ( please :P)
solutions. I'm open to advice/flaming/suggestions!


<off-topic>

Choice of data structure and search/update method really hasn't
anything to do with the C language. Once a choice has been made,
the question of how best to express the solution in C would be
topical, but the selection itself is driven by criteria that would
be just as valid (or invalid) for Fortran, Java, BASIC, COBOL, ...


Yes, apologies again for the off topic sway of the query. I'm somewhat stuck
with implementing in C as thats the only supported API the application this
code must interact with supports.
Also, you haven't revealed enough about your situation to allow
anything more than informed guesswork. For example, how many of
these six-byte keys do you need to store? You say you need to
access them "quickly," but you don't mention how many million queries
per second you need to sustain. Once entered, do the keys persist
forever or can they be deleted? And so on, and so forth -- and
NONE of these apparently vital considerations has anything at all
to do with C. Look elsewhere for advice, and return here if you're
having difficulty with coding the chosen solution in C.

Well, at the moment it needs to retain around 100,000 individual "payloads",
persisting in memory until the application is terminated, but generally only
seeing 100-500 new or duplicated keys per second.
</off-topic>

Good luck!


Thanks very much for the information Eric, you've been very helpful!

I'll let you know how I get on.

:o)

Cheers,
Matt
Nov 13 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2033
by: Pat | last post by:
I want to look for some one-to-one hashing function. In C++, any one-to-one hashing function?
11
3444
by: Wm. Scott Miller | last post by:
Hello all! We are building applications here and have hashing algorithms to secure secrets (e.g passwords) by producing one way hashes. Now, I've read alot and I've followed most of the advice that made sense. One comment I've seen alot about is "securing the hashing routine" but no-one explains how to accomplish this. So how do I secure my hashing routine? Do I use code access security, role based security, ACLs, etc or combination?...
10
2881
by: Dino M. Buljubasic | last post by:
Hi, I am using MD5 to hash my passwords and add them to database as hashed. I have noticed though that some passwords don't get recognized and I suppose that it happen because hashing might introduce some characters in my password that are not handled properly by SQL server then. For example, password 'startreck' works just fine password 'test' does not
19
3858
by: Ole Nielsby | last post by:
How does the GetHashCode() of an array object behave? Does it combine the GetHashCode() of its elements, or does it create a sync block for the object? I want to use readonly arrays as dictionary keys, based on their content, not their identity. Is this feasible using the arrays directly, or do I need to wrap them in a struct that handles GetHashCode and Equal? If so, is such a wrapper present in the standard class library?
6
2268
by: Jayender | last post by:
Hi, What is the difference between Hashing and Encryption ?
4
3425
by: wkatz | last post by:
Hi, Gurus. What hashing algorithm outputs hash value as numbers only? For example, if you pass a “John Q. Public” it will output 23324. If there is no such hashing, how hard is it to hire somebody to write a fairly quick one? It could be some fast hashing and then another function that creates numbers. Much obliged. wkatz.
11
2166
by: January Weiner | last post by:
Hello, I need to use a hashing function for relatively short strings (roughly 16 characters or less). The data that needs to be accessed via hash is just a simple int. I just need to look up values in two dimensional matrices each time that I encounter a pair of these strings. I would like to keep the code as simple and standard as possible, but at the same time to have a reasonable performance.
15
3021
by: Vinodh | last post by:
I am reading about hashing techniques. The map data structure available in C++ STL uses hashing techniques?
24
1966
by: Johnny Jörgensen | last post by:
I'm wondering (and hoping that somebody will be able to answer this): If I calculate the hash value of files (either MD5 or SHA1), can I then be sure that: 1) Two files with the same hash value are in fact identical? 2) Two different files will NEVER have the same hash value? 3) If two files have the same MD5 hash value, they will ALSO have the same
0
9895
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9741
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11011
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10666
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10351
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5735
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5929
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4546
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4140
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.