473,800 Members | 2,640 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Hash structure or Binary Search Tree...

I'm reposting this, as everyone went off on the "did this say c++"
tangent, when someone thought he was reading a different group. Hope I get
better results this time.

I need an efficient structure to store a counter and a series of IP
addresses. I would like to use something that's not going to slow down
once given large amounts of data. I'd like it to run as close to O(n) as
possible.

The main goal of this project is to create a daemon, which receives
data delimited by a new line, hashes the data, looks it up and increments
the associated counter. If the counter reaches a configured threshold,
send back 1 to the client, otherwise send 0.

Any comments/questions would be much appreciated...

Thank You,
Sig
Nov 14 '05 #1
7 6286
exonic wrote:
....snip ...
I need an efficient structure to store a counter and a series of
IP addresses. I would like to use something that's not going to
slow down once given large amounts of data. I'd like it to run
as close to O(n) as possible.

The main goal of this project is to create a daemon, which
receives data delimited by a new line, hashes the data, looks it
up and increments the associated counter. If the counter reaches a
configured threshold, send back 1 to the client, otherwise send 0.

Any comments/questions would be much appreciated...


You could use the hashlib.zip library. This reduces your problem
to optimizing the hash functions, and the status feedback from the
library will help you do this. It will run O(1), which is
considerably better than O(n). See:

<http://cbfalconer.home .att.net/download/>

A slight modification of the wdfreq demo program should do it.

--
Chuck F (cb********@yah oo.com) (cb********@wor ldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net> USE worldnet address!
Nov 14 '05 #2
exonic wrote:
I'm reposting this, as everyone went off on the "did this say c++"
tangent, when someone thought he was reading a different group. Hope I get
better results this time.
Well, you really have a data structure/algorithm question as opposed to
a question about the C language. The newsgroup news:comp.progr amming
would be the right place to ask. (And no, the fact that you're writing
this in C is not relevant.)
I need an efficient structure to store a counter and a series of IP
addresses. I would like to use something that's not going to slow down
once given large amounts of data. I'd like it to run as close to O(n) as
possible.
O(n)? You're joking, right? ;-) You'd achieve O(n) by using a linked
list, the most naive possible approach. A balanced binary tree would
give you O(log n) -- and a hash table O(1) (though, the devil is inthe
constants).
The main goal of this project is to create a daemon, which receives
data delimited by a new line, hashes the data, looks it up and increments
the associated counter. If the counter reaches a configured threshold,
send back 1 to the client, otherwise send 0.
Daemons are also not topical here. Going to news:comp.unix. programmer
would be appropriate.
Any comments/questions would be much appreciated...

<OT>
In a situation like this, the likely access pattern of the IP addresses
can be quite significant. A web search on some of the following might be
fruitful:

Red-black trees
AVL trees
Splay trees
Skip lists
</OT>

In any event, *please* read the FAQ at:

http://www.eskimo.com/~scs/C-faq/top.html

HTH,
--ag

--
Artie Gold -- Austin, Texas

Nov 14 '05 #3
On Mon, 05 Jan 2004 11:33:09 -0500, exonic <ex****@triton. net> wrote
in comp.lang.c:
I'm reposting this, as everyone went off on the "did this say c++"
tangent, when someone thought he was reading a different group. Hope I get
better results this time.

I need an efficient structure to store a counter and a series of IP
addresses. I would like to use something that's not going to slow down
once given large amounts of data. I'd like it to run as close to O(n) as
possible.

The main goal of this project is to create a daemon, which receives
data delimited by a new line, hashes the data, looks it up and increments
the associated counter. If the counter reaches a configured threshold,
send back 1 to the client, otherwise send 0.

Any comments/questions would be much appreciated...

Thank You,
Sig


Why would anyone bother to hash (IPV4) IP addresses? Each and every
distinct IP address is guaranteed to fit into an unsigned long.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #4
Jack Klein wrote:
exonic <ex****@triton. net> wrote

.... snip ...
I need an efficient structure to store a counter and a series of
IP addresses. I would like to use something that's not going to
slow down once given large amounts of data. I'd like it to run
as close to O(n) as possible.

The main goal of this project is to create a daemon, which
receives data delimited by a new line, hashes the data, looks it
up and increments the associated counter. If the counter reaches a
configured threshold, send back 1 to the client, otherwise send 0.


Why would anyone bother to hash (IPV4) IP addresses? Each and every
distinct IP address is guaranteed to fit into an unsigned long.


Because he only wants a subset of all possible addresses,
dependant on what were used. Suitable structures include an
adaptive hash table (hashlib) and a linked list. Only the hash
table offers O(1) access times AFAICT.

The hash table solution also automatically adapts to IPV6
addresses, and can easily handle input in either binary (32 or 48
bit words) or text strings.

--
Chuck F (cb********@yah oo.com) (cb********@wor ldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net> USE worldnet address!
Nov 14 '05 #5
CBFalconer wrote:
Jack Klein wrote:
exonic <ex****@triton. net> wrote

... snip ...
> I need an efficient structure to store a counter and a series of
> IP addresses. I would like to use something that's not going to
> slow down once given large amounts of data. I'd like it to run
> as close to O(n) as possible.
>
> The main goal of this project is to create a daemon, which
> receives data delimited by a new line, hashes the data, looks it
> up and increments the associated counter. If the counter reaches a
> configured threshold, send back 1 to the client, otherwise send 0.


Why would anyone bother to hash (IPV4) IP addresses? Each and every
distinct IP address is guaranteed to fit into an unsigned long.


Because he only wants a subset of all possible addresses,
dependant on what were used. Suitable structures include an
adaptive hash table (hashlib) and a linked list. Only the hash
table offers O(1) access times AFAICT.


A trie offers O(1) access, plus it keeps the data ordered.

Jeremy.
Nov 14 '05 #6
Jeremy Yallop wrote:

CBFalconer wrote:
Jack Klein wrote:
exonic <ex****@triton. net> wrote

... snip ...
> I need an efficient structure to store a counter and a series of
> IP addresses. I would like to use something that's not going to
> slow down once given large amounts of data. I'd like it to run
> as close to O(n) as possible.
>
> The main goal of this project is to create a daemon, which
> receives data delimited by a new line, hashes the data, looks it
> up and increments the associated counter. If the counter reaches a
> configured threshold, send back 1 to the client, otherwise send 0.

Why would anyone bother to hash (IPV4) IP addresses? Each and every
distinct IP address is guaranteed to fit into an unsigned long.


Because he only wants a subset of all possible addresses,
dependant on what were used. Suitable structures include an
adaptive hash table (hashlib) and a linked list. Only the hash
table offers O(1) access times AFAICT.


A trie offers O(1) access, plus it keeps the data ordered.


Does it? I would have expected O(logN) offhand.

--
Chuck F (cb********@yah oo.com) (cb********@wor ldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net> USE worldnet address!
Nov 14 '05 #7
CBFalconer wrote:

Jeremy Yallop wrote:

CBFalconer wrote:

Because he only wants a subset of all possible addresses,
dependant on what were used. Suitable structures include an
adaptive hash table (hashlib) and a linked list. Only the hash
table offers O(1) access times AFAICT.


A trie offers O(1) access, plus it keeps the data ordered.


Does it? I would have expected O(logN) offhand.


<off-topic>

It's a different "N." A trie takes O(log N) where
N is the length of a single key, not the number of keys
stored. If all keys have the same N, log N is a constant
and the retrieval time is thus O(1).

A few months back somebody on a different newsgroup
was proposing to store about 100000 IPv4 addresses in
four-level trie with 256-way branching at each node.
A few back-of-the-envelope computations convinced him
this would be an extravagantly wasteful data structure:
the top two levels would be full or nearly so, but the
bottom two levels (the bulk of the trie) would be almost
empty, averaging less than two keys apiece.

See also Knuth "The Art of Computer Programming,
Volume III: Sorting and Searching," Section 6.3.

</off-topic>

--
Er*********@sun .com
Nov 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
2534
by: jova | last post by:
Is there a difference between a Binary Tree and a Binary Search Tree? If so can someone please explain. Thank You.
0
4256
by: j | last post by:
Hi, Anyone out there with binary search tree experience. Working on a project due tomorrow and really stuck. We need a function that splits a binary tree into a bigger one and smaller one(for a random binary search tree. We've tried everything but are in the land of pointer hell. If someone could help it would be a huge help. our code follows. We've tried 2 diff methods the split() and splitter() functions #include <iostream> #include...
4
9021
by: Tarique Jawed | last post by:
Alright I needed some help regarding a removal of a binary search tree. Yes its for a class, and yes I have tried working on it on my own, so no patronizing please. I have most of the code working, even the removal, I just don't know how to keep track of the parent, so that I can set its child to the child of the node to be removed. IE - if I had C / \ B D
12
3206
by: wxs | last post by:
Many times we have a bunch of enums we have from either different enums or the same enum that will have various numeric values assigned. Rarely will there be collisions in numbering between the enums. These enums you might imagine would be like OrderPrice=27, OrderQuantity=50, OrderSide=62. There may be a lot of these. So normally what we would have to do is store these in hash table so hashtable=value could be set with the value....
6
16882
by: fdmfdmfdm | last post by:
This might not be the best place to post this topic, but I assume most of the experts in C shall know this. This is an interview question. My answer is: hash table gives you O(1) searching but according to your hash function you might take more memory than binary tree. On the contrary, binary tree gives you O(logN) searching but less memory. Am I right?
4
11015
by: whitehatmiracle | last post by:
Hello I have written a program for a binary tree. On compiling one has to first chose option 1 and then delete or search. Im having some trouble with the balancing function. It seems to be going into an infinite loop, i think im messing up with the pointers. Heres the code. //press 1, first, Do not press it a second time!!!
11
1191
by: Defected | last post by:
Hi, How i can create a Binary Search Tree with a class ? thanks
4
608
by: sharan | last post by:
Hello Friends I have a problem in Data Structure (In C) i want to implement a General tree having three child nodes of each parent node ...please send me any simple Example( code) by which i can understand General Tree codes.....node be int value if possible than help me for a name(string)
23
5743
by: raylopez99 | last post by:
A quick sanity check, and I think I am correct, but just to make sure: if you have a bunch of objects that are very much like one another you can uniquely track them simply by using an ArrayList or Array, correct? An example: create the object, create an array, the stuff the object into the array. Later on, assume the object is mutable, the object changes, but you can find it, if you have enough state information to uniquely identify...
0
9550
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10501
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10032
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7574
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5469
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5603
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4149
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3764
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2944
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.