471,344 Members | 1,539 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,344 software developers and data experts.

Binary search tree

Hi,

I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).

I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.

Couldn't find any python library that implements trees.
Is there some library of this kind in python? Or can I find it
somewhere else?

Nov 9 '07 #1
5 4014
ma*********@gmail.com wrote:
Hi,

I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).

I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.

Couldn't find any python library that implements trees.
Is there some library of this kind in python? Or can I find it
somewhere else?
Put them into a set. You can check for existence (very fast) and at the end it
is easy to sort.

-Larry
Nov 9 '07 #2
On 2007-11-09, Larry Bates <la*********@websafe.comwrote:
ma*********@gmail.com wrote:
>I have to get list of URLs one by one and to find the URLs
that I have more than one time(can't be more than twice).

I thought to put them into binary search tree, this way
they'll be sorted and I'll be able to check if the URL already
exist.

Couldn't find any python library that implements trees. Is
there some library of this kind in python? Or can I find it
somewhere else?

Put them into a set. You can check for existence (very fast)
and at the end it is easy to sort.
The set is likely the way to go.

Python's library support for binary search trees consists of the
bisect module.

--
Neil Cerutti
Ask about our plans for owning your home --sign at mortgage company
Nov 9 '07 #3
On Nov 9, 4:06 pm, maxim.no...@gmail.com wrote:
Hi,

I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).

I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.

Couldn't find any python library that implements trees.
Is there some library of this kind in python? Or can I find it
somewhere else?
Can you use set() or set.difference()?

Nov 9 '07 #4
ma*********@gmail.com a écrit :
Hi,

I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).

I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.
What about a set ?

s = set()
for url in urls:
if url in s:
print "already have ", url
else:
set.add(url)

Nov 9 '07 #5
On Nov 9, 11:45 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
maxim.no...@gmail.com a écrit :
Hi,
I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).
I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.

What about a set ?

s = set()
for url in urls:
if url in s:
print "already have ", url
else:
set.add(url)
Interesting. For this case I usually used dicts. As in:

d = {}
for url in urls:
if url in d:
print "already have ", url
else:
d[url] = 1

Now, I can see that this method has some superfluous data (the `1`
that is assigned to the dict). So I suppose this is less memory
efficient. But is this slower then? As both implementations use hashes
of the URL to access the data. Just asking out of curiosity ;)

Nov 12 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by j | last post: by
3 posts views Thread by tsunami | last post: by
4 posts views Thread by Tarique Jawed | last post: by
4 posts views Thread by mathon | last post: by
1 post views Thread by mathon | last post: by
11 posts views Thread by Defected | last post: by
2 posts views Thread by Defected | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.