On Nov 9, 11:45 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
maxim.no...@gmail.com a écrit :
Hi,
I have to get list of URLs one by one and to find the URLs that I have
more than one time(can't be more than twice).
I thought to put them into binary search tree, this way they'll be
sorted and I'll be able to check if the URL already exist.
What about a set ?
s = set()
for url in urls:
if url in s:
print "already have ", url
else:
set.add(url)
Interesting. For this case I usually used dicts. As in:
d = {}
for url in urls:
if url in d:
print "already have ", url
else:
d[url] = 1
Now, I can see that this method has some superfluous data (the `1`
that is assigned to the dict). So I suppose this is less memory
efficient. But is this slower then? As both implementations use hashes
of the URL to access the data. Just asking out of curiosity ;)