473,387 Members | 1,529 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

searching a value of a dict (each value is a list)

Hi,

I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.

dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.

How can I do this with Python? Ideas?
Dec 9 '07 #1
20 2379
On 12월10일, 오전1시23분, Seongsu Lee <se...@senux.comwrote:
Hi,

I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.

dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.

How can I do this with Python? Ideas?
Hi,

I just let the dict work in bidirectional fashion so that
I can find out what I want by both key and value. A mark
or prefix was needed to distinguish between keys originated
from keys and keys originated from values. (value * (-1))

from pprint import pprint
dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}
for k, v in dict.items():
for x in v:
dict[x * -1] = k
pprint(dict)

{-105: 900000,
-104: 900000,
-103: 900000,
-102: 900000,
-101: 900000,
-100: 900000,
-22: 900001,
-21: 900001,
-20: 900001,
-19: 999999,
-18: 999999,
-17: 999999,
-16: 999999,
-15: 999999,
-12: 2,
-11: 2,
-10: 2,
-5: 1,
-4: 1,
-3: 1,
-2: 1,
-1: 1,
1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

What do you think of this? Ideas with less space complexity?
Dec 9 '07 #2
Seongsu Lee escribi:
Hi,

I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
(...)

I want to find out the key value which has a specific
integer in the list of its value.
Sorry if this is unhelpful, but have you considered moving your data
model a proper database?
I ask because unless someone knows of a specific module, I think we are
in DB's authentic realm. Is the fastest solution, probably not just for
this particular operation you are trying to do.

Regards,
Pablo
Dec 9 '07 #3
On 12썡10씪, 삤*1떆53遺, Pablo Ziliani <pa...@decode.com.arwrote:
Seongsu Lee escribi처:
Hi,
I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
(...)
I want to find out the key value which has a specific
integer in the list of its value.

Sorry if this is unhelpful, but have you considered moving your data
model a proper database?
I ask because unless someone knows of a specific module, I think we are
in DB's authentic realm. Is the fastest solution, probably not just for
this particular operation you are trying to do.

Regards,
Pablo
Hi Pablo,

Thank you for your posting! I wanted to solve the problem within
a given environment, python, and I think it is solved by
a dictionary with bidirectional key. I have posted it and want to
know if other knows more elegant way to do it.
Dec 9 '07 #4
Seongsu Lee:
What do you think of this? Ideas with less space complexity?
You can put the second group of keys in a second dictionary, so you
don't have to mangle them, and it may be a bit faster.

Regarding the space complexity, I don't know how you can reduce it
with Python. Probably you can create a list of long, sort it and use
bisect on it to find the keys. Such longs can be a combination of
shifted integer OR the other integer that is the key of the original
dict. But I don't how much you can gain like this.

Another solution to reduce space is to use a tiny external module
written in C, Pyrex or D. Here follows some simple D code you can
modify a bit to make it work with Pyd (http://pyd.dsource.org/):
import std.traits: ReturnType;
import std.stdio: writefln;

struct TyInt_int {
int el, n;
int opCmp(TyInt_int other) {
if (el == other.el)
return 0;
return (el < other.el) ? -1 : 1;
}
}

int bisect(TyElem, TyData, TyFun)(TyElem[] a, TyData x, TyFun key) {
int lo = 0;
int hi = a.length;
while (lo < hi) {
int mid = (lo + hi) / 2;
if (x < key(a[mid]))
hi = mid;
else
lo = mid + 1;
}
return lo;
}

void main() {
int[][int] int_arr;
int_arr[1] = [1, 2, 3, 4, 5];
int_arr[1] = [10, 11, 12],
int_arr[900000] = [100, 101, 102, 103, 104, 105],
int_arr[900001] = [20, 21, 22],
int_arr[999999] = [15, 16, 17, 18, 19];

int tot_len = 0;
foreach(arr; int_arr)
tot_len += arr.length;

auto data_arr = new TyInt_int[](tot_len);
int i = 0;
foreach(n, arr; int_arr)
foreach(el; arr)
data_arr[i++] = TyInt_int(el, n);

data_arr.sort;
writefln(bisect(data_arr, 100, (TyInt_int ii){return ii.el;}));
}

Bye,
bearophile
Dec 9 '07 #5
Seongsu Lee <se***@senux.comwrote:
Hi,

I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.

dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.
Are the integers in the lists unique? I mean, could 104 occur in more than
one list? If it did, would it matter which key was returned?
How can I do this with Python? Ideas?
When I see something like this my natural response is to think that the data
structure is inappropriate for the use it's being put to.

The code someone else posted to reverse the keys is all very well, but
surely hugely wasteful on cpu, maybe storage, and elapsed time.

Even if the dict in this form is needed for some other reason, couldn't the
code that created it also create a reverse index at the same time?

--
Jeremy C B Nicoll - my opinions are my own.
Dec 9 '07 #6
Jeremy C B Nicoll:
The code someone else posted to reverse the keys is all very well, but
surely hugely wasteful on cpu, maybe storage, and elapsed time.
If you are talking about my D code then I know it, the creation of the
first dict has to be skipped, if possible... The code I have posted
must be adapted.

Bye,
bearophile
Dec 9 '07 #7
be************@lycos.com wrote:
Jeremy C B Nicoll:
The code someone else posted ...

If you are talking about my D code then I know it...
No I meant the code that used python to iterate over the dict and create
zillions of extra keys. I've deleted earlier posts in the thread and wasn't
sure who suggested that.

--
Jeremy C B Nicoll - my opinions are my own.
Dec 10 '07 #8
I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.

dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.

How can I do this with Python? Ideas?
def find_key(dict, num):
for k in dict:
if num in dict[k]:
return k
Dec 10 '07 #9
On 12월10일, 오전6시49분, Jeremy C B Nicoll <jer...@omba.demon.co.ukwrote:
Seongsu Lee <se...@senux.comwrote:
Hi,
I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.
dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}
I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.

Are the integers in the lists unique? I mean, could 104 occur in more than
one list? If it did, would it matter which key was returned?
Yes, the intergers in the lists are unique.
How can I do this with Python? Ideas?

When I see something like this my natural response is to think that the data
structure is inappropriate for the use it's being put to.

The code someone else posted to reverse the keys is all very well, but
surely hugely wasteful on cpu, maybe storage, and elapsed time.

Even if the dict in this form is needed for some other reason, couldn't the
code that created it also create a reverse index at the same time?
The reason I use the dict for my data is to speed up the search by
key.

The code could create also a reverse index (a reverse dict) at the
time. But as you said, it waste the space and I wanted to ask someone
who may know some way to reduce the waste of space while searching
fast.
Dec 10 '07 #10
Seongsu Lee <se***@senux.comwrote:
The reason I use the dict for my data is to speed up the search by key.
Ok, I understand that once the overhead of creating the dict has been done,
getting access to values within it is quick. And taking the time to create
a set of reverse keys speeds up the reverse access.

Rather than scanning the whole dict and creating reverse keys for everything
in it first, there might be an advantage in making search logic test if
there is a reverse key for the negative integer to be searched for, and if
so use it, otherwise scan the dict creating and examining reverse keys until
the integer is found. That way if the integer you're looking for is early
in the dict you'd only create reverse keys for the integers from the start
of the dict until the required one.
We don't know what external(?) process created the data that you've stored
in the dict, nor what you use it for - or more to the point - how often. If
you're going to make just one search of that data then there's little point
in having a fast search after a slow dict creation. On the other hand if
you have many many searches to do the initial overhead might be acceptable.
(I don't know how slow creating the dict would be for a typical example of a
million keys each keying lists of 1-1000 integers.)
The code could create also a reverse index (a reverse dict) at the
time. But as you said, it waste the space and I wanted to ask someone
who may know some way to reduce the waste of space while searching
fast.
Is the dict used by anything else? If the data in it was held in some other
form would that cause your program (or other programs) lots of problems? If
the range of values of the integers being stored is suitable, you might
sensibly use several or many smaller dicts to store all the data (and thus
save time reverse-keying much less of it).

--
Jeremy C B Nicoll - my opinions are my own.
Dec 10 '07 #11
Seongsu Lee wrote:
Hi,

I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.

dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.

How can I do this with Python? Ideas?
You can try this:

items = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

def findItem(item, dictionary):
for key, value in dictionary.iteritems():
if item in value:
print key, value

findItem(104, items)

This will allow you to work with the existing dataset without needing to
duplicate it. It will print all occurrances.

Also, you should never use reserved words like 'dict' this creates
confusion and can cause Python to misbehave since you are rebinding the
name.

Hope this helps.

Adonis Vargas

Dec 10 '07 #12
Adonis Vargas:
Also, you should never use reserved words like 'dict' this creates
confusion and can cause Python to misbehave since you are rebinding the
name.
Adonis Vargas
After hearing this suggestion for the 300th time, I think it may be
the moment to fix this problem in Python3, and make the Python
compiler issue a syntax error if someone tries to reassign such kind
of words, like dict, set, etc.

Bye,
bearophile
Dec 10 '07 #13
On 12월10일, 오후12시18분, Adonis Vargas <adon...@REMOVETHISearthlink.net>
wrote:
Seongsu Lee wrote:
Hi,
I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.
dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}
I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.
How can I do this with Python? Ideas?

You can try this:

items = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}

def findItem(item, dictionary):
for key, value in dictionary.iteritems():
if item in value:
print key, value

findItem(104, items)

This will allow you to work with the existing dataset without needing to
duplicate it. It will print all occurrances.
Hi,

Yes, it works. But I think it works in O(n * m), doesn't it?
(n is # of keys in the dictionary and m is # of items in the list.)
So, we need to create a reverse index. (a reverse dictionary) or
need something better at least, I think.
Also, you should never use reserved words like 'dict' this creates
confusion and can cause Python to misbehave since you are rebinding the
name.
Yep. :)
Hope this helps.

Adonis Vargas- 따온 텍스트 숨기기 -

- 따온 텍스트 보기 -
Dec 10 '07 #14
On Dec 10, 3:50 am, Seongsu Lee <se...@senux.comwrote:
On 12월10일, 오후12시18분, Adonis Vargas <adon...@REMOVETHISearthlink.net>
wrote:
Seongsu Lee wrote:
Hi,
I have a dictionary with million keys. Each value in the
dictionary has a list with up to thousand integers.
Follow is a simple example with 5 keys.
dict = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}
I want to find out the key value which has a specific
integer in the list of its value. For example, if I search
104 in the list, 900000 must be returned.
How can I do this with Python? Ideas?
You can try this:
items = {1: [1, 2, 3, 4, 5],
2: [10, 11, 12],
900000: [100, 101, 102, 103, 104, 105],
900001: [20, 21, 22],
999999: [15, 16, 17, 18, 19]}
def findItem(item, dictionary):
for key, value in dictionary.iteritems():
if item in value:
print key, value
findItem(104, items)
This will allow you to work with the existing dataset without needing to
duplicate it. It will print all occurrances.

Hi,

Yes, it works. But I think it works in O(n * m), doesn't it?
(n is # of keys in the dictionary and m is # of items in the list.)
So, we need to create a reverse index. (a reverse dictionary) or
need something better at least, I think.
Also, you should never use reserved words like 'dict' this creates
confusion and can cause Python to misbehave since you are rebinding the
name.

Yep. :)
Hope this helps.
Adonis Vargas- 따온 텍스트 숨기기 -
- 따온 텍스트 보기 -
If I'm not mistaken, building a reverse dictionary like that will be
O(n*m) because dict/list access is O(n) (ammortized). Somebody correct
me if I'm wrong. In that case, it really depends on how you will use
the dict to see whether you get any benefit from building the reversed
dict. If you want to do several lookups, then the initial overhead
(speed/memory) of building the reversed dict might be worth it so that
you can just run lookups at O(n). But if you only need it once, it is
a waste of time and space to create a reverse dict when your access
time is the same for the lookup as for building the reversed dict.

If you do need more than one lookup, it would also be a good
optimization strategy to build the reverse dict in parallel, as you
execute the first search; that way you can combine the time spent on
building the reverse dict and the lookup, to get a total of O(n*m)
rather than O(n^2*m). The first search is "free" since you need the
reverse dict anyway.

Regards,
Jordan
Dec 10 '07 #15
On 2007-12-10, MonkeeSage <Mo********@gmail.comwrote:
If I'm not mistaken, building a reverse dictionary like that will be
O(n*m) because dict/list access is O(n) (ammortized). Somebody correct
me if I'm wrong. In that case, it really depends on how you will use
the dict to see whether you get any benefit from building the reversed
dict. If you want to do several lookups, then the initial overhead
(speed/memory) of building the reversed dict might be worth it so that
you can just run lookups at O(n).
It also depends on if the dictionary shall be mutated between
reverse lookups.
But if you only need it once, it is a waste of time and space
to create a reverse dict when your access time is the same for
the lookup as for building the reversed dict.

If you do need more than one lookup, it would also be a good
optimization strategy to build the reverse dict in parallel, as
you execute the first search; that way you can combine the time
spent on building the reverse dict and the lookup, to get a
total of O(n*m) rather than O(n^2*m). The first search is
"free" since you need the reverse dict anyway.
It wouldn't be merely an optimization if reverse lookups and
mutations were interleaved.

--
Neil Cerutti
You only get a once-in-a-lifetime opportunity so many times. --Ike Taylor
Dec 10 '07 #16
On Dec 10, 8:31 am, Neil Cerutti <horp...@yahoo.comwrote:
On 2007-12-10, MonkeeSage <MonkeeS...@gmail.comwrote:
If I'm not mistaken, building a reverse dictionary like that will be
O(n*m) because dict/list access is O(n) (ammortized). Somebody correct
me if I'm wrong. In that case, it really depends on how you will use
the dict to see whether you get any benefit from building the reversed
dict. If you want to do several lookups, then the initial overhead
(speed/memory) of building the reversed dict might be worth it so that
you can just run lookups at O(n).

It also depends on if the dictionary shall be mutated between
reverse lookups.
But if you only need it once, it is a waste of time and space
to create a reverse dict when your access time is the same for
the lookup as for building the reversed dict.
If you do need more than one lookup, it would also be a good
optimization strategy to build the reverse dict in parallel, as
you execute the first search; that way you can combine the time
spent on building the reverse dict and the lookup, to get a
total of O(n*m) rather than O(n^2*m). The first search is
"free" since you need the reverse dict anyway.

It wouldn't be merely an optimization if reverse lookups and
mutations were interleaved.

--
Neil Cerutti
You only get a once-in-a-lifetime opportunity so many times. --Ike Taylor
Well true, but you enter a whole other level of complexity in that
case...something like Theta(log(n*(m-n))). I might have calculated
that incorrectly, but that just goes to show how complex a lookup
is(!) in such a case.

Regards,
Jordan
Dec 10 '07 #17
On Dec 10, 8:31 am, Neil Cerutti <horp...@yahoo.comwrote:
On 2007-12-10, MonkeeSage <MonkeeS...@gmail.comwrote:
If I'm not mistaken, building a reverse dictionary like that will be
O(n*m) because dict/list access is O(n) (ammortized). Somebody correct
me if I'm wrong. In that case, it really depends on how you will use
the dict to see whether you get any benefit from building the reversed
dict. If you want to do several lookups, then the initial overhead
(speed/memory) of building the reversed dict might be worth it so that
you can just run lookups at O(n).

It also depends on if the dictionary shall be mutated between
reverse lookups.
But if you only need it once, it is a waste of time and space
to create a reverse dict when your access time is the same for
the lookup as for building the reversed dict.
If you do need more than one lookup, it would also be a good
optimization strategy to build the reverse dict in parallel, as
you execute the first search; that way you can combine the time
spent on building the reverse dict and the lookup, to get a
total of O(n*m) rather than O(n^2*m). The first search is
"free" since you need the reverse dict anyway.

It wouldn't be merely an optimization if reverse lookups and
mutations were interleaved.

--
Neil Cerutti
You only get a once-in-a-lifetime opportunity so many times. --Ike Taylor
Well true, but you enter a whole other level of complexity in that
case...something like Theta(log(n*(m-n))). I might have calculated
that incorrectly, but that just goes to show how complex a lookup
is(!) in such a case.

Regards,
Jordan
Dec 10 '07 #18
On Dec 10, 9:45 am, MonkeeSage <MonkeeS...@gmail.comwrote:
On Dec 10, 8:31 am, Neil Cerutti <horp...@yahoo.comwrote:
On 2007-12-10, MonkeeSage <MonkeeS...@gmail.comwrote:
If I'm not mistaken, building a reverse dictionary like that will be
O(n*m) because dict/list access is O(n) (ammortized). Somebody correct
me if I'm wrong. In that case, it really depends on how you will use
the dict to see whether you get any benefit from building the reversed
dict. If you want to do several lookups, then the initial overhead
(speed/memory) of building the reversed dict might be worth it so that
you can just run lookups at O(n).
It also depends on if the dictionary shall be mutated between
reverse lookups.
But if you only need it once, it is a waste of time and space
to create a reverse dict when your access time is the same for
the lookup as for building the reversed dict.
If you do need more than one lookup, it would also be a good
optimization strategy to build the reverse dict in parallel, as
you execute the first search; that way you can combine the time
spent on building the reverse dict and the lookup, to get a
total of O(n*m) rather than O(n^2*m). The first search is
"free" since you need the reverse dict anyway.
It wouldn't be merely an optimization if reverse lookups and
mutations were interleaved.
--
Neil Cerutti
You only get a once-in-a-lifetime opportunity so many times. --Ike Taylor

Well true, but you enter a whole other level of complexity in that
case...something like Theta(log(n*(m-n))). I might have calculated
that incorrectly, but that just goes to show how complex a lookup
is(!) in such a case.

Regards,
Jordan
Sorry for the double-post...google is being beligerant right now.
Dec 10 '07 #19
Seongsu Lee:
>I have a dictionary with million keys. Each value in the dictionary has a list with up to thousand integers.<
Let's say each integer can be represented with 32 bits (if there are
less numbers then a 3-byte representation may suffice, but this makes
things more complex), that is 2^2 bytes. Let's say there are 2^20 keys
each one associated to 2^10 values. So to represent the values you
need 2^32 bytes. It means 4 GB, so I don't think Python suffices to
store them in RAM, because a Python int object requires quite more
than 4 bytes (only represented inside an array.array it may need just
4 bytes).

So if you can use 128 MB RAM to store such data structure you need to
store data on HD too. You probably can use a lower-level language. On
disk you can keep the reverse index, represented as an array of
records/structs, each of such structures keep two 32-bit numbers (so
such array is 8 GB). Such index is sorted according to the first
element of the struct. The first number is the value of the original
dictionary and the second nuber is its key. Inside the RAM you can
keep another sorted array that "summarizes" your whole data. When you
need a number you can do a binary search on the array in RAM, such
array gives you the position where you can read (with a seek) a little
part of the file (512 bytes may suffice), to perform a little binary
search (if the block is very little a linear scan suffices) on it too
to find the number you need. Note that the summarizing data structure
in RAM may be represented with just a Python dict too, so in the end
you can use Python to solve this problem. You may need a lower-level
language to create the 8 GB file on disk (or create it with Python,
but it may take lot of time. You may sort it with the sort unix
command).

This isn't a complete solution, but I think it may work.

Bye,
bearophile
Dec 10 '07 #20
On Dec 10, 1:28 pm, bearophileH...@lycos.com wrote:
Seongsu Lee:
I have a dictionary with million keys. Each value in the dictionary has a list with up to thousand integers.<

Let's say each integer can be represented with 32 bits (if there are
less numbers then a 3-byte representation may suffice, but this makes
things more complex), that is 2^2 bytes. Let's say there are 2^20 keys
each one associated to 2^10 values. So to represent the values you
need 2^32 bytes. It means 4 GB, so I don't think Python suffices to
store them in RAM, because a Python int object requires quite more
than 4 bytes (only represented inside an array.array it may need just
4 bytes).

So if you can use 128 MB RAM to store such data structure you need to
store data on HD too. You probably can use a lower-level language. On
disk you can keep the reverse index, represented as an array of
records/structs, each of such structures keep two 32-bit numbers (so
such array is 8 GB). Such index is sorted according to the first
element of the struct. The first number is the value of the original
dictionary and the second nuber is its key. Inside the RAM you can
keep another sorted array that "summarizes" your whole data. When you
need a number you can do a binary search on the array in RAM, such
array gives you the position where you can read (with a seek) a little
part of the file (512 bytes may suffice), to perform a little binary
search (if the block is very little a linear scan suffices) on it too
to find the number you need. Note that the summarizing data structure
in RAM may be represented with just a Python dict too, so in the end
you can use Python to solve this problem. You may need a lower-level
language to create the 8 GB file on disk (or create it with Python,
but it may take lot of time. You may sort it with the sort unix
command).

This isn't a complete solution, but I think it may work.

Bye,
bearophile
Nice. :)

Regards,
Jordan
Dec 14 '07 #21

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: jblazi | last post by:
I should like to search certain characters in a string and when they are found, I want to replace other characters in other strings that are at the same position (for a very simply mastermind game)...
5
by: Daniel Pryde | last post by:
Hi everyone. I was wondering if anyone might be able to help me out here. I'm currently looking to find the quickest way to find a best fit match in a large array. My problem is that I have an...
23
by: stewart.midwinter | last post by:
No doubt I've overlooked something obvious, but here goes: Let's say I assign a value to a var, e.g.: myPlace = 'right here' myTime = 'right now' Now let's say I want to print out the two...
5
by: rbt | last post by:
I know how to setup an empty list and loop thru something... appending to the list on each loop... how does this work with dicts? I'm looping thru a list of files and I want to put the file's...
7
by: Chris Stiles | last post by:
Hi -- I'm working on something that includes the concept of multiple aliases for a particular object, where a lookup for any of the aliases has to return all the others. The hack way of doing...
6
by: rh0dium | last post by:
Hi all, I am having a bit of difficulty in figuring out an efficient way to split up my data and identify the unique pieces of it. list= Now I want to split each item up on the "_" and...
2
by: Gerardo Herzig | last post by:
Hi all: I have this list thing as a result of a db.query: (short version) result = and so on...what i need to do is some list comprehension that returns me something like result = }, {...
16
by: agent-s | last post by:
Basically I'm programming a board game and I have to use a list of lists to represent the board (a list of 8 lists with 8 elements each). I have to search the adjacent cells for existing pieces and...
5
by: davenet | last post by:
Hi, I'm new to Python and working on a school assignment. I have setup a dictionary where the keys point to an object. Each object has two member variables. I need to find the smallest value...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.