i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:
object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)
all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.
i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.
i can think of some naive approaches, but is there an elegant way to
do this? 9 3356
On Sun, 10 Jun 2007 03:58:44 -0700, bullockbefriending bard wrote:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:
object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)
all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.
i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.
i can think of some naive approaches, but is there an elegant way to
do this?
Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the
integer you want to match and the position you want to match it in.
As a generator:
def matcher(list_of_objects, what, where):
for obj in list_of_objects:
if obj.data[where] == what:
yield obj
As a generator expression:
(obj for obj in list_of_objects if obj.data[what] == where)
--
Steven.
bullockbefriending bard schrieb:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:
object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)
all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.
i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.
i can think of some naive approaches, but is there an elegant way to
do this?
Depends on what you find elegant. Are the criteria runtime-specified,
and is anything other than the * allowerd, e.g. [1,3,8]? IMHO the best
thing is to create a filter-function that you then use to... filter :)
Like this:
def create_filter_predicate(criteria):
"""
criteria is an iterable containing either
an '*' or a list of comma-separated
integers
"""
sub_preds = []
for i, sub_crit in enumerate(criteria):
if sub_crit == "*":
continue
matching_set = set(int(n) for n in sub_crit.split(","))
sub_preds.append((i, matching_set))
def predicate(o):
t = o.my_tuple
for i, ms in sub_preds:
if not t[i] in ms:
return False
return True
return predicate
Diez
Diez B. Roggisch schrieb:
bullockbefriending bard schrieb:
>i have a large collection of python objects, each of which contains an integer 6-tuple as part of its data payload. what i need to be able to do is select only those objects which meet a simple tuple element wildcard matching criterion. e.g. given the following python objects:
object A includes tuple (1,2,3,4,5,6) object B includes tuple (1,4,4,4,11,1) object C includes tuple (1,3,9,1,1,1)
all tuples are unique. for what it's worth, the values in each field are independent of the other fields and range from 1 to 14. although 'large', my collection is sparse cf. the 14^6 possible tuples.
i want to search on *one only* tuple field/value. if my search criterion is (*,*,*,4,*,*), then i want to find object A and object B. if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only ever specify an integer match for one tuple field.
i can think of some naive approaches, but is there an elegant way to do this?
Depends on what you find elegant. Are the criteria runtime-specified,
and is anything other than the * allowerd, e.g. [1,3,8]? IMHO the best
thing is to create a filter-function that you then use to... filter :)
Like this:
def create_filter_predicate(criteria):
"""
criteria is an iterable containing either
an '*' or a list of comma-separated
integers
"""
sub_preds = []
for i, sub_crit in enumerate(criteria):
if sub_crit == "*":
continue
matching_set = set(int(n) for n in sub_crit.split(","))
sub_preds.append((i, matching_set))
def predicate(o):
t = o.my_tuple
for i, ms in sub_preds:
if not t[i] in ms:
return False
return True
return predicate
Obviously the docs are faulty - the criteria looks like this:
['*', '1,2,3']
But I think you can get the gist of it - regardless on how the actual
criteria are entered.
Diez
Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the
integer you want to match and the position you want to match it in.
for sure. that was more for expository purpose rather than how i was
planning to go about it.
As a generator expression:
(obj for obj in list_of_objects if obj.data[what] == where)
above or equivalent list comprehension was what i had in mind as far
as linear search goes. and scanning the list like this will most
likely be 'good enough' performance-wise. however, partly just out of
curiosity, i was wondering if there is some kind of data structure
which might let me find all the matches a bit faster.
bullockbefriending bard schrieb:
>Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the integer you want to match and the position you want to match it in.
for sure. that was more for expository purpose rather than how i was
planning to go about it.
>As a generator expression:
(obj for obj in list_of_objects if obj.data[what] == where)
above or equivalent list comprehension was what i had in mind as far
as linear search goes. and scanning the list like this will most
likely be 'good enough' performance-wise. however, partly just out of
curiosity, i was wondering if there is some kind of data structure
which might let me find all the matches a bit faster.
You can of course create a tree from the tuples, where the first level
of nodes corresponds to the first attribute of the tuple and so forth.
There are certainly cases where the speedup is tremendous - think of a
single integer in the first criteria - but then the overall performance
depends on the real-live queries. If lot's of wildcards are used, you
might end up slower if the tree-walk takes more time than the
C-implemented list-iteration will cost you.
Diez
On Jun 10, 8:58 pm, bullockbefriending bard <kinch1...@gmail.com>
wrote:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:
object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)
all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.
i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.
i can think of some naive approaches, but is there an elegant way to
do this?
You are going to have to tell us how many is in a "large" collection,
how often you will do this query, how much memory you have to spare,
how fast you want the answer back, whether you want the answer as a
generator, or in a list/tuple/set/whatever -- or you are going to have
to do a fair bit of prototyping and benchmarking.
Steven has given you one end of the spectrum: a naive serial scan. Now
I'll give you another end: a naive pre-build all possible answers
method:
[quite untested]
# build 14x6 array of empty sets
answer = [[set() for what in range(14)] for slot in range(6)]
# populate the sucker
for obj in obj_list:
for slot in range(6):
answer[slot][obj.data[slot]-1].add(obj)
Later, the answer to 'Which objects have the value 11 in slot 3?' is
answer[3][11-1]
HTH,
John
quite so, i rephrased docstring to be:
"""criteria is an iterable containing either '*' instances or strings
of comma-separated integers. e.g. ['*','1,2,3', '11,12']"""
thanks very much for the idea! upon further reflection, this seems to
be a more elegant solution for my case than the ad-hoc generator or
list comprehension approach. this is because i *do* have to filter
data based on multiple single field criteria and i know all of these
criteria at the same time - so it makes a lot of sense to do as you
have done and then only do one filter operation to pull out all the
objects i am interested in.
There are certainly cases where the speedup is tremendous - think of a
single integer in the first criteria - but then the overall performance
depends on the real-live queries. If lot's of wildcards are used, you
might end up slower if the tree-walk takes more time than the
C-implemented list-iteration will cost you.
hopefully, won't need to do this then. i'll test the filter approach
tonight. generally speaking, going to be mostly wildcards most of the
time.
On Jun 10, 10:32 pm, bullockbefriending bard <kinch1...@gmail.com>
wrote:
quite so, i rephrased docstring to be:
"""criteria is an iterable containing either '*' instances or strings
of comma-separated integers. e.g. ['*','1,2,3', '11,12']"""
thanks very much for the idea! upon further reflection, this seems to
be a more elegant solution for my case than the ad-hoc generator or
list comprehension approach. this is because i *do* have to filter
data based on multiple single field criteria and i know all of these
criteria at the same time - so it makes a lot of sense to do as you
have done and then only do one filter operation to pull out all the
objects i am interested in.
"""i want to search on *one only* tuple field/value."""
"""i will only ever specify an integer match for one tuple field."""
"""the cheque's in the mail"""
"""of course i'll still love you in the morning"""
So augment my pre-built approach by intersection (or union? your
wording is somewhat vague) of the selected answer sets. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Grumfish |
last post by:
In order to familiarize my self with Flash files and their bytecode I've
started to make an assembler. My first problem is writing the bitfields...
|
by: |
last post by:
If I need to check if a certain value does exist in a field, and return
either "yes" or "not" which query would be the most effestive?
|
by: Nels P. Olsen |
last post by:
If you serialize an object graph to persist it somewhere (e.g. a database) between application sessions, then you run into trouble if the serialized...
|
by: fabioppp |
last post by:
I'm using g++, a quite new version.
Why does this code not compile?
#include "HierarchyGenerators.h"
using namespace Loki;
int main()
{...
|
by: Tim Tabor |
last post by:
How can I delete records from a table with a criterion that depends on a
value in an *unrelated* table? In other words, something like:
DELETE ...
|
by: Alan Isaac |
last post by:
I am probably confused about immutable types.
But for now my questions boil down to these two:
- what does ``tuple.__init__`` do?
- what is the...
|
by: turtle |
last post by:
I want to find out the max value of a field on a report if the field
is not hidden. I have formatting on the report and if the field
doesn't meet...
|
by: Joe Blow |
last post by:
What is the best way to detect a TypeNone field in a tuple, or in a list?
I am accessing a MySQL database using the MySQLdb Python interface......
|
by: Joe Strout |
last post by:
Wow, this was harder than I thought (at least for a rusty Pythoneer
like myself). Here's my stab at an implementation. Remember, the
goal is to...
|
by: concettolabs |
last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
|
by: CD Tom |
last post by:
This only shows up in access runtime. When a user select a report from my report menu when they close the report they get a menu I've called Add-ins...
|
by: Naresh1 |
last post by:
What is WebLogic Admin Training?
WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
|
by: Matthew3360 |
last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function.
Here is my code.
...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
|
by: AndyPSV |
last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
| |