473,686 Members | 2,886 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

matching objects by a tuple field criterion

i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:

object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)

all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.

i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.

i can think of some naive approaches, but is there an elegant way to
do this?

Jun 10 '07 #1
9 3484
On Sun, 10 Jun 2007 03:58:44 -0700, bullockbefriend ing bard wrote:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:

object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)

all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.

i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.

i can think of some naive approaches, but is there an elegant way to
do this?
Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the
integer you want to match and the position you want to match it in.

As a generator:
def matcher(list_of _objects, what, where):
for obj in list_of_objects :
if obj.data[where] == what:
yield obj
As a generator expression:

(obj for obj in list_of_objects if obj.data[what] == where)

--
Steven.

Jun 10 '07 #2
bullockbefriend ing bard schrieb:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:

object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)

all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.

i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.

i can think of some naive approaches, but is there an elegant way to
do this?
Depends on what you find elegant. Are the criteria runtime-specified,
and is anything other than the * allowerd, e.g. [1,3,8]? IMHO the best
thing is to create a filter-function that you then use to... filter :)

Like this:

def create_filter_p redicate(criter ia):
"""
criteria is an iterable containing either
an '*' or a list of comma-separated
integers
"""
sub_preds = []
for i, sub_crit in enumerate(crite ria):
if sub_crit == "*":
continue
matching_set = set(int(n) for n in sub_crit.split( ","))
sub_preds.appen d((i, matching_set))
def predicate(o):
t = o.my_tuple
for i, ms in sub_preds:
if not t[i] in ms:
return False
return True
return predicate

Diez
Jun 10 '07 #3
Diez B. Roggisch schrieb:
bullockbefriend ing bard schrieb:
>i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:

object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)

all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.

i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.

i can think of some naive approaches, but is there an elegant way to
do this?

Depends on what you find elegant. Are the criteria runtime-specified,
and is anything other than the * allowerd, e.g. [1,3,8]? IMHO the best
thing is to create a filter-function that you then use to... filter :)

Like this:

def create_filter_p redicate(criter ia):
"""
criteria is an iterable containing either
an '*' or a list of comma-separated
integers
"""
sub_preds = []
for i, sub_crit in enumerate(crite ria):
if sub_crit == "*":
continue
matching_set = set(int(n) for n in sub_crit.split( ","))
sub_preds.appen d((i, matching_set))
def predicate(o):
t = o.my_tuple
for i, ms in sub_preds:
if not t[i] in ms:
return False
return True
return predicate
Obviously the docs are faulty - the criteria looks like this:

['*', '1,2,3']

But I think you can get the gist of it - regardless on how the actual
criteria are entered.

Diez
Jun 10 '07 #4
Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the
integer you want to match and the position you want to match it in.
for sure. that was more for expository purpose rather than how i was
planning to go about it.

As a generator expression:

(obj for obj in list_of_objects if obj.data[what] == where)
above or equivalent list comprehension was what i had in mind as far
as linear search goes. and scanning the list like this will most
likely be 'good enough' performance-wise. however, partly just out of
curiosity, i was wondering if there is some kind of data structure
which might let me find all the matches a bit faster.

Jun 10 '07 #5
bullockbefriend ing bard schrieb:
>Instead of passing a wild-card tuple like (*,*,*,4,*,*) simply pass the
integer you want to match and the position you want to match it in.

for sure. that was more for expository purpose rather than how i was
planning to go about it.

>As a generator expression:

(obj for obj in list_of_objects if obj.data[what] == where)

above or equivalent list comprehension was what i had in mind as far
as linear search goes. and scanning the list like this will most
likely be 'good enough' performance-wise. however, partly just out of
curiosity, i was wondering if there is some kind of data structure
which might let me find all the matches a bit faster.
You can of course create a tree from the tuples, where the first level
of nodes corresponds to the first attribute of the tuple and so forth.

There are certainly cases where the speedup is tremendous - think of a
single integer in the first criteria - but then the overall performance
depends on the real-live queries. If lot's of wildcards are used, you
might end up slower if the tree-walk takes more time than the
C-implemented list-iteration will cost you.

Diez
Jun 10 '07 #6
On Jun 10, 8:58 pm, bullockbefriend ing bard <kinch1...@gmai l.com>
wrote:
i have a large collection of python objects, each of which contains an
integer 6-tuple as part of its data payload. what i need to be able to
do is select only those objects which meet a simple tuple element
wildcard matching criterion. e.g. given the following python objects:

object A includes tuple (1,2,3,4,5,6)
object B includes tuple (1,4,4,4,11,1)
object C includes tuple (1,3,9,1,1,1)

all tuples are unique. for what it's worth, the values in each field
are independent of the other fields and range from 1 to 14. although
'large', my collection is sparse cf. the 14^6 possible tuples.

i want to search on *one only* tuple field/value. if my search
criterion is (*,*,*,4,*,*), then i want to find object A and object B.
if (1,*,*,*,*,*), i want to find objects A, B, and C, etc. i will only
ever specify an integer match for one tuple field.

i can think of some naive approaches, but is there an elegant way to
do this?
You are going to have to tell us how many is in a "large" collection,
how often you will do this query, how much memory you have to spare,
how fast you want the answer back, whether you want the answer as a
generator, or in a list/tuple/set/whatever -- or you are going to have
to do a fair bit of prototyping and benchmarking.

Steven has given you one end of the spectrum: a naive serial scan. Now
I'll give you another end: a naive pre-build all possible answers
method:

[quite untested]
# build 14x6 array of empty sets
answer = [[set() for what in range(14)] for slot in range(6)]
# populate the sucker
for obj in obj_list:
for slot in range(6):
answer[slot][obj.data[slot]-1].add(obj)

Later, the answer to 'Which objects have the value 11 in slot 3?' is
answer[3][11-1]

HTH,
John

Jun 10 '07 #7
quite so, i rephrased docstring to be:

"""criteria is an iterable containing either '*' instances or strings
of comma-separated integers. e.g. ['*','1,2,3', '11,12']"""

thanks very much for the idea! upon further reflection, this seems to
be a more elegant solution for my case than the ad-hoc generator or
list comprehension approach. this is because i *do* have to filter
data based on multiple single field criteria and i know all of these
criteria at the same time - so it makes a lot of sense to do as you
have done and then only do one filter operation to pull out all the
objects i am interested in.

Jun 10 '07 #8
There are certainly cases where the speedup is tremendous - think of a
single integer in the first criteria - but then the overall performance
depends on the real-live queries. If lot's of wildcards are used, you
might end up slower if the tree-walk takes more time than the
C-implemented list-iteration will cost you.
hopefully, won't need to do this then. i'll test the filter approach
tonight. generally speaking, going to be mostly wildcards most of the
time.

Jun 10 '07 #9
On Jun 10, 10:32 pm, bullockbefriend ing bard <kinch1...@gmai l.com>
wrote:
quite so, i rephrased docstring to be:

"""criteria is an iterable containing either '*' instances or strings
of comma-separated integers. e.g. ['*','1,2,3', '11,12']"""

thanks very much for the idea! upon further reflection, this seems to
be a more elegant solution for my case than the ad-hoc generator or
list comprehension approach. this is because i *do* have to filter
data based on multiple single field criteria and i know all of these
criteria at the same time - so it makes a lot of sense to do as you
have done and then only do one filter operation to pull out all the
objects i am interested in.
"""i want to search on *one only* tuple field/value."""
"""i will only ever specify an integer match for one tuple field."""
"""the cheque's in the mail"""
"""of course i'll still love you in the morning"""

So augment my pre-built approach by intersection (or union? your
wording is somewhat vague) of the selected answer sets.
Jun 10 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3179
by: Grumfish | last post by:
In order to familiarize my self with Flash files and their bytecode I've started to make an assembler. My first problem is writing the bitfields the format uses often. It is a series of fields, each can be a different number of bits, combined into the least amount of bytes possible. Extra bits in the last byte are padded with zeros. I would like to make a function that takes a size and value for each field needed, calculate the amount of...
20
10145
by: | last post by:
If I need to check if a certain value does exist in a field, and return either "yes" or "not" which query would be the most effestive?
1
1546
by: Nels P. Olsen | last post by:
If you serialize an object graph to persist it somewhere (e.g. a database) between application sessions, then you run into trouble if the serialized objects' existing field order or datatypes change, or if the object is moved from one assembly or namespace to another. Even if we're careful about making such changes, we need to serialize object graphs which may contain objects defined by the user, and users are sure to make incompatible changes...
1
1974
by: fabioppp | last post by:
I'm using g++, a quite new version. Why does this code not compile? #include "HierarchyGenerators.h" using namespace Loki; int main() { Tuple<TYPELIST_2(int,int)> tuple; Field<0>(tuple) = 4; };
2
1757
by: Tim Tabor | last post by:
How can I delete records from a table with a criterion that depends on a value in an *unrelated* table? In other words, something like: DELETE tblPortfolioHistory.* FROM tblPortfolioHistory WHERE (((tblPortfolioHistory.Date)=.));
2
3482
by: Alan Isaac | last post by:
I am probably confused about immutable types. But for now my questions boil down to these two: - what does ``tuple.__init__`` do? - what is the signature of ``tuple.__init__``? These questions are stimulated by http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/303439 Looking at that, what fails if I leave out the following line? ::
7
3235
by: turtle | last post by:
I want to find out the max value of a field on a report if the field is not hidden. I have formatting on the report and if the field doesn't meet a certain criteria then it is hidden. I want to get a max of the field for the ones that are not hidden. is this possible? TIA, KO
3
1189
by: Joe Blow | last post by:
What is the best way to detect a TypeNone field in a tuple, or in a list? I am accessing a MySQL database using the MySQLdb Python interface... this interface returns a tuple object type in response to SQL SELECT statements. My understanding of the MySQLdb interface is that NULL database values are returned as a Python 'None' object. Because I need to create some work fields based on the contents of the database I am doing so by...
1
3440
by: Joe Strout | last post by:
Wow, this was harder than I thought (at least for a rusty Pythoneer like myself). Here's my stab at an implementation. Remember, the goal is to add a "match" method to Template which works like Template.substitute, but in reverse: given a string, if that string matches the template, then it should return a dictionary mapping each template field to the corresponding value in the given string. Oh, and as one extra feature, I want to...
0
8584
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8516
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9054
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8768
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8778
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6440
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5796
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4308
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4532
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.