473,662 Members | 2,631 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

merge & de-duplicate lists

I need to merge and de-duplicate some lists, and I have some code
which works but doesn't seem particularly elegant. I was wondering if
somebody could point me at a cleaner way to do it.

Here's my function:

+++++++++++++++ ++++

from sets import Set

def mergeLists (intialList, sourceOfAdditio nalLists,
nameOfMethodToC all) :
workingSet = Set(initialList )
for s in sourceOfAdditio nalLists :
getList = s.__getAttribut e__(nameOfMetho dToCall)
workingSet = workingSet.unio n(Set \
(callable(getLi st) and getList() or getList))
return workingSet

++++++++++++++

Two questions - passing the *name* of the method to call, and then
looking it up for each object in the list of extra sources (all of
which need to be new-style objects - not a problem in my application)
seems inelegant. My "sourcesOfAddit ionalLists" are normally all of the
same class - is there something I can bind at class level that
automagically refers to instance-level attributes when invoked?

Second (hopefully clearer & less obscure) question : is
sets.Set.union( ) an efficient way to do list de-duplication? Seems
like the obvious tool for the job.
Jul 18 '05 #1
5 4500
Alan Little wrote:
I need to merge and de-duplicate some lists, and I have some code
which works but doesn't seem particularly elegant. I was wondering if
somebody could point me at a cleaner way to do it.

Here's my function:

+++++++++++++++ ++++

from sets import Set

def mergeLists (intialList, sourceOfAdditio nalLists,
nameOfMethodToC all) :
workingSet = Set(initialList )
for s in sourceOfAdditio nalLists :
getList = s.__getAttribut e__(nameOfMetho dToCall)
Normal expression of this line would rather be:
getList = getattr(s, nameOfMethodToC all)
workingSet = workingSet.unio n(Set \
(callable(getLi st) and getList() or getList))
return workingSet

++++++++++++++

Two questions - passing the *name* of the method to call, and then
looking it up for each object in the list of extra sources (all of
which need to be new-style objects - not a problem in my application)
seems inelegant. My "sourcesOfAddit ionalLists" are normally all of the
same class - is there something I can bind at class level that
automagically refers to instance-level attributes when invoked?
I'm not quite sure what you mean here. You can of course play
around with descriptors, e.g. properties or your own custom ones.
But that 'normally' is worrisome -- what happens in the not-so-
normal cases where one or two items are of a different class...?

Second (hopefully clearer & less obscure) question : is
sets.Set.union( ) an efficient way to do list de-duplication? Seems
like the obvious tool for the job.


Well, .union must make a new set each time and then you throw
away the old one; this is inefficient in much the same way in
which concatenating a list of lists would be if you coded that:
result = baselist
for otherlist in otherlists:
result = result + otherlist
here, too, the + would each time make a new list and you would
throw away the old one with the assignment. This inefficiency
is removed by in-place updates, e.g. result.extend(o therlist)
for the case of list concatenation, and
workingSet.upda te(otherlist)
for your case (don't bother explicitly making a Set out of
the otherlist -- update can take any iterable).

Overall, I would code your function (including some
renamings for clarity, and the removal of a very error
prone, obscure and useless and/or -- just imagine what
would happen if getList() returned an EMPTY list...) as
follows:

def mergeLists(inti alList, sourceOfAdditio nalLists,
nameOfAttribute ):
workingSet = Set(initialList )
for s in sourceOfAdditio nalLists :
getList = getattr(s, nameOfAttribute )
if callable(getLis t): getList=getList ()
workingSet.upda te(getList)
return workingSet
Alex

Jul 18 '05 #2
Alan Little wrote:
I need to merge and de-duplicate some lists, and I have some code
which works but doesn't seem particularly elegant. I was wondering if
somebody could point me at a cleaner way to do it.

Here's my function:

+++++++++++++++ ++++

from sets import Set

def mergeLists (intialList, sourceOfAdditio nalLists,
nameOfMethodToC all) :
workingSet = Set(initialList )
for s in sourceOfAdditio nalLists :
getList = s.__getAttribut e__(nameOfMetho dToCall)
workingSet = workingSet.unio n(Set \
(callable(getLi st) and getList() or getList))
return workingSet

++++++++++++++

Two questions - passing the *name* of the method to call, and then
looking it up for each object in the list of extra sources (all of
which need to be new-style objects - not a problem in my application)
seems inelegant. My "sourcesOfAddit ionalLists" are normally all of the
same class - is there something I can bind at class level that
automagically refers to instance-level attributes when invoked?

If they all are of the same class, you may just introduce method to call
that returns your lists.

BTW, your design might be not perfect. I personally whould rather split
this function into a couple: one to merge lists and the second that will
produce lists to merege (this approach might help with the problem above).

regards,
anton.
Jul 18 '05 #3
Thanks for the advice guys. I've only been playing with python in my
spare time for a few weeks, and am hugely impressed with how clean and
fun and powerful it all is (compared to digging in the java and oracle
mines, which is what I do in the day job). Getting this sort of
informed critique of my ideas is great.
Jul 18 '05 #4
Alex wrote:
Two questions - passing the *name* of the method to call, and then
looking it up for each object in the list of extra sources (all of
which need to be new-style objects - not a problem in my application)
seems inelegant. My "sourcesOfAddit ionalLists" are normally all of the
same class - is there something I can bind at class level that
automagically refers to instance-level attributes when invoked?


I'm not quite sure what you mean here. You can of course play
around with descriptors, e.g. properties or your own custom ones.
But that 'normally' is worrisome -- what happens in the not-so-
normal cases where one or two items are of a different class...?


what I was getting at was that I was thnínking I would like to be able
to call my function something like this (pseudocode):

def f (listOfObjects, method) :
for o in listOfObjects :
o.method # passed method gets invoked on the instances

l = [a,b,c] # collection of objects of known class C
result = f(l, C.method) # call with method of the class

Stashing the method name in a string is a level of indirection that
somehow felt to me like it *ought* to be unnecessary, but I think I
was just thinking in non-pythonic function pointer terms. Which
doesn't work in a world where (a) we don't know if two objects of the
same class have the same methods, and (b) we have the flexibility to
write functions that will work with anything that has a method of the
requisite name, regardless of its type.

Just learning out loud here (with the help of your Nutshell book, I
might add).
Jul 18 '05 #5
Alan Little wrote:
def f (listOfObjects, method) :
for o in listOfObjects :
o.method # passed method gets invoked on the instances

l = [a,b,c] # collection of objects of known class C
result = f(l, C.method) # call with method of the class


Maybe the following might interest you:

1.

class A(object):
def fooA(self):
return "fooA"

class B(object):
def fooB(self):
return "fooB"

a, b = A(), B()

invoke_fooA = lambda obj: obj.fooA()
invoke_fooB = lambda obj: obj.fooB()

print invoke_fooA(a)
ptin invoke_fooB(b)

2. If the class is the same you can use:

class A(object):
def foo(self):
return "A.foo"

method = A.foo

a = A()

print method(a)

HTH,
anton.

Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
4802
by: Nils | last post by:
Hello, my problem: I merged about 1.000 Tables with Create table name (variables) type=merge union=(table1,table2,...,table1000); MySQL now creates a tables, but I can't open it. Everytime I get the
2
12511
by: Steve M | last post by:
I'm trying to do invoke the mail merge functionality of MS Word from a Python script. The situation is that I have a template Word document, and a record that I've generated in Python, and I want to output a new Word .doc file with the template filled in with the record I've generated. (To refresh your memory, in Word a mailmerge is achieved by a) under Tools -> Letters and Mailings, check off Show Mail Merge Toolbar; b) open a document...
5
4563
by: stemc © | last post by:
Hi there, In work, we often mail merge letters and post them to contacts. But more and more, we've been emailing information to people instead. So far, I've been writing a single generic email, then copying and pasting the email addresses into the address fields. I start the email off with 'Dear all...' But is there a way to 'email merge' this directly into outlook emails, in the same way we do for normal printed letters? This...
2
2296
by: nickdu | last post by:
Is there a tool that will merge XML documents? We also need the reverse, we need to be able to create a Diff of two documents. What we're trying to do is just store differences of documents at different levels of hierarchy in our configuration store. As an example, lets say at a certain hierarchy in our configuration store is the following document: <grid bgColor="Red" fgColor="Green" Width="200" Height="100"> <font name="Arial"...
0
2764
by: Phil C. | last post by:
Hi, I'm using Access 2000. I have a Select Query that uses the MID function to separate the actual text of articles from the title of the articles. The articles are enterd into the underlying table (in a memo field) in html format, as one big block of text. The memo field is called (I named it before I realized that I needed to separate title from text).
9
5608
by: rkk | last post by:
Hi, I have written a generic mergesort program which is as below: --------------------------------------------------------- mergesort.h ----------------------- void MergeSort(void *array,int p,int r,int elemSize,int(*Compare)(const void *keyA,const void *keyB));
1
4292
by: mjobbe | last post by:
I have an installer that requires three merge modules (ATL, CRT, and MFC), and after adding them in, I get the following warnings when I build the MSI: WARNING: Two or more objects have the same target location ('\8.0.50727.42.cat') WARNING: Two or more objects have the same target location ('\8.0.50727.42.cat')
3
12016
by: Bishman | last post by:
Hi, I have some issues with the code below. These are: ONE: In code I open an existing document and 'attach' the Mail Merge data source, but the data is not poulating the merge fields until I manually press 'View Merged Data' button in Word. The data then appears as expected. If I perform the 'wrdMailMerge.Execute(ref oTrue);' call I get two documents
1
4947
by: mr k | last post by:
Hi, I wanted to use mail merge with forms but Text form fields are not retained during mail merge in Word, I got the code from Microsoft but it doesn't remember the text form field options such as the maximum length of the text (which I need) and the text format (would be ideal but can do without if need be) I have posted the code below, so please can you help!?? Thanks in advance... Sub PreserveMailMergeFormFieldsNewDoc() Dim...
1
2503
by: =?Utf-8?B?QmFkaXM=?= | last post by:
Hi, how can I use a dataset (as my datasource) to perform a Word Mail Merge? maybe methods like wrdMailMerge.OpenDataSource()!? or wrdMailMerge.DataSource...!? could help but I don't know how to do it. a simple example could realy help. Thanks
0
8435
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8768
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8633
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7368
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6186
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5655
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4181
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2763
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1999
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.