473,883 Members | 1,604 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

[perl-python] exercise: partition a list by equivalence

here's another interesting algorithmic exercise, again from part of a
larger program in the previous series.

Here's the original Perl documentation:

=pod

merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

=cut

For those of you unfamiliar with math lingoes, partition a list means
to create sublists, based on some criteria. In our case, the input list
comes in the form of pairs, and its members are the union of all
members in the pairs. The criterion for partition in our case is that
of equivalence, specified in the pairs input.

Try to write a Python code for this. In the Python code, the input
should be a list of couples. (for Perlers, sketch out the algorithm on
paper and try to code in Python directly.)

I'll post Perl & Python code tomorrow.

==This is brought to you by the Perl-Python community.==
== http://xahlee.org/perl-python/python.html ==

Xah
xa*@xahlee.org
http://xahlee.org/PageTwo_dir/more.html

Jul 18 '05 #1
41 3572
>For those of you unfamiliar with math lingoes, partition a list means
to create sublists, based on some criteria.


Typical moronic mathematicians with their exclusionary lingoes...why
can't they just say "create sublists based on some criteria" like
normal people?

- alex23

Jul 18 '05 #2
Xah Lee wrote:
here's another interesting algorithmic exercise, again from part of a
larger program in the previous series.

Here's the original Perl documentation:

=pod

merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

=cut


Almost a joke:

from numarray import *

def merge(*pairs):
flattened = reduce(tuple.__ add__, pairs, tuple())
m, M = min(flattened), max(flattened)

d = M - m + 1
matrix = zeros((d, d), type = Bool)

for x, y in pairs:
X, Y = x - m, y - m

matrix[X, X] = 1
matrix[X, Y] = 1
matrix[Y, X] = 1
matrix[Y, Y] = 1

while True:
next = greater(dot(mat rix, matrix), 0)
if alltrue(ravel(n ext == matrix)):
break
matrix = next

results = []
for i in range(d):
eqls, = nonzero(matrix[i])
if eqls.size():
if i == eqls[0]:
results.append( tuple(x + m for x in eqls))

return results

Disclaimer: I'm not an expert in numarray and suppose the something can
be dramatically imporved.
Jul 18 '05 #3
On Thu, Feb 17, 2005 at 03:46:20PM -0800, Xah Lee wrote:
here's another interesting algorithmic exercise, again from part of a
larger program in the previous series.

Here's the original Perl documentation:

=pod

merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

=cut


in Python:

def merge(pairings) :
rev = {}
res = []
for pairing in pairings:
first, second = pairing
has_first = first in rev
has_second = second in rev
if has_first and has_second:
rev[first].extend(rev[second])
rev[second][:] = []
rev[second] = rev[first]
elif has_first:
rev[first].append(second)
rev[second] = rev[first]
elif has_second:
rev[second].append(first)
rev[first] = rev[second]
else:
copy = [first, second]
res.append(copy )
rev[first] = rev[second] = copy
return filter(None, res)

and in Perl:

sub merge($)
{
my %rev = ();
my @res = ();
my ($pairing, $first, $second, $has_first, $has_second);
foreach $pairing (@{$_[0]})
{
($first, $second) = @$pairing;
$has_first = exists $rev{$first};
$has_second = exists $rev{$second};
if ($has_first and $has_second)
{
push @{$rev{$first}} , @{$rev{$second} };
@{$rev{$second} } = ();
$rev{$second} = $rev{$first};
}
elsif (exists $rev{$first})
{
push @{$rev{$first}} , $second;
$rev{$second} = $rev{$first};
}
elsif (exists $rev{$second})
{
push @{$rev{$second} }, $first;
$rev{$first} = $rev{$second};
}
else
{
my @copy = ($first, $second);
push @res, \@copy;
$rev{$first} = $rev{$second} = \@copy;
}
}
return [grep @$_, @res];
}

although in Perl you wouldn't define it to take a reference to a list
(as you did), but rather a list, and you wouldn't define it to return
a reference to a list, but rather a list in list context (and possibly
a reference in scalar context).

Both versions are (IMHO) pretty clear (when the docstring/pod is
included), and O(n) because dict/hash lookup and list
appending/pushing is O(1) in both languages. Both versions can
probably be tweaked for speed quite a bit, but I don't *think* there's
a better-than-O(n) algorithm for this.

Note that the Python version assumes that the pairs' elements are
hashable; your example used numbers, so I thought it was pretty safe
assumption. The Perl version has no such restriction.
--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
Noble cosa es, aún para un anciano, el aprender.
-- Sófocles. (497-406 a.C.) Filósofo griego.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFhXjgPq u395ykGsRAus5AJ 9iCSbGsyP3QHZy/whP195GSVNIwwCc Dyqf
hIA+oWaLBHdSGUi 7t79Wfx8=
=BlC7
-----END PGP SIGNATURE-----

Jul 18 '05 #4

John Lenton wrote:
On Thu, Feb 17, 2005 at 03:46:20PM -0800, Xah Lee wrote:
here's another interesting algorithmic exercise, again from part of a larger program in the previous series.

Here's the original Perl documentation:

=pod

merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

=cut
in Python:

def merge(pairings) :
rev = {}
res = []
for pairing in pairings:
first, second = pairing
has_first = first in rev
has_second = second in rev

Not robust in the face of input like:
[[1,1]]
or
[[1,2], [1,2]]
or
[[1,2], [2,1]]
or
[[1,2], [2,3], [3,1]]

needs "if first == second: continue" here
if has_first and has_second:
needs "if rev[first] == rev[second]: continue" here
rev[first].extend(rev[second])
rev[second][:] = []
rev[second] = rev[first]
elif has_first:
rev[first].append(second)
rev[second] = rev[first]
elif has_second:
rev[second].append(first)
rev[first] = rev[second]
else:
copy = [first, second]
res.append(copy )
My reaction to the "magic" by which res grows was "omigod that's the
most horrible thing I've seen for quite a while" but there was worse to
come :-)
rev[first] = rev[second] = copy
return filter(None, res)

and in Perl:
[snip]

Both versions are (IMHO) pretty clear (when the docstring/pod is
included), and O(n) because dict/hash lookup and list
appending/pushing is O(1) in both languages. Both versions can
probably be tweaked for speed quite a bit, but I don't *think* there's a better-than-O(n) algorithm for this.
List appending is O(1) only in the amortised sense. Extending is not
O(1) in any sense. Neither is the list comparison that is necessary for
robustness (using your data structure, that is).

You don't need to think. This problem has been extensively picked over
by folk who are a lot smarter than us, starting from 30 years ago.
Google for "disjoint set" and "union-find". One gets the impression
that the best possible algorithm would be slightly *worse* than O(n).

Note that the Python version assumes that the pairs' elements are
hashable; your example used numbers, so I thought it was pretty safe
assumption. The Perl version has no such restriction.


In the real world, the objects under comparison (images, fingerprints,
customer records, etc) may not themselves be hashable and comparison
for equality may be expensive but a unique identifier would be assigned
to each object and that ID would of course be cheaply hashable and
comparable.

Jul 18 '05 #5
On Fri, Feb 18, 2005 at 03:21:10PM -0800, John Machin wrote:
Not robust in the face of input like:
[[1,1]]
or
[[1,2], [1,2]]
or
[[1,2], [2,1]]
or
[[1,2], [2,3], [3,1]]
oops, my bad.

needs "if first == second: continue" here
if has_first and has_second:
needs "if rev[first] == rev[second]: continue" here


an 'is' is enough, and better.
rev[first].extend(rev[second])
rev[second][:] = []
rev[second] = rev[first]
elif has_first:
rev[first].append(second)
rev[second] = rev[first]
elif has_second:
rev[second].append(first)
rev[first] = rev[second]
else:
copy = [first, second]
res.append(copy )


My reaction to the "magic" by which res grows was "omigod that's the
most horrible thing I've seen for quite a while" but there was worse to
come :-)


what is magic about it? is it really that horrible?
List appending is O(1) only in the amortised sense. Extending is not
O(1) in any sense. Neither is the list comparison that is necessary for
robustness (using your data structure, that is).

You don't need to think. This problem has been extensively picked over
by folk who are a lot smarter than us, starting from 30 years ago.
Google for "disjoint set" and "union-find". One gets the impression
that the best possible algorithm would be slightly *worse* than O(n).


Of course! I'd forgotten clean about union-find. And yes, it's
O(n*log(n)) union and find, and my implementation is O(n**2) for
union, O(1) for find; I'm pleased that, in spite of having forgotten
about union-find, I "reinvented " (heh) something that is better than
the "naive" implementation we saw in class.

Now I'm even more worried about your dismissal of this as magic and
horrible.

--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
Why don't you fix your little problem... and light this candle?
-- Alan Shepherd, the first man into space, Gemini program

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCFoBIgPq u395ykGsRApUIAK CnNmctznqq1KTfZ hi7Dl8YpjPnbQCg uo2E
z/Ss9rWC64h9vuDrJ M//Ge8=
=rVAU
-----END PGP SIGNATURE-----

Jul 18 '05 #6
Hello John,

Try your python code on this example:
merge([[1,2], [3,4], [1,2], [5,3]])

The result given by your function is:
[[3, 4, 5]]

Sorry...

To Xah: next time you propose an exercise, write some UNIT TESTS!!!
Then people will be able to test if there answers are correct or not.

Cyril

On Fri, 18 Feb 2005 13:20:51 -0300, John Lenton <jo**@grulic.or g.ar> wrote:
On Thu, Feb 17, 2005 at 03:46:20PM -0800, Xah Lee wrote:
here's another interesting algorithmic exercise, again from part of a
larger program in the previous series.

Here's the original Perl documentation:

=pod

merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];

(ordering of the returned list and sublists are not specified.)

=cut


in Python:

def merge(pairings) :
rev = {}
res = []
for pairing in pairings:
first, second = pairing
has_first = first in rev
has_second = second in rev
if has_first and has_second:
rev[first].extend(rev[second])
rev[second][:] = []
rev[second] = rev[first]
elif has_first:
rev[first].append(second)
rev[second] = rev[first]
elif has_second:
rev[second].append(first)
rev[first] = rev[second]
else:
copy = [first, second]
res.append(copy )
rev[first] = rev[second] = copy
return filter(None, res)

and in Perl:

sub merge($)
{
my %rev = ();
my @res = ();
my ($pairing, $first, $second, $has_first, $has_second);
foreach $pairing (@{$_[0]})
{
($first, $second) = @$pairing;
$has_first = exists $rev{$first};
$has_second = exists $rev{$second};
if ($has_first and $has_second)
{
push @{$rev{$first}} , @{$rev{$second} };
@{$rev{$second} } = ();
$rev{$second} = $rev{$first};
}
elsif (exists $rev{$first})
{
push @{$rev{$first}} , $second;
$rev{$second} = $rev{$first};
}
elsif (exists $rev{$second})
{
push @{$rev{$second} }, $first;
$rev{$first} = $rev{$second};
}
else
{
my @copy = ($first, $second);
push @res, \@copy;
$rev{$first} = $rev{$second} = \@copy;
}
}
return [grep @$_, @res];
}

although in Perl you wouldn't define it to take a reference to a list
(as you did), but rather a list, and you wouldn't define it to return
a reference to a list, but rather a list in list context (and possibly
a reference in scalar context).

Both versions are (IMHO) pretty clear (when the docstring/pod is
included), and O(n) because dict/hash lookup and list
appending/pushing is O(1) in both languages. Both versions can
probably be tweaked for speed quite a bit, but I don't *think* there's
a better-than-O(n) algorithm for this.

Note that the Python version assumes that the pairs' elements are
hashable; your example used numbers, so I thought it was pretty safe
assumption. The Perl version has no such restriction.

--
John Lenton (jo**@grulic.or g.ar) -- Random fortune:
Noble cosa es, aún para un anciano, el aprender.
-- Sófocles. (497-406 a.C.) Filósofo griego.


--
http://mail.python.org/mailman/listinfo/python-list


Jul 18 '05 #7

John Lenton wrote:
On Fri, Feb 18, 2005 at 03:21:10PM -0800, John Machin wrote:
Not robust in the face of input like:
[[1,1]]
or
[[1,2], [1,2]]
or
[[1,2], [2,1]]
or
[[1,2], [2,3], [3,1]]
oops, my bad.

needs "if first == second: continue" here
if has_first and has_second:


needs "if rev[first] == rev[second]: continue" here


an 'is' is enough, and better.


Good point. You're redeeming yourself :-)
rev[first].extend(rev[second])
rev[second][:] = []
rev[second] = rev[first]
elif has_first:
rev[first].append(second)
rev[second] = rev[first]
elif has_second:
rev[second].append(first)
rev[first] = rev[second]
else:
copy = [first, second]
res.append(copy )


My reaction to the "magic" by which res grows was "omigod that's the most horrible thing I've seen for quite a while" but there was worse to come :-)


what is magic about it? is it really that horrible?


Try explaining to the newbies over on the tutor list how despite "res"
only ever *explicitly* having little bits like [3, 4] appended to it,
it (or more properly the thing to which it refers) is actually
festering and growing and being mutated under the surface until at the
finale it bursts out dripping slime just like the creature from the
black lagoon ...

Jul 18 '05 #8
Xah Lee wrote:
merge($pairings ) takes a list of pairs, each pair indicates the
sameness
of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is
merge( [ [1,2], [2,4], [5,6] ] );

that means 1 and 2 are the same. 2 and 4 are the same. Therefore
1==2==4. The result returned is

[[4,2,1],[6,5]];


Not sure how efficient this is, but I decided to take advantage of the
operations provided by sets:

def merge(pairs):
pairs = set(tuple(sorte d(p)) for p in pairings)
merged = []
# Each loop will result in a new, complete sublist in the result.
while pairs:
p = set(pairings.po p())
remove = set([])
for pair in pairs:
pairSet = set(pair)
if pairSet & p:
p |= pairSet
remove.add(pair )
pairs -= remove
merged.append(l ist(p))
return merged

--
Brian Beck
Adventurer of the First Order
Jul 18 '05 #9
Brian Beck wrote:
Brian Beck wrote:
> [code]


Ah heck, nevermind... it worked for my small test cases but the
algorithm is just wrong.

--
Brian Beck
Adventurer of the First Order
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
8140
by: Mark Wilson CPU | last post by:
This must be easy, but I'm missing something... I want to execute a Perl script, and capture ALL its output into a PHP variable. Here are my 2 files: ------------------------------------- test3.pl ------------------------------------- print "PERL Hello from Perl! (plain print)<br>\n"; print STDERR "PERL This is text sent to STDERR<br>\n"; $output="PERL Some output:<br>\n"; for ($i=0; $i<5; $i++) {
3
3466
by: David F. Skoll | last post by:
Hi, I'm tearing my hair out on this one. I'm trying to embed a Perl interpreter into a C program. I need to be able to create and destroy the interpreter periodically, but will never actually have two interpreters at the same time. On Red Hat Linux 7.3 with Perl 5.6.1, the attached program segfaults. On Red Hat 9 with Perl 5.8.0, it works perfectly. Save the program code as "test-embed-perl.c" and the build script as "buildte". ...
1
4699
by: Julia Bell | last post by:
I would like to run the same script on two different platforms. The directory in which the script(s) will be stored is common to the two platforms. (I see the same directory contents regardless of which platform I use to access the directory.) Platform 1: perl is installed in /tps/bin/perl. CPAN modules are available Perl is also installed in /usr/bin/perl Platform 1, but the modules are not accessible with this version. Platform...
1
668
by: sm00thcrimnl13 | last post by:
if i have windows 2000 and know how to write perl scripts, how to i actuvate the script through perl?
1
3654
by: smsabu2002 | last post by:
Hi, I am facing the build problem while installing the DBD-MySql perl module (ver 2.9008) using both GCC and CC compilers in HP-UX machine. For the Build using GCC, the compiler error is produced due to the unknown GCC compiler option "+DAportable". For the Build using CC, the preprocessor error is produced due to the recursive declration of macro "PerlIO" in perlio.h file.
13
3275
by: Otto J. Makela | last post by:
I'm trying to install to php the Perl-1.0.0.tgz package (from http://pecl.php.net/package/perl, enabling one to call perl libraries) to a pre-existing Solaris system. Unfortunately, the attempt fails in a rather dramatic way, spewing out thousands of "relocation remains"... I'm somewhat lost on what to do next, the documentation that came along with the Perl package is somewhat sparse. Anyone have suggestions? % uname -a
6
3021
by: surfivor | last post by:
I may be involved in a data migration project involving databases and creating XML feeds. Our site is PHP based, so I imagine the team might suggest PHP, but I had a look at the PHP documentation for one of the Pear modules for creating XML and it didn't look like much. I've used Perl XML:Twig and I have the impression that there is more Perl stuff available as well as the Perl stuff being well documented as I have a Perl DBI book, a Perl...
4
3720
by: billb | last post by:
I installed a perl extension for PHP to use some perl inside my php primarily because I have perl working with oracle and not php and oracle. So I want to use my old perl scripts, and use the functionality of php. The extension uses the '$perl->eval' function. i am kind of stuck with the syntax or how to put the php variable into the perl script. I have a form where the user puts in a grid reference. Then a php script that gets the...
4
8643
by: vijayarl | last post by:
Hi All, i have the following software installed in my system : 1.OS: Win2k 2.Eclipse Version used :3.4.0 & even the perl too... 1. I have imported the my own perl project in Eclipse, when i tried to run the External Tools --> Perl -w am getting the popup saying then it says : " Variable references empty selection:
1
47518
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Many websites have a form or a link you can use to download a file. You click a form button or click on a link and after a moment or two a file download dialog box pops-up in your web browser and prompts you for some instructions, such as “open” or “save“. I’m going to show you how to do that using a perl script. What You Need Any recent...
0
11118
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10732
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10836
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10407
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9564
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7960
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5982
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4606
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3230
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.