I want to sort sequences of strings lexicographically but those with
longer prefix should come earlier, e.g. for s = ['a', 'bc', 'bd',
'bcb', 'ba', 'ab'], the sorted sequence is ['ab', 'a', 'ba', 'bcb',
'bc', 'bd']. Currently I do it with:
s.sort(cmp=lambda x,y: 0 if x==y else
-1 if x.startswith(y) else
+1 if y.startswith(x) else
cmp(x,y))
Can this be done with an equivalent key function instead of cmp ?
George 9 1772
On Nov 3, 6:49*pm, George Sakkis <george.sak...@gmail.comwrote:
I want to sort sequences of strings lexicographically but those with
longer prefix should come earlier, e.g. for s = ['a', 'bc', 'bd',
'bcb', 'ba', 'ab'], the sorted sequence is ['ab', 'a', 'ba', 'bcb',
'bc', 'bd']. Currently I do it with:
s.sort(cmp=lambda x,y: 0 if x==y else
* * * * * * * * * * * * * * * * * * -1 if x.startswith(y) else
* * * * * * * * * * * * * * * * * * +1 if y.startswith(x) else
* * * * * * * * * * * * * * * * * * cmp(x,y))
Can this be done with an equivalent key function instead of cmp ?
George
Your input and output:
s = ['a', 'bc', 'bd', 'bcb', 'ba', 'ab']
r = ['ab', 'a', 'ba', 'bcb', 'bc', 'bd']
To me your lambda looks like an abuse of the inline if expression. So
I suggest to replace it with a true function, that is more readable:
def mycmp(x, y):
if x == y:
return 0
elif x.startswith(y):
return -1
elif y.startswith(x):
return +1
else:
return cmp(x, y)
print sorted(s, cmp=mycmp)
It's a peculiar cmp function, I'm thinking still in what situations it
can be useful.
To use the key argument given a cmp function I use the simple code
written by Hettinger:
def cmp2key(mycmp):
"Converts a cmp= function into a key= function"
class K:
def __init__(self, obj, *args):
self.obj = obj
def __cmp__(self, other):
return mycmp(self.obj, other.obj)
return K
print sorted(s, key=cmp2key(mycmp))
Now I'll look for simpler solutions...
Bye,
bearophile
George Sakkis wrote:
s.sort(cmp=lambda x,y: 0 if x==y else
-1 if x.startswith(y) else
+1 if y.startswith(x) else
cmp(x,y))
Probably not what you had in mind ...
>>s
['a', 'bc', 'bd', 'bcb', 'ba', 'ab']
>>maxlen = max(len(si) for si in s) def k(si): return si+'z'*(maxlen-len(si))
...
>>sorted(s,key=k)
['ab', 'a', 'ba', 'bcb', 'bc', 'bd']
Cheers,
Alan Isaac
George Sakkis <ge***********@gmail.comwrites:
I want to sort sequences of strings lexicographically but those with
longer prefix should come earlier, e.g. for s = ['a', 'bc', 'bd',
'bcb', 'ba', 'ab'], the sorted sequence is ['ab', 'a', 'ba', 'bcb',
'bc', 'bd']. Currently I do it with:
s.sort(cmp=lambda x,y: 0 if x==y else
-1 if x.startswith(y) else
+1 if y.startswith(x) else
cmp(x,y))
Can this be done with an equivalent key function instead of cmp ?
Here's an idea:
>>sorted(s, key=lambda x: x+'z'*(3-len(s)))
['ab', 'a', 'ba', 'bcb', 'bc', 'bd']
The 3 above is the length of the longest string in the list
Here's another idea, probably more practical:
>>sorted(s, key=lambda x: tuple(256-ord(l) for l in x), reverse=True)
['ab', 'a', 'ba', 'bcb', 'bc', 'bd']
HTH
--
Arnaud
Alan G Isaac:
Probably not what you had in mind ...
...
>>maxlen = max(len(si) for si in s)
* * *>>def k(si): return si+'z'*(maxlen-len(si))
This looks a little better:
assert isinstance(s, str)
sorted(s, key=lambda p: p.ljust(maxlen, "\255"))
If the string is an unicode that may not work anymore.
I don't know if there are better solutions.
Bye,
bearophile
Arnaud Delobelle:
Here's another idea, probably more practical:
>sorted(s, key=lambda x: tuple(256-ord(l) for l in x), reverse=True)
Nice.
A variant that probably works with unicode strings too:
print sorted(s, key=lambda x: [-ord(l) for l in x], reverse=True)
Bye,
bearophile be************@lycos.com writes:
Arnaud Delobelle:
>Here's another idea, probably more practical:
>>sorted(s, key=lambda x: tuple(256-ord(l) for l in x), reverse=True)
Nice.
A variant that probably works with unicode strings too:
print sorted(s, key=lambda x: [-ord(l) for l in x], reverse=True)
Of course that's better! (although mine will work with unicode if yours
does). It's funny how the obvious escapes me so often. Still I think
the idea of the 'double reverse' (one letterwise, the other listwise)
was quite good.
--
Arnaud
Arnaud Delobelle:
It's funny how the obvious escapes me so often.
In this case it's a well known cognitive effect: the mind of humans
clings to first good/working solution, not allowing its final tuning.
For that you may need to think about something else for a short time,
and then look at your solution with a little "fresher" mind.
This (ugly) translation into D + my functional-style libs shows why
Python syntax is a good idea:
import d.all;
void main() {
auto txt = "a bc bd bcb ba ab".split();
putr( sorted(txt, (string s){ return map((char c){return -
cast(int)c;}, s);} ).reverse );
}
Long Live To Python! :-)
Bye,
bearophile
On Nov 3, 1:51*pm, bearophileH...@lycos.com wrote:
Arnaud Delobelle:
Here's another idea, probably more practical:
>>sorted(s, key=lambda x: tuple(256-ord(l) for l in x), reverse=True)
Nice.
A variant that probably works with unicode strings too:
print sorted(s, key=lambda x: [-ord(l) for l in x], reverse=True)
Bye,
bearophile
Awesome! I tested it on a sample list of ~61K words [1] and it's
almost 40% faster, from ~1.05s dropped to ~0.62s. That's still >15
times slower than the default sorting (0.04s) but I guess there's not
much more room for improvement.
George
[1] http://www.cs.pitt.edu/~kirk/cs1501/...ggle/5desk.txt
George Sakkis:
but I guess there's not much more room for improvement.
That's nonsense, Python is a high level language, so there's nearly
always room for improvement (even in programs written in assembly you
can generally find faster solutions).
If speed is what you look for, and your strings are ASCII then this is
much faster:
tab = "".join(map(chr, xrange(256)))[::-1]
s.sort(key=lambda x: x.translate(tab), reverse=True)
Bye,
bearophile This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: dont bother |
last post by:
This is really driving me crazy.
I have a dictionary feature_vectors{}. I try to sort
its keys using
#apply sorting on feature_vectors...
|
by: Federico G. Babelis |
last post by:
Hi All:
I have this line of code, but the syntax check in VB.NET 2003 and also in
VB.NET 2005 Beta 2 shows as unknown:
Dim local4 As Byte
...
|
by: StenKoll |
last post by:
Help needed in order to create a register of stocks in a company. In
accordance with local laws I need to give each individual share a
number. I...
|
by: Owen T. Soroke |
last post by:
Using VB.NET
I have a ListView with several columns.
Two columns contain integer values, while the remaining contain string
values.
I am...
|
by: Sjaakie |
last post by:
Hi,
I'm, what it turns out to be, fooling around with 3-tier design.
At several websites people get really enthusiastic about using custom...
|
by: Ambica Jain |
last post by:
Hi,
I want custom sorting on some of the columns in the datagrid. And i am able
to do the same by overriding MouseDown event. However, i need to...
|
by: Kamal |
last post by:
Hello all,
I have a very simple html table with collapsible rows and sorting
capabilities. The collapsible row is hidden with css rule...
|
by: KevinADC |
last post by:
Introduction
In part one we discussed the default sort function. In part two we will discuss more advanced techniques you can use to sort data....
|
by: jrod11 |
last post by:
hi,
I found a jquery html table sorting code i have implemented. I am trying to figure out how to edit how many colums there are, but every time i...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
|
by: antdb |
last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine
In the overall architecture, a new "hyper-convergence" concept was...
|
by: Matthew3360 |
last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function.
Here is my code.
...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
|
by: Matthew3360 |
last post by:
Hi,
I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...
|
by: Oralloy |
last post by:
Hello Folks,
I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA.
My problem (spelled failure) is with the...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
|
by: Rahul1995seven |
last post by:
Introduction:
In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python...
| |