By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,732 Members | 1,461 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,732 IT Pros & Developers. It's quick & easy.

UNIX-style sort in Python?

P: n/a
For a while at least I have to work in Windows rather than UNIX, which
is more familiar. I'm trying to do with Python some of the things that
I've done for years in shell, in particular, sort. The shell sort is
pretty easy to use:
% sort -t, +2 +5 imputfilename <return>

where -t is the field separator, in this case a comma, , and +2 and
+4 are the fields to be sorted, in that order. Actually, the fields are
zero-based, so the first and third fields would be the sorted.

So, is there a module or function already available that does this?

Lance

Jul 18 '05 #1
Share this Question
Share on Google+
10 Replies


P: n/a
Kotlin Sam wrote:
% sort -t, +2 +5 imputfilename <return> So, is there a module or function already available that does this?


In newer Pythons (CVS and beta-1 for 2.4) you can do

def get_fields(line):
fields = line.split("\t")
return fields[1], fields[4]

sorted_lines = sorted(open("imputfilename"), key=get_fields)

For older Pythons you'll need to do the "decorate-sort-undecorate"
("DSU") yourself, like this

lines = [get_fields(line), line for line in open("imputfilename")]
lines.sort()
sorted_lines = [x[1] for x in lines]

There is a slight difference between these two. If fields[1]
and fields[4] are the same between two lines in the comparison
then the first of these sorts by position of each line (it's
a "stable sort") while the latter sorts by the content of the
line.

Andrew
da***@dalkescientific.com
Jul 18 '05 #2

P: n/a
On 2004-10-18, Kotlin Sam <xa***********@hotmail.com> wrote:
For a while at least I have to work in Windows rather than UNIX, which
is more familiar. I'm trying to do with Python some of the things that
I've done for years in shell, in particular, sort. The shell sort is
pretty easy to use:


Sounds like you need to install Cygwin so you have a real bash
shell and all of the normal shell utilities.

--
Grant Edwards grante Yow! I'm in ATLANTIC CITY
at riding in a comfortable
visi.com ROLLING CHAIR...
Jul 18 '05 #3

P: n/a
Andrew Dalke <ad****@mindspring.com> wrote:
Kotlin Sam wrote:
% sort -t, +2 +5 imputfilename <return>
So, is there a module or function already available that does this?


In newer Pythons (CVS and beta-1 for 2.4) you can do

def get_fields(line):
fields = line.split("\t")
return fields[1], fields[4]

sorted_lines = sorted(open("imputfilename"), key=get_fields)


Quite right -- and, of course, if Katlin needs get_fields to depend on
the sys.argv parameters that's easy to arrange.

For older Pythons you'll need to do the "decorate-sort-undecorate"
("DSU") yourself, like this

lines = [get_fields(line), line for line in open("imputfilename")]
Wrong syntax -- needs to be:

lines = [(get_fields(line), line) for line in open("imputfilename")]
lines.sort()
sorted_lines = [x[1] for x in lines]

There is a slight difference between these two. If fields[1]
and fields[4] are the same between two lines in the comparison
then the first of these sorts by position of each line (it's
a "stable sort") while the latter sorts by the content of the
line.


....and to get exactly the same stable-sort semantics in 2.3, just change
the first one of the three statements to:

lines = [ (get_fields(line), i, line)
for i, line in enumerate(open("imputfilename")) ]
Alex
Jul 18 '05 #4

P: n/a
Grant Edwards <gr****@visi.com> wrote:
On 2004-10-18, Kotlin Sam <xa***********@hotmail.com> wrote:
For a while at least I have to work in Windows rather than UNIX, which
is more familiar. I'm trying to do with Python some of the things that
I've done for years in shell, in particular, sort. The shell sort is
pretty easy to use:


Sounds like you need to install Cygwin so you have a real bash
shell and all of the normal shell utilities.


An excellent piece of advice. Cygwin has occasionally save my sanity in
the past when the weakness of Windows' cmd.exe was getting to me...!-)
Alex
Jul 18 '05 #5

P: n/a
Kotlin Sam wrote:
For a while at least I have to work in Windows rather than UNIX, which
is more familiar. I'm trying to do with Python some of the things that
I've done for years in shell, in particular, sort. The shell sort is
pretty easy to use:


Why don't you just install the UNIX utils on windows? There are native
ports of most of them at http://unxutils.sourceforge.net/
Jul 18 '05 #6

P: n/a
Alex Martelli wrote:
Wrong syntax -- needs to be:

lines = [(get_fields(line), line) for line in open("imputfilename")]


Bah! I all too often forget that () on the LHS of the list
comprehension. :(

Andrew
da***@dalkescientific.com
Jul 18 '05 #7

P: n/a
Andrew Dalke wrote:
Alex Martelli wrote:
lines = [(get_fields(line), line) for line in open("imputfilename")]


Bah! I all too often forget that () on the LHS of the list
comprehension. :(


Me too. Could the grammar conceivably be changed so that it works
without the parantheses there?
--
Michael Hoffman
Jul 18 '05 #8

P: n/a
Michael Hoffman wrote:
Me too. Could the grammar conceivably be changed so that it works
without the parantheses there?


Unlikely. As I recall Python deliberately uses only a
lookahead-1 to resolve ambiguities.
Or see PEP 202

] BDFL Pronouncements
]
] - The form [x, y for ...] is disallowed; one is required to write
] [(x, y) for ...].
It could be made an arbitrary lookahead in theory, but
as I recall Guido has also said doesn't want that because
it makes human parsing more complex as well.

Can't find a ready citation for that though.

Andrew
da***@dalkescientific.com
Jul 18 '05 #9

P: n/a
On Mon, 18 Oct 2004 09:27:45 +0200, al*****@yahoo.com (Alex Martelli) wrote:
Grant Edwards <gr****@visi.com> wrote:
On 2004-10-18, Kotlin Sam <xa***********@hotmail.com> wrote:
> For a while at least I have to work in Windows rather than UNIX, which
> is more familiar. I'm trying to do with Python some of the things that
> I've done for years in shell, in particular, sort. The shell sort is
> pretty easy to use:


Sounds like you need to install Cygwin so you have a real bash
shell and all of the normal shell utilities.


An excellent piece of advice. Cygwin has occasionally save my sanity in
the past when the weakness of Windows' cmd.exe was getting to me...!-)

Most of my cmd.exe use is to invoke xxx ..args where xxx.cmd in a path directory
is one line like @python c:\util\xxx.cmd %* (I don't like the kludgy windows
first-line trick that requires xxx.py itself to be named xxx.cmd)
;-)

But, have you tried msys/mingw ? I haven't done a lot with it, but it is nice,
and supports most of the basic utilities including compiler/linker, though
I prefer gvim directly over the vim via msys shell (I probably don't have
the latter configured quite right).

A sampling:

[13:59] ~>ls /
bin doc etc home local m.ico mdk mingw msys.bat msys.ico uninstall
[14:00] ~>ls /bin
awk diff.exe ftp libW11.dll mv.exe sed.exe tr.exe
basename.exe diff3.exe gawk.exe ln.exe od.exe sh.exe true.exe
bunzip2 dirname.exe grep.exe lnkcnv patch.exe sleep.exe uname.exe
bzip2.exe echo gunzip ls.exe printf sort.exe uniq.exe
cat.exe egrep gzip.exe m4.exe ps.exe split.exe vi
chmod.exe env.exe head.exe make.exe pwd start view
cmd ex id.exe makeinfo.exe rm.exe tail.exe vim.exe
cmp.exe expr.exe info.exe md5sum.exe rmdir.exe tar.exe wc.exe
comm.exe false.exe infokey.exe mkdir.exe rvi tee.exe which
cp.exe fgrep install-info.exe mount.exe rview texi2dvi xargs.exe
cut.exe find.exe install.exe msys-1.0.dll rvim texindex.exe
date.exe fold.exe less.exe msysinfo rxvt.exe touch.exe
[14:00] ~>which gcc
/mingw/bin/gcc
[14:00] ~>ls /mingw/bin
a2dll dlltool.exe g77.exe mingw32-c++.exe objdump.exe res2coff.exe
addr2line.exe dllwrap.exe gcc.exe mingw32-g++.exe pexports.exe size.exe
ar.exe dos2unix.exe gccbug mingw32-gcc.exe protoize.exe strings.exe
as.exe drmingw.exe gcov.exe mingw32-make.exe ranlib.exe strip.exe
c++.exe dsw2mak gdb.exe mingwm10.dll readelf.exe unix2dos.exe
c++filt.exe exchndl.dll gprof.exe nm.exe redir.exe unprotoize.exe
cpp.exe g++.exe ld.exe objcopy.exe reimp.exe windres.exe
[14:00] ~>

Regards,
Bengt Richter
Jul 18 '05 #10

P: n/a
Bengt Richter <bo**@oz.net> wrote:
But, have you tried msys/mingw ? I haven't done a lot with it, but it is nice,
and supports most of the basic utilities including compiler/linker, though
I prefer gvim directly over the vim via msys shell (I probably don't have
the latter configured quite right).


I've used mingw in the past, never tried msys. As for editors, GVIM
does work just fine on Windows, that's the least of the problems...
Alex
Jul 18 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.