473,378 Members | 1,605 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

porting shell scripts: system(list), system_pipe(lists)

One of my recent projects has involved taking an accretion of sh and
perl scripts and "doing them right" - making them modular, improving
the error reporting, making it easier to add even more features to
them. "Of course," I'm redoing them in python - much of the cut&paste
reuse has become common functions, which then get made more robust and
have a common style and are callable from other (python) tools
directly, instead of having to exec scripts to get at them. The usual
"glorious refactoring."

Most of it has been great - os.listdir+i.endswith() instead of
globbing, exception handling instead of "exit 1", that sort of thing.
I've run into one weakness, though: executing programs.

Python has, of course, os.fork and os.exec* corresponding to the raw
unix functions. It also has the higher level os.system, popen,
expect, and commands.get* functions. The former need a bunch of
stylized operations performed; the latter *all* involve passing in
strings which then leads one to quoting issues, which can be serious
risks in some applications.

Perl had one very helpful interface for this kind of thing: system and
exec will both take array arguments:
$ perl -e 'system("echo", "*")'
*
$ perl -e 'exec("echo", "*")'
*
versus
$ perl -e 'exec("echo *")'
#.newsrc-dribble# CVS stuff ...
This has always struck me as "correct" - not the overloading,
necessarily, but the use of a list.

So, implementing system this way is easy enough:

def system(cmd):
pid = os.fork()
if pid > 0:
p, st = os.waitpid(pid, os.P_WAIT)
if st == 0:
return
raise ExecFailed(str(cmd), st)
elif pid == 0:
try:
os.execvp(cmd[0], cmd)
except OSError, e:
traceback.print_exc()
os._exit(113)

[The try/except is an interesting issue: if cmd[0] isn't found,
os.execvp throws -- but it is already in the child, and this walks up
the stack to any surrounding try/except, which then continues,
possibly disastrously, whatever that code had been doing *in a
duplicate process*. The _exit explicitly short cuts this.]

So, this makes a big difference when porting simple bits of shell (and
usually, just in passing, fixing quoting bugs - if you had code that
used to do "ci -l $foo" and it is now "system(['ci', '-l', foo])"
you now properly handle spaces and punctuation in the value of foo,
"for free".) However, the other thing you tend to find in
"advanced"[1] shell scripts is lengthy pipelines. (Sure, you find
while loops and case statements and such - but python's control
structures handle those fine.)

Implementing pipelines takes rather a bit more work, and one might
(not unreasonably) throw up one's hands and just use os.system and
some re.sub's to do the quoting. However, I had enough cases where
the goal really was to run a complex shell pipeline (I also had cases
where the pipeline converted nicely to some inline python code,
especially with the help of the gzip module) that I sat down and
cooked up a pipeline class.

The interface I ended up with is pretty simple:
g_pipe = pipeline()
g_pipe.stdin(open("blort.gz", "r"))
g_pipe.append(["gunzip"])
g_pipe.append(["sort", "-u"])
g_pipe.append(["wc", "-l"])
g_pipe.stdout(open("blort.count", "w"))
print g_pipe.run()

is equivalent to the sh:
gunzip < blort.gz | sort -u | wc -l > blort.count

pipeline also has obvious stderr and chdir methods; pipeline.run
actually returns an array with the return status of *each* pipeline
element (which leads to "if filter(None, st): deal_with_error" being a
useful idiom for noticing failures that a shell script would typically
miss.)

This has lead me to a few questions:

1. Am I being dense? Are there already common modules (included or
otherwise) that do this, or solve the problem some other way?
2. Is there a more pythonic way of expressing the construction?
Would exposing the internal array of commands make more sense,
possibly by "passing through" various array operations on the
class to the internal array (as the use of "append" hints at)? Or
maybe "exec" objects that a "pipe" combiner operates on?
3. Should an interface like this be in a "battery" somewhere? shutil
didn't seem to quite match...
4. Any reason to even try porting this interface to non-unix systems?
Is there a close enough match to os.pipe/os.fork/os.exec/os.wait,
or some other construct that works on microsoft platforms?

_Mark_ <ei****@metacarta.com>

[1] in the Invader Zim sense :)
Jul 18 '05 #1
2 3343


ei****@metacarta.com wrote:
One of my recent projects has involved taking an accretion of sh and
perl scripts and "doing them right" - making them modular, improving
the error reporting, making it easier to add even more features to
them. "Of course," I'm redoing them in python - much of the cut&paste
reuse has become common functions, which then get made more robust and
have a common style and are callable from other (python) tools
directly, instead of having to exec scripts to get at them. The usual
"glorious refactoring."
<<SNIP>>
Implementing pipelines takes rather a bit more work, and one might
(not unreasonably) throw up one's hands and just use os.system and
some re.sub's to do the quoting. However, I had enough cases where
the goal really was to run a complex shell pipeline (I also had cases
where the pipeline converted nicely to some inline python code,
especially with the help of the gzip module) that I sat down and
cooked up a pipeline class.

The interface I ended up with is pretty simple:
g_pipe = pipeline()
g_pipe.stdin(open("blort.gz", "r"))
g_pipe.append(["gunzip"])
g_pipe.append(["sort", "-u"])
g_pipe.append(["wc", "-l"])
g_pipe.stdout(open("blort.count", "w"))
print g_pipe.run()

is equivalent to the sh:
gunzip < blort.gz | sort -u | wc -l > blort.count
<<SNIP>>
_Mark_ <ei****@metacarta.com>

[1] in the Invader Zim sense :)


I think that your pipeline code looks nothing like the original sh
script pipeline which to me counts heavily against it.
Just playing at the cygwin prompt...
$ ls -l|wc -l > /tmp/lines_in_dir
$ cat /tmp/lines_in_dir
465
$ python
from os import system
system(r'''/bin/ls -l|/bin/wc -l > /tmp/lines_in_dir2''') 0 system(r'''/bin/cat /tmp/lines_in_dir2''')

463
0

I prefer the above because it looks like the original sh command.
Of course, if script security is very important then you may want to
change the way things are implemented again.

Cheers, Paddy.

Jul 18 '05 #2
Quoth ei****@metacarta.com:
....
| 1. Am I being dense? Are there already common modules (included or
| otherwise) that do this, or solve the problem some other way?

I can't tell you whether any of them has come to be common, but
there have been a handful of efforts along these lines - process
and pipeline creation.

| 2. Is there a more pythonic way of expressing the construction?
| Would exposing the internal array of commands make more sense,
| possibly by "passing through" various array operations on the
| class to the internal array (as the use of "append" hints at)? Or
| maybe "exec" objects that a "pipe" combiner operates on?

Only thing that comes to mind is error handling. It certainly is
not characteristic of Python functions to return an error status,
rather they typically raise exceptions. Ideally, I would think
the exception type for this would carry the exit status, other
information in the status word, and text from error/diagnostic
output. That last one is particularly important and particularly
awkward to get.

See appended example for a trick to deal with the special case
where a Python exception is caught in the fork.

| 3. Should an interface like this be in a "battery" somewhere? shutil
| didn't seem to quite match...

No one ever likes anyone else's version of this, so it's typically
reinvented as required.

| 4. Any reason to even try porting this interface to non-unix systems?
| Is there a close enough match to os.pipe/os.fork/os.exec/os.wait,
| or some other construct that works on microsoft platforms?

There's os.spawnv, if you haven't noticed that.

Donn Cave, do**@drizzle.com
-----------
import fcntl
import posix
import sys
import pickle

def spawn_wnw(wait, file, args, env):
p0, p1 = posix.pipe()
pid = posix.fork()
if pid:
posix.close(p1)
ps = posix.read(p0, 1024)
posix.close(p0)
if wait:
junk, ret = posix.waitpid(pid, 0)
else:
ret = pid
if ps:
e, v = pickle.loads(ps)
raise e, v
else:
return ret
else:
try:
fcntl.fcntl(p1, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
posix.close(p0)
posix.execve(file, args, env)
except:
e, v, t = sys.exc_info()
s = pickle.dumps((e, v))
posix.write(p1, s)
posix._exit(117)

def spawnw(file, args, env):
spawn_wnw(1, file, args, env)

def spawn(file, args, env):
spawn_wnw(0, file, args, env)

pid = spawn('/bin/bummer', ['bummer', '-ever', 'summer'], posix.environ)
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Jochen Hub | last post by:
Hi, I am a real beginner in Python and I was wondering if there is a way to convert a string (which contains a list) to a "real" list. I need this since I would like to give a list as an...
20
by: Lucas Raab | last post by:
I'm done porting the C code, but now when running the script I continually run into problems with lists. I tried appending and extending the lists, but with no avail. Any help is much appreciated...
41
by: Odd-R. | last post by:
I have to lists, A and B, that may, or may not be equal. If they are not identical, I want the output to be three new lists, X,Y and Z where X has all the elements that are in A, but not in B, and...
7
by: Majnu | last post by:
Hi community, just in case somebody needs a shellsort in c#, I rewrote the pascal code that I found in another newsgroup. Here are both. For more explanation on the pascal code you can search...
8
by: BARTKO, Zoltán | last post by:
Hello, folks, I am trying to install pgsql8 on winxp. I tried first to install "as is" with pginstaller beta2-dev3, no luck, it froze, switched off Nod32, froze a little later, ran through the...
2
by: V_S_H_Satish | last post by:
Dear Friends I am working as oracle and ms sql dba from last 4 years. My company recently migrated to DB2 databases. So i am very much new to db2 database Can any one pls provide script to...
8
by: Mir Nazim | last post by:
Hello, I need to write scripts in which I need to generate all posible unique combinations of an integer list. Lists are a minimum 12 elements in size with very large number of possible...
10
by: AZRebelCowgirl73 | last post by:
This is what I have so far: My program! import java.util.*; import java.lang.*; import java.io.*; import ch06.lists.*; public class UIandDB {
19
RMWChaos
by: RMWChaos | last post by:
Previously, I had used independent JSON lists in my code, where the lists were part of separate scripts. Because this method did not support reuse of a script without modification, I decided to...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.