471,319 Members | 1,601 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,319 software developers and data experts.

Oh no, my code is being published ... help!

rm
There is a Linux forum that I frequent from time to time on which I
mentioned a couple of scripts that I wrote. The editors of a small
Linux magazine heard and found them interesting enough to ask me to
write an article about them. I accepted gladly, of course. I wrote
the article and submitted it and I was told to look for it on the
January issue. Sounds good, right?

The thing is I am starting to get a little nervous about it. You see,
programming is not my full time job. I dabble in it from time to
time, mostly to scratch my own itches, as they say. But, I know that
my code is probably far from being of professional quality. So, I was
wondering if some of you would be interested in taking a peak and
offer some suggestions for improving the quality and safety of the
code. Let me try to explain what they do.

Lets say, for example that you have, as I do, a large directory tree
that you want to compress containing data that you hardly ever use,
but that you want to have easy access to from time to time. In my
case, that directory tree contains the RAW image files that come from
my DSLR camera. Each of those files is about 10 MB. The total size
of that directory tree is about 45 GB, and it is constantly growing.
(Note: I store my finished, "processed", images on a different
directory tree. They are stored as JPEG files, so they are already
compressed.) How would you go about using compression to retake some
disk space on a situation like this one?

Well, one way I came up with was to write my own tool to do this job.
I created a program called 7sqz (7Squeeze) that can take care of this
task with ease. It is a Python script that navigates through a
directory tree compressing its contents only, not the actual
directories. As it enters each directory on the tree it saves all the
files on that directory on an archive on that same directory giving it
the name of the directory itself. If it finds that the directory
already has an archive file with the correct name it leaves it alone
and goes to the next directory, unless it also finds an uncompressed
file in it. When that happens it simply moves it into the existing
archive file, updating it inside the archive if it was already there.

I also created 7usqz which is the opposite counterpart of 7sqz. It
will simply go through a specified directory tree looking for archive
files named as the holding directory and will uncompress them,
essentially leaving the directory as it was before being squeezed.
Both 7sqz and 7usqz use p7zip for the actual compression, so you need
to have p7zip already installed.

You can obtain 7sqz from here:
http://rmcorrespond.googlepages.com/7sqz

And you can get 7usqz from here:
http://rmcorrespond.googlepages.com/7usqz

After downloading them, save them in a place like /usr/bin and make
sure they are executable.

To use 7sqz you could just give it a target directory as a parameter,
like this:

7sqz /home/some_directory

By default it will use the 7z format (which gives better compression
than zip), but you can use the zip format if you prefer by using the -
m option like this:

7sqz -m zip /home/some_directory

By default it will use Normal as the level of compression, but you can
use EXTRA or MAX if you prefer by using the -l option like this:

7sqz -l Extra /home/some_directory

By default it will just skip any file if it found an error during
compression and will log the error, but you can tell it to "Halt on
Error" with the -e option like this:

7sqz -e /home/some_directory

And of course, you can combine options as you please like this:

7sqz -m zip -l Max -e /home/some_directory

As I said, 7usqz is the opposite counterpart of 7sqz. To use it you
could just give it a target directory as a parameter, like this:

7usqz /home/some_directory

By default it will just skip any file if it found an error during
decompression and will log the error, but you can tell it to "Halt on
Error" with the -e option like this:

7usqz -e /home/some_directory

Please do a few, or better yet a lot of tests, before using it on a
directory that you cannot afford to loose. I believe it has all the
necessary safety precautions to protect your data, but I can't
guaranty it. That is why I'm asking for your help. All I can say is
that I have never lost any data with it and that it works great for
me. What do you think?
Nov 29 '07 #1
11 1467
rm <rm**********@gmail.comwrote:
>
The thing is I am starting to get a little nervous about it. You see,
programming is not my full time job. I dabble in it from time to
time, mostly to scratch my own itches, as they say. But, I know that
my code is probably far from being of professional quality. So, I was
wondering if some of you would be interested in taking a peak and
offer some suggestions for improving the quality and safety of the
code. Let me try to explain what they do.
There are several places where you do something like this:
strng = "abcde"
print str(strng)

Why the calls to str()? These are already strings, and even if they
weren't, "print" will convert its arguments.

You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

You have:
if ErrorHalt == True:
In almost every case, it's better to write this as:
if ErrorHalt:

The same applies to:
if debug == 1:
This is really a boolean, and most would write that as:
if debug:

"Tuple" is spelled with one "p", but that's being awfully picky...

Note that this:
def Save(self):
LogFile = open(self.logpath, "w")
for line in self.LogStrings:
LogFile.write(line)
LogFile.close()
can be written as:
def Save(self):
open(self.logpath, "w").writelines( self.LogStrings )
but that's really micro-optimization.

Most people would argue that every line in a log file should be written and
flushed to disk immediately, and not saved until some later time. If your
program terminates unexpectedly, all of your log information will be lost.

You assume that p7zip will be installed in /usr/bin/7za. That's not good.
On many of my systems, I install all new packages into /usr/local/bin. Some
people use /opt. You should really check the PATH to look for 7za.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Nov 30 '07 #2
Tim Roberts a écrit :
rm <rm**********@gmail.comwrote:
(snip)
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.
And FWIW, error messages should go to sys.stderr, not to sys.stdout
which is for normal program ouputs.

=>
# assert you imported sys before, of course
print >sys.stderr, "Error yadda yadda %s " % anything
Nov 30 '07 #3
On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.
I thought that with the eventual dropping of 'print' as a statement in
Python 3, that writing it this way (as if it were a print function) is
preferred, since that will be one fewer thing to convert.

--

# p.d.
Nov 30 '07 #4
On Nov 30, 2007 11:18 AM, Peter Decker <py******@gmail.comwrote:
On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

I thought that with the eventual dropping of 'print' as a statement in
Python 3, that writing it this way (as if it were a print function) is
preferred, since that will be one fewer thing to convert.
No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time and
disable the fix_print convertion. It is easier and safer to write the
common 2.Xish way and let 2to3 do the work.

--
http://www.advogato.org/person/eopadoan/
Bookmarks: http://del.icio.us/edcrypt
Nov 30 '07 #5
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.
Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
Nov 30 '07 #6
On Fri, 30 Nov 2007 14:36:17 +0100, Hrvoje Niksic wrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
>No, writing this way will confound the 2to3 tool.

Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
As this would encourage that stupid style I'd say -1 for that.

Written that way it looks like a function which it isn't. The current
Python version is still 2.5, there's a 2.6 ahead and the promise that the
2.x and 3.x branches will co-exist for some time.

If the function looking style would be adopted for 2.x, do *you* want to
explain confused newbies why they can write::

print('hello!')

but this acts "strange":

print('hello, my name is ', name)

Ciao,
Marc 'BlackJack' Rintsch
Nov 30 '07 #7
On Nov 30, 2007 11:36 AM, Hrvoje Niksic <hn*****@xemacs.orgwrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.

Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
print("foo") -print(("foo"))

If you have any idea of how the tool could understand what you meant,
please report it at bugs.python.org :)

--
http://www.advogato.org/person/eopadoan/
Bookmarks: http://del.icio.us/edcrypt
Nov 30 '07 #8
On Fri, Nov 30, 2007 at 12:25:25PM -0200, Eduardo O. Padoan wrote regarding Re: Oh no, my code is being published ... help!:
>
On Nov 30, 2007 11:36 AM, Hrvoje Niksic <hn*****@xemacs.orgwrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.
Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.

print("foo") -print(("foo"))
And more to the point

(2.5) print(foo, bar) != (3.0) print(foo, bar)

Nov 30 '07 #9
On 2007-11-30, Eduardo O. Padoan <ed************@gmail.comwrote:
On Nov 30, 2007 11:18 AM, Peter Decker <py******@gmail.comwrote:
>On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

I thought that with the eventual dropping of 'print' as a
statement in Python 3, that writing it this way (as if it were
a print function) is preferred, since that will be one fewer
thing to convert.

No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time
and disable the fix_print convertion. It is easier and safer to
write the common 2.Xish way and let 2to3 do the work.
Output ought be centralized to support maintenance, solving the
3.0 compatibility problem as a side-effect.

So the above would be something like:

my_print("Error Squeezing %s..." % the_thingy)

With my_print defined appropriately for the time and place.

Moreover, publishing code today with print(...) will, at best,
require a needless digression.

--
Neil Cerutti
Nov 30 '07 #10
rm
On Nov 30, 10:01 am, Neil Cerutti <horp...@yahoo.comwrote:
On 2007-11-30, Eduardo O. Padoan <eduardo.pad...@gmail.comwrote:
On Nov 30, 2007 11:18 AM, Peter Decker <pydec...@gmail.comwrote:
On Nov 30, 2007 1:19 AM, Tim Roberts <t...@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")
The parentheses serve no purpose here, and are unidiomatic.
I thought that with the eventual dropping of 'print' as a
statement in Python 3, that writing it this way (as if it were
a print function) is preferred, since that will be one fewer
thing to convert.
No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time
and disable the fix_print convertion. It is easier and safer to
write the common 2.Xish way and let 2to3 do the work.

Output ought be centralized to support maintenance, solving the
3.0 compatibility problem as a side-effect.

So the above would be something like:

my_print("Error Squeezing %s..." % the_thingy)

With my_print defined appropriately for the time and place.

Moreover, publishing code today with print(...) will, at best,
require a needless digression.

--
Neil Cerutti
Thanks for the great pointers. Exactly what I was looking for. At
least I hope it will save me some embarrassment. :)
Nov 30 '07 #11
rm
Thanks for all the comments. I uploaded revised versions of both
files. If you see any more problems with them or if you have any
suggestions for improving them, I am all ears.

@
:D
@
Nov 30 '07 #12

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

242 posts views Thread by James Cameron | last post: by
109 posts views Thread by Andrew Thompson | last post: by
93 posts views Thread by Phlip | last post: by
6 posts views Thread by Just Me | last post: by
4 posts views Thread by Joe | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.