472,984 Members | 1,938 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,984 software developers and data experts.

Oh no, my code is being published ... help!

rm
There is a Linux forum that I frequent from time to time on which I
mentioned a couple of scripts that I wrote. The editors of a small
Linux magazine heard and found them interesting enough to ask me to
write an article about them. I accepted gladly, of course. I wrote
the article and submitted it and I was told to look for it on the
January issue. Sounds good, right?

The thing is I am starting to get a little nervous about it. You see,
programming is not my full time job. I dabble in it from time to
time, mostly to scratch my own itches, as they say. But, I know that
my code is probably far from being of professional quality. So, I was
wondering if some of you would be interested in taking a peak and
offer some suggestions for improving the quality and safety of the
code. Let me try to explain what they do.

Lets say, for example that you have, as I do, a large directory tree
that you want to compress containing data that you hardly ever use,
but that you want to have easy access to from time to time. In my
case, that directory tree contains the RAW image files that come from
my DSLR camera. Each of those files is about 10 MB. The total size
of that directory tree is about 45 GB, and it is constantly growing.
(Note: I store my finished, "processed", images on a different
directory tree. They are stored as JPEG files, so they are already
compressed.) How would you go about using compression to retake some
disk space on a situation like this one?

Well, one way I came up with was to write my own tool to do this job.
I created a program called 7sqz (7Squeeze) that can take care of this
task with ease. It is a Python script that navigates through a
directory tree compressing its contents only, not the actual
directories. As it enters each directory on the tree it saves all the
files on that directory on an archive on that same directory giving it
the name of the directory itself. If it finds that the directory
already has an archive file with the correct name it leaves it alone
and goes to the next directory, unless it also finds an uncompressed
file in it. When that happens it simply moves it into the existing
archive file, updating it inside the archive if it was already there.

I also created 7usqz which is the opposite counterpart of 7sqz. It
will simply go through a specified directory tree looking for archive
files named as the holding directory and will uncompress them,
essentially leaving the directory as it was before being squeezed.
Both 7sqz and 7usqz use p7zip for the actual compression, so you need
to have p7zip already installed.

You can obtain 7sqz from here:
http://rmcorrespond.googlepages.com/7sqz

And you can get 7usqz from here:
http://rmcorrespond.googlepages.com/7usqz

After downloading them, save them in a place like /usr/bin and make
sure they are executable.

To use 7sqz you could just give it a target directory as a parameter,
like this:

7sqz /home/some_directory

By default it will use the 7z format (which gives better compression
than zip), but you can use the zip format if you prefer by using the -
m option like this:

7sqz -m zip /home/some_directory

By default it will use Normal as the level of compression, but you can
use EXTRA or MAX if you prefer by using the -l option like this:

7sqz -l Extra /home/some_directory

By default it will just skip any file if it found an error during
compression and will log the error, but you can tell it to "Halt on
Error" with the -e option like this:

7sqz -e /home/some_directory

And of course, you can combine options as you please like this:

7sqz -m zip -l Max -e /home/some_directory

As I said, 7usqz is the opposite counterpart of 7sqz. To use it you
could just give it a target directory as a parameter, like this:

7usqz /home/some_directory

By default it will just skip any file if it found an error during
decompression and will log the error, but you can tell it to "Halt on
Error" with the -e option like this:

7usqz -e /home/some_directory

Please do a few, or better yet a lot of tests, before using it on a
directory that you cannot afford to loose. I believe it has all the
necessary safety precautions to protect your data, but I can't
guaranty it. That is why I'm asking for your help. All I can say is
that I have never lost any data with it and that it works great for
me. What do you think?
Nov 29 '07 #1
11 1532
rm <rm**********@gmail.comwrote:
>
The thing is I am starting to get a little nervous about it. You see,
programming is not my full time job. I dabble in it from time to
time, mostly to scratch my own itches, as they say. But, I know that
my code is probably far from being of professional quality. So, I was
wondering if some of you would be interested in taking a peak and
offer some suggestions for improving the quality and safety of the
code. Let me try to explain what they do.
There are several places where you do something like this:
strng = "abcde"
print str(strng)

Why the calls to str()? These are already strings, and even if they
weren't, "print" will convert its arguments.

You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

You have:
if ErrorHalt == True:
In almost every case, it's better to write this as:
if ErrorHalt:

The same applies to:
if debug == 1:
This is really a boolean, and most would write that as:
if debug:

"Tuple" is spelled with one "p", but that's being awfully picky...

Note that this:
def Save(self):
LogFile = open(self.logpath, "w")
for line in self.LogStrings:
LogFile.write(line)
LogFile.close()
can be written as:
def Save(self):
open(self.logpath, "w").writelines( self.LogStrings )
but that's really micro-optimization.

Most people would argue that every line in a log file should be written and
flushed to disk immediately, and not saved until some later time. If your
program terminates unexpectedly, all of your log information will be lost.

You assume that p7zip will be installed in /usr/bin/7za. That's not good.
On many of my systems, I install all new packages into /usr/local/bin. Some
people use /opt. You should really check the PATH to look for 7za.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Nov 30 '07 #2
Tim Roberts a écrit :
rm <rm**********@gmail.comwrote:
(snip)
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.
And FWIW, error messages should go to sys.stderr, not to sys.stdout
which is for normal program ouputs.

=>
# assert you imported sys before, of course
print >sys.stderr, "Error yadda yadda %s " % anything
Nov 30 '07 #3
On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.
I thought that with the eventual dropping of 'print' as a statement in
Python 3, that writing it this way (as if it were a print function) is
preferred, since that will be one fewer thing to convert.

--

# p.d.
Nov 30 '07 #4
On Nov 30, 2007 11:18 AM, Peter Decker <py******@gmail.comwrote:
On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

I thought that with the eventual dropping of 'print' as a statement in
Python 3, that writing it this way (as if it were a print function) is
preferred, since that will be one fewer thing to convert.
No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time and
disable the fix_print convertion. It is easier and safer to write the
common 2.Xish way and let 2to3 do the work.

--
http://www.advogato.org/person/eopadoan/
Bookmarks: http://del.icio.us/edcrypt
Nov 30 '07 #5
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.
Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
Nov 30 '07 #6
On Fri, 30 Nov 2007 14:36:17 +0100, Hrvoje Niksic wrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
>No, writing this way will confound the 2to3 tool.

Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
As this would encourage that stupid style I'd say -1 for that.

Written that way it looks like a function which it isn't. The current
Python version is still 2.5, there's a 2.6 ahead and the promise that the
2.x and 3.x branches will co-exist for some time.

If the function looking style would be adopted for 2.x, do *you* want to
explain confused newbies why they can write::

print('hello!')

but this acts "strange":

print('hello, my name is ', name)

Ciao,
Marc 'BlackJack' Rintsch
Nov 30 '07 #7
On Nov 30, 2007 11:36 AM, Hrvoje Niksic <hn*****@xemacs.orgwrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.

Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.
print("foo") -print(("foo"))

If you have any idea of how the tool could understand what you meant,
please report it at bugs.python.org :)

--
http://www.advogato.org/person/eopadoan/
Bookmarks: http://del.icio.us/edcrypt
Nov 30 '07 #8
On Fri, Nov 30, 2007 at 12:25:25PM -0200, Eduardo O. Padoan wrote regarding Re: Oh no, my code is being published ... help!:
>
On Nov 30, 2007 11:36 AM, Hrvoje Niksic <hn*****@xemacs.orgwrote:
"Eduardo O. Padoan" <ed************@gmail.comwrites:
No, writing this way will confound the 2to3 tool.
Why? print("foo") is a perfectly valid Python 2 statement. Maybe
it's simply a matter of fixing the tool.

print("foo") -print(("foo"))
And more to the point

(2.5) print(foo, bar) != (3.0) print(foo, bar)

Nov 30 '07 #9
On 2007-11-30, Eduardo O. Padoan <ed************@gmail.comwrote:
On Nov 30, 2007 11:18 AM, Peter Decker <py******@gmail.comwrote:
>On Nov 30, 2007 1:19 AM, Tim Roberts <ti**@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")

The parentheses serve no purpose here, and are unidiomatic.

I thought that with the eventual dropping of 'print' as a
statement in Python 3, that writing it this way (as if it were
a print function) is preferred, since that will be one fewer
thing to convert.

No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time
and disable the fix_print convertion. It is easier and safer to
write the common 2.Xish way and let 2to3 do the work.
Output ought be centralized to support maintenance, solving the
3.0 compatibility problem as a side-effect.

So the above would be something like:

my_print("Error Squeezing %s..." % the_thingy)

With my_print defined appropriately for the time and place.

Moreover, publishing code today with print(...) will, at best,
require a needless digression.

--
Neil Cerutti
Nov 30 '07 #10
rm
On Nov 30, 10:01 am, Neil Cerutti <horp...@yahoo.comwrote:
On 2007-11-30, Eduardo O. Padoan <eduardo.pad...@gmail.comwrote:
On Nov 30, 2007 11:18 AM, Peter Decker <pydec...@gmail.comwrote:
On Nov 30, 2007 1:19 AM, Tim Roberts <t...@probo.comwrote:
You also have a couple of instances of:
print("Error Squeezing %s...")
The parentheses serve no purpose here, and are unidiomatic.
I thought that with the eventual dropping of 'print' as a
statement in Python 3, that writing it this way (as if it were
a print function) is preferred, since that will be one fewer
thing to convert.
No, writing this way will confound the 2to3 tool. Just try for
yourself. You would have to write print(something) all the time
and disable the fix_print convertion. It is easier and safer to
write the common 2.Xish way and let 2to3 do the work.

Output ought be centralized to support maintenance, solving the
3.0 compatibility problem as a side-effect.

So the above would be something like:

my_print("Error Squeezing %s..." % the_thingy)

With my_print defined appropriately for the time and place.

Moreover, publishing code today with print(...) will, at best,
require a needless digression.

--
Neil Cerutti
Thanks for the great pointers. Exactly what I was looking for. At
least I hope it will save me some embarrassment. :)
Nov 30 '07 #11
rm
Thanks for all the comments. I uploaded revised versions of both
files. If you see any more problems with them or if you have any
suggestions for improving them, I am all ears.

@
:D
@
Nov 30 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

242
by: James Cameron | last post by:
Hi I'm developing a program and the client is worried about future reuse of the code. Say 5, 10, 15 years down the road. This will be a major factor in selecting the development language. Any...
109
by: Andrew Thompson | last post by:
It seems most people get there JS off web sites, which is entirely logical. But it is also a great pity since most of that code is of such poor quality. I was looking through the JS FAQ for any...
192
by: Vortex Soft | last post by:
http://www.junglecreatures.com/ Try it and tell me what's happenning in the Microsoft Corporation. Notes: VB, C# are CLS compliant
19
by: Swaregirl | last post by:
Hello, I would like to build a website using ASP.NET. I would like website visitors to be able to download code that I would like to make available to them and that would be residing on my...
93
by: Phlip | last post by:
C++ers: Feast your eyes: void Home:: inherits (IdentifierPtr const& id) { ... }
4
by: Brian Wotherspoon | last post by:
Hi all, I'm using SQL Server 2000 SP3 to store data for real time transaction processing. I have set up replication to another server using a push subscription to give me immediate backup. ...
2
by: Gawn | last post by:
Hi I am from Thainald and new to PHP/MySQL. I am doing a news website and I can't do the related news, while I can for latest news. My code for latest news is "SELECT * FROM news WHERE...
6
by: Just Me | last post by:
Any ideas on this. I am trying to loop through an xml document to remove attributes, but Im having so much trouble, any help is appreciated //THIS IS THE EXCEPTION ( SEE CODE LINE WHERE FAILURE...
4
by: Joe | last post by:
Hello all! I added a Global.asax to my application. I'm using the Application_BeginRequest event. Everything works fine in my development enviorment but when I publish the web site the...
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.