473,322 Members | 1,232 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Secure delete with python

Hi everybody.
I was wandering is there a method or a function already implemented in
python that supports secure deletion of data?

I'm interested in something which is able to securely wipe data (from
single file to bunch of MB's), and that should run both on Linux and
Windows.

I tried on google, but I hadn't found anything useful to me.

Thank you very much in advance.

Boris Genc
Jul 18 '05 #1
21 6682
Boris Genc <boris.genc@REMOVE_mindless_ME.com> wrote:
Hi everybody.
I was wandering is there a method or a function already implemented in
python that supports secure deletion of data?

I'm interested in something which is able to securely wipe data (from
single file to bunch of MB's), and that should run both on Linux and
Windows.


When people talk about secure deletion of data, they generally mean
things like over-writing the physical disk blocks that used to hold the
file with random data. The details of how you do this is extremely
operating system dependent (and probably also on what kind of file
system, hardware, etc). Not to mention that the definition of "secure"
will vary with the type of data, and who's doing it (i.e. what I
consider secure probably doesn't pass muster with the military).
Jul 18 '05 #2
Boris Genc wrote:
Hi everybody.
I was wandering is there a method or a function already implemented in
python that supports secure deletion of data?

I'm interested in something which is able to securely wipe data (from
single file to bunch of MB's), and that should run both on Linux and
Windows.

I tried on google, but I hadn't found anything useful to me.

Thank you very much in advance.

Boris Genc

something like

fp = open(path, "wb")
for i in range(os.path.getsize(path)):
fp.write("*")
fp.close()
os.unlink(path)

is probably all you can do in a portable way (multiple write phases with
different data could improve the 'security'). But a problem that cannot be
solved in a portable way is that the data might exist at other locations on the
disk (e.g. temporary file, backup, swapfile...). Unless you know *exactly* that
there *cannot* be another copy of the data, you would have to erase all unused
parts of the filesystem, too - a process that heavily depends on which
filesystem is used.
Jul 18 '05 #3
Benjamin Niemann wrote:
Boris Genc wrote:
Hi everybody.
I was wandering is there a method or a function already implemented in
python that supports secure deletion of data?

I'm interested in something which is able to securely wipe data (from
single file to bunch of MB's), and that should run both on Linux and
Windows.

I tried on google, but I hadn't found anything useful to me.

Thank you very much in advance.

Boris Genc


something like

fp = open(path, "wb")
for i in range(os.path.getsize(path)):
fp.write("*")
fp.close()
os.unlink(path)


and there is no guarantee that this actually overwrites the old file. The
filesystem may choose to write the new content at another location of the disk,
leaving the original data untouched.
Jul 18 '05 #4
On Mon, 06 Sep 2004 09:10:49 -0400, Roy Smith wrote:
When people talk about secure deletion of data, they generally mean
things like over-writing the physical disk blocks that used to hold the
file with random data. The details of how you do this is extremely
operating system dependent (and probably also on what kind of file
system, hardware, etc). Not to mention that the definition of "secure"
will vary with the type of data, and who's doing it (i.e. what I
consider secure probably doesn't pass muster with the military).


Yes, I was thinking about overwriting the data I want to be deleted with
random data. I know that things like that are OS specific. I wasn't
thinking about all those Gutmann methods and 27 passes, it's more like a
simple utility, more "hide from your sister" than "hide from the
government" type:)

Anyway, thank you guys. Benjamin, I think your method will suit me, thank
you.

Jul 18 '05 #5
Boris Genc <boris.genc@REMOVE_mindless_ME.com> writes:
I'm interested in something which is able to securely wipe data (from
single file to bunch of MB's), and that should run both on Linux and
Windows.


I wrote something like that:

http://www.nightsong.com/phr/crypto/keytree.py

Explanation at: http://tinyurl.com/67beu
Jul 18 '05 #6
>>>>> "Benjamin" == Benjamin Niemann <b.*******@betternet.de> writes:
fp = open(path, "wb")
for i in range(os.path.getsize(path)):
fp.write("*")
fp.close()
os.unlink(path)


Benjamin> and there is no guarantee that this actually overwrites
Benjamin> the old file. The filesystem may choose to write the new
Benjamin> content at another location of the disk, leaving the
Benjamin> original data untouched.

Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...

Also, when overwriting a file, it's better to do it several times,
with alternating bit patterns and "syncing" the disk after each
pass. Of course even that is not going to guarantee anything because
it may just go to the hardware cache in the disk unit, but it's
reasonable if you are overwriting lots of data at once.

Performing these steps, you'll at least get a good false sense of
security ;-).

--
Ville Vainio http://tinyurl.com/2prnb
Jul 18 '05 #7
On Mon, 06 Sep 2004 15:25:51 +0200, Benjamin Niemann
<b.*******@betternet.de> declaimed the following in comp.lang.python:

fp = open(path, "wb")
Opening for "w", on many systems I've used, basically creates a
new file that may or may not use the same disk region (it definitely
wouldn't on UCSD P-system -- when I used that all files opened for
output were opened in the largest contiguous space on the disk).

Opening the file for "r+" is probably better; since it indicates
one may wish to read from the file along with writing to it, then the
original file must be available -- and I've not heard of any OS that
makes complete copies of a file during updates (I'm not counting the
behavior of editors/word-processors that read the entire file into
memory and create a temporary backup copy).

-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Jul 18 '05 #8
Ville Vainio wrote:
Seriously? What OSen are known for [writing new content at
another location of the disk]? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...


It can even be filesystem specific. Back in the days
of WORM drives (do people still use those?) you could write
once to a place on the drive, but read it many times.
(Write Once Read Many). Changing a file meant writing a
new copy of it and writing a new index to point to the
new file, ignoring the old. That is, all copies of the
file would stay on the disk.
The VMS systems always kept an old copy of the file around
unless you explicitly deleted it. By default a directory
listing would only show the most recent copy of the file,
but you could tell it to show all the versions, which
would look like (roughly, been 15 years since I last saw VMS)
MYFILE;1
MYFILE;2
..
MYFILE;94

It was believed this feature was a deliberate ploy of
DEC to sell more hard drives. ;)
If you read a file then wait a while, and during that time
the OS decided to defragment the drive then the location
of the file could easily be changed from underneath you.
Andrew
da***@dalkescientific.com
Jul 18 '05 #9
Ville Vainio wrote:
>>"Benjamin" == Benjamin Niemann <b.*******@betternet.de> writes:
>> fp = open(path, "wb")
>> for i in range(os.path.getsize(path)):
>> fp.write("*")
>> fp.close()
>> os.unlink(path)


Benjamin> and there is no guarantee that this actually overwrites
Benjamin> the old file. The filesystem may choose to write the new
Benjamin> content at another location of the disk, leaving the
Benjamin> original data untouched.

Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...


VMS, I believe, has a versioning system built into the file system. Each
time a file is saved, a new version is created while the old versions
are still there. All from hearsay though, I have never used or seen VMS
myself.

--
"Codito ergo sum"
Roel Schroeven
Jul 18 '05 #10
Ville Vainio wrote:
>>"Benjamin" == Benjamin Niemann <b.*******@betternet.de> writes:
>> fp = open(path, "wb")
>> for i in range(os.path.getsize(path)):
>> fp.write("*")
>> fp.close()
>> os.unlink(path)

Benjamin> and there is no guarantee that this actually overwrites
Benjamin> the old file. The filesystem may choose to write the new
Benjamin> content at another location of the disk, leaving the
Benjamin> original data untouched.

Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...

I don't know, if there actually is a filesystem that does this, but
there is no rule (that comes to mind now at least) that forbids it. E.g.
I could imagine some kind of transactional FS that doesn't change the
original file until to finish the transaction (=close the file) to avoid
file corruption, if a program crashes while writing...

Modern filesystem do lots of things most people (including me) can't
imaging. ReiserFS e.g. packs several small files into one block. If such
a file grows (perhaps) the data is moved to a block of its own - and the
old data stays (unreferenced) on disk although you didn't conciously
made a copy of the file...

But I'm just thinking aloud - don't know if anything of this is true.
But I except to task of a "secure delete" to be pretty difficult.
Also, when overwriting a file, it's better to do it several times,
with alternating bit patterns and "syncing" the disk after each
pass. Of course even that is not going to guarantee anything because
it may just go to the hardware cache in the disk unit, but it's
reasonable if you are overwriting lots of data at once.

Performing these steps, you'll at least get a good false sense of
security ;-).

Jul 18 '05 #11
Ville Vainio <vi***@spammers.com> writes:
Benjamin> and there is no guarantee that this actually overwrites
Benjamin> the old file. The filesystem may choose to write the new
Benjamin> content at another location of the disk, leaving the
Benjamin> original data untouched.

Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...
That's what log structured file systems do, for example.
Also, when overwriting a file, it's better to do it several times,
with alternating bit patterns and "syncing" the disk after each
pass. Of course even that is not going to guarantee anything because
it may just go to the hardware cache in the disk unit, but it's
reasonable if you are overwriting lots of data at once.


It may never get written to the same sector of the disk as the
original file, even if the OS has tried to overwrite those sectors.
Disk drives themselves will sometimes remap sectors from one place to
another.
Jul 18 '05 #12
On Mon, 06 Sep 2004 20:40:50 GMT, Roel Schroeven
<rs****************@fastmail.fm> declaimed the following in
comp.lang.python:

VMS, I believe, has a versioning system built into the file system. Each
time a file is saved, a new version is created while the old versions
The keyword is "saved"... If opened in an "update" mode, one is
working with just the original file. Things like editors, however,
typically duplicated the contents (with modifications) into a NEW file
-- incrementing the version number.

-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Jul 18 '05 #13
Paul Rubin <http://ph****@NOSPAM.invalid> wrote in message news:<7x************@ruckus.brouhaha.com>...
Ville Vainio <vi***@spammers.com> writes:
Benjamin> and there is no guarantee that this actually overwrites
Benjamin> the old file. The filesystem may choose to write the new
Benjamin> content at another location of the disk, leaving the
Benjamin> original data untouched.

Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...


That's what log structured file systems do, for example.
Also, when overwriting a file, it's better to do it several times,
with alternating bit patterns and "syncing" the disk after each
pass. Of course even that is not going to guarantee anything because
it may just go to the hardware cache in the disk unit, but it's
reasonable if you are overwriting lots of data at once.


It may never get written to the same sector of the disk as the
original file, even if the OS has tried to overwrite those sectors.
Disk drives themselves will sometimes remap sectors from one place to
another.


I had this idea once, when I assumed that the OS wrote to the first
blocks nearest to the beginning of the disk, to where I just simply
write a whole bunch of crap files to fill in blocks that could be the
place where recently deleted files used to be. Then defrag the
filesystem. Then delete the crap files.

I'm just thinking aloud if any of this helps.
Jul 18 '05 #14
ma**********@gmail.com (Matthew K Jensen) writes:
I had this idea once, when I assumed that the OS wrote to the first
blocks nearest to the beginning of the disk, to where I just simply
write a whole bunch of crap files to fill in blocks that could be the
place where recently deleted files used to be. Then defrag the
filesystem. Then delete the crap files.

I'm just thinking aloud if any of this helps.


If you're 1) in control of what the OS does; and 2) not concerned
about securing the data against serious recovery attempts, then ok,
there's all kinds of stuff you can do that gives reasonable protection.

In practice, 1) you're usually not in control of the OS and so you
can't assume what order blocks are written in; and 2) if you're
writing a security application for use by other people, you don't
necessarily know what kinds of opponents your users will have or what
will happen if their data escapes, so you have to guard against
powerful data recovery techniques (including as-yet-uninvented ones)
as well as casual ones.

I think you're best off assuming that short of melting the platters,
there's no way to ever erase data from a hard drive, i.e. that a
sufficiently powerful attacker can recover every state that the drive
has ever been in. The solution is to write only encrypted data to the
drive, and don't store the key on the drive.
Jul 18 '05 #15
Ville Vainio <vi***@spammers.com> wrote in
news:du*************@amadeus.cc.tut.fi:
Seriously? What OSen are known for doing this? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...


I don't know for certain, but I think it is a pretty safe bet that NTFS
allocates new disc blocks instead of updating the existing ones.

NTFS is a transaction based file system, i.e. it guarantees that any
particular disc operation either completes or doesn't, you can never get
file-system corruption due to a power loss part way through updating a
file. Transactions are written to two transaction logs (in case one is
corrupted on failure), and every few seconds the outstanding transactions
are committed. Once committed there is sufficient information in the
transaction log that even if power is lost the transaction can be
completed, and likewise any transaction that has not been committed has
sufficient information stored that it can be rolled back.

There isn't very much published information on the NTFS internals (any
useful references gratefully received), but so far as I can see writing
updates to a fresh disc block would be the only realistic way to implement
this (otherwise you would need to write the data three times: once to each
transaction log then again to the actual file). If the data is written
separately then the transaction log only needs to store the location of the
new data (so it can be wiped if the transaction is rolled back) and then
update pointers when it is committed.

The other reason why I'm sure overwriting an existing file must allocate
new disc blocks is that NTFS supports compression on files, so if you start
off with a compressed file containing essentially random data and overwrite
it with repeated data (e.g. nulls) it will occupy less disc space.
Jul 18 '05 #16
Paul Rubin wrote:
I think you're best off assuming that short of melting the platters,
there's no way to ever erase data from a hard drive, i.e. that a
sufficiently powerful attacker can recover every state that the drive
has ever been in. The solution is to write only encrypted data to the
The german PC magazine c't has sent in hard disks overwritten once with
zeros to data recovery firms. No data was recovered. So unless your
opponent has secret service connections I'd say you are safe. He will
rather watch your screen or log your keystrokes than mess with the hd - if
he's not already in your WLAN that is.
has ever been in. The solution is to write only encrypted data to the
drive, and don't store the key on the drive.


As a special case, avoid that the OS writes the key to disk while swapping.

Peter
Jul 18 '05 #17
For example source code and discussion for Windows see
http://www.sysinternals.com/ntw2k/source/sdelete.shtml

Neil
Jul 18 '05 #18
On Tue, Sep 07, 2004 at 10:40:07AM +0200, Peter Otten wrote:
has ever been in. The solution is to write only encrypted data to the
drive, and don't store the key on the drive.


As a special case, avoid that the OS writes the key to disk while swapping.


or encrypt the swapfile. In fact, encrypt the disk, then partition it;
this is easily done with the device mapper in linux 2.6...

--
John Lenton (jo**@grulic.org.ar) -- Random fortune:
Todo lo que nace es digno de morir. -- Goethe --

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBPbLdgPqu395ykGsRAmvIAJ41SVhaTWAd3+8zVjANlF o0jCGWfgCgiqU/
cMQ+KqeulTq7QfLypgZeC6g=
=Vz57
-----END PGP SIGNATURE-----

Jul 18 '05 #19
"Neil Hodgson" <nh******@bigpond.net.au> writes:
For example source code and discussion for Windows see
http://www.sysinternals.com/ntw2k/source/sdelete.shtml


See also the stuff at briggsoft.com. Apparently a lot of so-called
secure deletion products on Windows don't work nearly as well as
claimed. Kent Briggs hangs out on sci.crypt and has evaluated a lot
of them besides marketing some of his own.
Jul 18 '05 #20
Andrew Dalke <ad****@mindspring.com> wrote:
Ville Vainio wrote:
Seriously? What OSen are known for [writing new content at
> another location of the disk]? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...
It can even be filesystem specific. Back in the days
of WORM drives (do people still use those?)


I know that some government agencies were still using them as recently
as a couple of years ago .. there were some regulations that said that
you had to keep undeletable backups of everything. I think your choice
was basically between a warehouse full of file cabinets and a WORM
drive or two.

-michael
Jul 18 '05 #21
Michael George Lerner <ml*****@NO.SPAMumichPLEASE.edu> wrote:
Andrew Dalke <ad****@mindspring.com> wrote:
Ville Vainio wrote:
Seriously? What OSen are known for [writing new content at
> another location of the disk]? I'd had thought that if
the file size is unchanged, the data is always written over the old
data...

It can even be filesystem specific. Back in the days
of WORM drives (do people still use those?)


I know that some government agencies were still using them as recently
as a couple of years ago .. there were some regulations that said that
you had to keep undeletable backups of everything. I think your choice
was basically between a warehouse full of file cabinets and a WORM
drive or two.


....or much-cheaper CD-R (not -RW) disks...? Not sure about how well
they'll age, but they're write-once, read-many, too, and cheap (besides,
they don't cost much...;-).
Alex
Jul 18 '05 #22

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: PiedmontBiz | last post by:
Listening to National Public Radio while reading comp.lang.python. What a life! I just heard a piece on NPR about the security failures of an electronic voting system being developed. I know a...
7
by: Seth | last post by:
I have noticed that the id of my session object changes when I switch from a non-secure to a secure connection. What I'm trying to do: I have a cookie that is built on the non-secure side of...
29
by: Mike Meyer | last post by:
After spending time I should have been sleeping working on it, the try python site is much more functional. It now allows statements, including multi-line statements and expressions. You can't...
5
by: Michael Sperlle | last post by:
Is it possible? Bestcrypt can supposedly be set up on linux, but it seems to need changes to the kernel before it can be installed, and I have no intention of going through whatever hell that would...
20
by: valpa | last post by:
I'm a net admin for about 20 unix servers, and I need to frequently telnet on to them and configure them. It is a tiring job to open a xterm and telnet, username, password to each server. Can I...
3
by: cuties | last post by:
Hi all.... i'm very new to this programming language. i'm required to fulfill this task in the company i'm doing my practical. i hope i can get guide for my problem... Here is the script i...
15
by: lixinyi.23 | last post by:
Hi! I'm currently working on a scientific computation software built in python. What I want to implement is a Matlab style command window <-> workspace interaction. For example, you type...
0
by: Python Nutter | last post by:
Mini install guide for python on the iPhone: Cydia =Install SSH helps make initial configuration easier until you get used to MobileTerminal Cydia =Install MobileTerminal (closest to a bash...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.