By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,551 Members | 1,142 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,551 IT Pros & Developers. It's quick & easy.

reading files with error

P: n/a
Hi,

I'm trying to read some files (video files) that seems to have some
errors in it. Basically, I cannot copy it out of discs as that gives me
an error message but I can still play the file using a media player like
VLC or QuickTime. I understand that copying a file will also invoke
checking routines as well, and I am guessing that the file may have some
parity-bit error or something.

Am I able to use Python to force read the entire file (full length)?
That is, do not check the read for errors. I know that this is insideous
in many uses but in media files, it may only result in a skipped frame
or something. What I've done is something like this:
f = open('/Volumes/NEW/gameday/morning.dat', 'rb')
data = f.read()
o = open('/Users/mauriceling/Desktop/o.dat', 'wb')
f.close()
o.write(data)
o.close()


What I've noticed is this:
1. sometimes it (Python) only reads to roughly the point of initial copy
error (I try to take note of how far drag-and-drop copying proceeds
before it fails)
2. sometimes it is able to read pass the drag-and-drop copying
fail-point but may or may not be full length.

What determines if Python is able to make it pass drag-and-drop copying
fail-point?

Is there anyway to do what I want, force read full length?

Thanks and cheers
Maurice
Sep 18 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
I have written a program to do something similar. My strategy is:
* use os.read() to read 512 bytes at a time
* when os.read fails, seek to the next multiple of 512 bytes
and write '\0' * 512 to the output
I notice this code doesn't deal properly with short reads, but in my experience
they never happen (because the disk error makes an entire block unreadable,and
a block is never less than 512 bytes)

I use this code on a unix-style system.

def dd(src, target, bs=512):
src = os.open(src, os.O_RDONLY)
if os.path.exists(target):
target = os.open(target, os.O_WRONLY | os.O_APPEND, 0600)
existing = os.lseek(target, 0, SEEK_END)
else:
existing = 0
target = os.open(target, os.O_WRONLY | os.O_CREAT, 0600)

total = os.lseek(src, 0, SEEK_END) / bs
os.lseek(src, existing, SEEK_SET)
os.lseek(target, existing, SEEK_SET)

if existing: print "starting at", existing
i = existing / bs
f = 0
lastrem = -1

last = start = time.time()
while 1:
try:
block = os.read(src, bs)
except os.error, detail:
if detail.errno == errno.EIO:
block = "\0" * bs
os.lseek(src, (i+1) * bs, SEEK_SET)
f = f + 1
else:
raise
if block == "": break

i = i + 1
os.write(target, block)

now = time.time()
if i % 1000 or now - last < 1: continue
last = now

frac = i * 1. / total
rem = int((now-start) * (1-frac) / frac)
if rem < 60 or abs(rem - lastrem) > .5:
rm, rs = divmod(rem, 60)
lastrem = rem
spd = i * 512. / (now - start) / 1024 / 1024
sys.stderr.write("%8d %8d %8d %3.1f%% %6d:%02d %6.1fMB/s\r"
% (i, f, i-f, i * 100. / total, rm, rs, spd))
sys.stderr.write("\n")

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFDLMm0Jd01MZaTXX0RAoyQAKCgrED02MfBBBxGGjB66X R0PtkPUwCeJZaj
qnHVJVnl3zAfuMrOXAiFzd8=
=vBK6
-----END PGP SIGNATURE-----

Sep 18 '05 #2

P: n/a
je****@unpythonic.net wrote:
I have written a program to do something similar. My strategy is:
* use os.read() to read 512 bytes at a time
* when os.read fails, seek to the next multiple of 512 bytes
and write '\0' * 512 to the output
I notice this code doesn't deal properly with short reads, but in my experience
they never happen (because the disk error makes an entire block unreadable, and
a block is never less than 512 bytes)

I use this code on a unix-style system.

def dd(src, target, bs=512):
src = os.open(src, os.O_RDONLY)
if os.path.exists(target):
target = os.open(target, os.O_WRONLY | os.O_APPEND, 0600)
existing = os.lseek(target, 0, SEEK_END)
else:
existing = 0
target = os.open(target, os.O_WRONLY | os.O_CREAT, 0600)

total = os.lseek(src, 0, SEEK_END) / bs
os.lseek(src, existing, SEEK_SET)
os.lseek(target, existing, SEEK_SET)

if existing: print "starting at", existing
i = existing / bs
f = 0
lastrem = -1

last = start = time.time()
while 1:
try:
block = os.read(src, bs)
except os.error, detail:
if detail.errno == errno.EIO:
block = "\0" * bs
os.lseek(src, (i+1) * bs, SEEK_SET)
f = f + 1
else:
raise
if block == "": break

i = i + 1
os.write(target, block)

now = time.time()
if i % 1000 or now - last < 1: continue
last = now

frac = i * 1. / total
rem = int((now-start) * (1-frac) / frac)
if rem < 60 or abs(rem - lastrem) > .5:
rm, rs = divmod(rem, 60)
lastrem = rem
spd = i * 512. / (now - start) / 1024 / 1024
sys.stderr.write("%8d %8d %8d %3.1f%% %6d:%02d %6.1fMB/s\r"
% (i, f, i-f, i * 100. / total, rm, rs, spd))
sys.stderr.write("\n")

Sorry but what are SEEK_END and SEEK_SET?

Maurice

--
Maurice Han Tong LING, BSc(Hons)(Biochem), AdvDipComp, CPT, SSN, FIFA,
MASBMB, MAMBIS, MACM
Doctor of Philosophy (Science) Candidate, The University of Melbourne
mobile: +61 4 22781753, +65 96669233
mailing address: Department of Zoology, The University of Melbourne
Royal Parade, Parkville, Victoria 3010, Australia
residence: 9/41 Dover Street, Flemington, Victoria 3031, Australia
resume: http://maurice.vodien.com/maurice_resume.pdf
www: http://www.geocities.com/beldin79/
Sep 18 '05 #3

P: n/a
Maurice Ling <ma*********@acm.org> wrote in message
news:ma************************************@python .org...
je****@unpythonic.net wrote:
I have written a program to do something similar. My strategy is:
* use os.read() to read 512 bytes at a time
* when os.read fails, seek to the next multiple of 512 bytes
and write '\0' * 512 to the output
I notice this code doesn't deal properly with short reads, but in my
experience
they never happen (because the disk error makes an entire block
unreadable, and
a block is never less than 512 bytes)

I use this code on a unix-style system.

def dd(src, target, bs=512):
src = os.open(src, os.O_RDONLY)
if os.path.exists(target):
target = os.open(target, os.O_WRONLY | os.O_APPEND, 0600)
existing = os.lseek(target, 0, SEEK_END)
else:
existing = 0
target = os.open(target, os.O_WRONLY | os.O_CREAT, 0600)

total = os.lseek(src, 0, SEEK_END) / bs
os.lseek(src, existing, SEEK_SET)
os.lseek(target, existing, SEEK_SET)

if existing: print "starting at", existing
i = existing / bs
f = 0
lastrem = -1

last = start = time.time()
while 1:
try:
block = os.read(src, bs)
except os.error, detail:
if detail.errno == errno.EIO:
block = "\0" * bs
os.lseek(src, (i+1) * bs, SEEK_SET)
f = f + 1
else:
raise
if block == "": break

i = i + 1
os.write(target, block)

now = time.time()
if i % 1000 or now - last < 1: continue
last = now

frac = i * 1. / total
rem = int((now-start) * (1-frac) / frac)
if rem < 60 or abs(rem - lastrem) > .5:
rm, rs = divmod(rem, 60)
lastrem = rem
spd = i * 512. / (now - start) / 1024 / 1024
sys.stderr.write("%8d %8d %8d %3.1f%% %6d:%02d %6.1fMB/s\r"
% (i, f, i-f, i * 100. / total, rm, rs, spd))
sys.stderr.write("\n")

Sorry but what are SEEK_END and SEEK_SET?


The Python 2.3 documentation seems to specify the *numeric*
values of these constants only. But since Python's file
objects are "implemented using C's stdio package", you
can read

http://www.opengroup.org/onlinepubs/...ons/lseek.html

Regards,
Christian Stapfer
Sep 18 '05 #4

P: n/a
On Sun, Sep 18, 2005 at 02:15:00PM +1000, Maurice Ling wrote:
Sorry but what are SEEK_END and SEEK_SET?


Oops, that's what I get for snipping a part of a larger program.

SEEK_SET = 0
SEEK_END = 2

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFDLYPxJd01MZaTXX0RArHTAKCbHRfGu/Bf7A5sopPXudMrKcZnuQCgjS72
uol/6PiY+7GKCnURTEJ05pE=
=fXAP
-----END PGP SIGNATURE-----

Sep 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.