By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,301 Members | 1,373 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,301 IT Pros & Developers. It's quick & easy.

Stripping specific bytes of a file without opening the entire file...

P: n/a
Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just
the last 512 bytes of each file. i was wondering if there is a way to
strip out this information without loading the entire file into memory.
Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3

Cheers,
Adam.

Jul 23 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a

ba*****@gmail.com wrote:
Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just the last 512 bytes of each file. i was wondering if there is a way to strip out this information without loading the entire file into memory. Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3

Cheers,
Adam.


Try 'istream::seekg'

Hope this helps,
-shez-

Jul 23 '05 #2

P: n/a

ba*****@gmail.com wrote:
Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just the last 512 bytes of each file. i was wondering if there is a way to strip out this information without loading the entire file into memory. Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3

Your best bet is fseek() which allows you to position the file pointer
at a specified offset in the file. I don't think that you can avoid
opening the file in any event.

Regards,

Jon Trauntvein

Jul 23 '05 #3

P: n/a
yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.

Jul 23 '05 #4

P: n/a
ff**@hotmail.com wrote:
yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.


Try using platform specific functions to prevent the loading
of the file into memory. Just remember about portability
problems.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library

Jul 23 '05 #5

P: n/a

ff**@hotmail.com wrote:
yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.


Why will seek functions (either fseek() or istream::seekg()) load the
entire file into memory? What kind of filesystem are you using? I
would think that if the filesystem provides enough information about
the physical layout of files on disk (e.g., inodes in unix), it
shouldn't need to load the entire file into memory.

Am I wrong?

-shez-

Jul 23 '05 #6

P: n/a
which platform specific functions are you thinking of? im running
fedora core 2 (redhat linux) on an ext3 file system. portability is
not a concern of mine. i just need it to run on ext3 filesystems.

cheers,
adam.

Jul 23 '05 #7

P: n/a

ff**@hotmail.com wrote:
which platform specific functions are you thinking of? im running
fedora core 2 (redhat linux) on an ext3 file system. portability is
not a concern of mine. i just need it to run on ext3 filesystems.

cheers,
adam.


This may not help you, but I just did some tests...on my system
(winXP), using fstream to open and seekg() to move around does not
cause the file to load into memory. I tested by opening a 30MB file,
seeking around, then seeking to the begining and loading the whole
thing into memory, and seeing how long it took/how much disk usage was
required between each step. The only operation that took any
significant time was loading the file. I then tested where I just
loaded a small fraction of the file, and it took much less time than
loading the full file.

HTH

Jul 23 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.