473,602 Members | 2,846 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Getting file size of binary file

Is using fseek and ftell a reliable method of getting the file size on a
binary file? I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it? Thanks.
Nov 14 '05 #1
17 15966
"Arnold" <ar****@nothpol e.com> wrote:
Is using fseek and ftell a reliable method of getting the file size on a
binary file?
No. From 7.19.9.2#3: "A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END".

To say that this irks me would be a bit of an understatement.
I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it?


There is none, in ISO C.

To say that _this_ irks me would be a bit of an understatement, as well.
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.

Richard
Nov 14 '05 #2
On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:
Is using fseek and ftell a reliable method of getting the file size on a
binary file? I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it? Thanks.


try fstat()
Nov 14 '05 #3
Richard Head <rh***@comcast. net> scribbled the following:
On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:
Is using fseek and ftell a reliable method of getting the file size on a
binary file? I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it? Thanks.
try fstat()


Which part of the ISO C standard defines fstat()?

--
/-- Joona Palaste (pa*****@cc.hel sinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"My absolute aspect is probably..."
- Mato Valtonen
Nov 14 '05 #4
Richard Head wrote:
On Thu, 08 Jan 2004 08:46:35 +0000, Arnold wrote:
Is using fseek and ftell a reliable method of getting the file
size on a binary file? I thought I remember reading somewhere it
wasn't... If not what would be the "right" and portable method

to obtain it? Thanks.

try fstat()


No, don't. There is no fstat() in standard C. Please do not give
off-topic answers in this newsgroup, where there may be nobody to
make corrections.

--
Chuck F (cb********@yah oo.com) (cb********@wor ldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net> USE worldnet address!

Nov 14 '05 #5
Richard Bos wrote:
"Arnold" <ar****@nothpol e.com> wrote:

Is using fseek and ftell a reliable method of getting the file size on a
binary file?

No. From 7.19.9.2#3: "A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END".


From the FAQ for this group:

http://www.eskimo.com/~scs/C-faq/q19.12.html

---
How can I find out the size of a file, prior to reading it in?

If the ``size of a file'' is the number of characters you'll be able to
read from it in C, it is difficult or impossible to determine this
number exactly).

Under Unix, the stat call will give you an exact answer. Several other
systems supply a Unix-like stat which will give an approximate answer.
You can fseek to the end and then use ftell, but these tend to have the
same problems: fstat is not portable, and generally tells you the same
thing stat tells you; ftell is not guaranteed to return a byte count
except for binary files. Some systems provide routines called filesize
or filelength, but these are not portable, either.

Are you sure you have to determine the file's size in advance? Since the
most accurate way of determining the size of a file as a C program will
see it is to open the file and read it, perhaps you can rearrange the
code to learn the size as it reads.
---

Does this look strange to anyone else? There's that lone closing paren
in the first paragraph, but the part that really bothers me is "ftell is
not guaranteed to return a byte count except for binary files." It seems
to be suggesting that the fseek/ftell method would be OK for a binary
file, but line from the standard that Richard quoted suggests the opposite.

To say that this irks me would be a bit of an understatement.

I thought I remember reading somewhere it wasn't... If not what
would be the "right" and portable method to obtain it?

There is none, in ISO C.

To say that _this_ irks me would be a bit of an understatement, as well.
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.


I suppose that it's partly because C deals with streams, not files
directly (for the most part). Many things may not make sense for a
stream, size included. How could the size of stdin be meaningful, for
example? At the same time, there are at least a few standard functions
that only make sense for certain types of streams. Seems like it
wouldn't be such a bad idea to have a few more.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.
Nov 14 '05 #6
Richard Bos wrote:

(snip)
No. From 7.19.9.2#3: "A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END".

To say that this irks me would be a bit of an understatement.
(snip)
To say that _this_ irks me would be a bit of an understatement, as well.
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.


I was reading not so long ago what one of IBM's C compilers for
VM/CMS or MVS does for fseek/ftell. For files with variable length
records, text or binary, ftell returns the block number in the
upper 17 bits, and position in the block in the lower 15 bits.
(OS restrictions tend to keep blocks less than 32K.) I think
it wraps at 128K blocks.

MVS keeps track of files in tracks, which can't reliably be
converted to bytes. CMS maps variable length blocks onto
a fixed block file system, but also doesn't accurately
keep track of bytes of file data.

On traditional IBM mainframe OS's, tracks are formatted when
written. The block size is determined by the program, and can
either fixed fixed or variable length. As an added complication,
files with fixed length blocks will usually have a short block
at the end. If opened for append, this short block stays in
place, so even for fixed length blocks a block count can't
reliably indicate file size.

-- glen

Nov 14 '05 #7
Kevin Goodsell <us************ *********@never box.com> wrote:
Richard Bos wrote:
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.


I suppose that it's partly because C deals with streams, not files
directly (for the most part). Many things may not make sense for a
stream, size included. How could the size of stdin be meaningful, for
example? At the same time, there are at least a few standard functions
that only make sense for certain types of streams. Seems like it
wouldn't be such a bad idea to have a few more.


Exactly; the function could always return -1 for "not available".

Richard
Nov 14 '05 #8
glen herrmannsfeldt <ga*@ugcs.calte ch.edu> wrote:
Richard Bos wrote:
To say that _this_ irks me would be a bit of an understatement, as well.
It should at least be possible to get the value of "what the OS thinks
the file size is", but apparently there are reasons why it isn't; I've
never heard one that is convincing, though.


I was reading not so long ago what one of IBM's C compilers for
VM/CMS or MVS does for fseek/ftell. For files with variable length
records, text or binary, ftell returns the block number in the
upper 17 bits, and position in the block in the lower 15 bits.
(OS restrictions tend to keep blocks less than 32K.) I think
it wraps at 128K blocks.

MVS keeps track of files in tracks, which can't reliably be
converted to bytes. CMS maps variable length blocks onto
a fixed block file system, but also doesn't accurately
keep track of bytes of file data.

On traditional IBM mainframe OS's, tracks are formatted when
written. The block size is determined by the program, and can
either fixed fixed or variable length. As an added complication,
files with fixed length blocks will usually have a short block
at the end. If opened for append, this short block stays in
place, so even for fixed length blocks a block count can't
reliably indicate file size.


That doesn't convince me, either.

The OS has _some_ idea of how large the file is, if only to prevent the
user from writing past the end of it. It should be possible to pass this
knowledge on to the C implementation. If the result is approximate, that
is inherent in the OS, and the user will be expecting it.

Richard
Nov 14 '05 #9
Richard Bos wrote:
glen herrmannsfeldt <ga*@ugcs.calte ch.edu> wrote:
(snip)
I was reading not so long ago what one of IBM's C compilers for
VM/CMS or MVS does for fseek/ftell. For files with variable length
records, text or binary, ftell returns the block number in the
upper 17 bits, and position in the block in the lower 15 bits.
(OS restrictions tend to keep blocks less than 32K.) I think
it wraps at 128K blocks. MVS keeps track of files in tracks, which can't reliably be
converted to bytes.


(snip)
That doesn't convince me, either. The OS has _some_ idea of how large the file is, if only to prevent the
user from writing past the end of it. It should be possible to pass this
knowledge on to the C implementation. If the result is approximate, that
is inherent in the OS, and the user will be expecting it.


The OS keeps track of how many tracks are allocated, but now how many
bytes are written to each one. The number of bytes you can fit on a
track with a BLKSIZE of 1 is about 1% of the maximum. There also
could be empty tracks allocated but not yet used, after the data.

There is no standard (or non-standard) way to say approximately how
much space a data set takes.

Assuming that every file system is like unix is not a good idea.

-- glen

Nov 14 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
10641
by: Jane Austine | last post by:
As you add more items, say text lines, in Text widget, it gets too slow and almost impractical to use on. Take idle for example. If the text gets bigger(e.g. print urllib.urlopen('http://www.amazon.com').read() ), it becomes too sluggish to use as an "interactive" shell. I have tried wxPython and it seems to have the same problem (from my experience using PyCrust). Is there any way to speed up Text widget, or should I look for another
5
5548
by: rob | last post by:
hey every1, I've got alot of data to write out to file and it's all just 1's and 0's. It's all stored in 2 dimensional arrays of width 32 and varying height. At the moment it's all just integer arrays and the individual 1's and 0's are being written out as integers.
49
61130
by: Sam | last post by:
Hi all, Is there a function in the standard library that can get the size of a file? Thank you very much. Sam.
35
2649
by: munish.nr | last post by:
Hi All, I want to know the size of file (txt,img or any other file). i knoe only file name. how i can acheive this. does anybody is having idea about that. plz help. rgrds, Munish Nayyar
0
3703
by: ruju00 | last post by:
I am getting an error in Login() method of the following class FtpConnection public class FtpConnection { public class FtpException : Exception { public FtpException(string message) : base(message){} public FtpException(string message, Exception innerException) : base(message,innerException){}
3
4000
by: forest demon | last post by:
for example, let's say I do something like, System.Diagnostics.Process.Start("notepad.exe","sample.txt"); if the user does a SaveAs (in notepad), how can i capture the path that the user selects? thanks...
10
13851
by: chat | last post by:
Hi, I know that text file ended with EOF mark but there is no mark for binary file. So, the problem is how do we know the end of binary file is reach? This code can tell us when the end of file is reach int ch; FILE *fp;
3
2070
by: Harry | last post by:
#include<stdio.h> #include<stdlib.h> #include<malloc.h> void scramble(void); struct bmp_header { short int sig; int size_bmp;
18
2811
by: MisterE | last post by:
I hear that this isn't always valid: FILE *in; long size; in = fopen("foo.bar","rb"); fseek(in,0,SEEK_END); size = ftell(in); fseek(in,0,SEEK_SET); then fread size many bytes into memory.
0
7993
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8401
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8268
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6730
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5867
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
3900
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3944
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2418
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
1254
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.