473,698 Members | 2,491 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Standard C++ file size???

Is there any standard C++ way to determine the size of a
file before it is read?
Oct 5 '08
20 3557
PeteOlcott wrote:
On Oct 6, 5:21 am, James Kanze <james.ka...@gm ail.comwrote:
>On Oct 5, 6:21 pm, Victor Bazarov <v.Abaza...@com Acast.netwrote:
>>Peter Olcott wrote:
Is there any standard C++ way to determine the size of a
file before it is read?
No. The "standard C++ way" is to open the file for reading,
seek to the end of the file and get the position.
That's a frequently used method, but it certainly isn't standard
C++. There's no guarantee that the position is convertable to
an integral type, and there's no guarantee that the integral
value means anything if it is.

In practice, this will probably work under Unix, and with binary
(but not text) files under Windows. Elsewhere, who knows?

Why would it not work for Text files under Windows?
(I am only looking for the size that can be block read into memory)
There is a difference between the number of bytes in the file
(physically on the disk) and the number of bytes you get when you read
the file due to the translation happening for the sequence of CR-LF, and
I don't remember which way it goes, you either get more when you read or
when you store it on disk. If there are more characters in the disk
storage, then you should be OK since you're going to allocate more than
you will read, but if it's the other way around, you might be in for a
surprise...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Oct 6 '08 #11
On Oct 6, 3:11 pm, Matthias Buelow <m...@incubus.d ewrote:
James Kanze wrote:
Another reasonable meaning is
the number of bytes the file occupies on the disk, but I don't
know of any system which has a request for this. (Unix
certainly doesn't.)
stat(), lstat(), fstat() will determine the number of blocks
used.
So they do. (I didn't remember it from when I learned stat.
But that was some time ago.) They also return the block size,
so with a little bit of multiplication. .. (Of course, this
doesn't include the space actually taken up by the inode:-).
Or in the directory entry. As Gennaro pointed out, the
definition of size is a bit vague to begin with, and I'm sure
that with a little bit of effort, I can come up with one that no
system supports.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Oct 6 '08 #12
On Oct 6, 4:33 pm, PeteOlcott <PeteOlc...@gma il.comwrote:
On Oct 6, 5:21 am, James Kanze <james.ka...@gm ail.comwrote:
On Oct 5, 6:21 pm, Victor Bazarov <v.Abaza...@com Acast.netwrote:
Peter Olcott wrote:
Is there any standard C++ way to determine the size of a
file before it is read?
No. The "standard C++ way" is to open the file for reading,
seek to the end of the file and get the position.
That's a frequently used method, but it certainly isn't
standard C++. There's no guarantee that the position is
convertable to an integral type, and there's no guarantee
that the integral value means anything if it is.
In practice, this will probably work under Unix, and with
binary (but not text) files under Windows. Elsewhere, who
knows?
Why would it not work for Text files under Windows? (I am
only looking for the size that can be block read into memory)
Because it doesn't. Try it:

#include <iostream>
#include <fstream>
#include <vector>

void
readAll(
char const* filename )
{
std::ifstream f( filename ) ;
if ( ! f ) {
throw "cannot open" ;
}
f.seekg( 0, std::ios::end ) ;
if ( ! f ) {
throw "seek error" ;
}
long long size = f.tellg() ;
std::cout << filename << ": size = " << size << std::endl ;
if ( size != 0 ) {
f.clear() ;
f.seekg( 0, std::ios::beg ) ;
if ( ! f ) {
throw "rewind failed" ;
}
std::vector< char v( size ) ;
f.read( &v[ 0 ], size ) ;
if ( ! f ) {
throw "read failed" ;
}
}
}

int
main( int argc, char** argv )
{
for ( int i = 1 ; i != argc ; ++ i ) {
try {
readAll( argv[ i ] ) ;
} catch ( char const* error ) {
std::cout << argv[ i ] << ": " << error << std::endl ;
}
}
return 0 ;
}

Compile and try it on some text files. On a variant with some
extra comments, reading the source itself, I get:
readall.cc: size = 1677
under Solaris (g++ or Sun CC), but
readall.cc: size = 1733
readall.cc: read failed
under Windows (compiled with VC++).

If I open the file in binary mode, or use system level requests,
of course, I can make it work.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Oct 6 '08 #13
James Kanze wrote:
>>>>Is there any standard C++ way to determine the size of a
file before it is read?
No. The "standard C++ way" is to open the file for reading,
seek to the end of the file and get the position.
>>That's a frequently used method, but it certainly isn't
standard C++. There's no guarantee that the position is
convertable to an integral type, and there's no guarantee
that the integral value means anything if it is.
>>In practice, this will probably work under Unix, and with
binary (but not text) files under Windows. Elsewhere, who
knows?
>Why would it not work for Text files under Windows? (I am
only looking for the size that can be block read into memory)

Because it doesn't. Try it:

#include <iostream>
#include <fstream>
#include <vector>

void
readAll(
char const* filename )
{
std::ifstream f( filename ) ;
if ( ! f ) {
throw "cannot open" ;
}
f.seekg( 0, std::ios::end ) ;
if ( ! f ) {
throw "seek error" ;
}
long long size = f.tellg() ;
I think Victor meant that everything stopped here. Yes, the size so
obtained will happily count some garbage as well, and it's not likely
that read() will work with it, but at least that's the number you should
see in the Windows Explorer. In many cases that's all that is needed to
avoid a lot of user complaints :-)

PS: of course, too, the match with Explorer properties and everything I
say above is all a big dance of "likely", "perhaps" and "should be";
nothing, as you mentioned, is really guaranteed.

--
Gennaro Prota | name.surname yahoo.com
Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
Do you need expertise in C++? I'm available.
Oct 6 '08 #14

"Victor Bazarov" <v.********@com Acast.netwrote in message
news:gc******** **@news.datemas .de...
PeteOlcott wrote:
>On Oct 6, 5:21 am, James Kanze <james.ka...@gm ail.com>
wrote:
>>On Oct 5, 6:21 pm, Victor Bazarov
<v.Abaza...@c omAcast.netwrot e:

Peter Olcott wrote:
Is there any standard C++ way to determine the size of
a
file before it is read?
No. The "standard C++ way" is to open the file for
reading,
seek to the end of the file and get the position.
That's a frequently used method, but it certainly isn't
standard
C++. There's no guarantee that the position is
convertable to
an integral type, and there's no guarantee that the
integral
value means anything if it is.

In practice, this will probably work under Unix, and
with binary
(but not text) files under Windows. Elsewhere, who
knows?

Why would it not work for Text files under Windows?
(I am only looking for the size that can be block read
into memory)

There is a difference between the number of bytes in the
file (physically on the disk) and the number of bytes you
get when you read the file due to the translation
happening for the sequence of CR-LF,
I am talking about reading a Text file in binary mode so
there is no translation. I am making a computer language
compiler so my lexical analyzer will treat the text as
binary data.
and I don't remember which way it goes, you either get
more when you read or when you store it on disk. If there
are more characters in the disk storage, then you should
be OK since you're going to allocate more than you will
read, but if it's the other way around, you might be in
for a surprise...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 7 '08 #15
On Oct 7, 4:09 am, "Peter Olcott" <NoS...@SeeScre en.comwrote:
"Victor Bazarov" <v.Abaza...@com Acast.netwrote in message
[...[
I am talking about reading a Text file in binary mode so
there is no translation.
You can't read a text file in binary mode. If you open a file
in binary mode, it is a binary file; if you open it in text
mode, it is a text file.

Outside of C/C++, some operating systems don't make a
distinction (Unix and Windows, for example); a file is a text
file or a binary file only in virtue of how you open it (and
only in C or C++). In other systems (probably most), if the
file was written as text, you can't open it as binary, and vice
versa.
I am making a computer language compiler so my lexical
analyzer will treat the text as binary data.
Hmmm. The most logical thing would be for a compiler to open
the files as text. (On some systems, the editors save the files
as text files, and you can't open them in binary.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Oct 7 '08 #16
On Oct 6, 9:12 pm, Gennaro Prota <gennaro/pr...@yahoo.com wrote:
James Kanze wrote:
>>>Is there any standard C++ way to determine the size of a
file before it is read?
No. The "standard C++ way" is to open the file for reading,
seek to the end of the file and get the position.
>That's a frequently used method, but it certainly isn't
standard C++. There's no guarantee that the position is
convertable to an integral type, and there's no guarantee
that the integral value means anything if it is.
>In practice, this will probably work under Unix, and with
binary (but not text) files under Windows. Elsewhere, who
knows?
Why would it not work for Text files under Windows? (I am
only looking for the size that can be block read into memory)
Because it doesn't. Try it:
#include <iostream>
#include <fstream>
#include <vector>
void
readAll(
char const* filename )
{
std::ifstream f( filename ) ;
if ( ! f ) {
throw "cannot open" ;
}
f.seekg( 0, std::ios::end ) ;
if ( ! f ) {
throw "seek error" ;
}
long long size = f.tellg() ;
I think Victor meant that everything stopped here. Yes, the
size so obtained will happily count some garbage as well, and
it's not likely that read() will work with it, but at least
that's the number you should see in the Windows Explorer.
Which means?
In many cases that's all that is needed to avoid a lot of user
complaints :-)
If the requirements specification says to display the value
shown by Windows Explorer, fine. If the goal is to allocate a
buffer so you can read it in one go, it doesn't work. If the
goal is to know exactly how much space the file takes on the
disk, it doesn't work. As you said yourself, size is a rather
vague concept when it comes to files. Unless you're determining
the size so you can display it, in a way that is compatible with
Windows Explorer, then I don't see this working.

More importantly, it's very implementation defined; on some
implementations , it might not even compile. As long as you're
being implementation defined, you might as well use the platform
specific functions and be done with it. Not that they'll
necessarily give you anything more useful, but they'll almost
certainly give you a useless answer a lot faster, and they'll
probably define more or less what their answer really
corresponds to.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Oct 7 '08 #17
Peter Olcott wrote:
I am talking about reading a Text file in binary mode so
there is no translation. I am making a computer language
compiler so my lexical analyzer will treat the text as
binary data.
Just out of curiosity, why do you need to know the file size for that?
Does your language need a lookahead of more than 1 character?
Oct 7 '08 #18
James Kanze wrote:
[seeking to the end and getting the position]
>>>Why would it not work for Text files under Windows? (I am
only looking for the size that can be block read into memory)
>>Because it doesn't. Try it:
>> #include <iostream>
#include <fstream>
#include <vector>
>> void
readAll(
char const* filename )
{
std::ifstream f( filename ) ;
if ( ! f ) {
throw "cannot open" ;
}
f.seekg( 0, std::ios::end ) ;
if ( ! f ) {
throw "seek error" ;
}
long long size = f.tellg() ;
>I think Victor meant that everything stopped here. Yes, the
size so obtained will happily count some garbage as well, and
it's not likely that read() will work with it, but at least
that's the number you should see in the Windows Explorer.

Which means?
I don't know. It might even vary from one Windows incarnation to
another. The attempt didn't get through, but...
>In many cases that's all that is needed to avoid a lot of user
complaints :-)
....this was really meant to be humorous.
If the requirements specification says to display the value
shown by Windows Explorer, fine. If the goal is to allocate a
buffer so you can read it in one go, it doesn't work. If the
goal is to know exactly how much space the file takes on the
disk, it doesn't work. As you said yourself, size is a rather
vague concept when it comes to files. Unless you're determining
the size so you can display it, in a way that is compatible with
Windows Explorer, then I don't see this working.
Sure, I completely agree. "File" itself is probably hard to define in
general, too (for one thing: do you see it from the perspective of the
filesystem or from the perspective of the user and its contents? And
what interpretation of the contents?).

Considering the original question, what about this summary:

Q.: Is there any standard C++ way to determine the size of a file
without "reading" it?

A.: Not a strictly conforming one: the concepts of "file size" and
"file read" themselves, in fact, have no universal meaning; you'll
have to resort to an implementation-defined mechanism, if any,
such as stat() on POSIX platforms, GetFileSize()/GetFileSizeEx()
on Win32. The system documentation, or conformity to further
standards such as POSIX, may/should clarify what meaning of "size"
and/or "read" each of those mechanisms correspond to. This may not
apply to all of the supported file types.

Maybe this should be a FAQ (or two :-).

--
Gennaro Prota | name.surname yahoo.com
Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
Do you need expertise in C++? I'm available.
Oct 10 '08 #19
On Oct 10, 3:46 pm, Gennaro Prota <gennaro/pr...@yahoo.com wrote:
James Kanze wrote:
<snip>
Considering the original question, what about this summary:

Q.: Is there any standard C++ way to determine the size of a file
without "reading" it?

A.: Not a strictly conforming one: the concepts of "file size" and
"file read" themselves, in fact, have no universal meaning; you'll
have to resort to an implementation-defined mechanism, if any,
such as stat() on POSIX platforms, GetFileSize()/GetFileSizeEx()
on Win32. The system documentation, or conformity to further
standards such as POSIX, may/should clarify what meaning of "size"
and/or "read" each of those mechanisms correspond to. This may not
apply to all of the supported file types.

Maybe this should be a FAQ (or two :-).
BTW, I do not think that anybody mentioned in this thread that in most
systems (i.e. any system that allows concurrent access to files), the
data returned by any kind of get file size API might be stale the
instant after it has been returned.

You can't really rely on it, except to treat it as some kind of hint.
Unless of course the OS gives you some way to acquire exclusive access
to that file before the get file size request.

--
Giovanni P. Deretta
Oct 11 '08 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

40
2053
by: Matt | last post by:
I want to know what is the latest C standard version? Is it C99? There are many terms I have heard, including C98, C99, C9X. Or should we call ANSI/ISO C? Please advise. Thanks!!
20
3161
by: Chor Lit | last post by:
Hi, I asked Bjarne Stroustrup about the idea of adding colour standard for C++, and he said that it is very difficult for compiler vendors to change their IDE. But do you think it is possible ? Note that the proposed colour standard is not just merely to ease the eye only as what presently is in C++ compilers, but to aid in syntax disambiguation and other advantages. Here are a few advantages that I can think of:
270
9448
by: jacob navia | last post by:
In my "Happy Christmas" message, I proposed a function to read a file into a RAM buffer and return that buffer or NULL if the file doesn't exist or some other error is found. It is interesting to see that the answers to that message prove that programming exclusively in standard C is completely impossible even for a small and ridiculously simple program like the one I proposed. 1 I read the file contents in binary mode, what should...
0
8608
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9161
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8867
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6522
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5860
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4370
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4619
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2006
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.