By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,079 Members | 915 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,079 IT Pros & Developers. It's quick & easy.

iostream and files larger than 4GB

P: n/a
Hi,

I am relative new to C++ regarding it's functions and libraries. I need to
access files larger than 4GB which is AFAIK not possible with the STL
iostream - at least not if using a 32 bit compiler. iostream was my
favorite as my code has to work on files as well as memory buffers...

Could somebody please help me what functions/classes are the best in this
case?

BTW: I am currently using Visual C++ 2008 on Win32, but if possible I want
to write my code as "portable as possible".

Robert
Jul 21 '08 #1
Share this Question
Share on Google+
19 Replies


P: n/a
Robert Kochem wrote:
I am relative new to C++ regarding it's functions and libraries. I need to
access files larger than 4GB which is AFAIK not possible with the STL
iostream - at least not if using a 32 bit compiler. iostream was my
favorite as my code has to work on files as well as memory buffers...
Have you actually tried and failed, or is that only your speculation?
Could somebody please help me what functions/classes are the best in this
case?

BTW: I am currently using Visual C++ 2008 on Win32, but if possible I want
to write my code as "portable as possible".
AFAIK, even standard C Library functions like fread and fseek should
work with large files. And since C++ I/O streams are relatively thin
wrappers around C streams, those are expected to work just as well.
Write a program, see if you get it to work, if not, post your code and
explain the situation.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 21 '08 #2

P: n/a
Victor Bazarov schrieb:
Robert Kochem wrote:
>I am relative new to C++ regarding it's functions and libraries. I need to
access files larger than 4GB which is AFAIK not possible with the STL
iostream - at least not if using a 32 bit compiler. iostream was my
favorite as my code has to work on files as well as memory buffers...

Have you actually tried and failed, or is that only your speculation?
If you get a "possible loss of data" warning when feeding seekg() with an
64 bit integer - what would you expect?
AFAIK, even standard C Library functions like fread and fseek should
work with large files. And since C++ I/O streams are relatively thin
wrappers around C streams, those are expected to work just as well.
Write a program, see if you get it to work, if not, post your code and
explain the situation.
It may work for files, but can I work with them on memory streams?

Robert
Jul 21 '08 #3

P: n/a
Robert Kochem wrote:
Victor Bazarov schrieb:
>Robert Kochem wrote:
>>I am relative new to C++ regarding it's functions and libraries. I need to
access files larger than 4GB which is AFAIK not possible with the STL
iostream - at least not if using a 32 bit compiler. iostream was my
favorite as my code has to work on files as well as memory buffers...
Have you actually tried and failed, or is that only your speculation?

If you get a "possible loss of data" warning when feeding seekg() with an
64 bit integer - what would you expect?
I expect not to use seekg then. Or switch to a better implementation of
the library.
>
>AFAIK, even standard C Library functions like fread and fseek should
work with large files. And since C++ I/O streams are relatively thin
wrappers around C streams, those are expected to work just as well.
Write a program, see if you get it to work, if not, post your code and
explain the situation.

It may work for files, but can I work with them on memory streams?
I don't know what those are, sorry.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 21 '08 #4

P: n/a
Hi,

Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file. If the OS can handle both like for instance sun V440 or
ubuntu 64 or lots of others you can compile with m64 with gcc (and link to
64 versions of all libraries) and usually have the full 64 bit range.
Sometimes you have to add something like LARGE_FILE_SUPPORT from the top of
my memory.

I havent' tried but maybe the same applies to 64 bit MS-Windows.
Regards, Ron AF Greve

http://www.InformationSuperHighway.eu

"Robert Kochem" <ro****@mailueberfall.dewrote in message
news:1b******************************@40tude.net.. .
Hi,

I am relative new to C++ regarding it's functions and libraries. I need to
access files larger than 4GB which is AFAIK not possible with the STL
iostream - at least not if using a 32 bit compiler. iostream was my
favorite as my code has to work on files as well as memory buffers...

Could somebody please help me what functions/classes are the best in this
case?

BTW: I am currently using Visual C++ 2008 on Win32, but if possible I want
to write my code as "portable as possible".

Robert

Jul 21 '08 #5

P: n/a
Victor Bazarov wrote:
>If you get a "possible loss of data" warning when feeding seekg() with an
64 bit integer - what would you expect?

I expect not to use seekg then. Or switch to a better implementation of
the library.
That is easy to say - but what else to use?
>It may work for files, but can I work with them on memory streams?

I don't know what those are, sorry.
May be that was not the correct name in the C++ realm: I need an
abstraction of the underlaying data source. My code have to work on files
as well as on memory buffers and I call a stream using a memory buffer as
source a memory stream.

Robert
Jul 21 '08 #6

P: n/a
Ron AF Greve schrieb:
Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.
Sorry, but I can't believe that. Do you really mean that e.g. a 32bit Linux
filesystem can not handle files larger than 4GB?

Robert
Jul 21 '08 #7

P: n/a
tni
Visual C++ iostream doesn't work with large files, not even on x64. See
the following discussion (still true for VS 2005 at least):

http://archive.netbsd.se/?ml=opensce...5-08&m=1634957
Jul 21 '08 #8

P: n/a
Hi,

In the past there certainly was a time it couldn't. Currently I haven't a
pure 32 bit linux version although I could test with 64 and compiling for 32
(maybe tomorrow) And I am sure a lot of OS'es indeed don't. Lookup your
flavor and search for large file support.

Regards, Ron AF Greve

http://www.InformationSuperHighway.eu

"Robert Kochem" <ro****@mailueberfall.dewrote in message
news:13****************************@40tude.net...
Ron AF Greve schrieb:
>Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.

Sorry, but I can't believe that. Do you really mean that e.g. a 32bit
Linux
filesystem can not handle files larger than 4GB?

Robert

Jul 21 '08 #9

P: n/a
Robert Kochem wrote:
Ron AF Greve schrieb:
>Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.

Sorry, but I can't believe that. Do you really mean that e.g. a 32bit Linux
filesystem can not handle files larger than 4GB?
I don't think so either. The 64 bit file API is in no way related to 64
bit extension of the CPU. even 8 bit CPUs could deal with 64 bit numbers.

It is a compile time feature of the runtime library. At the operatin
system level there are either two sets of API function with and without
large file support or optional 64 bit extension parameters to the 32 bit
API functions (like Win32). Unfortunately the C++ runtimes are not the
first ones that support this.

For tasks like that I do not recommend to use the iostream libraries at
all. Usually they are not trimmed for maximum perpormance. Sometimes the
implementations are more like case studies.
And writing that large files /is/ a question of performance. You might
want to control the caching of the content. Or you might do the I/O
asynchronously.
Marcel
Jul 21 '08 #10

P: n/a
Robert Kochem wrote:
Victor Bazarov wrote:
>>If you get a "possible loss of data" warning when feeding seekg() with an
64 bit integer - what would you expect?
I expect not to use seekg then. Or switch to a better implementation of
the library.

That is easy to say - but what else to use?
A different implementation. Check out 'STLport', Dinkumware...
>>It may work for files, but can I work with them on memory streams?
I don't know what those are, sorry.

May be that was not the correct name in the C++ realm: I need an
abstraction of the underlaying data source. My code have to work on files
as well as on memory buffers and I call a stream using a memory buffer as
source a memory stream.
So, are you going to write some kind of wrappers for C++ streams and
your own buffers, or are you going to provide your own streambuf
implementations that you're going to plug into C++ streams?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Jul 21 '08 #11

P: n/a
On Jul 22, 7:14 am, "Ron AF Greve" <ron@localhostwrote:

Please stop top-posting.
Hi,

Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.
That's nonsense.
If the OS can handle both like for instance sun V440 or
A V440 is a machine, not an OS.
ubuntu 64 or lots of others you can compile with m64 with gcc (and link to
64 versions of all libraries) and usually have the full 64 bit range.
Also nonsense.

Ian.

Jul 22 '08 #12

P: n/a
On Jul 21, 8:39 pm, Victor Bazarov <v.Abaza...@comAcast.netwrote:
Robert Kochem wrote:
I am relative new to C++ regarding it's functions and
libraries. I need to access files larger than 4GB which is
AFAIK not possible with the STL iostream - at least not if
using a 32 bit compiler. iostream was my favorite as my code
has to work on files as well as memory buffers...
Have you actually tried and failed, or is that only your
speculation?
It's really implementation defined. I know that some
implementations do have this restriction.
Could somebody please help me what functions/classes are the
best in this case?
BTW: I am currently using Visual C++ 2008 on Win32, but if
possible I want to write my code as "portable as possible".
AFAIK, even standard C Library functions like fread and fseek
should work with large files.
According to what or who? The standards (both C and C++) are
really very, very vague about this (intentionally). I think
about all you can portably count on is that you can read
anything you can write. If the library doesn't allow writing
files with more than some upper limit of characters, then
there's no reason to assume that it can read them.

From a quality of implementation point of view, of course, one
would expect that the library not introduce additional
restrictions not present in the OS. But backwards compatibility
issues sometimes pose problems: changing the size of off_t on a
Posix implementation breaks binary compatibility, for example.
So libc.so (the dynamic object which contains the system API and
the basic C library under Solaris) must stick with 32 bit file
offsets, or existing binaries will cease to work. And if
libc.so uses a 32 bit file offset, then any new code which links
against it must, too. So by default, fopen uses a 32 bit file
offset, and only allows access to the first 4 GB of a file, at
least in programs compiled in 32 bit mode. I don't know how
Windows handles this, but I'd be surprised if they didn't
encounter the same problems, at least to some degree.

The obvious solution would be to have three models, instead of
two: a pure 32 bit mode for legacy code, a 32 bit mode with 64
bit file offsets for new 32 bit code, and a 64 bit mode. On the
other hand, even coping with two different models on the same
machine can be confusing enough.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jul 22 '08 #13

P: n/a
On Jul 21, 10:07 pm, Marcel Müller <news.5.ma...@spamgourmet.org>
wrote:
Robert Kochem wrote:
Ron AF Greve schrieb:
Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am
not sure about that) per file.
Sorry, but I can't believe that. Do you really mean that
e.g. a 32bit Linux filesystem can not handle files larger
than 4GB?
I don't think so either. The 64 bit file API is in no way
related to 64 bit extension of the CPU. even 8 bit CPUs could
deal with 64 bit numbers.
Posix requires off_t to be a typedef to a signed integral type.
It also requires that the file size, in bytes, be held in an
off_t. In the days before long long, the largest signed
integral type was long, normally 32 bits on a 32 bit machine.
Which meant that file sizes were limited to 2GB. (Of course,
back then, a file of more than 2GB wouldn't fit on most disks.)

The integration of large file support has been extremely
complex, since breaking existing binaries (which dynamically
link to the system API) was not considered an acceptable option.
The result is that by default, both 32 bit Solaris and 32 bit
Linux do not support files greater than 2GB. (I think that both
have means to do so; it's highly unlikely, however, that the
C++, or even the C standard library use these.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jul 22 '08 #14

P: n/a
tni
James Kanze wrote:
The result is that by default, both 32 bit Solaris and 32 bit
Linux do not support files greater than 2GB. (I think that both
have means to do so; it's highly unlikely, however, that the
C++, or even the C standard library use these.)
It is part of the Unix 98 spec, so it is supported by libc/glibc. E.g.,
look at:
http://ac-archive.sourceforge.net/largefile/libc.html
Jul 22 '08 #15

P: n/a
James Kanze wrote:
But backwards compatibility
issues sometimes pose problems: changing the size of off_t on a
Posix implementation breaks binary compatibility, for example.
So libc.so (the dynamic object which contains the system API and
the basic C library under Solaris) must stick with 32 bit file
offsets, or existing binaries will cease to work. And if
libc.so uses a 32 bit file offset, then any new code which links
against it must, too.
I don't know how Solaris implements this in particular but this could be
solved by providing legacy compatibility libs for older binaries (I
think that, for example, FreeBSD does it that way, which has had a
64-bit off_t since at least 1996 iirc.)
Jul 22 '08 #16

P: n/a
James Kanze wrote:
The result is that by default, both 32 bit Solaris and 32 bit
Linux do not support files greater than 2GB. (I think that both
What do you mean by that? At least Linux (i386) has had support for
files >2gb for many years now, "out of box" (that is, by default).
Jul 22 '08 #17

P: n/a
Hi Robert,

I did a test on sun solaris using gcc and indeed even in 32 bit mode it can
handle large files. So apparently it has become something of the past.

However if you don't believe that this certainly hasn't always been the
case, here is some nice reading material.

http://www.unix.org/version2/whatsnew/lfs20mar.html#1.1
Regards, Ron AF Greve

http://www.InformationSuperHighway.eu

"Robert Kochem" <ro****@mailueberfall.dewrote in message
news:13****************************@40tude.net...
Ron AF Greve schrieb:
>Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.

Sorry, but I can't believe that. Do you really mean that e.g. a 32bit
Linux
filesystem can not handle files larger than 4GB?

Robert

Jul 22 '08 #18

P: n/a
On Jul 21, 3:14*pm, "Ron AF Greve" <ron@localhostwrote:
Hi,

Most 32 bit OS'es have a limitation of 2GB (maybe 4GB I am not sure about
that) per file.

That's not true. See http://en.wikipedia.org/wiki/Comparison_of_file_systems
Most file systems in use today will have no problem with 4GB files.
Jul 22 '08 #19

P: n/a
On Jul 22, 3:16 pm, Matthias Buelow <m...@incubus.dewrote:
James Kanze wrote:
The result is that by default, both 32 bit Solaris and 32 bit
Linux do not support files greater than 2GB. (I think that both
What do you mean by that? At least Linux (i386) has had
support for files >2gb for many years now, "out of box" (that
is, by default).
The OS, yes, but at least on the 32 bit implementations I have
access to, off_t is an int32_t, which means (indirectly) that
the standard FILE* and fstream will have problems with them.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jul 23 '08 #20

This discussion thread is closed

Replies have been disabled for this discussion.