Connecting Tech Pros Worldwide Help | Site Map

Standard compliant seek with long long?

David Mathog
Guest
 
Posts: n/a
#1: Nov 15 '05
I recently ran into a problem where a data file downloaded
from another site contained more than 4Gb of data and so
the index file to items within that data went from unsigned
4 byte integers to unsigned 8 byte integers. Naturally this
broke my code which uses fseek(), and can only offset by
longs, which on the target OS is a 4 byte integer.

There are ways around this using OS calls, but
as far as I can tell the C99 standard offers no way to write
code that can jump to an arbitrary offset in this type of
large data file. Is there any movement in the standards
community towards solving this problem in the not too
distant future?

Solaris offers fseeko (fseek with offset of type
off_t) and various other lf64 extensions, but the man pages
didn't indicate that it was anything other than a Sun
specific solution.

Thanks,

David Mathog
mathog@caltech.edu
Gordon Burditt
Guest
 
Posts: n/a
#2: Nov 15 '05

re: Standard compliant seek with long long?


>I recently ran into a problem where a data file downloaded[color=blue]
>from another site contained more than 4Gb of data and so
>the index file to items within that data went from unsigned
>4 byte integers to unsigned 8 byte integers. Naturally this
>broke my code which uses fseek(), and can only offset by
>longs, which on the target OS is a 4 byte integer.
>
>There are ways around this using OS calls, but
>as far as I can tell the C99 standard offers no way to write
>code that can jump to an arbitrary offset in this type of
>large data file. Is there any movement in the standards
>community towards solving this problem in the not too
>distant future?[/color]

There are a couple of different standards-based solutions
(both of which require possibly-incompatible changes to
the implementation):

(1) make long bigger than 32 bits.
or
(2) fgetpos() and fsetpos(), which allow for the position to be
contained in what might be a struct containing a (maybe 256k-bit)
track, sector, cylinder, disk number, IPv8 IP address, etc.

Gordon L. Burditt
tedu
Guest
 
Posts: n/a
#3: Nov 15 '05

re: Standard compliant seek with long long?


Gordon Burditt wrote:[color=blue][color=green]
> >I recently ran into a problem where a data file downloaded
> >from another site contained more than 4Gb of data and so
> >the index file to items within that data went from unsigned
> >4 byte integers to unsigned 8 byte integers. Naturally this
> >broke my code which uses fseek(), and can only offset by
> >longs, which on the target OS is a 4 byte integer.
> >
> >There are ways around this using OS calls, but
> >as far as I can tell the C99 standard offers no way to write
> >code that can jump to an arbitrary offset in this type of
> >large data file. Is there any movement in the standards
> >community towards solving this problem in the not too
> >distant future?[/color]
>
> There are a couple of different standards-based solutions
> (both of which require possibly-incompatible changes to
> the implementation):
>
> (1) make long bigger than 32 bits.
> or
> (2) fgetpos() and fsetpos(), which allow for the position to be
> contained in what might be a struct containing a (maybe 256k-bit)
> track, sector, cylinder, disk number, IPv8 IP address, etc.[/color]

since you can only fsetpos() to a location where you once called
fgetpos(), this doesn't really help with random seeking unless you
fseek() through recording positions.

fseeko() is part of posix, so it should be available on most platforms
(but outside strict C).

David Mathog
Guest
 
Posts: n/a
#4: Nov 15 '05

re: Standard compliant seek with long long?


tedu wrote:[color=blue]
> Gordon Burditt wrote:
>[color=green][color=darkred]
>>>I recently ran into a problem where a data file downloaded[/color]
>>[color=darkred]
>>>from another site contained more than 4Gb of data and so[/color]
>>[color=darkred]
>>>the index file to items within that data went from unsigned
>>>4 byte integers to unsigned 8 byte integers. Naturally this
>>>broke my code which uses fseek(), and can only offset by
>>>longs, which on the target OS is a 4 byte integer.[/color][/color]
>
> fseeko() is part of posix, so it should be available on most platforms
> (but outside strict C).
>[/color]

Thanks, I guess allowing posix extensions isn't too much of a stretch.

Still...

Can any of the standards folks explain why ANSI C does not
have fseeko (or equivalent) so that we can write
standard compliant ANSI C code that can randomly access the largest
files supported on a given OS?

I can understand the historical basis for fseek() using only
longs but not, at this late date, why something like fseeko is
not part of the current C standard.

Regards,

David Mathog
mathog@caltech.edu
Keyser Soze
Guest
 
Posts: n/a
#5: Nov 15 '05

re: Standard compliant seek with long long?


"tedu" <tu@zeitbombe.org> wrote in message news:1129749842.004314.295930@g49g2000cwa.googlegr oups.com...[color=blue]
> Gordon Burditt wrote:[color=green][color=darkred]
> > >I recently ran into a problem where a data file downloaded
> > >from another site contained more than 4Gb of data and so
> > >the index file to items within that data went from unsigned
> > >4 byte integers to unsigned 8 byte integers. Naturally this
> > >broke my code which uses fseek(), and can only offset by
> > >longs, which on the target OS is a 4 byte integer.
> > >
> > >There are ways around this using OS calls, but
> > >as far as I can tell the C99 standard offers no way to write
> > >code that can jump to an arbitrary offset in this type of
> > >large data file. Is there any movement in the standards
> > >community towards solving this problem in the not too
> > >distant future?[/color]
> >
> > There are a couple of different standards-based solutions
> > (both of which require possibly-incompatible changes to
> > the implementation):
> >
> > (1) make long bigger than 32 bits.
> > or
> > (2) fgetpos() and fsetpos(), which allow for the position to be
> > contained in what might be a struct containing a (maybe 256k-bit)
> > track, sector, cylinder, disk number, IPv8 IP address, etc.[/color]
>
> since you can only fsetpos() to a location where you once called
> fgetpos(), this doesn't really help with random seeking unless you
> fseek() through recording positions.
>
> fseeko() is part of posix, so it should be available on most platforms
> (but outside strict C).
>[/color]

Remember that fseek() can move the file position relative to the BEGINNING, END, or CURRENT position of the file.

You only run into the 4Gb file size limit when the fseek() orign is set to SEEK_SET or SEEK_END. When using SEEK_CUR you can move
the file position forward by (2^31-1) bytes and backwards by (2^31) bytes from the current position.

By combining fseek() with fgetpos() and fsetpos() it should be possible to index any file in 4Gb regions.

You will need to check that C run time of your target OS supports files larger than 4Gb.

To avoid some of the nasty suprises that fseek() has you should open the file as a binary stream. Text file streams tend to have
issues when using fseek().


Walter Roberson
Guest
 
Posts: n/a
#6: Nov 15 '05

re: Standard compliant seek with long long?


In article <IkU5f.5222$tV6.1559@newssvr27.news.prodigy.net> ,
Keyser Soze <noreturn@nothere.com> wrote:[color=blue]
>Remember that fseek() can move the file position relative to the BEGINNING, END, or CURRENT position of the file.[/color]
[color=blue]
>You only run into the 4Gb file size limit when the fseek() orign is set to SEEK_SET or SEEK_END. When using SEEK_CUR you can move
>the file position forward by (2^31-1) bytes and backwards by (2^31) bytes from the current position.[/color]
[color=blue]
>By combining fseek() with fgetpos() and fsetpos() it should be possible to index any file in 4Gb regions.[/color]

"A binary stream need not meaningfully support fseek calls with
a whence value of SEEK_END."

"For a text stream, either offset shall be zero, or offset shall be
a value returned by an earlier call to the ftell function on the
same stream and whence shall be SEEK_SET."


The first of these means that you cannot move to arbitrary positions
relative to the end of a binary stream.

The second of these means that you cannot index text streams in 4 GB
regions -- you only said that text streams had "issues".


I suppose it would be possible to count the number of times that one
must seek forward by 2^31-1 bytes, and then the further count forward
one must go relative to that, but tis would seem to be a poor way to
win a war.
--
If you lie to the compiler, it will get its revenge. -- Eric Sosman
Closed Thread