473,756 Members | 1,676 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to speed up ftell()/fseek()



Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

L.B.
*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistr y? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*
Jul 23 '05 #1
7 4967
"Leslaw Bieniasz" <nb******@cyf-kr.edu.pl> wrote in message
news:Pi******** *************** *******@kinga.c yf-kr.edu.pl...


Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.


Let me guess: You are using a Microsoft compiler. I once wrote a language
interpreter that did all the necessary token recognition, parsing and
expression evaluation, and it turned out that an lseek I was doing just to
keep track of the current file position (and not to actually seek anywhere)
was taking 50% of the execution time! That was easy to fix because I only
had to use my own counter to keep track of the position myself. In your case
the fseek is really seeking, so I don't know what you can do. Are you sure
the delays are excessive? You would expect some degradation in performance
as the file size increases and the physical seek distances on the disk get
larger.

DW
Jul 23 '05 #2

"Leslaw Bieniasz" <nb******@cyf-kr.edu.pl> wrote in message news:Pi******** *************** *******@kinga.c yf-kr.edu.pl...


Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

[snip]

Perhaps the following links will give some tips:
http://groups-beta.google.com/group/...5e065030?hl=en
http://groups-beta.google.com/group/...a4c4e9bb?hl=en

--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #3
Leslaw Bieniasz wrote:

Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

L.B.
*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistr y? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*


did you consider mmap'ing the file instead? I don't know whether
this is available on your platform and more performant than fseek,
but it might be worth a try.

Tom
Jul 23 '05 #4

How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.
L.B.

*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistr y? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*
Jul 23 '05 #5
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.


Be aware that it's basically a Unix thing... see if you have a header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the post by Alex Vinokur earlier in this thread (it's a
bit tricky getting all the parameters right, I seem to recall).

--
Lionel B

Jul 23 '05 #6
Lionel B wrote:
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.

Be aware that it's basically a Unix thing... see if you have a header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the post by Alex Vinokur earlier in this thread (it's a
bit tricky getting all the parameters right, I seem to recall).


when it comes to POSIX and UNIX the best places to go to are IMHO
- http://www.opengroup.org
(in this case
http://www.opengroup.org/onlinepubs/.../xsh/mmap.html)
- usenet: comp.unix.progr ammer
- the man pages on your systems
- the docs of your system provider (e.g.: http://docs.sun.com)

Tom
Jul 23 '05 #7
Thomas Maier-Komor wrote:
Lionel B wrote:
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.


Be aware that it's basically a Unix thing... see if you have a
header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the
post by Alex Vinokur earlier in this thread (it's a bit tricky
getting all the parameters right, I seem to recall).


when it comes to POSIX and UNIX the best places to go to are IMHO
- http://www.opengroup.org


Didn't know about this resource - looks very handy.

Thanks,

--
Lionel B

Jul 23 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
667
by: kate | last post by:
salve. per favore rispondete alla mia domanda: Come faccio a ottenere le dimensioni di un file?(con c/c++) risp presto grazie
15
16136
by: TJ Walls | last post by:
Hello All, I am baffled ... I am trying to improve the speed of a program that I have written that performs random access within a file. It relies heavily on fseek and is very slow. To test, I wrote the following test program which just writes the numbers 1-167721 sequentially to a binary file: #include <stdio.h> #include <stdlib.h>
18
2190
by: Martin Johansen | last post by:
Hello When opening a CR-NL file, ftell returns the length of the file with the CR-NL as two bytes, is it supposed to do so? I am comparing two file-sizes, one CR-NL and one NL using ftell to get the filesize. Any alternative suggestion is welcomed. Thanks - Martin Johansen
2
3553
by: cedarson | last post by:
I am writing a program and have been instructeed to use the 'fseek', 'ftell', and 'stat' functions, however, after looking in the online manual for each of these, I am still unsure on how to use them. In my program, I am to write a code that opens a file, uses 'stat' to determine the file size, use 'fseek' to move the offset of the pointer, and finally use 'ftell' to obtain the file pointer index. Will someone please help? Again, thanks...
10
5979
by: Kenneth Brody | last post by:
I recently ran into an "issue" related to text files and ftell/fseek, and I'd like to know if it's a bug, or simply an annoying, but still conforming, implementation. The platform is Windows, where text files use CF+LF (0x0d, 0x0a) to mark end-of-line. The file in question, however, was in Unix format, with only LF (0x0a) at the end of each line. First, does the above situation already invoke "implementation defined" or "undefined"...
3
2951
by: Chen ShuSheng | last post by:
HI, I am now study a segment of codes: ------------------------ printf("%p\t",fp); /*add by me*/ fseek(fp, 0L, SEEK_END); /* go to end of file */ printf("%p\t",fp); /*add by me*/ last = ftell(fp); cout<<"last="<<last<<"\t"; /*add by me*/ -------------------------
7
3131
by: Hallvard B Furuseth | last post by:
I'm trying to clean up a program which does arithmetic on text file positions, and also reads text files in binary mode. I can't easily get rid of it all, so I'm wondering which of the following assumptions are, well, least unportable. In particular, do anyone know if there are real-life systems where the text file assumptions below don't hold? For text mode FILE*s,
25
3349
by: subramanian100in | last post by:
Consider the following program: #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv) { if (argc != 2) { printf("Usage: <program-name<text-file>\n");
2
3399
by: Seongsu Lee | last post by:
Hi all, I want to get the size of a block device by ftell(). I found that I can get the size of a device by seek() and tell() in Python. But not in C. What is difference between them? How can I get the size of a block device by ftell()?
0
9455
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
9838
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8709
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7242
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5140
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5302
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3354
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2665
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.