472,344 Members | 1,246 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,344 software developers and data experts.

How to speed up ftell()/fseek()



Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

L.B.
*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistry? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*
Jul 23 '05 #1
7 4649
"Leslaw Bieniasz" <nb******@cyf-kr.edu.pl> wrote in message
news:Pi******************************@kinga.cyf-kr.edu.pl...


Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.


Let me guess: You are using a Microsoft compiler. I once wrote a language
interpreter that did all the necessary token recognition, parsing and
expression evaluation, and it turned out that an lseek I was doing just to
keep track of the current file position (and not to actually seek anywhere)
was taking 50% of the execution time! That was easy to fix because I only
had to use my own counter to keep track of the position myself. In your case
the fseek is really seeking, so I don't know what you can do. Are you sure
the delays are excessive? You would expect some degradation in performance
as the file size increases and the physical seek distances on the disk get
larger.

DW
Jul 23 '05 #2

"Leslaw Bieniasz" <nb******@cyf-kr.edu.pl> wrote in message news:Pi******************************@kinga.cyf-kr.edu.pl...


Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

[snip]

Perhaps the following links will give some tips:
http://groups-beta.google.com/group/...5e065030?hl=en
http://groups-beta.google.com/group/...a4c4e9bb?hl=en

--
Alex Vinokur
email: alex DOT vinokur AT gmail DOT com
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn

Jul 23 '05 #3
Leslaw Bieniasz wrote:

Hello,

I am trying to fastly read large binary files (order of 100-200 MB)
using ftell() and fseek(). My class gets a pointer to the
data stored in the file, and then uses fseek() to access
and read the data. The problem is that when the file grows
in size, the access time also increases. I initially used
fseek() with option SEEK_SET, but later switched to SEEK_CUR
in the hope that this will speed up the access, but there
is no improvement. My question is: is there anything else
one can do in order to have the access time independed
on the file size?
Stream classes are not conceivable here, as they are even
much slower.

L.B.
*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistry? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*


did you consider mmap'ing the file instead? I don't know whether
this is available on your platform and more performant than fseek,
but it might be worth a try.

Tom
Jul 23 '05 #4

How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.
L.B.

*-------------------------------------------------------------------*
| Dr. Leslaw Bieniasz, |
| Institute of Physical Chemistry of the Polish Academy of Sciences,|
| Department of Electrochemical Oxidation of Gaseous Fuels, |
| ul. Zagrody 13, 30-318 Cracow, Poland. |
| tel./fax: +48 (12) 266-03-41 |
| E-mail: nb******@cyf-kr.edu.pl |
*-------------------------------------------------------------------*
| Interested in Computational Electrochemistry? |
| Visit my web site: http://www.cyf-kr.edu.pl/~nbbienia |
*-------------------------------------------------------------------*
Jul 23 '05 #5
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.


Be aware that it's basically a Unix thing... see if you have a header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the post by Alex Vinokur earlier in this thread (it's a
bit tricky getting all the parameters right, I seem to recall).

--
Lionel B

Jul 23 '05 #6
Lionel B wrote:
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.

Be aware that it's basically a Unix thing... see if you have a header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the post by Alex Vinokur earlier in this thread (it's a
bit tricky getting all the parameters right, I seem to recall).


when it comes to POSIX and UNIX the best places to go to are IMHO
- http://www.opengroup.org
(in this case
http://www.opengroup.org/onlinepubs/.../xsh/mmap.html)
- usenet: comp.unix.programmer
- the man pages on your systems
- the docs of your system provider (e.g.: http://docs.sun.com)

Tom
Jul 23 '05 #7
Thomas Maier-Komor wrote:
Lionel B wrote:
Leslaw Bieniasz wrote:
How can I do the mmapping of a file?
I heard about that but I don't know how to actually do this.


Be aware that it's basically a Unix thing... see if you have a
header file called sys/mman.h in your sytem path.

http://www.gnu.org/software/libc/man...mapped-I_002fO

I think you'll find some sample code if you follow the links in the
post by Alex Vinokur earlier in this thread (it's a bit tricky
getting all the parameters right, I seem to recall).


when it comes to POSIX and UNIX the best places to go to are IMHO
- http://www.opengroup.org


Didn't know about this resource - looks very handy.

Thanks,

--
Lionel B

Jul 23 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
by: kate | last post by:
salve. per favore rispondete alla mia domanda: Come faccio a ottenere le dimensioni di un file?(con c/c++) risp presto grazie
15
by: TJ Walls | last post by:
Hello All, I am baffled ... I am trying to improve the speed of a program that I have written that performs random access within a file. It...
18
by: Martin Johansen | last post by:
Hello When opening a CR-NL file, ftell returns the length of the file with the CR-NL as two bytes, is it supposed to do so? I am comparing two...
2
by: cedarson | last post by:
I am writing a program and have been instructeed to use the 'fseek', 'ftell', and 'stat' functions, however, after looking in the online manual for...
10
by: Kenneth Brody | last post by:
I recently ran into an "issue" related to text files and ftell/fseek, and I'd like to know if it's a bug, or simply an annoying, but still...
3
by: Chen ShuSheng | last post by:
HI, I am now study a segment of codes: ------------------------ printf("%p\t",fp); /*add by me*/ fseek(fp, 0L, SEEK_END); ...
7
by: Hallvard B Furuseth | last post by:
I'm trying to clean up a program which does arithmetic on text file positions, and also reads text files in binary mode. I can't easily get rid of...
25
by: subramanian100in | last post by:
Consider the following program: #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv) { if (argc != 2) { printf("Usage:...
2
by: Seongsu Lee | last post by:
Hi all, I want to get the size of a block device by ftell(). I found that I can get the size of a device by seek() and tell() in Python. But not...
0
by: concettolabs | last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
0
by: teenabhardwaj | last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.