472,122 Members | 1,465 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,122 software developers and data experts.

Fastest way to read a file

Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data is
very quick. I am currently using a Streamreader and have found the readline
method to perform slightly better than the read method (although it is nice
to have the read's granuality of one byte). Is there any faster reader that
I can use? (only interested in reads -not writes).

Thanks.

Mark.
Nov 17 '05 #1
14 8735
Mark Broadbent wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data is
very quick. I am currently using a Streamreader and have found the readline
method to perform slightly better than the read method (although it is nice
to have the read's granuality of one byte). Is there any faster reader that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


Not sure if this is the fastest but you could give it a try.
public static string FileToStr(string cFileName)
{
//Create a StreamReader and open the file
StreamReader oReader = System.IO.File.OpenText(cFileName);

//Read all the contents of the file in a string
string lcString = oReader.ReadToEnd();

//Close the StreamReader and return the string
oReader.Close();
return lcString;
}
Nov 17 '05 #2
Hi Mark,

You're not telling us what format your data is in. If it is raw data, then StreamReader will treat it as text and it will be faster reading directly from the FileStream. I tried with ~30mb binary file and StreamReader took roughly three times longer to read all the bytes.

However, since you are using StreamReader to begin with you probably have a text file. Still, it might be worth trying to read it as a byte array with FileStream.Read and convert the bytes to string using the Encoding class.
On Thu, 28 Jul 2005 11:59:35 +0200, Mark Broadbent <no****@nospam.com> wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data is
very quick. I am currently using a Streamreader and have found the readline
method to perform slightly better than the read method (although it is nice
to have the read's granuality of one byte). Is there any faster reader that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


--
Happy coding!
Morten Wennevik [C# MVP]
Nov 17 '05 #3
Just 2 cents. I would use a "using" statement to wrap the StreamReader
object. That way it will be closed even if exception in opening file or in
the read.
Something like (from memory):

public static string ReadFile(string path)
{
using(StreamReader sr = new StreamReader(path))
{
return sr.ReadToEnd();
}
}

--
William Stacey [MVP]

"Eugene Vtial" <ne**@microsoft.com> wrote in message
news:eH**************@TK2MSFTNGP09.phx.gbl...
Mark Broadbent wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data
is
very quick. I am currently using a Streamreader and have found the
readline
method to perform slightly better than the read method (although it is
nice
to have the read's granuality of one byte). Is there any faster reader
that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


Not sure if this is the fastest but you could give it a try.
public static string FileToStr(string cFileName)
{
//Create a StreamReader and open the file
StreamReader oReader = System.IO.File.OpenText(cFileName);

//Read all the contents of the file in a string
string lcString = oReader.ReadToEnd();

//Close the StreamReader and return the string
oReader.Close();
return lcString;
}

Nov 17 '05 #4
As I said factual please not guess. I can try variants myself (and have).
ReadToEnd has no performance gain over ReadLine.

"Eugene Vtial" <ne**@microsoft.com> wrote in message
news:eH**************@TK2MSFTNGP09.phx.gbl...
Mark Broadbent wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data
is
very quick. I am currently using a Streamreader and have found the
readline
method to perform slightly better than the read method (although it is
nice
to have the read's granuality of one byte). Is there any faster reader
that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


Not sure if this is the fastest but you could give it a try.
public static string FileToStr(string cFileName)
{
//Create a StreamReader and open the file
StreamReader oReader = System.IO.File.OpenText(cFileName);

//Read all the contents of the file in a string
string lcString = oReader.ReadToEnd();

//Close the StreamReader and return the string
oReader.Close();
return lcString;
}

Nov 17 '05 #5
Hi Morten. As I said Im not really interested in whatever format a file is
in, I am doing a comparison between files, therefore only need to compare
bits. Thanks for info on StreamReader , I had an thought that this might not
be the best IO object. Using FileStream, the same kind of operation on a
like for like basis (200MB) completes in 1 minute (compariable time to
StreamReader ReadLine) wheras StreamReader would have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumjcodbklbvpo@stone...
Hi Mark,

You're not telling us what format your data is in. If it is raw data,
then StreamReader will treat it as text and it will be faster reading
directly from the FileStream. I tried with ~30mb binary file and
StreamReader took roughly three times longer to read all the bytes.

However, since you are using StreamReader to begin with you probably have
a text file. Still, it might be worth trying to read it as a byte array
with FileStream.Read and convert the bytes to string using the Encoding
class.
On Thu, 28 Jul 2005 11:59:35 +0200, Mark Broadbent <no****@nospam.com>
wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data
is
very quick. I am currently using a Streamreader and have found the
readline
method to perform slightly better than the read method (although it is
nice
to have the read's granuality of one byte). Is there any faster reader
that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


--
Happy coding!
Morten Wennevik [C# MVP]

Nov 17 '05 #6
Well, reading the 30Mb file took about 0.1 second using a FileStream, so a 200mb shouldn't take all that much longer. Most likely it is the comparison that takes up the time, or you read 'too few' bytes in each read.

Try to limit reading

On Thu, 28 Jul 2005 15:30:55 +0200, Mark Broadbent <no****@nospam.com> wrote:
Hi Morten. As I said Im not really interested in whatever format a file is
in, I am doing a comparison between files, therefore only need to compare
bits. Thanks for info on StreamReader , I had an thought that this might not
be the best IO object. Using FileStream, the same kind of operation on a
like for like basis (200MB) completes in 1 minute (compariable time to
StreamReader ReadLine) wheras StreamReader would have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumjcodbklbvpo@stone...
Hi Mark,

You're not telling us what format your data is in. If it is raw data,
then StreamReader will treat it as text and it will be faster reading
directly from the FileStream. I tried with ~30mb binary file and
StreamReader took roughly three times longer to read all the bytes.

However, since you are using StreamReader to begin with you probably have
a text file. Still, it might be worth trying to read it as a byte array
with FileStream.Read and convert the bytes to string using the Encoding
class.
On Thu, 28 Jul 2005 11:59:35 +0200, Mark Broadbent <no****@nospam.com>
wrote:
Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of the
data (i.e. blocks, bytes, string etc) just that the IO to read the data
is
very quick. I am currently using a Streamreader and have found the
readline
method to perform slightly better than the read method (although it is
nice
to have the read's granuality of one byte). Is there any faster reader
that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


--
Happy coding!
Morten Wennevik [C# MVP]



--
Happy coding!
Morten Wennevik [C# MVP]
Nov 17 '05 #7
Hmm... 30MB file in 0.1 second, you must have the whole file data in the
cache, so what you are measuring is the memory to memory transfer rate.
Please if you run file IO benchmarks flush the file cache before each run.
Willy.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumllynqklbvpo@stone...
Well, reading the 30Mb file took about 0.1 second using a FileStream, so a
200mb shouldn't take all that much longer. Most likely it is the
comparison that takes up the time, or you read 'too few' bytes in each
read.

Try to limit reading

On Thu, 28 Jul 2005 15:30:55 +0200, Mark Broadbent <no****@nospam.com>
wrote:
Hi Morten. As I said Im not really interested in whatever format a file
is
in, I am doing a comparison between files, therefore only need to compare
bits. Thanks for info on StreamReader , I had an thought that this might
not
be the best IO object. Using FileStream, the same kind of operation on a
like for like basis (200MB) completes in 1 minute (compariable time to
StreamReader ReadLine) wheras StreamReader would have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumjcodbklbvpo@stone...
Hi Mark,

You're not telling us what format your data is in. If it is raw data,
then StreamReader will treat it as text and it will be faster reading
directly from the FileStream. I tried with ~30mb binary file and
StreamReader took roughly three times longer to read all the bytes.

However, since you are using StreamReader to begin with you probably
have
a text file. Still, it might be worth trying to read it as a byte array
with FileStream.Read and convert the bytes to string using the Encoding
class.
On Thu, 28 Jul 2005 11:59:35 +0200, Mark Broadbent <no****@nospam.com>
wrote:

Does anybody know what is (factual please -not just guess) the quickest
method to read data from a file? I am not interested in the format of
the
data (i.e. blocks, bytes, string etc) just that the IO to read the data
is
very quick. I am currently using a Streamreader and have found the
readline
method to perform slightly better than the read method (although it is
nice
to have the read's granuality of one byte). Is there any faster reader
that
I can use? (only interested in reads -not writes).

Thanks.

Mark.


--
Happy coding!
Morten Wennevik [C# MVP]



--
Happy coding!
Morten Wennevik [C# MVP]

Nov 17 '05 #8

"Mark Broadbent" <no****@nospam.com> wrote in message
news:u7*************@TK2MSFTNGP09.phx.gbl...
Hi Morten. As I said Im not really interested in whatever format a file is
in, I am doing a comparison between files, therefore only need to compare
bits. Thanks for info on StreamReader , I had an thought that this might
not be the best IO object. Using FileStream, the same kind of operation on
a like for like basis (200MB) completes in 1 minute (compariable time to
StreamReader ReadLine) wheras StreamReader would have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

All FCL file IO wrappers are simple wrappers over the one and only ReadFile
Win32 IO API so their performance level will be more or less the same when
using comparable buffer sizes at the core.
The differences in functionality/complexity of the wrapper classes only
plays a very small role in the overall IO transfer rate between disk an
process memory.
The Filestream class is the one designed for simple Buffered File IO, so
it's the fastest, but again the differences with others are hard to measure
when used with comparable buffer sizes.

Willy.
Nov 17 '05 #9
Oh yeah, indeed :P Uncached FileStream seemed to take around 0.6 seconds on a fairly slow disk (assuming it is now uncahced). How would I go about ensuring the cache is flushed?
On Thu, 28 Jul 2005 16:24:48 +0200, Willy Denoyette [MVP] <wi*************@telenet.be> wrote:
Hmm... 30MB file in 0.1 second, you must have the whole file data in the
cache, so what you are measuring is the memory to memory transfer rate.
Please if you run file IO benchmarks flush the file cache before each run.
Willy.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumllynqklbvpo@stone...
Well, reading the 30Mb file took about 0.1 second using a FileStream, so a
200mb shouldn't take all that much longer. Most likely it is the
comparison that takes up the time, or you read 'too few' bytes in each
read.

Try to limit reading

On Thu, 28 Jul 2005 15:30:55 +0200, Mark Broadbent <no****@nospam.com>
wrote:
Hi Morten. As I said Im not really interested in whatever format a file
is
in, I am doing a comparison between files, therefore only need to compare
bits. Thanks for info on StreamReader , I had an thought that this might
not
be the best IO object. Using FileStream, the same kind of operation on a
like for like basis (200MB) completes in 1 minute (compariable time to
StreamReader ReadLine) wheras StreamReader would have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sumjcodbklbvpo@stone...
Hi Mark,

You're not telling us what format your data is in. If it is raw data,
then StreamReader will treat it as text and it will be faster reading
directly from the FileStream. I tried with ~30mb binary file and
StreamReader took roughly three times longer to read all the bytes.

However, since you are using StreamReader to begin with you probably
have
a text file. Still, it might be worth trying to read it as a byte array
with FileStream.Read and convert the bytes to string using the Encoding
class.
On Thu, 28 Jul 2005 11:59:35 +0200, Mark Broadbent <no****@nospam.com>
wrote:

> Does anybody know what is (factual please -not just guess) the quickest
> method to read data from a file? I am not interested in the format of
> the
> data (i.e. blocks, bytes, string etc) just that the IO to read the data
> is
> very quick. I am currently using a Streamreader and have found the
> readline
> method to perform slightly better than the read method (although it is
> nice
> to have the read's granuality of one byte). Is there any faster reader
> that
> I can use? (only interested in reads -not writes).
>
> Thanks.
>
> Mark.
>
>
>

--
Happy coding!
Morten Wennevik [C# MVP]


--
Happy coding!
Morten Wennevik [C# MVP]



--
Happy coding!
Morten Wennevik [C# MVP]
Nov 17 '05 #10

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sum82ntdklbvpo@stone...
Oh yeah, indeed :P Uncached FileStream seemed to take around 0.6 seconds
on a fairly slow disk (assuming it is now uncahced). How would I go about
ensuring the cache is flushed?


Note that I'm talking about the File System cache not the FileStream cache,
once you have read a file or a portion of a file that file (or portion) will
be in the FS cache. So your experiment is still using the FS cache.
You can eliminate the FS caching behavior:
1. By openeing the file unbuffered, that is the file data is directly passed
from the driver to the application buffer.
2. By flushing th FS cache for this file.
Both aren't exposed by the FCL, so you'll have to PInvoke (1)
CreateFile()with FILE_FLAG_NO_BUFFERING , (2) FlushFileBuffers() API's

Willy.


Nov 17 '05 #11
I've found that MemoryMapped files are the quickest form of File IO.
I've only ever used C++ for MM files, so am not sure if this
functionality is available natively with .Net, or if you would have to
use interop.

RichS

Nov 17 '05 #12
Yes, the FileStream/StreamReader cache is flushed on Close() so I knew you didn't mean that. I'll try to ensure the cache is flushed from now on, thanks :)

On Fri, 29 Jul 2005 09:58:43 +0200, Willy Denoyette [MVP] <wi*************@telenet.be> wrote:

"Morten Wennevik" <Mo************@hotmail.com> wrote in message
news:op.sum82ntdklbvpo@stone...
Oh yeah, indeed :P Uncached FileStream seemed to take around 0.6 seconds
on a fairly slow disk (assuming it is now uncahced). How would I go about
ensuring the cache is flushed?


Note that I'm talking about the File System cache not the FileStream cache,
once you have read a file or a portion of a file that file (or portion) will
be in the FS cache. So your experiment is still using the FS cache.
You can eliminate the FS caching behavior:
1. By openeing the file unbuffered, that is the file data is directly passed
from the driver to the application buffer.
2. By flushing th FS cache for this file.
Both aren't exposed by the FCL, so you'll have to PInvoke (1)
CreateFile()with FILE_FLAG_NO_BUFFERING , (2) FlushFileBuffers() API's

Willy.



--
Happy coding!
Morten Wennevik [C# MVP]
Nov 17 '05 #13
Yeah I agree, thanks. The performance could be better, but I guess it is
acceptable and I will see how far I can go to optimise. Reading in two
different threads should hopefully improve on what I have.

With respect to your discussion of the flushing of the cache to disk,
another method that could be used is the freeware systems internals app
"sync.exe" which presumably should perform the same function.

Thanks guys.

Mark.

"Willy Denoyette [MVP]" <wi*************@telenet.be> wrote in message
news:eb**************@TK2MSFTNGP15.phx.gbl...

"Mark Broadbent" <no****@nospam.com> wrote in message
news:u7*************@TK2MSFTNGP09.phx.gbl...
Hi Morten. As I said Im not really interested in whatever format a file
is in, I am doing a comparison between files, therefore only need to
compare bits. Thanks for info on StreamReader , I had an thought that
this might not be the best IO object. Using FileStream, the same kind of
operation on a like for like basis (200MB) completes in 1 minute
(compariable time to StreamReader ReadLine) wheras StreamReader would
have taken 1m 30secs.
I guess I could do async on both files which should speed things up even
further.

Thanks again.

Mark.

All FCL file IO wrappers are simple wrappers over the one and only
ReadFile Win32 IO API so their performance level will be more or less the
same when using comparable buffer sizes at the core.
The differences in functionality/complexity of the wrapper classes only
plays a very small role in the overall IO transfer rate between disk an
process memory.
The Filestream class is the one designed for simple Buffered File IO, so
it's the fastest, but again the differences with others are hard to
measure when used with comparable buffer sizes.

Willy.

Nov 17 '05 #14

"RichS" <ri****************@surfcontrol.com> wrote in message
news:11*********************@g43g2000cwa.googlegro ups.com...
I've found that MemoryMapped files are the quickest form of File IO.
I've only ever used C++ for MM files, so am not sure if this
functionality is available natively with .Net, or if you would have to
use interop.

RichS


No they are not faster, the accesses are faster once mapped into your
process space, but mapping file portions involves IO reads and these are not
faster than any other managed or unmanaged read IO. Also this is not
desirable at all when there is only one single process accessing the file.
MemoryMapped files are great for sharing file data that's all.

Willy.



Nov 17 '05 #15

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

40 posts views Thread by Abby | last post: by
2 posts views Thread by somequestion | last post: by
1 post views Thread by samira | last post: by
3 posts views Thread by =?Utf-8?B?Sm9obiBXYWxrZXI=?= | last post: by
2 posts views Thread by xplode144 | last post: by
reply views Thread by leeamiin | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.