469,275 Members | 1,418 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,275 developers. It's quick & easy.

Reading real time log files.

looking for the best approach to reading a real time log file. The file gets
updated pretty quickly, always appended to the end.

do i really need to just keep re-opening the file and reading the end of it?

I cant help but think there is a better way to go about this.

any suggestions would be greatly appreciated.

Thanks
Justin
Jul 19 '07 #1
9 6148

"Justin Rich" <jr******@yahoo.spam.comwrote in message
news:uy**************@TK2MSFTNGP06.phx.gbl...
looking for the best approach to reading a real time log file. The file
gets updated pretty quickly, always appended to the end.

do i really need to just keep re-opening the file and reading the end of
it?
I don't know why you'd need to reopen. Your position may be invalid after
you hit end-of-file, so just seek to position you last got data from.
There's something like FileSystemWatcher which should tell you whenever the
size changes.
Jul 19 '07 #2
I guess i dont understand how the readers work.

I guess i assumed the reader was much like NotePad... when you open the
file, i copy is put in to memory and you work off that. if the original file
is changed it wont reflect that until you close and reopen notepad.

are you saying that if i open a stream of that file (some more code based
info might be good as to how i should open the file) that it will allow me
to read to EOF and then once it reaches EOF i can basically try to read in a
line until i no longer get EOF and i actually get some new data.

I havent put a lot of thought in to this... i just know that ive never done
such a thing and that i had ASSUMED that the basic stream readers read the
entire file to memory first, if im wrong this probably wont be so hard.

Thanks
Justin

"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:eQ**************@TK2MSFTNGP03.phx.gbl...
>
"Justin Rich" <jr******@yahoo.spam.comwrote in message
news:uy**************@TK2MSFTNGP06.phx.gbl...
>looking for the best approach to reading a real time log file. The file
gets updated pretty quickly, always appended to the end.

do i really need to just keep re-opening the file and reading the end of
it?

I don't know why you'd need to reopen. Your position may be invalid after
you hit end-of-file, so just seek to position you last got data from.
There's something like FileSystemWatcher which should tell you whenever
the size changes.


Jul 19 '07 #3

"Justin Rich" <jr******@yahoo.spam.comwrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl...
>I guess i dont understand how the readers work.

I guess i assumed the reader was much like NotePad... when you open the
file, i copy is put in to memory and you work off that. if the original
file is changed it wont reflect that until you close and reopen notepad.

are you saying that if i open a stream of that file (some more code based
info might be good as to how i should open the file) that it will allow me
to read to EOF and then once it reaches EOF i can basically try to read in
a line until i no longer get EOF and i actually get some new data.

I havent put a lot of thought in to this... i just know that ive never
done such a thing and that i had ASSUMED that the basic stream readers
read the entire file to memory first, if im wrong this probably wont be so
hard.
File I/O is nowhere near that wasteful. The reader classes might do some
caching, to do fewer block reads. Seeking to the appropriate position
should cause the reader to flush its cache, if any. Actually the reader
might not support seeking, only the underlying file, so you would have to
close the reader, seek the file, and create a new reader. If you need a
reader at all, that is, FileStream might be enough.
Jul 19 '07 #4
On Thu, 19 Jul 2007 10:04:30 -0700, Ben Voigt [C++ MVP]
<rb*@nospam.nospamwrote:
File I/O is nowhere near that wasteful. The reader classes might do some
caching, to do fewer block reads.
They do, at least some of them. For example, StreamReader.
Seeking to the appropriate position
should cause the reader to flush its cache, if any.
This is not true. If the data at the position being seeked to has already
been buffered, the cache isn't flushed. That's why StreamReader has the
DiscardBufferedData() method. You can't rely on seeking to update the
reader's buffers for you.
Actually the reader might not support seeking,
Actually, I'm not aware of any reader that does support seeking
explicitly. For example, for StreamReader you need to use the seeking
mechanisms on the base stream. This is one of the reasons that the reader
itself doesn't flush its buffers automatically...it has no practical way
to know that it needs to do that.
only the underlying file, so you would have to
close the reader, seek the file, and create a new reader.
I'm not aware of any file-based streams that don't support seeking.
However, yes...if you have a different kind of reader that is based on a
stream that doesn't support seeking (for example, a StringReader, or a
StreamReader that's using a NetworkStream), you can't seek.
If you need a
reader at all, that is, FileStream might be enough.
I find it useful to use a StreamReader when I know that my FileStream is
from a text file. No, it's not strictly required, but then none of the
reader classes are. But yes, if you want to seek a reader, you have to
use the underlying i/o class to do that, assuming it's possible at all.

The one concern I might have regarding all this is the behavior of the
stream and/or reader class upon reaching the end of the stream. For
example, for sure a NetworkStream is simply unusable once you've read to
the end of the stream; the network connection is closed.

Obviously this doesn't specifically apply to file-based streams, but the
OP should definitely double-check to make sure that once he's reached the
end of the stream, the stream and/or reader class doesn't just set some
internal state and never return any more data from beyond that point. I
would _guess_ that it wouldn't do that, but since I haven't done this
specifically, I don't want to say for sure that it's okay.

Pete
Jul 19 '07 #5

"Peter Duniho" <Np*********@nnowslpianmk.comwrote in message
news:op***************@petes-computer.local...
On Thu, 19 Jul 2007 10:04:30 -0700, Ben Voigt [C++ MVP]
<rb*@nospam.nospamwrote:
>File I/O is nowhere near that wasteful. The reader classes might do some
caching, to do fewer block reads.

They do, at least some of them. For example, StreamReader.
StreamReader reads the whole file into memory? From a cursory glance
through Reflector, it uses block reads but doesn't grab the whole file.
>
>Seeking to the appropriate position
should cause the reader to flush its cache, if any.

This is not true. If the data at the position being seeked to has already
been buffered, the cache isn't flushed. That's why StreamReader has the
DiscardBufferedData() method. You can't rely on seeking to update the
reader's buffers for you.
>Actually the reader might not support seeking,

Actually, I'm not aware of any reader that does support seeking
explicitly. For example, for StreamReader you need to use the seeking
mechanisms on the base stream. This is one of the reasons that the reader
itself doesn't flush its buffers automatically...it has no practical way
to know that it needs to do that.
If the reader doesn't support a seek operation, then what do you mean by "If
the data at the position being seeked to has already been buffered, the
cache isn't flushed." Apparently then the cache isn't flushed, ever.
>
>only the underlying file, so you would have to
close the reader, seek the file, and create a new reader.

I'm not aware of any file-based streams that don't support seeking.
However, yes...if you have a different kind of reader that is based on a
stream that doesn't support seeking (for example, a StringReader, or a
StreamReader that's using a NetworkStream), you can't seek.
>If you need a
reader at all, that is, FileStream might be enough.

I find it useful to use a StreamReader when I know that my FileStream is
from a text file. No, it's not strictly required, but then none of the
reader classes are. But yes, if you want to seek a reader, you have to
use the underlying i/o class to do that, assuming it's possible at all.

The one concern I might have regarding all this is the behavior of the
stream and/or reader class upon reaching the end of the stream. For
example, for sure a NetworkStream is simply unusable once you've read to
the end of the stream; the network connection is closed.

Obviously this doesn't specifically apply to file-based streams, but the
OP should definitely double-check to make sure that once he's reached the
end of the stream, the stream and/or reader class doesn't just set some
internal state and never return any more data from beyond that point. I
would _guess_ that it wouldn't do that, but since I haven't done this
specifically, I don't want to say for sure that it's okay.

Pete

Jul 19 '07 #6
On Thu, 19 Jul 2007 10:51:56 -0700, Ben Voigt [C++ MVP]
<rb*@nospam.nospamwrote:
>>File I/O is nowhere near that wasteful. The reader classes might do
some
caching, to do fewer block reads.

They do, at least some of them. For example, StreamReader.

StreamReader reads the whole file into memory? From a cursory glance
through Reflector, it uses block reads but doesn't grab the whole file.
No. By "they do" I mean "they do just what you said". That is "do some
caching, to do fewer block reads".
If the reader doesn't support a seek operation, then what do you mean by
"If the data at the position being seeked to has already been buffered,
the
cache isn't flushed." Apparently then the cache isn't flushed, ever.
I suppose it depends on how you look at it. To my way of looking at it,
"the cache is flushed" just before new data is read in. As for the
dependency on seeking, I never wrote anything that should be construed to
mean that the reader itself is doing the seeking. Simply that the seeking
is being done.

The fact that the seeking isn't done by the reader not only isn't mutually
exclusive with the statement that the reader's buffer isn't flushed, it's
the explanation as to why it isn't.

Pete
Jul 19 '07 #7

"Peter Duniho" <Np*********@nnowslpianmk.comwrote in message
news:op***************@petes-computer.local...
On Thu, 19 Jul 2007 10:51:56 -0700, Ben Voigt [C++ MVP]
<rb*@nospam.nospamwrote:
>>>File I/O is nowhere near that wasteful. The reader classes might do
some
caching, to do fewer block reads.

They do, at least some of them. For example, StreamReader.

StreamReader reads the whole file into memory? From a cursory glance
through Reflector, it uses block reads but doesn't grab the whole file.

No. By "they do" I mean "they do just what you said". That is "do some
caching, to do fewer block reads".
>If the reader doesn't support a seek operation, then what do you mean by
"If the data at the position being seeked to has already been buffered,
the
cache isn't flushed." Apparently then the cache isn't flushed, ever.

I suppose it depends on how you look at it. To my way of looking at it,
"the cache is flushed" just before new data is read in. As for the
dependency on seeking, I never wrote anything that should be construed to
mean that the reader itself is doing the seeking. Simply that the seeking
is being done.

The fact that the seeking isn't done by the reader not only isn't mutually
exclusive with the statement that the reader's buffer isn't flushed, it's
the explanation as to why it isn't.
Ok, I was just confused because you said "If the data at the position being
seeked to has already been buffered", and there clearly isn't any such test,
if the reader isn't aware of the seek.
>
Pete

Jul 19 '07 #8
I had considered... in light of some of this info, that it might be best to
just rename the file (most logging app's will auto create a new log file)
and then process that data, and archive it.

in doing this I wont really have the most up to date info as I would
monitoring the file as a stream but, whats 5 seconds really?

really all im trying to accomplish is log monitoring and consolidation. I
have a handful of web servers, sql servers, 3rd party app's and system logs.

It is rather troublesome to keep track of all of these and a lot of things
get passed over due to time and the complexity of some of these logs.

I would say the largest log I have is roughly 250MB a day.

What I would like to do is setup a config file that will have a regular
expression for processing the log, and some info as to how to store it in a
DB.

can you intelligent people give me some suggestions as to some best
practices to approach this? :-D

Thanks
Justin

"Ben Voigt [C++ MVP]" <rb*@nospam.nospamwrote in message
news:OD**************@TK2MSFTNGP06.phx.gbl...
>
"Peter Duniho" <Np*********@nnowslpianmk.comwrote in message
news:op***************@petes-computer.local...
>On Thu, 19 Jul 2007 10:51:56 -0700, Ben Voigt [C++ MVP]
<rb*@nospam.nospamwrote:
>>>>File I/O is nowhere near that wasteful. The reader classes might do
some
caching, to do fewer block reads.

They do, at least some of them. For example, StreamReader.

StreamReader reads the whole file into memory? From a cursory glance
through Reflector, it uses block reads but doesn't grab the whole file.

No. By "they do" I mean "they do just what you said". That is "do some
caching, to do fewer block reads".
>>If the reader doesn't support a seek operation, then what do you mean by
"If the data at the position being seeked to has already been buffered,
the
cache isn't flushed." Apparently then the cache isn't flushed, ever.

I suppose it depends on how you look at it. To my way of looking at it,
"the cache is flushed" just before new data is read in. As for the
dependency on seeking, I never wrote anything that should be construed to
mean that the reader itself is doing the seeking. Simply that the
seeking is being done.

The fact that the seeking isn't done by the reader not only isn't
mutually exclusive with the statement that the reader's buffer isn't
flushed, it's the explanation as to why it isn't.

Ok, I was just confused because you said "If the data at the position
being seeked to has already been buffered", and there clearly isn't any
such test, if the reader isn't aware of the seek.
>>
Pete


Jul 20 '07 #9
In article <uy**************@TK2MSFTNGP06.phx.gbl>, Justin Rich wrote:
looking for the best approach to reading a real time log file. The file gets*
updated pretty quickly, always appended to the end.

do i really need to just keep re-opening the file and reading the end of it?
For a similar need, I have a log viewer that uses the FileSystemWatcher to
learn when the file is updated then opens the file and reads the new part of
the file. I've put most of the logic into a Stream subclass (FollowingStream)
that hides much of this from the viewing application. This allows me to use a
XmlReader to load the data. When the application switches to a new log file
(renames the old file then creates a new one), the FollowingStream notices that
the file is smaller and starts reading the new file from the beginning.
Unfortunately, it's quite possible that some data at the end of the old log
file is missed. As my log files are XML based, this has meant that the viewer
needed to help with recovery from this situation.

Mike

Jul 21 '07 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Sebastian Becker | last post: by
20 posts views Thread by sahukar praveen | last post: by
6 posts views Thread by Rajorshi Biswas | last post: by
5 posts views Thread by jeff | last post: by
5 posts views Thread by Justin | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.