473,322 Members | 1,496 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

What is best way to implement "tail"?

What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.

Sep 19 '07 #1
21 7997
Owen Zhang <ow***************@gmail.comwrites:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
What performance problems have you observed with these
implementations?
--
Ben Pfaff
http://benpfaff.org
Sep 19 '07 #2
On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

--
Erik Wikström
Sep 19 '07 #3
On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.
Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
later = clock();
while (((later-now)/CLOCKS_PER_SEC) < 5);
}

int main(void)
{
int datum;

for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
}

return EXIT_SUCCESS;
}

--
Erik Wikström

Sep 19 '07 #4
On Sep 19, 11:11 am, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.
I think it could be implemented purely in ANSI C without much
difficulty.

A super-simple (stinky) implementation would just read into a ring
buffer of size specified by the user, overwriting as it goes.
When the file has been read, spew out the lines still in the buffer.

Now, suppose we want something a bit smarter.

We could create a static array of characters with 1000 characters/row
and 1000 rows.
Next, we seek to the end of the file, and back up one megabyte.
Next we fgets from the current position into our ring buffer,
overwriting as we go until we hit the end.
If any rows exceeds 1000 characters, signal an error.
When we hit the end of the file, just cough up whatever is in the ring
buffer.

I'm sure that there are better ways to do it in standard C, but off
the top of my head I think it would work.

On the other hand, I guess that the UNIX shell command tail is
implemented in C and a simple hack job is not going to do better.
For that matter, the Perl version probably just calls the system tail
function or implements it in C underneath the covers anyway.

So I guess that it will be an exercise in futility.
Sep 19 '07 #5
On Sep 19, 3:06 pm, user923005 <dcor...@connx.comwrote:
On Sep 19, 11:11 am, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

I think it could be implemented purely in ANSI C without much
difficulty.

A super-simple (stinky) implementation would just read into a ring
buffer of size specified by the user, overwriting as it goes.
When the file has been read, spew out the lines still in the buffer.
The minimum implementation of tail(1) on Unix depends on knowledge of
an admitedly platform-specific behaviour of the standard C I/O
library. On Unix, EOF is not necessarily a permanent and final
condition, and the implementation of the standard library acknowledges
this. I believe that this falls within a gray area of the C standard,
being an implementation-defined behaviour.

Knowing that EOF is possibly a temporary condition, programs can
choose to ignore the EOF indication on a file, and attempt to continue
reading it. tail(1) (when invoked with the -f "follow" flag)
disregards the EOF indicator, waits a bit (for the process that is
writing the file to write some more, and thus extend the file past
it's current end-of-file condition), and then attempts to read more
data from the file. No ring buffer necessary.

However, some sort of "out of band" condition or signal must be used
to tell the program to stop reading the file. Typically, this would be
done with an implicit or explicit signal handler that would terminate
the program when a suitable signal is presented to it. SIGINT does
nicely here.

Sep 19 '07 #6
On Wed, 19 Sep 2007 11:52:02 -0700,
Lew Pitcher <lp******@teksavvy.comwrote:
On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
>On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.
Not if you also take into account that one of the requirements is that
you need to have 'higher performace' than given other tools. To get that
sort of performance, you generally need to resort to platform-specific
tricks, or use functions or assumptions that fall outside of ISO
standard.

Martien
--
|
Martien Verbruggen | The problem with sharks is that they are too
| large to get to the shallow end of the gene
| pool. -- Scott R. Godin
Sep 19 '07 #7
On Wed, 19 Sep 2007 11:52:02 -0700, Lew Pitcher wrote:
On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
>On 2007-09-19 19:41, Owen Zhang wrote:
What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.
[snip]
for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
And in which way is this better than calling clearerr(stdin)?
}
--
Army1987 (Replace "NOSPAM" with "email")
If you're sending e-mail from a Windows machine, turn off Microsoft's
stupid “Smart Quotes†feature. This is so you'll avoid sprinkling garbage
characters through your mail. -- Eric S. Raymond and Rick Moen

Sep 19 '07 #8
Lew Pitcher wrote, On 19/09/07 19:52:
On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
>On 2007-09-19 19:41, Owen Zhang wrote:
>>What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
later = clock();
while (((later-now)/CLOCKS_PER_SEC) < 5);
}

int main(void)
{
int datum;

for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
}

return EXIT_SUCCESS;
}
This is not the same as the Unix "tail -f" since you cannot apply it to
a file being written by another process.
--
Flash Gordon
Sep 19 '07 #9
Flash Gordon wrote:
Lew Pitcher wrote, On 19/09/07 19:52:
>On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
>>On 2007-09-19 19:41, Owen Zhang wrote:

What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.

Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
later = clock();
while (((later-now)/CLOCKS_PER_SEC) < 5);
}

int main(void)
{
int datum;

for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
}

return EXIT_SUCCESS;
}

This is not the same as the Unix "tail -f" since you cannot apply it to
a file being written by another process.
Hmmm..... No.

lpitcher@merlin:~/code/mytail$ head -10 mytail.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
lpitcher@merlin:~/code/mytail$ cc -o mytail mytail.c
lpitcher@merlin:~/code/mytail$ mytail </var/log/messages
Sep 19 04:40:02 merlin syslogd 1.4.1: restart.
Sep 19 05:08:52 merlin -- MARK --
Sep 19 05:28:52 merlin -- MARK --
Sep 19 05:48:52 merlin -- MARK --
Sep 19 06:08:52 merlin -- MARK --
Sep 20 '07 #10
Erik Wikström wrote:
Owen Zhang wrote:
>What is the best way to implement "tail -f" in C or C++ and
higher performance compared to either unix shell command "tail
-f" or perl File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask
in a group discussing programming on your platform.
Implementing the -f requires abilities outside of standard C. For
the rest of it, a simple method would be to read the file with
ggets and save the last N pointers returned (not forgetting to free
one when overwriting an earlier one). Set ggets.zip on:

<http://cbfalconer.home.att.net/download/>

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 20 '07 #11
Lew Pitcher wrote:
Erik Wikström <Erik-wikst...@telia.comwrote:
>Owen Zhang wrote:
>>What is the best way to implement "tail -f" in C or C++ and
higher performance compared to either unix shell command "tail
-f" or perl File::Tail ? Any suggestion appreciated. Thanks.

You would need to use platform specific functions to do that, ask
in a group discussing programming on your platform.

Nonsense. It can be done (disregarding questions of efficiency) in
ISO standard C.
Yes, but your method eats up the whole machine, thus preventing any
other process from updating the input file. Not really practical.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 20 '07 #12
Lew Pitcher wrote, On 20/09/07 01:27:
Flash Gordon wrote:
>Lew Pitcher wrote, On 19/09/07 19:52:
>>On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2007-09-19 19:41, Owen Zhang wrote:

What is the best way to implement "tail -f" in C or C++ and higher
performance compared to either unix shell command "tail -f" or perl
File::Tail ? Any suggestion appreciated. Thanks.
You would need to use platform specific functions to do that, ask in a
group discussing programming on your platform.
Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
later = clock();
while (((later-now)/CLOCKS_PER_SEC) < 5);
}

int main(void)
{
int datum;

for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
}

return EXIT_SUCCESS;
}
This is not the same as the Unix "tail -f" since you cannot apply it to
a file being written by another process.

Hmmm..... No.

lpitcher@merlin:~/code/mytail$ head -10 mytail.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void WaitFiveSecs(void)
{
clock_t now, later;

now = clock();
do
lpitcher@merlin:~/code/mytail$ cc -o mytail mytail.c
lpitcher@merlin:~/code/mytail$ mytail </var/log/messages
Sep 19 04:40:02 merlin syslogd 1.4.1: restart.
Sep 19 05:08:52 merlin -- MARK --
Sep 19 05:28:52 merlin -- MARK --
Sep 19 05:48:52 merlin -- MARK --
Sep 19 06:08:52 merlin -- MARK --
.
.
.
Sep 19 20:23:24 merlin popa3d[28471]: 1 (835) deleted, 0 (0) left
Sep 19 20:25:18 merlin sshd[28533]: Accepted publickey for lpitcher from
192.168.11.2 port 32882 ssh2

Seems to work fine for me
Now do it on a file that is not updated for a long time, where "long
time" is defined as long enough for the end of the file to be reached
before the file is extended.
--
Flash Gordon
Sep 20 '07 #13
>On 2007-09-19 19:41, Owen Zhang wrote:
>>>What is the best way to implement "tail -f" [with emphasis on]
higher performance ...
>On Sep 19, 2:11 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
>>You would need to use platform specific functions ...
In article <11**********************@y42g2000hsy.googlegroups .com>
Lew Pitcher <lp******@teksavvy.comwrote:
>Nonsense. It can be done (disregarding questions of efficiency) in ISO
standard C.
Indeed; but as I noted with a little editing above, "questions of
efficiency" seem to be central to the original poster.

[snippage]
for(;;)
{
if ((datum = getchar()) != EOF)
putchar(datum);
else WaitFiveSecs();
}
This fails on any system using my stdio, because you neglect to
call clearerr(stdin). As soon as the first EOF occurs, a "sticky"
EOF flag is set on the underlying stream, and further attempts to
read from it signal EOF again, without asking the underlying OS
for more bytes. Using clearerr() on the stream resets this "sticky"
flag, so that subsequent attempts to read from the stream will ask
the OS for more bytes.

(Implementations are allowed, but not required, to have this sort
of "sticky EOF" behavior. Various stdio implementations vary. I
made my behavor depend upon a single line in the stdio source code,
so that implementors could change it if desired.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Sep 20 '07 #14
In article <fc********@news1.newsguy.com>,
Chris Torek <no****@torek.netwrote:
>(Implementations are allowed, but not required, to have this sort
of "sticky EOF" behavior.
C99 says, under fgetc():

If the end-of-file indicator [...] is set, or if the stream is at
end-of-file, the end-of-file indicator [...] is set and the fgetc
function returns EOF.

C90 is less clear: it says:

If the stream is at end-of-file, the end-of-file indicator [...] is
set and the fgetc function returns EOF.

So it seems to have been changed to clarify that sticky behaviour is
required.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Sep 20 '07 #15
On Sep 20, 10:01 am, rich...@cogsci.ed.ac.uk (Richard Tobin) wrote:
[snip]
Anyway, what is the point of the WaitFiveSecs function? If you're
going to busy-wait, why not just keep calling getchar()?
Well, I wasn't going to busy-wait, but I couldn't find a sleep()
function in C90. Of course, you are right; repeated calling of
getchar() would have been better than either sleep() or a busy-wait
loop. (FWIW, I hoped that the clock() call might actually perform some
sort of system-dependant wait itself, thus reducing the overhead of
the busy-wait loop.)

Sep 20 '07 #16
On Thu, 20 Sep 2007 10:07:08 +0000, Richard Tobin wrote:
In article <fc********@news1.newsguy.com>,
Chris Torek <no****@torek.netwrote:
>>(Implementations are allowed, but not required, to have this sort
of "sticky EOF" behavior.

C99 says, under fgetc():

If the end-of-file indicator [...] is set, or if the stream is at
end-of-file, the end-of-file indicator [...] is set and the fgetc
function returns EOF.

C90 is less clear: it says:

If the stream is at end-of-file, the end-of-file indicator [...] is
set and the fgetc function returns EOF.

So it seems to have been changed to clarify that sticky behaviour is
required.
"Is set" can be interpreted as the passive of "set" rather than
as "is" and an adjective. If the sentence is taken to mean "If the
fgets() sets the end-of-file indicator, it returns EOF", the
sticky behaviour is not required.

--
Army1987 (Replace "NOSPAM" with "email")
If you're sending e-mail from a Windows machine, turn off Microsoft's
stupid “Smart Quotes†feature. This is so you'll avoid sprinkling garbage
characters through your mail. -- Eric S. Raymond and Rick Moen

Sep 21 '07 #17
In article <pa****************************@NOSPAM.it>,
Army1987 <ar******@NOSPAM.itwrote:
>C99 says, under fgetc():

If the end-of-file indicator [...] is set, or if the stream is at
end-of-file, the end-of-file indicator [...] is set and the fgetc
function returns EOF.
>"Is set" can be interpreted as the passive of "set" rather than
as "is" and an adjective. If the sentence is taken to mean "If the
fgets() sets the end-of-file indicator, it returns EOF", the
sticky behaviour is not required.
That makes no sense, because it would mean "if fgetc() set the
end-of-file indicator ... fgetc() sets the end-of-file indicator and
returns EOF".

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Sep 21 '07 #18
On Fri, 21 Sep 2007 16:52:54 +0000, Richard Tobin wrote:
In article <pa****************************@NOSPAM.it>,
Army1987 <ar******@NOSPAM.itwrote:
[ISO/IEC 9899 7.19.7.1#3]
>>"Is set" can be interpreted as the passive of "set" rather than
as "is" and an adjective. If the sentence is taken to mean "If the
fgets() sets the end-of-file indicator, it returns EOF", the
sticky behaviour is not required.

That makes no sense, because it would mean "if fgetc() set the
end-of-file indicator ... fgetc() sets the end-of-file indicator and
returns EOF".
But neither would your interpretation, if it is already set why
set it again? But wait... "If the end-of-file indicator for the
stream is set,[...] the end-of-file indicator for the stream is
set[...]. They can't be the same proposition, can they? OMG...
(The answer is in the paragraph above. You are correct.)

--
Army1987 (Replace "NOSPAM" with "email")
If you're sending e-mail from a Windows machine, turn off Microsoft's
stupid “Smart Quotes†feature. This is so you'll avoid sprinkling garbage
characters through your mail. -- Eric S. Raymond and Rick Moen

Sep 21 '07 #19
In article <pa****************************@NOSPAM.it>,
Army1987 <ar******@NOSPAM.itwrote:
>But neither would your interpretation, if it is already set why
set it again?
You might think that, but then that's just the question - is it sticky?
>But wait... "If the end-of-file indicator for the
stream is set,[...] the end-of-file indicator for the stream is
set[...]. They can't be the same proposition, can they?
Exactly. The sentence only makes sense if the two instances of "the
end-of-file indicator for the stream is set" are interpreted
differently. Writing standards unambiguously is hard.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Sep 21 '07 #20
ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:
[...]
Writing standards unambiguously is hard.
Do you mean that writing unambiguous standards is hard, or do you mean
that writing standards is unambiguously hard?

8-)}

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 21 '07 #21
In article <ln************@nuthaus.mib.org>,
Keith Thompson <ks***@mib.orgwrote:
>Do you mean that writing unambiguous standards is hard, or do you mean
that writing standards is unambiguously hard?
Yes, that's exactly what I mean.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Sep 21 '07 #22

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Mel | last post by:
i need to create a unix like "tail" function on my site. the question: the text is displayed in a scrolled area and a new line is added at the end and the entire text is scrolled down so that...
1
by: cyshao | last post by:
Are there any command like Unix "tail -f " ? I'm developing a service program who writes log file in each 5 second. Now, I want to watch changes of the contaxt of the file. I know Unix...
11
by: Natti | last post by:
Hello, I start fileevents as soon as my tk code is invoked. My widget gets updated based on the fileevents. Most of the files do not exist in the beginning and so I get the following messages on...
21
by: Owen Zhang | last post by:
What is the best way to implement "tail -f" in C or C++ and higher performance compared to either unix shell command "tail -f" or perl File::Tail ? Any suggestion appreciated. Thanks.
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.