473,221 Members | 2,041 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,221 software developers and data experts.

Regarding: fgets() replacement

"Paul D. Boyle" <bo***@laue.chem.ncsu.edu> wrote:
There was a recent thread in this group which talked about the
shortcomings of fgets(). I decided to try my hand at writing a
replacement for fgets() using fgetc() and realloc() to read a line of
arbitrary length. I know that the better programmers in this group could
write a more robust function, but here is my shot at it anyway.
I would appreciate people's comments on my fget_line() code below
(usage example included). Any constructive criticism welcome regarding
logic, design, style, etc. Thanks.

Your algorithm can end up calling realloc O(lengthof(input)) times.
This will end up shredding some of the existing heaps for some widely
deployed C compilers in existence today. I.e., just input enough
stuff and malloc/realloc will start grinding to a halt. Its not hard
to write up test scenarios to see this happen for yourself with 2 of
the most popular Windows C compilers. Using double the amount of
memory at each step is a simple way to avoid this problem (it will
also improve performance by quite a bit.)

Your function also only supports the one semantic of reallocating the
memory required for whatever input is supplied. This isnt going to
work very well in systems like Linux/UNIX which support "yes".
(Consider "yes | ./a.out") The time required to fill in a whole
swap-file's worth of data is pretty long; so this is not a marginal
situation if a hacker is trying to slow down your system, for example.

Your function doesn't make any distinction between binary or text
files. The thing is -- if you read a text file in a binary mode, you
can receive extraneous characters (like '\0's that would not be read
from a text file). That's all fine since you return the length so the
set of characters read (with the exception of the last one that might
be rejected by "validate") but, just like fgets(), that puts your
function at odds with every other function in the standard C library
which always disallows internal characters of char * strings to be
'\0' (since that's the end-of-string marker).

The validate() function you specify is a lot weaker than it could and
should be. First of all, if the input is fairly large, why not build
a table indexed by all characters and just check the table instead?
It would be a lot faster. Ok, the reason is because a callback
function can process a lot more context -- for example it might be
possible to parse simple grammars via state machine. The way you do
this is by passing an opaque context parameter to it (passed in to
fget_line()) as well as the character index. So it should be:

int (*validate)(int character, int idx, void * context)

And add in the additional parameter void * context to fget_line().

There also appears to be a flaw in your program that allows
local_buffer to be dereferenced even if its NULL; i.e., if the first
realloc() fails.

If you would like to see how I solved this same problem in a more
general way see:


Paul Hsieh
Nov 14 '05 #1
0 1560

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

by: Rob Somers | last post by:
Hey all I am writing a program to keep track of expenses and so on - it is not a school project, I am learning C as a hobby - At any rate, I am new to structs and reading and writing to files,...
by: William L. Bahn | last post by:
I recently challenged one of my students to figure out a way to determine if fgets() actually received the entire contents of the input string. When he was having trouble figuring it out even after...
by: Paul D. Boyle | last post by:
Hi all, There was a recent thread in this group which talked about the shortcomings of fgets(). I decided to try my hand at writing a replacement for fgets() using fgetc() and realloc() to read...
by: David Mathog | last post by:
Every so often one of my fgets() based programs encounters an input file containing embedded nulls. fgets is happy to read these but the embedded nulls subsequently cause problems elsewhere in...
by: AG | last post by:
Hello, This is my first post to this group, and on top of that I am a beginner. So please direct me to another group if this post seems out of place.... I have recently written a program which...
by: FireHead | last post by:
Hello C World & Fanatics I am trying replace fgets and provide a equavivalant function of BufferedInputReader::readLine. I am calling this readLine function as get_Stream. In the line 4 where...
by: mellyshum123 | last post by:
I need to read in a comma separated file, and for this I was going to use fgets. I was reading about it at http://www.cplusplus.com/ref/ and I noticed that the document said: "Reads characters...
by: allpervasive | last post by:
hi all, this is reddy, a beginner to c lang,,here i have some problems in reading and modifying the contents of a file,, hope you can help to solve this problem. Here i attach the file to be...
by: Sheth Raxit | last post by:
Machine 1 : bash-3.00$ uname -a SunOS <hostname5.10 Generic_118822-30 sun4u sparc SUNW,Sun-Fire-280R bash-3.00$ gcc -v Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/2.95.3/...
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.