I use this function that I wrote for inputting strings. It's meant to
return a pointer to mallocated memory holding one input string, or 0 on
error. (Personally, I prefer to use 0 to NULL when returning null
pointers.) It looks pretty watertight to me, but my version of lint
complains about use of deallocated pointers, etc. Is this code
completely safe on all input, or have I missed something?
/* Header files included in the program containing malloc_getstr */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
/* This function gets a string of any length from stdin,
mallocating an appropriate amount of memory. It returns
0 on string length overflow or out-of-memory. Note that
slightly more space is allocated than neccesary (from
1 to 10 characters). */
char* malloc_getstr(void)
{
size_t s=0;
char* c;
char* t;
c=malloc(s+10);
if(!c)
{ /* out of memory */
return 0;
}
for(;;)
{
fgets(c+s,10,stdin);
if(strchr(c+s,'\n')) break; /* \n lets us know if the input has
ended. Check only the most recent
input. */
if(s>SIZE_MAX-9)
{ /* new string length too long to pass as realloc's argument */
free(c);
return 0; /* error return value */
}
s+=9;
t=realloc(c,s+10);
if(!t)
{ /* out of memory */
free(c);
return 0; /* error return value */
}
c=t;
}
*strchr(c+s,'\n')=0; /* replace the \n with a \0 */
return c;
} 8 2621
ais523 wrote: I use this function that I wrote for inputting strings. It's meant to return a pointer to mallocated memory holding one input string, or 0 on error. (Personally, I prefer to use 0 to NULL when returning null pointers.) It looks pretty watertight to me, but my version of lint complains about use of deallocated pointers, etc. Is this code completely safe on all input, or have I missed something?
You've forgotten about end-of-file (and I/O errors).
/* Header files included in the program containing malloc_getstr */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <limits.h>
/* This function gets a string of any length from stdin, mallocating an appropriate amount of memory. It returns 0 on string length overflow or out-of-memory. Note that slightly more space is allocated than neccesary (from 1 to 10 characters). */ char* malloc_getstr(void) { size_t s=0; char* c; char* t; c=malloc(s+10); if(!c) { /* out of memory */ return 0; } for(;;) { fgets(c+s,10,stdin); if(strchr(c+s,'\n')) break; /* \n lets us know if the input has ended. Check only the most recent input. */
... except that if fgets() detected end-of-file or error,
the contents of the buffer are indeterminate (for different
reasons in the two cases, but indeterminate anyhow). The
buffer might not contain a valid string, so the behavior of
strchr() is undefined.
if(s>SIZE_MAX-9) { /* new string length too long to pass as realloc's argument */
Looks like the test is wrong: the argument you're about
to pass to realloc will be s+19, not s+9. Another point of
view is to say that the test is right (approximately), but
in the wrong place: since you've already got s+10 characters
in the buffer, the test as it stands is too late to avoid
trouble. If you want to make the test, I'd suggest making it
right before allocating the memory.
free(c); return 0; /* error return value */ } s+=9; t=realloc(c,s+10); if(!t) { /* out of memory */ free(c); return 0; /* error return value */ } c=t; } *strchr(c+s,'\n')=0; /* replace the \n with a \0 */ return c; }
General impression: Salvageable.
A few observations:
- Everybody writes a "better fgets()" eventually, and my
first used a strategy not unlike yours: call fgets(),
check for '\n', grow the buffer, lather, rinse, repeat.
However, I found that the function became much simpler
when I discarded fgets()/strchr() in favor of getc()
and testing each character as it arrived.
- The magic number 10 appears in five places in the code
(sometimes disguised as 9). I'm not as rabid as those
who insist on using #define for every number other than
0, 1, and perhaps 2, but when an essentially arbitrary
value shows up five times ...
- The test against SIZE_MAX shows praiseworthy diligence
(I failed to include a similar test in my own function,
so my hat's off to you.) The test will probably never
ever fire, but "better safe than sorry."
- Instead of limiting the function's usefulness by hard-
wiring stdin as the data source, why not accept a FILE*
function argument?
- Figuring out a good growth pattern for the buffer is a
matter of guesswork, and there's no one "best" way.
However, you might want to consider growing the buffer
more aggressively as the line gets longer and longer,
the idea being that a line that's already proved to be
at least 200 characters long may be more likely to grow
to 300 than to stop short of 210. (Of course, this
requires you to manage the 10/9 stuff differently.)
- Similarly with the size of the initial allocation: it
seems you're expecting very short input lines most of
the time. I'd suggest a larger starting size, not only
to avoid the work of copying the first part of the line
over and over and over again, but also so realloc() and
its friends won't have to deal with huge numbers of
itty-bitty memory fragments.
--
Eric Sosman es*****@acm-dot-org.invalid
In the time between my starting to write this and now, Eric Sosman has
already provided a response which covered most of what I wrote. But in
true "Me too" fashion, I shall post it anyway.
ais523 wrote: I use this function that I wrote for inputting strings. It's meant to return a pointer to mallocated memory holding one input string, or 0 on error. (Personally, I prefer to use 0 to NULL when returning null pointers.)
Me too.
It looks pretty watertight to me, but my version of lint complains about use of deallocated pointers, etc.
In my experience, lint gets confused if you do anything much more
complicated than a simple
x = malloc(...)
use_x();
free(x);
Is this code completely safe on all input, or have I missed something?
You've missed something. See my comment after your call to fgets.
/* Header files included in the program containing malloc_getstr */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <limits.h>
/* This function gets a string of any length from stdin,
Why not pass the FILE* in as a parameter? That would allow you to use
it in a more general case and you could still provide a wrapper to make
using it for console input convenient.
mallocating an appropriate amount of memory. It returns 0 on string length overflow or out-of-memory. Note that slightly more space is allocated than neccesary (from 1 to 10 characters). */ char* malloc_getstr(void) { size_t s=0; char* c; char* t;
Use variable names with more than a single character! Maybe
char* line;
char* alloc;
size_t len;
c=malloc(s+10); if(!c)
You checked the return value of malloc. Good!
{ /* out of memory */ return 0; } for(;;) { fgets(c+s,10,stdin);
But you forgot to check the return value of fgets! If fgets fails
before any characters are read (say, stdin is already at EOF), then the
string doesn't get null-terminated. And the next thing you do is call
strchr, which assumes a null-terminated string.
If fgets succeeds the first time, you'll be fine even if it fails
thereafter because you don't overwrite the '\0' from the previous
iteration. But you still have that possibility of failure.
My safe line input function uses getc to read one character at a time
instead of mucking with fgets and its line terminators. I find it
easier, and though I haven't tested, I imagine that not having to scan
the characters again (which your strchr does below) makes up for any
lost efficiency. Heck, since getc's a macro, it might even be more
efficient. I've never had occasion to care, though.
if(strchr(c+s,'\n')) break; /* \n lets us know if the input has ended. Check only the most recent input. */
What if input ends (hard EOF) before a \n is read? Since you don't
check the return value of fgets, right now you have an infinite loop.
If you do check the return value of fgets but leave the bug here,
you'll incorrectly report an error condition to the caller (and forget
all the data along the way -- more about that later).
if(s>SIZE_MAX-9)
Why all these magic numbers? Have a #define EXPAND_SIZE 10, and then
that 9 here is really just (EXPAND_SIZE - 1). Also, I have found that
it's better to expand by a factor (eg, like start at size 10, and
double when you run out of space).
{ /* new string length too long to pass as realloc's argument */ free(c);
Hmm... so now you've removed a whole bunch of characters from the input
stream and failed to tell the caller any information about them. You
didn't even say how many you removed! I think that's a bad idea.
return 0; /* error return value */
I think you're trying to multiplex too much information into your
single output interface (the return value). Right now, it's doing
double-duty as the pointer to the dynamically-allocated string and an
error indicator. This is too restrictive. One alternative is:
int malloc_getstr(char** line)
Sometimes changing to a parameter makes things inconvenient for the
caller, since they can't just do something like
puts(malloc_getstr());
anymore, but in this case that would have been a memory leak anyway, so
you don't particularly want to encourage that. So the tradeoff here is
really going from
char* line;
if(line = malloc_getstr()) {
puts(line);
free(line);
}
to
char* line;
if(0 == malloc_getstr(&line)) {
puts(line);
free(line);
}
You also might want to consider allowing the user to pass a pointer to
size_t as a parmeter which, if not NULL, the object it points to gets
the number of characters read. After all, in the function you know this
already, and it's inconvenient to the tune of O(n) for the caller to
determine.
} s+=9; t=realloc(c,s+10);
Just a style issue: I would prefer to start c as NULL (or 0, if you
prefer) and then use realloc instead of malloc as my allocation
function, plus arrange my code in such a way that it appears only once
(albeit in a loop).
if(!t) { /* out of memory */ free(c);
Here you go again, forgetting things that might be useful to the
caller. If your user goes through the trouble to type a bazillion
characters as input, do you think he'll appreciate you just forgetting
it all because of a system limitation?
return 0; /* error return value */ } c=t; } *strchr(c+s,'\n')=0; /* replace the \n with a \0 */
This call is suspect. Right now, the only exit point from the loop
(without returning) is when you find a '\n', so it's safe -- but that
might (should) change.
return c; }
CBFalconer has posted code to his "ggets" function (search the
archives), which is an excellent implementation of the kind of thing
you're trying to do. I have my own which behaves a little differently,
but his is already out there and probably much more peer-reviewed than
mine. Even if you stick with a home-grown one, I suggest you study his.
Josh
Josh Sebastian wrote:
.... snip ... CBFalconer has posted code to his "ggets" function (search the archives), which is an excellent implementation of the kind of thing you're trying to do. I have my own which behaves a little differently, but his is already out there and probably much more peer-reviewed than mine. Even if you stick with a home-grown one, I suggest you study his.
Available at <http://cbfalconer.home.att.net/download/ggets.zip>
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
"CBFalconer" <cb********@yahoo.com> wrote in message
news:43***************@yahoo.com... Josh Sebastian wrote: ... snip ... CBFalconer has posted code to his "ggets" function (search the archives), which is an excellent implementation of the kind of thing you're trying to do. I have my own which behaves a little differently, but his is already out there and probably much more peer-reviewed than mine. Even if you stick with a home-grown one, I suggest you study his.
Available at <http://cbfalconer.home.att.net/download/ggets.zip>
Falconer's ggets could be improved slightly:
Its signature is
int fggets(char* *ln, FILE *f);
The first thing it should do is check to ensure that ln!=NULL and f!=NULL
--
Fred L. Kleinschmidt
Boeing Associate Technical Fellow
Technical Architect, Software Reuse Project
"Fred Kleinschmidt" <fr******************@boeing.com> writes: "CBFalconer" <cb********@yahoo.com> wrote in message news:43***************@yahoo.com...
[...] Available at <http://cbfalconer.home.att.net/download/ggets.zip>
Falconer's ggets could be improved slightly:
Its signature is int fggets(char* *ln, FILE *f);
The first thing it should do is check to ensure that ln!=NULL and f!=NULL
Why should it do that? Just don't call it with null pointers.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Fred Kleinschmidt wrote: "CBFalconer" <cb********@yahoo.com> wrote in message Josh Sebastian wrote: ... snip ... CBFalconer has posted code to his "ggets" function (search the archives), which is an excellent implementation of the kind of thing you're trying to do. I have my own which behaves a little differently, but his is already out there and probably much more peer-reviewed than mine. Even if you stick with a home-grown one, I suggest you study his.
Available at <http://cbfalconer.home.att.net/download/ggets.zip>
Falconer's ggets could be improved slightly:
Its signature is int fggets(char* *ln, FILE *f);
The first thing it should do is check to ensure that ln!=NULL and f!=NULL
Yeah, except that he's following the ANSI C committee's lead in
suggesting that a language and its libraries should not actually assist
with programmer safety. This "peer reviewed" function has other design
problems however:
1) The function will always fail if it is unable to allocate 112 bytes
(even if the input size is smaller than that.)
2) The (linear) realloc strategy will cause real world heaps to
fragment like crazy on very long inputs (like the kinds you see in
typical weblogs from attempted virus attacks against IIS.) Since he
calls realloc O(n) times, and realloc itself is an O(n) function (think
about it) that means the function is O(n^2).
3) The external function fggets returns with enum values that are only
declared locally in the file. Its done in a well defined way, but its
really kind of annoying. I mean, why not just declare all the return
values and throw them in the .h file? If its a namespace issue, there
is no reason not to pick hard coded values, since there are so few
output values. (There is also the 3-valued output for *ln (filled in,
NULL, or not touched) that can be used to represent various errors.)
4) The function doesn't have a consistent output on out-of-memory
conditions.
5) The function inherits all the other problems of fgets(), namely that
it cannot support binary files (since it ignores '\0's that are read
from stdin) and although the total length of the input is never more
than O(1) away from the inner loop, this information is not computed
and is lost by the time the function returns (which usually means that
one way or another, you will end up calling strlen or something like it
one additional time for that string, after it has been output.)
You can find a somewhat easier to understand, and better documented
dynamic string input function that doesn't have these sorts of problems
here: http://www.pobox.com/~qed/userInput.html
--
Paul Hsieh http://www.pobox.com/~qed/ http://bstring.sf.net/ we******@gmail.com schrieb: Fred Kleinschmidt wrote:"CBFalconer" <cb********@yahoo.com> wrote in messageJosh Sebastian wrote: ... snip ...
CBFalconer has posted code to his "ggets" function (search the archives), which is an excellent implementation of the kind of thing you're trying to do. I have my own which behaves a little differently, but his is already out there and probably much more peer-reviewed than mine. Even if you stick with a home-grown one, I suggest you study his.
Available at <http://cbfalconer.home.att.net/download/ggets.zip> Falconer's ggets could be improved slightly:
Its signature is int fggets(char* *ln, FILE *f);
The first thing it should do is check to ensure that ln!=NULL and f!=NULL
Yeah, except that he's following the ANSI C committee's lead in suggesting that a language and its libraries should not actually assist with programmer safety.
True enough; it is IMO still a good source for seeing how input
can be done yourself. If someone understands it, moving on to
more efficient or more secure versions is easier.
This "peer reviewed" function has other design problems however:
1) The function will always fail if it is unable to allocate 112 bytes (even if the input size is smaller than that.)
2) The (linear) realloc strategy will cause real world heaps to fragment like crazy on very long inputs (like the kinds you see in typical weblogs from attempted virus attacks against IIS.) Since he calls realloc O(n) times, and realloc itself is an O(n) function (think about it) that means the function is O(n^2).
3) The external function fggets returns with enum values that are only declared locally in the file. Its done in a well defined way, but its really kind of annoying. I mean, why not just declare all the return values and throw them in the .h file? If its a namespace issue, there is no reason not to pick hard coded values, since there are so few output values. (There is also the 3-valued output for *ln (filled in, NULL, or not touched) that can be used to represent various errors.)
4) The function doesn't have a consistent output on out-of-memory conditions.
5) The function inherits all the other problems of fgets(), namely that it cannot support binary files (since it ignores '\0's that are read from stdin) and although the total length of the input is never more than O(1) away from the inner loop, this information is not computed and is lost by the time the function returns (which usually means that one way or another, you will end up calling strlen or something like it one additional time for that string, after it has been output.)
You can find a somewhat easier to understand, and better documented dynamic string input function that doesn't have these sorts of problems here:
http://www.pobox.com/~qed/userInput.html
Nicely explained although I do not like the belligerent tone.
Will definitely go into my list of resources to point newbies to.
BTW: Finding a line like
#define system(s) #error "....."
on the page does not exactly instill awe.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
On 17 Feb 2006 17:25:08 -0800, in comp.lang.c , we******@gmail.com
wrote: Yeah, except that he's following the ANSI C committee's lead in suggesting that a language and its libraries should not actually assist with programmer safety.
If you'd bothered to say that differently, I might have agreed with
you. As it is, you merely come across as a bitter person who dislikes
C and has some axe to grind against the committee. Perhaps you wanted
to attend but weren't allowed to, or the majority disagreed with your
ideas.
5) The function inherits all the other problems of fgets(), namely that it cannot support binary files
You seem to have a misunderstanding about fgets. That isn't a
'problem' with fgets, its not supposed to handle binary files.
and is lost by the time the function returns (which usually means that one way or another, you will end up calling strlen or something like it one additional time for that string, after it has been output.)
This also isn't a 'problem', its simply not intended to calculate it.
Perhaps you should consider a less agressive approach to life?
Mark McIntyre
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==---- http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =---- This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: /dev/null |
last post by:
Is there a function in php that you can pass a string headed for a mysql
query and have it make it 'safe'? For example it would escape out the '
found in the string.
Thanks!
|
by: Michael Hill |
last post by:
I have this input element:
<input type='text' name='street' maxlength='20' size='20'>
that I only want the user to be able to input to based on a item from a
select list:
<select...
|
by: Neil Schemenauer |
last post by:
The title is perhaps a little too grandiose but it's the best I
could think of. The change is really not large. Personally, I
would be happy enough if only %s was changed and the built-in was...
|
by: Alan |
last post by:
hi all,
I want to define a constant length string, say 4
then in a function at some time, I want to set the string to a constant
value, say a
below is my code but it fails
what is the correct...
|
by: Simon Schaap |
last post by:
Hello,
I have encountered a strange problem and I hope you can help me to
understand it. What I want to do is to pass an array of chars to a
function that will split it up (on every location where...
|
by: Jonathan Burd |
last post by:
Greetings everyone,
Here is a random string generator I wrote for an application
and I'm wondering about the thread-safety of this function.
I was told using static and global variables cause...
|
by: Cor |
last post by:
Hallo,
I have promised Jay B yesterday to do some tests.
The subject was a string evaluation that Jon had send in. Jay B was in doubt
what was better because there was a discussion in the C#...
|
by: Erik |
last post by:
Hello,
For many years ago I implemented my own string buffer class, which
works fine except assignments - it copies the char* buffer instead of
the pointer. Therefore in function calls I pass...
|
by: Alex Pavluck |
last post by:
Hello. I get the following error with the following code. Is there
something wrong with my Python installation?
code:
import types
something = input("Enter something and I will tell you the...
|
by: DJRhino |
last post by:
Was curious if anyone else was having this same issue or not....
I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
|
by: tracyyun |
last post by:
Hello everyone,
I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
|
by: NeoPa |
last post by:
Hello everyone.
I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report).
I know it can be done by selecting :...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
|
by: Teri B |
last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course.
0ne-to-many. One course many roles.
Then I created a report based on the Course form and...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM)
Please note that the UK and Europe revert to winter time on...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
|
by: isladogs |
last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, Mike...
|
by: GKJR |
last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
| |