By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,361 Members | 2,042 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,361 IT Pros & Developers. It's quick & easy.

snprint rationale?

P: n/a
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?

What is the name and address of the person responsible for this?

Mike
Nov 14 '05 #1
Share this Question
Share on Google+
27 Replies


P: n/a
Michael B Allen <mb*****@ioplex.com> writes:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?


The declaration of snprintf() is:

int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);

The number of characters available is passed in as the second argument.
To test for overflow, check whether the result exceeds n.

If you call snprintf() with n==0; it won't write any characters, but
it will return the number of characters that would have been written.
You can then allocate the appropriate space and call snprintf() again
with the same arguments, but with a non-zero n.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #2

P: n/a
In article <pa********************************@ioplex.com>
Michael B Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?
This lets you allocate a buffer that is big enough, without having
to do many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);

It is also consistent with fprintf(), which returns the number of
characters printed.
This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?
Given a buffer "buf" of size "size":

result = snprintf(buf, size, fmt, arg);

if (result >= 0 && result < size)
all_is_well();
else
needed_more_space();
What is the name and address of the person responsible for this?


That would be me. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #3

P: n/a
Michael B Allen wrote:

What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?


Snprintf is guaranteed not to overflow.

This works:

if (snprintf (buf, buflen, "...", ....) < buflen)
puts ("No overflow occurred");
else
puts ("Overflow might have occurred");

This also works:

buflen = snprintf (NULL, 0, "...", ....);
buf = malloc (buflen + 1) ;
snprintf (buf, buflen, "...", ....);

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo no****@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
Seen on comp.lang.python:
Q : If someone has the code in python for a buffer overflow,
please post it.
A : Python does not support buffer overflows, sorry.
Nov 14 '05 #4

P: n/a


Michael B Allen wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?
It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?
From the C99 standard:
"7.19.6.5 The snprintf function
Synopsis
1
#include <stdio.h> int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);

Description
2 The snprintf function is equivalent to fprintf, except that the output
is written into an array (specified by argument s) rather than to a
stream. If n is zero, nothing is written, and s may be a null pointer.
Otherwise, output characters beyond the n-1st are discarded rather than
being written to the array, and a null character is written at the end
of the characters actually written into the array. If copying takes
place between objects that overlap, the behavior is undefined.

Returns
3 The snprintf function returns the number of characters that would have
been written had n been sufficiently large, not counting the terminating
null character, or a neg ative value if an encoding error occurred.
Thus, the null-terminated output has been completely written if and only
if the returned value is nonnegative and less than n.
"

So, I would use allocated buffers and do something along the lines

char *buf;
size_t buf_size;
int retval;

buf = NULL;
buf_size = 0;

while (1) {
retval = snprintf(buf, buf_size, "Test with buf of size %zu\n",
buf_size);
if (retval < 0) {
/* Treat encoding error or die */
}
else if (retval<buf_size) {
break; /* We finally made it */
}
else {
char *tmp;
if ( (tmp=realloc(buf, (size_t) retval + 1)) == NULL ) {
/* Give up trying to write this string or die */
}
buf = tmp;
buf_size = (size_t) retval + 1;
}
}

I did not test it but you see that it deals with the problem
that, depending on buf_size, the length of the output varies
so we need to adjust the size a second time.

What is the name and address of the person responsible for this?


I think this is slighty OT here. Try comp.std.c but I guess
they won't tell you either.
-Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #5

P: n/a
Michael Mair wrote:

Michael B Allen wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this
is that snprintf takes a size_t parameter and returns an int,
which is broken by design.


There's also an environmental limit, which is the minimum value for
the maximum number of characters produced by any single conversion:
509 in C89,
4095 in C99.

--
pete
Nov 14 '05 #6

P: n/a


pete wrote:
Michael Mair wrote:
Michael B Allen wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this
is that snprintf takes a size_t parameter and returns an int,
which is broken by design.

There's also an environmental limit, which is the minimum value for
the maximum number of characters produced by any single conversion:
509 in C89,
4095 in C99.


Thank you :-)
I was completely unaware of this.
However, this does not really affect that this switching of types
in between is ugly.

Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #7

P: n/a
Michael Mair wrote:

pete wrote:
Michael Mair wrote:
Michael B Allen wrote:

What is the rationale for snprintf to
"return the number of characters
(excluding the trailing '\0')
which would have been written to the final
string if enough space had been available"?

It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this
is that snprintf takes a size_t parameter and returns an int,
which is broken by design.

There's also an environmental limit, which is the minimum value for
the maximum number of characters produced by any single conversion:
509 in C89,
4095 in C99.


Thank you :-)
I was completely unaware of this.
However, this does not really affect that this switching of types
in between is ugly.


I think it has to do with snprintf being based on the
functionality of fprintf and with fprintf being older than size_t.

--
pete
Nov 14 '05 #8

P: n/a
Michael B Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

This just twists my noodle in a knot every time! What is the proper way
to test the return value for overflow?


So what else would you have it return? The number of characters it
actually did write? That's almost always useless information, since it's
easily found using strlen(). The number of characters it would've
written had it had the space, however, is very useful.

Richard
Nov 14 '05 #9

P: n/a
Erik de Castro Lopo <no****@mega-nerd.com> writes:
[...]
Snprintf is guaranteed not to overflow.


Well, sort of; it will overflow if you tell it to.

For example,

char buf[5];
snprintf(buf, 30, "%s", "This string is too big");

But assuming the arguments are consistent, yes, it's guaranteed not to
overflow.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #10

P: n/a
Michael Mair <Mi**********@invalid.invalid> writes:
Michael B Allen wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.

[...]

The following:

snprintf(buf, buf_size, "");

is a legitimate call to snprintf; it returns 0 but doesn't indicate an
error.

If ISO C had a "ssize_t" type (a signed equivalent of size_t), this
would be a good place to use it. (POSIX defines ssize_t; ISO C
doesn't.)

An alternative might be to have the return value just indicate success
or failure, and return the number of bytes via a separate size_t*
argument, but that would make the function more difficult to use.

In practice, returning int is only going to be a problem if the length
of the string would exceed INT_MAX characters. This is unlikely on
systems with 16-bit int, and even more unlikely on systems with 32-bit
or larger int. I agree that it's a wart, but I'm not sure there's a
good way to fix it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #11

P: n/a
Keith Thompson wrote:
Michael Mair <Mi**********@invalid.invalid> writes:
Michael B Allen wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?
It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.


[...]

The following:

snprintf(buf, buf_size, "");

is a legitimate call to snprintf; it returns 0 but doesn't indicate an
error.


With my suggestion, this would have returned 1 ('\0') which is distinct
from 0 :-)

If ISO C had a "ssize_t" type (a signed equivalent of size_t), this
would be a good place to use it. (POSIX defines ssize_t; ISO C
doesn't.)
Yep, I really do not understand why we were not given that toy by
C99... especially since at other places the standard goes to a length
avoiding to say ssize_t (for example when describing the *printf/*scanf
length modifier z, referring to size_t or the corresponding signed
type...).
Losing half the positive range of size_t is certainly better than
a potential int/size_t problem.

An alternative might be to have the return value just indicate success
or failure, and return the number of bytes via a separate size_t*
argument, but that would make the function more difficult to use.
Indeed.

In practice, returning int is only going to be a problem if the length
of the string would exceed INT_MAX characters. This is unlikely on
systems with 16-bit int, and even more unlikely on systems with 32-bit
or larger int. I agree that it's a wart, but I'm not sure there's a
good way to fix it.


Well, apart from the differences to fprintf() which will lead to
problems with people too lazy to look up snprintf(), I still hold
that returning -- as size_t value -- the numbers of characters to
be written _including_ the string terminator or zero on error would
have been the easiest and probably best way.
However, this is purely academical as we already have the wart.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #12

P: n/a
On Mon, 22 Nov 2004 04:15:02 -0500, Chris Torek wrote:
In article <pa********************************@ioplex.com> Michael B
Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


This lets you allocate a buffer that is big enough, without having to do
many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2); if (needed < 0) ...
handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ... result = snprintf(mem, needed
+ 1, fmt, arg1, arg2);


I see. This is reasonable. I was wondering why it didn't just return -1
but I prefer this behavior. If I want something dumber I can wrap it.

Thanks,
Mike
Nov 14 '05 #13

P: n/a
Michael Mair <Mi**********@invalid.invalid> writes:
Keith Thompson wrote:
Michael Mair <Mi**********@invalid.invalid> writes:
Michael B Allen wrote:

What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?

It allows you to find out by how much to enlarge your final string
in order to fit it in. What is troubling me about this is that snprintf
takes a size_t parameter and returns an int, which is broken by design.
Returning 0 on error and the hypothetical number of written characters
including the string terminator with return type size_t would IMO have
been better.

[...]
The following:
snprintf(buf, buf_size, "");
is a legitimate call to snprintf; it returns 0 but doesn't indicate
an
error.


With my suggestion, this would have returned 1 ('\0') which is distinct
from 0 :-)


Right, I missed the "including the string terminator" clause. I think
that would be counterintuitive, since most similar functions return
the length (strlen()) of the string excluding the terminator. But in
any case we're stuck with the current behavior.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #14

P: n/a
Michael Mair wrote:

Keith Thompson wrote:
If ISO C had a "ssize_t" type (a signed equivalent of size_t),


That's how I think ptrdiff_t should have been defined.
Losing half the positive range of size_t is certainly better than
a potential int/size_t problem.


--
pete
Nov 14 '05 #15

P: n/a
In <cn********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
In article <pa********************************@ioplex.com>
Michael B Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


This lets you allocate a buffer that is big enough, without having
to do many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);


No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #16

P: n/a
"Dan Pop" <Da*****@cern.ch> wrote in message
news:co***********@sunnews.cern.ch...
In <cn********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
In article <pa********************************@ioplex.com>
Michael B Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"?


This lets you allocate a buffer that is big enough, without having
to do many passes:

needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);


No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).


Bad advice : if the code for assessing the length is duplicated for the actual
formatting, chances are these copies will get out of sync !
how is it better to call sprintf() instead of snprintf() ? What is there to
gain ?

If you are really paranoid, check again that result == needed indeed. But by
all means call snprintf().

Chqrlie.
Nov 14 '05 #17

P: n/a
"Charlie Gordon" <ne**@chqrlie.org> writes:
"Dan Pop" <Da*****@cern.ch> wrote in message

[...]
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).


Bad advice : if the code for assessing the length is duplicated for
the actual formatting, chances are these copies will get out of
sync! how is it better to call sprintf() instead of snprintf() ?
What is there to gain ?


I suspect any actual implementation is going to use much of the same
underlying code for all the *printf functions. And even if the code
is duplicated, any discrepancy between the length determined by
snprintf() and the length determined by sprintf() would be a bug in
the implementation. If you're unwilling to assume that the
implementation gets things right, you probably shouldn't be using it
in the first place. If you happen to know of a specific bug in some
implementation, it can make sense to guard against it in your code
(*and* submit a bug report to the implementer), but guarding against
implementation bugs in general is a waste of time. Or did I miss your
point?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #18

P: n/a
Keith Thompson wrote:
"Charlie Gordon" <ne**@chqrlie.org> writes:
"Dan Pop" <Da*****@cern.ch> wrote in message


[...]
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).


Bad advice : if the code for assessing the length is duplicated for
the actual formatting, chances are these copies will get out of
sync! how is it better to call sprintf() instead of snprintf() ?
What is there to gain ?


I suspect any actual implementation is going to use much of the same
underlying code for all the *printf functions. And even if the code
is duplicated, any discrepancy between the length determined by
snprintf() and the length determined by sprintf() would be a bug in
the implementation. If you're unwilling to assume that the
implementation gets things right, you probably shouldn't be using it
in the first place. If you happen to know of a specific bug in some
implementation, it can make sense to guard against it in your code
(*and* submit a bug report to the implementer), but guarding against
implementation bugs in general is a waste of time. Or did I miss your
point?


The only thing which makes it _necessary_ to check again the return
value I can think of is a dependence of arg1/arg2 on needed or mem
(as I illustrated elsethread).

Charlies point, AFAICS, is that you easily can make a slip of the kind
"Oh, I will always copy over the format string" and then forget it
once. This may lead to exactly the buffer overrun which was to be
avoided in the first place.
As Dan referred to the const char * fmt in both cases, this point is
moot here; but in general, I agree that it certainly does not hurt to
always use snprintf...
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #19

P: n/a
>In <cn********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);

In article <co***********@sunnews.cern.ch> Dan Pop <Da*****@cern.ch> wrote:No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).


Call it paranoia if you like, but I recommend the two-snprintf()
method anyway. For instance, suppose we have:

char *some_subroutine(const char *str, int n) {
...
needed = snprintf(NULL, 0, "%s%d", str, n);
...
mem = malloc(needed + 1);
...
return mem;
}

and suppose the caller passes, as the "str" argument, a pointer to
freed space (or similarly invalid pointer) that gets overwritten
with some new value by the malloc() call. In particular, if
strlen(str) increases, the second snprintf() will "want to" write
more characters than we just allocated.

Obviously such a call is faulty -- but using snprintf() twice will
limit any additional damage, and the fact that the number of
characters printed has changed may help find the bug. The *cost*
of using snprintf() twice (instead of snprintf() followed by plain
sprintf()) is likely negligible; it may even save CPU time and/or
(code) space.

As someone else noted, there is also the "parallel construction"
bonus, which I admit is kind of a soft squidgy human-factor thing
-- but as I see it, that makes two or three advantages (however
small they may be), and no disadvantages, so one might as well do
it this way.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #20

P: n/a
"Michael Mair" <Mi**********@invalid.invalid> wrote in message
news:30*************@uni-berlin.de...
Keith Thompson wrote:
"Charlie Gordon" <ne**@chqrlie.org> writes:
"Dan Pop" <Da*****@cern.ch> wrote in message


[...]
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).

Bad advice : if the code for assessing the length is duplicated for
the actual formatting, chances are these copies will get out of
sync! how is it better to call sprintf() instead of snprintf() ?
What is there to gain ?


I suspect any actual implementation is going to use much of the same
underlying code for all the *printf functions. And even if the code
is duplicated, any discrepancy between the length determined by
snprintf() and the length determined by sprintf() would be a bug in
the implementation. If you're unwilling to assume that the
implementation gets things right, you probably shouldn't be using it
in the first place. If you happen to know of a specific bug in some
implementation, it can make sense to guard against it in your code
(*and* submit a bug report to the implementer), but guarding against
implementation bugs in general is a waste of time. Or did I miss your
point?


Charlies point, AFAICS, is that you easily can make a slip of the kind
"Oh, I will always copy over the format string" and then forget it
once. This may lead to exactly the buffer overrun which was to be
avoided in the first place.
As Dan referred to the const char * fmt in both cases, this point is
moot here; but in general, I agree that it certainly does not hurt to
always use snprintf...


Exactly. The bold programmer is his own worst enemy.
In Chris Torek's example snprintf() is called with a variable format string.
This is *bad* practice. It prevents casual and compiler assisted verification
of parameter consistency. The unknown format string fmt points to may well
refer to more than 2 parameters and not necessarily of the correct type.
Calling snprintf() twice with the same arguments may well produce different
output in this case (or much worse UB of course).
What I was refering to in my reply was not snprintf() and sprintf()
implementations getting out of sync: they most likely share a common
implementation. But the programmer's code for those 2 calls runs the risk of
differentiating of time, via ill-advised maintenance. Call it the law of
evolution applied to C code. Even from day one, the 2 calls may have been
created different by mistake.
It would be much less error-prone to wrap those in some kind of loop executed
twice with a single snprintf() statement.
GNU defines asprintf() for this purpose. The implementation is straightforward
and can be made an easy addition to projects targetted at non gcc platforms.

int asprintf(char **strp, const char *fmt, ...);
int vasprintf(char **strp, const char *fmt, va_list ap);

strp will receive a pointer to storage allocated via malloc(). This is exactly
what Chris Torek does in his example, except more general and usable.

Chqrlie.


Nov 14 '05 #21

P: n/a
In <co********@news1.newsguy.com> Chris Torek <no****@torek.net> writes:
In <cn********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
needed = snprintf(NULL, 0, fmt, arg1, arg2);
if (needed < 0) ... handle error ...
mem = malloc(needed + 1);
if (mem == NULL) ... handle error ...
result = snprintf(mem, needed + 1, fmt, arg1, arg2);

In article <co***********@sunnews.cern.ch> Dan Pop <Da*****@cern.ch> wrote:
No point in calling snprintf again, sprintf would do just fine:

result = sprintf(mem, fmt, arg1, arg2);

And, unless you're really paranoid, you don't even need its return value,
because it *must* be equal to ``needed'' (if the original snprintf call
succeeded, a sprintf call with the same arguments and in the same locale
must return the same value).


Call it paranoia if you like, but I recommend the two-snprintf()
method anyway. For instance, suppose we have:

char *some_subroutine(const char *str, int n) {
...
needed = snprintf(NULL, 0, "%s%d", str, n);
...
mem = malloc(needed + 1);
...
return mem;
}

and suppose the caller passes, as the "str" argument, a pointer to
freed space (or similarly invalid pointer) that gets overwritten
with some new value by the malloc() call. In particular, if
strlen(str) increases, the second snprintf() will "want to" write
more characters than we just allocated.


Your first example was, at least, realistic: it presented the code as
a compact block, exactly as it is going to be used in real code. There
is no point in mixing any other code between the lines, as your highly
artificial second does.

My point is that each function call serves a well defined purpose:
the snprintf call is used to determine the length of a buffer needed
by sprintf to generate the desired output. Once you have achieved this
purpose, you can *safely* use sprintf to generate the output, and I
don't buy excuses along the lines of "but I am an idiot who cannot rely
on his ability to call both functions correctly".

Checking the return value of the second s[n]printf call merely complicates
the code for no redeeming benefits. That is, unless you get paid
proportionally with the number of lines of source code you write ;-)
Obviously such a call is faulty -- but using snprintf() twice will
limit any additional damage, and the fact that the number of
characters printed has changed may help find the bug. The *cost*
of using snprintf() twice (instead of snprintf() followed by plain
sprintf()) is likely negligible; it may even save CPU time and/or
(code) space.


Please enlighten me about the way to implement snprintf in such a way
that it uses less (or even the same number of) CPU cycles as sprintf (for
the same arguments). Unless I'm missing something, snprintf has to do
everything sprintf does *and* perform some extra work.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #22

P: n/a
>In <co********@news1.newsguy.com> Chris Torek <no****@torek.net> writes:
Obviously such a call is faulty -- but using snprintf() twice will
limit any additional damage, and the fact that the number of
characters printed has changed may help find the bug. The *cost*
of using snprintf() twice (instead of snprintf() followed by plain
sprintf()) is likely negligible; it may even save CPU time and/or
(code) space.

In article <co**********@sunnews.cern.ch> Dan Pop <Da*****@cern.ch> wrote:Please enlighten me about the way to implement snprintf in such a way
that it uses less (or even the same number of) CPU cycles as sprintf (for
the same arguments). Unless I'm missing something, snprintf has to do
everything sprintf does *and* perform some extra work.


Well, as it turns out, in my own implementation, it uses pretty
much the same number of instructions -- sprintf and snprintf both
call __svfprintf, which does all the real work, and __svfprintf
uses internal write routines to copy into the output buffer(s) for
stdio FILE objects. The "buffer" for a string is simply the memory
area being written to, and if it fills up, extra data are quietly
ignored.

Since the usual (and only correct!) case is, as you pointed out, that
the second snprintf() does not overflow, the "ignore overflow" code
is not used -- so sprintf and snprintf take the same code path. The
only divergence is at the top of the first call, where sprintf says:

fake_file_struct.len = 32767; /* XXX */

(I think I used 32767, but it might be some other constant), while
snprintf has an extra argument and says:

fake_file_struct.len = buffer_size;

Hnece one has a "move constant" instruction while the other has a
"copy argument" instruction; and of course, in the calls to sprintf
or snprintf, we get either "no parameter" or "buffer size parameter",
which may or may not require some instruction(s) depending on
calling sequence. A typical example might be:

mov %l1,%o0 ! buffer
mov %l5,%o1 ! fmt
mov %l6,%o2 ! arg
call sprintf

vs:

mov %l1,%o0 ! buffer
mov 80,%o1 ! sizeof buffer
mov %l5,%o2 ! fmt
mov %l6,%o3 ! arg
call snprintf

The extra instruction in the call to snprintf here is made up for
by the fact that "fake_file_struct.len = 32767" requires three
instructions inside sprintf, while "fake_file_struct.len = buffer_size"
requires only one (on the SPARC -- other architectures will differ).
(On the SPARC, we actually get a one-instruction advantage for
snprintf, but that is only because I used a smaller-than-4096-byte
buffer.)

Despite using (in effect) the same number of instructions, I claim
the snprintf() call potentially runs (slightly) faster. Why?
Because we just called it, so it is probably already in the
instruction cache. (Of course, this depends on how the I-cache
interacts with the actual printf engine -- __svfprintf, in my
stdio.)

Again, the advantages of using snprintf twice (instead of snprintf
once, then sprintf once) are anywhere from tiny to nonexistent,
but the disadvantages appear always to be nonexistent. If one
balances "up to one microgram" against "zero micrograms", the way
to bet is that the up-to-one-microgram version is heavier. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #23

P: n/a
On 24 Nov 2004 19:51:38 GMT, Chris Torek <no****@torek.net> wrote:
In <co********@news1.newsguy.com> Chris Torek <no****@torek.net> writes:
Obviously such a call is faulty -- but using snprintf() twice will
limit any additional damage, and the fact that the number of
characters printed has changed may help find the bug. The *cost*
of using snprintf() twice (instead of snprintf() followed by plain
sprintf()) is likely negligible; it may even save CPU time and/or
(code) space.


In article <co**********@sunnews.cern.ch> Dan Pop <Da*****@cern.ch> wrote:
Please enlighten me about the way to implement snprintf in such a way
that it uses less (or even the same number of) CPU cycles as sprintf (for
the same arguments). Unless I'm missing something, snprintf has to do
everything sprintf does *and* perform some extra work.


Well, as it turns out, in my own implementation, it uses pretty
much the same number of instructions


Which is not only irrelevant to standard C, but does not at all
invalidate Dan's observation.

Personally, I would use sprintf for the second call, else next year I
would be scratching my head and wondering why I used an unneeded
constraint. (Gee, did I lose a line of code somewhere?)

--
Al Balmer
Balmer Consulting
re************************@att.net
Nov 14 '05 #24

P: n/a
Chris Torek wrote:
In article <pa********************************@ioplex.com>
Michael B Allen <mb*****@ioplex.com> wrote:
What is the rationale for snprintf to "return the number of characters
(excluding the trailing '\0') which would have been written to the final
string if enough space had been available"? What is the name and address of the person responsible for this?

That would be me. :-)


Congratulations! It's a great change and I loved it the mo. I saw it.

--
Ron House ho***@usq.edu.au
http://www.sci.usq.edu.au/staff/house
Nov 14 '05 #25

P: n/a
In <co*********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
Again, the advantages of using snprintf twice (instead of snprintf
once, then sprintf once) are anywhere from tiny to nonexistent,
but the disadvantages appear always to be nonexistent.


You're forgetting the real comparison: calling snprintf to do the real
work and checking its return value, as suggested in your post versus
calling sprintf to do the real work and ignoring its return value, as
suggested in my post.

I don't really care about the extra cycles, but I do care about the
source code bloat.

The reason I object to using snprintf for doing the real work is that it
is a conceptually more complex function and the added complexity doesn't
buy you anything. Furthermore, this added complexity is a minor potential
source of additional bugs (you have one more function argument that you
have to get right).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #26

P: n/a
In <co*********@news2.newsguy.com> Chris Torek <no****@torek.net> writes:
Since the usual (and only correct!) case is, as you pointed out, that
the second snprintf() does not overflow, the "ignore overflow" code
is not used -- so sprintf and snprintf take the same code path. The
only divergence is at the top of the first call, where sprintf says:

fake_file_struct.len = 32767; /* XXX */

(I think I used 32767, but it might be some other constant), while
snprintf has an extra argument and says:

fake_file_struct.len = buffer_size;


Does this mean that your sprintf implementation is broken? If I got you
right, it truncates its output to some arbitrary length.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #27

P: n/a
In article <co**********@sunnews.cern.ch>, Dan Pop <Da*****@cern.ch> wrote:
Does this mean that your sprintf implementation is broken? If I got you
right, it truncates its output to some arbitrary length.


Well, yes; but also no. The Standard says only that I have to
handle a small fixed length, and I handle a larger fixed length. :-)

If I may digress a bit...

I really do not remember what I set the size to -- possibly INT_MAX,
as the read-count field in the stdio structure is an "int" (signed
for various reasons, although actually the write count is the only
one that *has* to be signed). I do, however, remember where I
first saw this magic constant of 32767, which also might help
explain how snprintf came about in the first place.

Some time in the 1980s, I was working on a problem with a program
that used sprintf to print into an internal buffer, but was
overrunning the buffer. I investigated the stdio implementation
at the time, and saw that it did pretty much just what I expected:
set the output pointer to point to the buffer you handed it, and
set the length to "a large number" (32767, in this case -- the code
had obviously been taken from the 16-bit PDP-11 environment, where
it was the maximum possible value; though I doubt signedness was
really required, so that it could have been unsigned and hence
65535).

In any case, it was immediately obvious that I could clone the code
but set the length smaller. There was a special flag associated
with the stdio data structure as well. The only question I had
was: what happens if you exceed the supplied count?

The answer, as it turns out, was that the stdio internal code would
call write() on whatever file descriptor number was lying around
in the stdio structure. The special flag was ignored entirely!
Trying to write snprintf() using this particular printf engine
(named "_doprnt" and written in VAX assembly language) turned out
to be hopeless. I considered modifying the assembly language; but
this was unappealing for multiple reasons.

When I discovered that we needed to rewrite the engine in C (both
because of the in-progress ANSI C standard, and because the folks
at UCBerkeley were porting the system to the Tahoe, which did not
have the VAX's "editpc" instruction), I took advantage of the
rewrite to fix a bunch of things, including adding snprintf(). (I
also wanted a "malloc()ing sprintf", a la GNU's asprintf(), but
that got vetoed. I did get funopen() put in, but it was not
adopted for C99.)

The only trick I missed in that early snprintf() implementation
was allowing the buffer to be NULL if the size was given as 0.
Given my experience with programs with buffer overruns and programs
that wanted an "allocating sprintf", I did know that snprintf()
had to limit its actual output, but return the number of characters
needed for the complete output. (Of course, we also needed va_copy()
so that vsnprintf() could be used to build an allocating vasprintf(),
but again that had to wait for C99. The two-pass method also has
those "atomicity issues" that can be solved using funopen() to
build asprintf()/vasprintf(), which would have been my preferred
solution.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #28

This discussion thread is closed

Replies have been disabled for this discussion.