By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,507 Members | 1,758 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,507 IT Pros & Developers. It's quick & easy.

Erradicating a Buffer Overflow

P: n/a
Hello everyone, here is a sample program with what I think has a possible
buffer overflow vulnerability:

#include <stdio.h>

int main(int argc, char *argv[])
{
char junk[10]; /* Possibly dangerous */
char wday[4];
char mon[4];
char time[9];
int day;
int year;

if (argc == 2) {
sscanf(argv[1],
"%.3s, %d %.3s %4d %.8s %s",
wday,
&day,
mon,
&year,
time,
junk);
}

return 0;
}

Now, from what I can tell, wday, mon, and time will all be safe, because
there is a very strict limit to how much will be scanned in. The problem,
seems to be the junk buffer.

As you can guess, this is designed to take a specifically formatted date
string and read it into variables. However, in the date format I am
processing (mbox/overview file type dates), there is an extra bit after
the time that could be an arbitrary length. Generally, it's not bigger
than 10, which is why I initially used that value, but it did not click in
my head before that this would cause a problem. Then, while I was thinking
about it today, I realized that you could put in more than 10 characters
after the time section of the string, and overflow the program. My
question is, what is the proper way of handling this? How can I remedy it?
I could change %s to %.9s or something of that nature, but that would be
ugly, because I would end up with a bunch of whitespace and padding at the
beginning or the end. Ideally I would not want that. How could I make this
a safer process?

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #1
Share this Question
Share on Google+
23 Replies


P: n/a
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:
char junk[10]; /* Possibly dangerous */
char wday[4];
char mon[4];
char time[9];
int day;
int year;

if (argc == 2) {
sscanf(argv[1],
"%.3s, %d %.3s %4d %.8s %s",
wday,
&day,
mon,
&year,
time,
junk);
} Then, while I was thinking
about it today, I realized that you could put in more than 10 characters
after the time section of the string, and overflow the program.
Yes, that's correct.
My question is, what is the proper way of handling this? How can I remedy it?
I could change %s to %.9s or something of that nature, but that would be
ugly, because I would end up with a bunch of whitespace and padding at the
beginning or the end.


If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do. The simplest solution is probably not to copy it into another
char array, but to set a char pointer inside the argument string.

Richard
Nov 15 '05 #2

P: n/a
On Mon, 24 Oct 2005 11:04:40 -0400, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do. The simplest solution is probably not to copy it into another
char array, but to set a char pointer inside the argument string.


Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not read
it? How would I just dump that extra part?

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #3

P: n/a
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:
On Mon, 24 Oct 2005 11:04:40 -0400, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do. The simplest solution is probably not to copy it into another
char array, but to set a char pointer inside the argument string.
Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not read
it?


Quite.
How would I just dump that extra part?


What do you mean, "dump"? You have a single string. You're not reading
from a file. You don't need to dump anything.

BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?

Richard
Nov 15 '05 #4

P: n/a
Arctic Fidelity:
....
Hmm, alright, not sure I understand that. It *is* junk, but since it is
junk, do I just leave out the "%s" part in sscanf and sscanf will not
read it? How would I just dump that extra part?


You could use %*s, but since it is the last argument of sscanf, you could
just drop it as well.

Jirka
Nov 15 '05 #5

P: n/a
On Mon, 24 Oct 2005 12:01:41 -0400, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
How would I just dump that extra part?
What do you mean, "dump"? You have a single string. You're not reading
from a file. You don't need to dump anything.


Hehe, I was not aware that you could specify a format string in sscanf
that did not encompass the entire string. :-) My bad. I suppose the
correct phrasing is, ignore.
BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?


Yes, I am aware of that. This was just the smallest reasonable program
that I could make to demonstrate my question. My actual use of this
function has nothing to do with command line arguments or any such thing.
I figured you all wouldn't want to see the rest of the code, when all I
was asking about was a simple buffer overflow.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #6

P: n/a
On Mon, 24 Oct 2005 11:04:40 -0400, Richard Bos
<rl*@hoekstra-uitgeverij.nl> wrote:
If it really is junk, I would not bother to read it at all. If you want
to inspect it, and it can be of any length, there are several things you
can do.


BTW, thank you for the quick response. I learn something new everyday I
read this group.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #7

P: n/a
On Mon, 24 Oct 2005 12:03:59 -0400, Jirka Klaue <jk****@tkn.tu-berlin.de>
wrote:
You could use %*s, but since it is the last argument of sscanf, you could
just drop it as well.


Thank you very much. I had no idea that %*s worked like that with sscanf.
I re-read the information in my manual about that, and lo, and behold,
look what I find! :-) Thanks a bunch.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #8

P: n/a
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:

# sscanf(argv[1],

# after the time section of the string, and overflow the program. My
# question is, what is the proper way of handling this? How can I remedy it?

Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
One of the drawbacks of being a martyr is that you have to die.
Nov 15 '05 #9

P: n/a
On Mon, 24 Oct 2005 18:35:20 -0400, SM Ryan
<wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.


I suppose I should say that I am unsure of what other tools in the
Standard C Library allow me to extract, in one function call, all the date
information from a string that I need, in such a straightforward fashion.
If there is, I'd love to hear it. :-) I personally came accross a sample
usage of sscanf in documentation, and found that it was much faster
compared to my original idea of single character stepping through the date
string.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #10

P: n/a
In article <op***************@frostbite.hsd1.va.comcast.net >,
Arctic Fidelity <sp**@sacrificumdeo.net> wrote:
I personally came accross a sample
usage of sscanf in documentation, and found that it was much faster
compared to my original idea of single character stepping through the date
string.


Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?
--
Chocolate is "more than a food but less than a drug" -- RJ Huxtable
Nov 15 '05 #11

P: n/a
SM Ryan <wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:

# sscanf(argv[1],

# after the time section of the string, and overflow the program. My
# question is, what is the proper way of handling this? How can I remedy it?

Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.


And what better tool would you use in this particular situation?

Richard
Nov 15 '05 #12

P: n/a
On Mon, 24 Oct 2005 20:52:37 -0400, Walter Roberson
<ro******@ibd.nrc-cnrc.gc.ca> wrote:
Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?


Faster to debug the code, the security issues, faster to write the code,
though I am not sure about execution speed, I haven't tested that.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #13

P: n/a
rl*@hoekstra-uitgeverij.nl (Richard Bos) wrote:
# SM Ryan <wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
#
# > "Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:
# >
# > # sscanf(argv[1],
# >
# > # after the time section of the string, and overflow the program. My
# > # question is, what is the proper way of handling this? How can I remedy it?
# >
# > Do you realize you aren't required to use *scanf? If the tools are
# > too difficult to use, get better tools.
#
# And what better tool would you use in this particular situation?

I usually write my own parser with things like state machines, strchr,
isxxx, strtol, etc. I prefer writing longer code if necessary to ensure
I have it under control.

Then again I'm not the one who felt the need to ask others if a scanf
format was safe.

--
SM Ryan http://www.rawbw.com/~wyrmwif/
Why are we here?
whrp
Nov 15 '05 #14

P: n/a
> BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?


Would you please tell me how to do that?
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[]. How to get all that in a single member?

Thanks in advance.

Nov 15 '05 #15

P: n/a
In article <11*********************@g14g2000cwa.googlegroups. com>,
WhoCares? <va***********@sify.com> wrote:
BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?
Would you please tell me how to do that?
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[]. How to get all that in a single member?


Mechanism to pass spaces in as arguments are OS or shell specific,
and should be asked in an appropriate newsgroup.

[OT]

Commonly, passing spaces in involves quoting of arguments. But the
exact quote characters and escape rules are OS or shell specific.

Unix ksh:

A3="arg 3"
./myprog "arg 1" 'arg 2' $A3

The behaviour in the last of those cases especially is not the same
on other shells or OS's.
--
I am spammed, therefore I am.
Nov 15 '05 #16

P: n/a
"WhoCares?" <va***********@sify.com> wrote:

[ Please do not remove attributions that are still relevant. Thanks. ]
BTW, I note that you're doing this to a command line argument. You are
aware that any single command line argument - that is, any single member
of argv[] - is highly unlikely to contain your _entire_ command line,
and will probably not even contain any spaces unless what- or whoever
called your program has taken special precautions to see that it does?
Would you please tell me how to do that?


Depends on the OS, and under some OSes, on the shell.
In my _limited_ knowledge, if the command line contains whitespaces
then the strings separated by these whitespaces are passed as different
members of argv[].


That's the most usual case, yes. But consider an OS which does not have
a command line, but allows you to fill in program parameters in the
symlink properties dialog. Or consider a shell which allows you to
escape whitespace.

Richard
Nov 15 '05 #17

P: n/a
In article <op***************@frostbite.hsd1.va.comcast.net >,
Arctic Fidelity <sp**@sacrificumdeo.net> wrote:
On Mon, 24 Oct 2005 20:52:37 -0400, Walter Roberson
<ro******@ibd.nrc-cnrc.gc.ca> wrote:
Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?

Faster to debug the code, the security issues, faster to write the code,
though I am not sure about execution speed, I haven't tested that.


You snipped the context that you had encountered scanf() in some
documentation and had found using it to be faster.

With regard to the debugging the security issues, you should be
taking into account that in order to debug those issues, you ended
up having to post to Usenet and to track through several days of
discussions in order to get the security issues clear. If you had
written your own small code section that did not use scanf(),
then you could have had it done in perhaps half an hour. And since
debugging the code includes debugging the security issues, you
weren't done debugging the code for at least several days.

Similarily, part of writing the code is debugging it, and documenting
it. Again you had the several days of delay while you found out
what scanf() does. So writing the code was in fact slower than if you
had taken a more direct approach without scanf().

The only "faster" left is execution time, which you indicate that
you did not measure.

I think you should be reconsidering whether it was any "faster"
to use scanf() or not. It looks to me that using scanf() was slower
in every measure you were taking into account.
--
Chocolate is "more than a food but less than a drug" -- RJ Huxtable
Nov 15 '05 #18

P: n/a
On Tue, 25 Oct 2005 11:40:23 -0400, Walter Roberson
<ro******@ibd.nrc-cnrc.gc.ca> wrote:
Faster? In what sense? Faster to write the code, or faster execution
time, or faster to debug the security problems?
Faster to debug the code, the security issues, faster to write the code,
though I am not sure about execution speed, I haven't tested that.


You snipped the context that you had encountered scanf() in some
documentation and had found using it to be faster.


My sincere apologies. I shall try to take better note of that.
With regard to the debugging the security issues, you should be
taking into account that in order to debug those issues, you ended
up having to post to Usenet and to track through several days of
discussions in order to get the security issues clear. If you had
written your own small code section that did not use scanf(),
then you could have had it done in perhaps half an hour. And since
debugging the code includes debugging the security issues, you
weren't done debugging the code for at least several days.
Actually, I received a response that well enough answered my question to
the point where I knew where to look and how to look in my documentation
that I was able to have the entire question settled from post to end in
far less than half a day. The time that it would have taken for me to
properly fix and make an even equivalently working program of my own,
would have been at least that long, since my speed at writing C code is
not nearly fast enough for that yet.
Similarily, part of writing the code is debugging it, and documenting
it. Again you had the several days of delay while you found out
what scanf() does. So writing the code was in fact slower than if you
had taken a more direct approach without scanf().
Whether the approach is more direct or not is debatable. I would say that
in comparison with the estimated amount of time it would have taken to
properly fix and verify the code I would have written, sscanf() + Usenet
discussion time (and by this I mean the time before I fixed the problem)
was faster.
The only "faster" left is execution time, which you indicate that
you did not measure.

I think you should be reconsidering whether it was any "faster"
to use scanf() or not. It looks to me that using scanf() was slower
in every measure you were taking into account.


Having reconsidered it, and I have come to the conclusion that sscanf
seems to be at least equivalent in "speed" (with regards to those issues
stated above) to writing my own code by hand, taking into account my
relative speed at writing such code at this moment in time.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #19

P: n/a
On Tue, 25 Oct 2005 08:43:33 -0400, SM Ryan
<wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
Then again I'm not the one who felt the need to ask others if a scanf
format was safe.


In some regards, I feel almost as though having asked this question has
earned me even the slightest bit of disdain from some particular readers
of this group. Am I missing something? Forgive me if I am reading into
such things. I am under the impression, perhaps, that scanf and such
functions have at them a group of people who are in at least partially
strong objection to their use? If so, is their some history or methods or
something else about these scanf tools with which I am not familar that
has earned them such apparent dislike? If not, well, then do please ignore
the far too naive jabberings of a simpleton of the C world, a relative
newcomer.

- Arctic Fidelity

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Nov 15 '05 #20

P: n/a
Arctic Fidelity wrote:
... I am under the impression, perhaps, that scanf and such
functions have at them a group of people who are in at least
partially strong objection to their use? If so, is their some
history or methods or something else about these scanf tools
with which I am not familar that has earned them such apparent
dislike?


The problem lies with teachers and tutors who are too eager to
teach the entire language as quickly as possible, without focussing
on details. You'll often see newbies totally unaware of the basic
problems with code like...

scanf("%s", my_string);
...

The problem with scanf is that it is too easy to misuse, especially
if you're just using 'default' options. Of course, that doesn't mean
it can't be used correctly. It just means that people are prone to
prefering alternative tools.

Unfortunately, writing bullet proof input routines that deal with
both good and bad input robustly (without undefined behaviour), and
which are able to continue with further input (in the absense of EOF),
is not a trivial task in C!

--
Peter

Nov 15 '05 #21

P: n/a
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote:
On Tue, 25 Oct 2005 08:43:33 -0400, SM Ryan
<wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
Then again I'm not the one who felt the need to ask others if a scanf
format was safe.


In some regards, I feel almost as though having asked this question has
earned me even the slightest bit of disdain from some particular readers
of this group.


Some particular readers, undoubtedly. But you might read the group for a
bit longer before deciding how much to worry about Mr. SM "Could not
quote properly for his life" Ryan's opinion of yourself. As for me, I
feel no disdain for you.

Richard
Nov 15 '05 #22

P: n/a
"Arctic Fidelity" <sp**@sacrificumdeo.net> wrote in message
news:op***************@frostbite.hsd1.va.comcast.n et...
On Mon, 24 Oct 2005 18:35:20 -0400, SM Ryan
<wy*****@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:
Do you realize you aren't required to use *scanf? If the tools are
too difficult to use, get better tools.
I suppose I should say that I am unsure of what other tools in the
Standard C Library allow me to extract, in one function call, all the

date information from a string that I need, in such a straightforward fashion.

So write a function of your own. Duh.
If there is, I'd love to hear it. :-) I personally came accross a sample usage of sscanf in documentation, and found that it was much faster
compared to my original idea of single character stepping through the date string.


Faster is so 1960's. You can't tell what is "faster" by looking at code.
Are you writing code for a microwave oven? If you code is for a modren
CPU then a good compiler will probably modify your code into something
fast. If you detect a slowdown, or want to, then run a profiler. No one
(except Gods - who may post here from time to time) can predict a
speedup - things you do to speed up your code may prevent the compiler
from speeding up your code. Just write code that solves the problem.

--
Mabden
Nov 15 '05 #23

P: n/a
On Mon, 24 Oct 2005 10:09:52 -0400, "Arctic Fidelity"
<sp**@sacrificumdeo.net> wrote:

<snip>
sscanf(argv[1],
"%.3s, %d %.3s %4d %.8s %s", <snip: various args ending with junk which is char[10]>

Those should be %3s etc. "Dot" numbers in *scanf are nonstandard. (Cf.
*printf where %Ns pads to minimum and %.Ns truncates to maximum.)
As you can guess, this is designed to take a specifically formatted date
string and read it into variables. However, in the date format I am
processing (mbox/overview file type dates), there is an extra bit after
the time that could be an arbitrary length. Generally, it's not bigger
than 10, which is why I initially used that value, but it did not click in
my head before that this would cause a problem. Then, while I was thinking
about it today, I realized that you could put in more than 10 characters
after the time section of the string, and overflow the program. My
question is, what is the proper way of handling this? How can I remedy it?
As already answered, the real answer is %*s or nothing, but one nit:
I could change %s to %.9s or something of that nature, but that would be
ugly, because I would end up with a bunch of whitespace and padding at the
beginning or the end. <snip>


*scanf %s, with or without a length limit, will always skip leading
whitespace and stop at following whitespace, so even if the supplied
string (which you said later isn't really an argv[] string) contains
padding this particular format wouldn't put it in the variable.

- David.Thompson1 at worldnet.att.net
Nov 15 '05 #24

This discussion thread is closed

Replies have been disabled for this discussion.