473,549 Members | 2,346 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Bounds checked string library

A function like strcpy takes now, two unbounded pointers.

Unbounded pointers, i.e. pointers where there is no
range information, have catastrophic failure modes
specially when *writing* to main memory.

A better string library would accept *bounded* pointers.
We would have then:
char *strcpyN(char *destination, size_t bound1,
char *src,size_t bound2);

Bounded pointers are used in C in many interfaces.
This is absolutely nothing new.

Their use could be made more generalized when the
functions in the C library would leave the obsession
with unbounded pointers and accept this type too.

Of course, clever compilers could pass automatically
size information to the called function, but that would
be just an improvement. What is needed is a standard
that would allow generalized use of this type of
pointers in applications that need them.

Because in many applications security is more
important than sparing a few cycles.

Of course there exist many string libraries that do
this, but each has its own syntax. Much better
would be if standard C would encourage the use
of bounded pointers with a string library
that uses them.

jacob
Nov 14 '05 #1
22 2055
On Sat, 14 Feb 2004 22:48:48 +0100, in comp.lang.c , "jacob navia"
<ja***@jacob.re mcomp.fr> wrote:
A function like strcpy takes now, two unbounded pointers.

Unbounded pointers, i.e. pointers where there is no
range information, have catastrophic failure modes
specially when *writing* to main memory.
Jacob
in this and your other post, I personally feel you're solving problems that
don't exist. A good compiler will have its own way to check them during
debugging, and careful programming will avoid them in production.

Plus frankly, I don't see this as a problem anyway. I'm passing an array
to a function and doing something with it - in your model I need to know
how big it is when I write the fn, which is a serious problem. Imagine I'm
reading in data from a file, and mallocing the memory for it, I don't know
at compile time how much memory ie how large an array I need.

I understand what you're trying to do but I do genuinely think that its a
problem thats already solvable by adequate quality programming.
A better string library would accept *bounded* pointers.
We would have then:
char *strcpyN(char *destination, size_t bound1,
char *src,size_t bound2);
strncpy does most of this already. The bit it doesn't do, checking the size
of the destination, is trivial to check yourself before calling it.
Their use could be made more generalized when the
functions in the C library would leave the obsession
with unbounded pointers and accept this type too.


better to create a new type, say "string", which contains the size info
within the type. What does that remind me of... :-)
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.c om/ms3/bchambless0/welcome_to_clc. html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #2

"Mark McIntyre" <ma**********@s pamcop.net> a écrit dans le message de
news:rj******** *************** *********@4ax.c om...
On Sat, 14 Feb 2004 22:48:48 +0100, in comp.lang.c , "jacob navia"
<ja***@jacob.re mcomp.fr> wrote:
A function like strcpy takes now, two unbounded pointers.

Unbounded pointers, i.e. pointers where there is no
range information, have catastrophic failure modes
specially when *writing* to main memory.
Jacob
in this and your other post, I personally feel you're solving problems

that don't exist. A good compiler will have its own way to check them during
debugging, and careful programming will avoid them in production.

No. There is no way to write the strcpy function without
provoking a catastrophic failure with unbounded pointers.

Plus frankly, I don't see this as a problem anyway. I'm passing an array
to a function and doing something with it - in your model I need to know
how big it is when I write the fn, which is a serious problem.
This is precisely my point. This *is* a serious problem. You
*must* check the bounds of the array when writing to it.

Most C programmers do not do it because is incredible
tedious:

if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;

You see a lot of code like that?

Imagine I'm
reading in data from a file, and mallocing the memory for it, I don't know
at compile time how much memory ie how large an array I need.

fread accepts bounded pointers, since the input buffer is bounded
by the size to read in!!!
I understand what you're trying to do but I do genuinely think that its a
problem thats already solvable by adequate quality programming.

Yes, but it is VERY TEDIOUS so most people (me included)
do not do it!!!

This is precisely the problem. The interface of those function
is plain wrong.
A better string library would accept *bounded* pointers.
We would have then:
char *strcpyN(char *destination, size_t bound1,
char *src,size_t bound2);


strncpy does most of this already. The bit it doesn't do, checking the

size of the destination, is trivial to check yourself before calling it.


At EACH CALL ???

This is of course possible but it is BAD DESIGN!

You are doing what a machine could do much faster.
What's the use of computers if we are going to waste
time and effort doing their job?
Their use could be made more generalized when the
functions in the C library would leave the obsession
with unbounded pointers and accept this type too.


better to create a new type, say "string", which contains the size info
within the type. What does that remind me of... :-)


Yes. The other solution is to overload the [] operator
and use bounded strings. This is much easier but
probably would provoke such an outcry that a smaller
but still useful solution is better.

jacob
Nov 14 '05 #3
The concept of the C language is to give the programmer the power of
assembly language, but with increased visual comprehension.

This is the point you are missing.

If you want specialized languages, try the languages PERL or D, for example,
which have the specialized features you are refering to.
Nov 14 '05 #4
On Sun, 15 Feb 2004 01:02:59 +0100, in comp.lang.c , "jacob navia"
<ja***@jacob.re mcomp.fr> wrote:

"Mark McIntyre" <ma**********@s pamcop.net> a écrit dans le message de
news:rj******* *************** **********@4ax. com...
On Sat, 14 Feb 2004 22:48:48 +0100, in comp.lang.c , "jacob navia"
<ja***@jacob.re mcomp.fr> wrote:
>A function like strcpy takes now, two unbounded pointers.
>
>Unbounded pointers, i.e. pointers where there is no
>range information, have catastrophic failure modes
>specially when *writing* to main memory.
Jacob
in this and your other post, I personally feel you're solving problems

that
don't exist. A good compiler will have its own way to check them during
debugging, and careful programming will avoid them in production.


No. There is no way to write the strcpy function without
provoking a catastrophic failure with unbounded pointers.


Thats not what I said. The compiler writer can implement bounds checking in
debug mode without changing the syntax of strcpy. And you, the user of
strcpy, can avoid buffer overflows by writing careful code.
Plus frankly, I don't see this as a problem anyway. I'm passing an array
to a function and doing something with it - in your model I need to know
how big it is when I write the fn, which is a serious problem.


This is precisely my point. This *is* a serious problem. You
*must* check the bounds of the array when writing to it.


Rubbish. When you read 4 bytes from a file into a 12-byte array, do you
need to check that the destination is big enough first? When you copy those
4 bytes into a 128 byte array, do you need to check again?

When you need to check you should check. Wen you don't you don't.
Most C programmers do not do it because is incredible
tedious:
if you do it like this, its tedious. However if your code requires such
checking, then you /should/ put it in.
if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;
This is not how to do it though.
You see a lot of code like that?


I see a lot of code that makes sure its array bounds can't be overflowed.
But not by daft checks like the above.
Imagine I'm
reading in data from a file, and mallocing the memory for it, I don't know
at compile time how much memory ie how large an array I need.


fread accepts bounded pointers, since the input buffer is bounded
by the size to read in!!!


Who said I was using fread? And whats to stop the simple mistake

int somedata[128];
fread(somedata, 129,1, file);
I understand what you're trying to do but I do genuinely think that its a
problem thats already solvable by adequate quality programming.


Yes, but it is VERY TEDIOUS so most people (me included)
do not do it!!!


Then you're doing it wrong.
>A better string library would accept *bounded* pointers.


strncpy does most of this already. The bit it doesn't do, checking the

size
of the destination, is trivial to check yourself before calling it.


At EACH CALL ???


No, at each declaration.
>Their use could be made more generalized when the
>functions in the C library would leave the obsession
>with unbounded pointers and accept this type too.


better to create a new type, say "string", which contains the size info
within the type. What does that remind me of... :-)


Yes. The other solution is to overload the [] operator
and use bounded strings. This is much easier but
probably would provoke such an outcry that a smaller
but still useful solution is better.


I believe the phrase I'm searchgin for is "you know where C++ can be
found...".

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.c om/ms3/bchambless0/welcome_to_clc. html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #5


jacob navia wrote:
"Mark McIntyre" <ma**********@s pamcop.net> a écrit dans le message de
news:rj******** *************** *********@4ax.c om...
On Sat, 14 Feb 2004 22:48:48 +0100, in comp.lang.c , "jacob navia"
<ja***@jacob. remcomp.fr> wrote:

A function like strcpy takes now, two unbounded pointers.

Unbounded pointers, i.e. pointers where there is no
range information, have catastrophic failure modes
specially when *writing* to main memory.
Jacob
in this and your other post, I personally feel you're solving problems


that
don't exist. A good compiler will have its own way to check them during
debugging, and careful programming will avoid them in production.



No. There is no way to write the strcpy function without
provoking a catastrophic failure with unbounded pointers.


Plus frankly, I don't see this as a problem anyway. I'm passing an array
to a function and doing something with it - in your model I need to know
how big it is when I write the fn, which is a serious problem.



This is precisely my point. This *is* a serious problem. You
*must* check the bounds of the array when writing to it.

Most C programmers do not do it because is incredible
tedious:

if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;

You see a lot of code like that?


No, I don't, and that's because where I work we
consider the use of strcpy() to be bad practice
and use strncpy() instead. Unadorned strcat(),
strcpy(), gets(), sprintf(), etc. don't pass code inspections.
They are treated as historical curiosities
which should not, in general, be used in production
code. And yes, checking the size of the destination
is standard coding practice in our shop.
Imagine I'm
reading in data from a file, and mallocing the memory for it, I don't know
at compile time how much memory ie how large an array I need.


fread accepts bounded pointers, since the input buffer is bounded
by the size to read in!!!

I understand what you're trying to do but I do genuinely think that itsa
problem thats already solvable by adequate quality programming.



Yes, but it is VERY TEDIOUS so most people (me included)
do not do it!!!

This is precisely the problem. The interface of those function
is plain wrong.


strcat(), et. al., were the first of the "strxxx" functions
to be issued and put into the library. After the problems
were identified, the "strnxxx" functions were issued.
(This was before ANY standard, I believe.) I would have loved
to see the former retired from the libraries, but that did
not happen, probably because too much existing code would break.
Yes, the functions have an interface which is dangerous when
used without discipline, but that, IMO, is not a reason
to change (or augment) the language specification.

Patient: "Doctor, it hurts when I do this."
Doctor: "Then don't do it."

A better string library would accept *bounded* pointers.
We would have then:
char *strcpyN(char *destination, size_t bound1,
char *src,size_t bound2);
strncpy does most of this already. The bit it doesn't do, checking the


size
of the destination, is trivial to check yourself before calling it.



At EACH CALL ???

This is of course possible but it is BAD DESIGN!


Why is it bad design? Please explain.

You are doing what a machine could do much faster.
What's the use of computers if we are going to waste
time and effort doing their job?

Their use could be made more generalized when the
functions in the C library would leave the obsession
with unbounded pointers and accept this type too.

As far as I know, there is nothing in the standard
which specifies the contents of a pointer. (Please
correct me if I am wrong.) Thus an *implementation *
may choose to somehow store the limits (bounds) of the valid area
for that pointer to operate on as part of the pointer
representation and doing something implementation
dependent when the pointer is out of bounds.
This would, of course, break code which relies
on pointer arithmetic, but that is bad practice
anyway. I know of no implementation which has
chosen to do so.

The implementation may also choose to implement
the standard libaries in a way in which the size
of the array_of_whatev er is stored in the size_t
bytes just before the array and perform bounds
checking that way. (Much like some malloc implementations
keep track of the size of an allocated area.)
Again, I know of no *C* implementation
which has chosen to do so. (Although some Fortran
string-handling packages do exactly that.)

Thus there appear, at least on the face of it,
a couple of possible ways to do this without changing
the language definition.

better to create a new type, say "string", which contains the size info
within the type. What does that remind me of... :-)



Yes. The other solution is to overload the [] operator
and use bounded strings. This is much easier but
probably would provoke such an outcry that a smaller
but still useful solution is better.

jacob



--
Ñ
"It is impossible to make anything foolproof because fools are so
ingenious" - A. Bloch

Nov 14 '05 #6

"Nick Landsberg" <hu*****@att.ne t> wrote in message
if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;
You see a lot of code like that?

No, I don't, and that's because where I work we
consider the use of strcpy() to be bad practice
and use strncpy() instead.

The problem is that src will be longer than dst for a reason. Maybe someone
has an unusually long address, for example. Simply putting in a check
removes the undefined behaviour, but substitutes wrong behaviour, which is
no benefit at all. So you need either to exit the program with an error
message or code some sort of intelligent address truncator.As far as I know, there is nothing in the standard
which specifies the contents of a pointer. (Please
correct me if I am wrong.) Thus an *implementation *
may choose to somehow store the limits (bounds) of the valid area
for that pointer to operate on as part of the pointer
representati on and doing something implementation
dependent when the pointer is out of bounds.
This would, of course, break code which relies
on pointer arithmetic, but that is bad practice
anyway. I know of no implementation which has
chosen to do so.

It is actually illegal to calculate an invalid address, unless it is one
beyond the end of a valid range, so strictly conforming code would not
break.
The AS400 implementation actually does this (see the size of a
sizeof(pointer) thread).


Nov 14 '05 #7
jacob navia wrote:
A function like strcpy takes now, two unbounded pointers.
No, it takes two pointers. They are bounded by common sense programming
safeguards (such as Everything Must Be Somewhere).
Unbounded pointers, i.e. pointers where there is no
range information, have catastrophic failure modes
specially when *writing* to main memory.
Not if you don't let them fail.
A better string library would accept *bounded* pointers.
A /different/ string library would accept bounded pointers.
We would have then:
char *strcpyN(char *destination, size_t bound1,
char *src,size_t bound2);

Bounded pointers are used in C in many interfaces.
This is absolutely nothing new.
Indeed. Lots of people have written string libraries. (Me too!)
Their use could be made more generalized when the
functions in the C library would leave the obsession
with unbounded pointers and accept this type too.
That would be a mistake. C is fast right now, because it assumes the
programmer knows what he's doing. When bounds-checking makes sense, the C
programmer puts it in (and, if he doesn't, on his own head be it). If it
doesn't make sense, why bother? You'll just slow everything down.
Of course, clever compilers could pass automatically
size information to the called function, but that would
be just an improvement. What is needed is a standard
that would allow generalized use of this type of
pointers in applications that need them.

Because in many applications security is more
important than sparing a few cycles.
If you want C++'s std::string, you know where to find it. And if you don't,
well, here is C - it's lean and it's mean and it's very, very keen, so
please don't slow it down for the rest of us just because some people don't
know when to bounds-check and when not to.
Of course there exist many string libraries that do
this, but each has its own syntax. Much better
would be if standard C would encourage the use
of bounded pointers with a string library
that uses them.


If it's all the same to you, I think I prefer it the way it is.

--
Richard Heathfield : bi****@eton.pow ernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Nov 14 '05 #8
In 'comp.lang.c', "jacob navia" <ja***@jacob.re mcomp.fr> wrote:
Most C programmers do not do it because is incredible
tedious:

if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;

You see a lot of code like that?


I have this in my personal standard C library in my "STR" module :

/* ---------------------------------------------------------------------
STR_safecopy()
---------------------------------------------------------------------
safe string copy
In case of outbonding, the string is cut-off.
---------------------------------------------------------------------
I: destination address
I: destination size
I: source address
O: destination address
--------------------------------------------------------------------- */
char *STR_safecopy (char *const des
,size_t const size
,char const *const src)
{
char *s_out = NULL;
if (des && size && src)
{
memcpy (des, src, size - 1);
des[size - 1] = 0;
s_out = des;
}

return s_out;
}

On my "str.h", I could add if I dared:

#undef strcpy()
#define strcpy(a, b) assert (0)

to prevent the use of strcpy() and encourage the use of my function
instead...

--
-ed- em**********@no os.fr [remove YOURBRA before answering me]
The C-language FAQ: http://www.eskimo.com/~scs/C-faq/top.html
C-reference: http://www.dinkumware.com/manuals/reader.aspx?lib=cpp
FAQ de f.c.l.c : http://www.isty-info.uvsq.fr/~rumeau/fclc/
Nov 14 '05 #9
Emmanuel Delahaye <em**********@n oos.fr> wrote in message news:<Xn******* *************** *****@212.27.42 .66>...
In 'comp.lang.c', "jacob navia" <ja***@jacob.re mcomp.fr> wrote:
Most C programmers do not do it because is incredible
tedious:

if (strlen(src) < sizeof(dst))
strcpy(src,dst) ;

You see a lot of code like that?
I have this in my personal standard C library in my "STR" module :


<snip>
On my "str.h", I could add if I dared:

#undef strcpy()
#undef strcpy
#define strcpy(a, b) assert (0)


I tend to do...

assert(!"strcpy () disabled: use foobar()");

....because it makes the assert print something more useful than say...

Assertion failed at blah.c line 6: 0

--
Peter
Nov 14 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
4175
by: agascoig | last post by:
I am just starting to learn Java J2RE 5.0. I coded the matrix library below and found that the array accessing is really slow since each access is checked: http://f1.pg.briefcase.yahoo.com/bc/agascoig/vwp2?.tok=bccV1CVBYRRfg_eF&.dir=/Guest&.dnm=ComplexDoubleMatrixJava.zip&.src=bc It appears that with gcj and a -f switch I can turn off the...
50
6125
by: jacob navia | last post by:
As everybody knows, the C language lacks a way of specifying bounds checked arrays. This situation is intolerable for people that know that errors are easy to do, and putting today's powerful microprocessor to do a few instructions more at each array access will not make any difference what speed is concerned. Not all C applications are...
2
1806
by: Jeff | last post by:
Hello all! I created a successful program that reads data from a reliable tab-delimited file - or so I thought. After getting everything to work with small files, I changed the input to a larger file that was created by the same program with the same format so hopefully there shouldnt be any problem. Unfortunately after the change, my...
125
6501
by: jacob navia | last post by:
We hear very often in this discussion group that bounds checking, or safety tests are too expensive to be used in C. Several researchers of UCSD have published an interesting paper about this problem. http://www.jilp.org/vol9/v9paper10.pdf Specifically, they measured the overhead of a bounds
0
7985
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7830
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6071
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5387
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5111
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3496
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1962
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1082
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
784
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.