473,405 Members | 2,354 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

C++ Compiler with a -Wwarn-use-of-strcpy or similar option??

I need to automatically search and replace all fixed size
buffer strcpy's with strncpy's (or better yet, strlcpy's)
as a security and stability audit. The code base is large
and it is not feasable to manually perform these changes.

I would like perhaps a C++ parser that can automatically
detect use of a strcpy to a buffer of fixed size. For instance,

struct x {
char member[128];
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should generate a warning here */

but

struct x {
char *member;
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should NOT generate a warning */

(The second case is too complex to fix at this point.)

Is there any way of doing this? Our code is C++ (not C) and I
have, for example, looked at

http://codeworker.free.fr/ScriptsRepository.html

but this does not seem to provide an easy solution.

I am anticipating writing a script that can search and replace
"strcpy (x.member, p);" with "strlcpy (x.member, p, sizeof(x.member));"
provided the script can be guaranteed that the replacement is valid
(and I suppose only a full C++ parser would know if it is valid).

Can GCC be modified to give such a warning?

thanks

-paul
Jul 22 '05 #1
7 3306
On Fri, 3 Sep 2004 21:55:54 UTC, ps****@icon.co.za (Paul Sheer) wrote:
I need to automatically search and replace all fixed size
buffer strcpy's with strncpy's (or better yet, strlcpy's)
as a security and stability audit. The code base is large
and it is not feasable to manually perform these changes.

I would like perhaps a C++ parser that can automatically
detect use of a strcpy to a buffer of fixed size. For instance,

struct x {
char member[128];
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should generate a warning here */

but

struct x {
char *member;
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should NOT generate a warning */

(The second case is too complex to fix at this point.)

Is there any way of doing this? Our code is C++ (not C) and I
have, for example, looked at

http://codeworker.free.fr/ScriptsRepository.html

but this does not seem to provide an easy solution.

I am anticipating writing a script that can search and replace
"strcpy (x.member, p);" with "strlcpy (x.member, p, sizeof(x.member));"
provided the script can be guaranteed that the replacement is valid
(and I suppose only a full C++ parser would know if it is valid).

Can GCC be modified to give such a warning?

thanks

-paul


Paul,

I've worked for several companies that have generated their own
macros to watch the string and memory routines and inform at
compiler time if there was going to be a problem. There are
many times though when the code won't give you clues and even at
runtime you can't catch everything. However, it is generally
useful to use such macros occasionally, or as a rule, to test that
your code adheres to some reasonable practice. I've also seen
similar macros used to track memory leaks and similar offenses
that can sometime be checked this way. There are some public
domain versions of these as well. I don't have any to offer,
but I'll describe the general idea for you.

Say that you want to insure that strcpy tries to protect itself
against copying a large space into a smaller space. NULLs could
be detected at compile time. Certain run time conditinos could
also be checked for... such as NULL values.

The general method is to declare macros of the same name as
the function to check. The macro does compile time checking
and generates a call to an equivalent function that you may
fee is more suitable to your needs. A compile time switch is
sometimes used to differentiate when the real or enhanced code
is called. The macros must be defined before they are used
and the original functions reliably removed from view. #ifdefs
and such are generally used.

Consider the strcpy function. It is defined as

char *strcpy (char *destination, const char * const source);

or equivalent. The replacement function could be something
like the following.

char *StrCpy (char *destination, const char * const source);

In this case StrCpy might handle improper parameters a bit
better or provide some form of useful side effect such as
writing to a log file. For memory leak tracers, this is where
you keep a table of allocated memory and watch it get freed.
Upon a final, special call, your code would list memory not
freed and perhaps the module, line, and function that allocated
it.

The macro for strcpy would include comparing the pointers
to the size of a pointer on your system and if they aren't
that length, then they point to a measurable space defined
by sizeof(). You can then write an error/warning at compile
time that the sizeof a source is larger than the sizeof a
destination. There are likely many instances in your code
where it isn't possible to tell though.

As a general rule you should compile with the maximum warning
level and treat all warnings as errors. If the code is correct,
fix it so as not to issue the warning. Otherwise fix the code.
For groups that don't already do this, slowly increase the error
level and correct errors as you go. For very dirty code this may
take a long time. However, if you've seen one mistake, like
using an assignment in a condition, chances are it was also done
elsewhere. Look for all of that specific warning and fix the
code or remove the warning.

David
Jul 22 '05 #2
ps****@icon.co.za (Paul Sheer) wrote in news:9ec0d967.0409031355.1f5bb1
@posting.google.com:
I need to automatically search and replace all fixed size
buffer strcpy's with strncpy's (or better yet, strlcpy's)
as a security and stability audit. The code base is large
and it is not feasable to manually perform these changes.


Err... why not grep for them? (OK, granted I'm assuming you're on a
unixish platform....)
Jul 22 '05 #3
ps****@icon.co.za (Paul Sheer) wrote in message news:<9e************************@posting.google.co m>...
I need to automatically search and replace all fixed size
buffer strcpy's with strncpy's (or better yet, strlcpy's)
as a security and stability audit. The code base is large
and it is not feasable to manually perform these changes.

I would like perhaps a C++ parser that can automatically
detect use of a strcpy to a buffer of fixed size. For instance,


Paul,

All that seems to be a bit of overkill if you source code base is
large. Do you also plan on replacing the entire str* and mem* family?

There a many tools on the market to help diagnose buffer overruns.
Take a look at mpatrol and Valgrind for example for Unix platforms.
Boundscheck and Purify work on Windows. My company markets Dynamic
Leak Check (DLC) for this purpose. The DLC checks for heap overruns,
leaks and many other errors. The big advantage of the DLC is that no
recompile is required.

To answer your question directly, I don't know of any tools that do
what you are looking for. I am a bit of an expert in the area. You
may check for tools that do 'static analysis'.

If you are motivated, it should be easy to modify gcc to give a
warning if strcpy is called. You could also use the LD_PRELOAD on many
platform to overload strcpy and spit out a runtime error.

Matthew
Dynamic Memory Solutions
www.dynamic-memory.com
Jul 22 '05 #4
[[I'll just reply to all three responses together...]]
From: Matthew Fisher (ma*****@dynamic-memory.com)
I need to automatically search and replace all fixed size [...]
All that seems to be a bit of overkill if you source code base is
large. Do you also plan on replacing the entire str* and mem* family?


we are going to start with strcpy which is very widely
used in our source code.

i personally believe it is currently the biggest problem with
this code base.

There a many tools on the market to help diagnose buffer overruns.
i do use valgrind on Linux.
Take a look at mpatrol and Valgrind for example for Unix platforms.
Boundscheck and Purify work on Windows. My company markets Dynamic
Leak Check (DLC) for this purpose. The DLC checks for heap overruns,
leaks and many other errors. The big advantage of the DLC is that no
recompile is required.
yes, but what if no overruns are detected with the test
data that you run the program with?

i really need to solve potential problems for the future
stability of the code.

To answer your question directly, I don't know of any tools that do
what you are looking for. I am a bit of an expert in the area. You
may check for tools that do 'static analysis'.

If you are motivated, it should be easy to modify gcc to give a
warning if strcpy is called. You could also use the LD_PRELOAD on many
platform to overload strcpy and spit out a runtime error.
i started looked at gcc. i got expand_builtin_strcpy to give the
warning i want. however, it is taking much time to understand how
gcc's parse tree is organized and i can't seem to get it working
for the general case.

any help would be appreciated.

------------------- From: David (Fl************@United.Com)
I need to automatically search and replace all fixed size
I've worked for several companies that have generated their own
macros to watch the string and memory [...]


yes, me too...

off the top of my head, something like,

#define strcpy(a,b) \
do { \
char *dummy; \
fprintf (stderr, "%s:%d: Warning, unbounded strcpy(%s,%s)\n", \
__FILE__, __LINE__, #a, #b); \
if (sizeof(dummy) == sizeof(a)) real_strcpy (a, b); \
else strlcpy (a, b, sizeof(a)); \
} while (0)

is only a partial solution. it can still break many things.

------------------- From: Andre Kostur (nn******@kostur.net)
I need to automatically search and replace all fixed size


Err... why not grep for them? (OK, granted I'm assuming you're on a
unixish platform....)


a blind search and replace across a gig of code WILL get me fired.

i have to GUARANTEE that my changes will not break anything.

-paul
Jul 22 '05 #5
ok, managed to patch gcc to do this

diff follows...

-paul

---------------

diff -u -r gcc-ORIG/builtins.c gcc/builtins.c
--- gcc-ORIG/builtins.c 2003-05-05 18:59:13.000000000 +0200
+++ gcc/builtins.c 2004-09-05 16:00:47.000000000 +0200
@@ -74,7 +74,6 @@
tree built_in_decls[(int) END_BUILTINS];

static int get_pointer_alignment PARAMS ((tree, unsigned int));
-static tree c_strlen PARAMS ((tree));
static const char *c_getstr PARAMS ((tree));
static rtx c_readstr PARAMS ((const char *,
enum machine_mode));
@@ -232,7 +231,7 @@
Unfortunately, string_constant can't access the values of const
char
arrays with initializers, so neither can we do so here. */

-static tree
+tree
c_strlen (src)
tree src;
{
diff -u -r gcc-ORIG/c-tree.h gcc/c-tree.h
--- gcc-ORIG/c-tree.h 2002-09-16 20:33:18.000000000 +0200
+++ gcc/c-tree.h 2004-09-05 16:06:49.000000000 +0200
@@ -323,4 +323,7 @@
extern GTY(()) tree static_ctors;
extern GTY(()) tree static_dtors;

+/* in builtins.c */
+extern tree c_strlen PARAMS ((tree));
+
#endif /* ! GCC_C_TREE_H */
diff -u -r gcc-ORIG/c-typeck.c gcc/c-typeck.c
--- gcc-ORIG/c-typeck.c 2003-04-18 08:50:45.000000000 +0200
+++ gcc/c-typeck.c 2004-09-05 16:19:36.000000000 +0200
@@ -57,6 +57,7 @@
static tree decl_constant_value_for_broken_optimization PARAMS
((tree));
static tree default_function_array_conversion PARAMS ((tree));
static tree lookup_field PARAMS ((tree, tree));
+static void warn_bad_usage PARAMS ((tree, tree, tree, tree));
static tree convert_arguments PARAMS ((tree, tree, tree, tree));
static tree pointer_diff PARAMS ((tree, tree));
static tree unary_complex_lvalue PARAMS ((enum tree_code, tree,
int));
@@ -1525,6 +1526,10 @@
/* fntype now gets the type of function pointed to. */
fntype = TREE_TYPE (fntype);

+ /* Warn about use of strcpy to a buffer of fixed sized and
+ other common programming errors. */
+ warn_bad_usage (TYPE_ARG_TYPES (fntype), params, name, fundecl);
+
/* Convert the parameters to the types declared in the
function prototype, or apply default promotions. */

@@ -1559,6 +1564,61 @@
return require_complete_type (result);
}

+/* Check the function call and arguments against certain good
+ programming ethics. TYPELIST, VALUES, NAME, FUNDECL have the
+ same sense as in convert_arguments below */
+
+static void
+warn_bad_usage (typelist, values, name, fundecl)
+ tree typelist, values, name, fundecl;
+{
+ tree typetail, valtail;
+ tree parm[3];
+ int parmnum;
+
+ for (valtail = values, typetail = typelist, parmnum = 0;
+ valtail; valtail = TREE_CHAIN (valtail), parmnum++)
+ {
+ tree type = typetail ? TREE_VALUE (typetail) : 0;
+ tree val = TREE_VALUE (valtail);
+ if (type == void_type_node)
+ return;
+
+ /* See convert_arguments below */
+ if (TREE_CODE (val) == NON_LVALUE_EXPR)
+ val = TREE_OPERAND (val, 0);
+ parm[parmnum] = val;
+ }
+
+ switch (DECL_FUNCTION_CODE (fundecl))
+ {
+ case BUILT_IN_STRCPY:
+ if (ARRAY_TYPE == TREE_CODE (TREE_TYPE (parm[0])))
+ {
+ tree dlen = c_sizeof (TREE_TYPE (parm[0]));
+ tree slen = c_strlen (parm[1]);
+ if (slen == 0)
+ {
+ warning
+ ("strcpy to an array of fixed size, use strncpy instead, or better
yet, use strlcpy");
+ }
+ else
+ {
+ slen = size_binop (PLUS_EXPR, slen, ssize_int (1));
+ if (tree_int_cst_lt (dlen, slen))
+ {
+ error
+ ("trying to strcpy a constant string to a fixed size array of
insufficient size",
+ IDENTIFIER_POINTER (name));
+ }
+ }
+ }
+ break;
+ default:
+ return;
+ }
+}
+
/* Convert the argument expressions in the list VALUES
to the types in the list TYPELIST. The result is a list of
converted
argument expressions.
diff -u -r gcc-ORIG/expr.c gcc/expr.c
--- gcc-ORIG/expr.c 2003-04-23 01:08:15.000000000 +0200
+++ gcc/expr.c 2004-09-05 15:49:44.000000000 +0200
@@ -9471,7 +9471,12 @@
{
STRIP_NOPS (arg);

- if (TREE_CODE (arg) == ADDR_EXPR
+ if (TREE_CODE (arg) == STRING_CST)
+ {
+ *ptr_offset = size_zero_node;
+ return arg;
+ }
+ else if (TREE_CODE (arg) == ADDR_EXPR
&& TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST)
{
*ptr_offset = size_zero_node;
diff -u -r gcc-ORIG/tree.h gcc/tree.h
--- gcc-ORIG/tree.h 2003-03-24 19:59:37.000000000 +0200
+++ gcc/tree.h 2004-09-04 20:05:32.000000000 +0200
@@ -155,6 +155,8 @@
unsigned lang_flag_5 : 1;
unsigned lang_flag_6 : 1;
unsigned unused_1 : 1;
+
+ tree param_before_array_conversion;
};

/* The following table lists the uses of each of the above flags and
Jul 22 '05 #6
> I need to automatically search and replace all fixed size
buffer strcpy's with strncpy's (or better yet, strlcpy's)
as a security and stability audit. The code base is large
and it is not feasable to manually perform these changes.

I would like perhaps a C++ parser that can automatically
detect use of a strcpy to a buffer of fixed size. For instance,

struct x {
char member[128];
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should generate a warning here */

but

struct x {
char *member;
}
...
struct x X;
...
strcpy (X.member, p); /* <-- should NOT generate a warning */

(The second case is too complex to fix at this point.)


Well you need no any additional parser or whatever, just C++.
Write simple header file strcpy2.h :

------------------------------------------
class decorated_char_p
{
private: char* p_;
public: decorated_char_p( char* p ) : p_(p) { }
operator char* () { return p_; }
};

void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }
template< int N > void strcpy2( char (&arg)[N] ) {
int x[0]; // force syntax error at instantiation
}
#define strcpy strcpy2
------------------------------------------

Then include it to all .cpp files where you have references to strcpy.
You need to include it AFTER inclusion of string.h, just put it after
last #include in the beginning of the file.

You should have syntax error like "cannot allocate an array of
constant size 0"
whenever you tend to use strcpy with an array.

We need class decorated_char_p because if we overload template
function with no conversion of argument, compiler always choses
nontemplate function. Thus we have to introduce additional "barrier"
with implicit construction of decorated_char_p object.

-- Mikhail Kupchik
University of KPI, Kiev, Ukraine
Jul 22 '05 #7
>
Well you need no any additional parser or whatever, just C++.
Write simple header file strcpy2.h :

------------------------------------------
class decorated_char_p
{
private: char* p_;
public: decorated_char_p( char* p ) : p_(p) { }
operator char* () { return p_; }
};

void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }
template< int N > void strcpy2( char (&arg)[N] ) {
int x[0]; // force syntax error at instantiation
}
#define strcpy strcpy2
------------------------------------------

Then include it to all .cpp files where you have references to strcpy.
You need to include it AFTER inclusion of string.h, just put it after
last #include in the beginning of the file.

You should have syntax error like "cannot allocate an array of
constant size 0"
whenever you tend to use strcpy with an array.

We need class decorated_char_p because if we overload template
function with no conversion of argument, compiler always choses
nontemplate function. Thus we have to introduce additional "barrier"
with implicit construction of decorated_char_p object.

-- Mikhail Kupchik


I made a minor change, but YES, this actually works.
It's quite brilliant actually. See the first strcpy gives no warning,
the second does. (My updated gcc patch can be gotten from the
gcc-patches mailing list BTW.)

#include <string>

class decorated_char_p
{
private: char* p_;
public: decorated_char_p( char* p ) : p_(p) { }
operator char* () { return p_; }
};

void strcpy2( decorated_char_p p1, const char* p2 ) { strcpy(p1,p2); }

template< int N > void strcpy2( char (&arg)[N], const char* p2 ) {
strcpy_to_fixed_array_warning();
}

#define strcpy strcpy2

char p[256];

void
foo (void)
{
char *q;
q = p;
strcpy (q, "hello");
strcpy (p, "hello");
}

-paul
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: TQ Pham | last post by:
Hi, How can you disable finstrument-functions option for included files ? For example: I have A.cxx, which include B.hxx I want to compile A with -finstrument-functions, but it always do the...
4
by: Paul Sheer | last post by:
I need to automatically search and replace all fixed size buffer strcpy's with strncpy's (or better yet, strlcpy's) as a security and stability audit. The code base is large and it is not feasable...
15
by: Ramaraj M Bijur | last post by:
Hi All, Could anyone help me to resolve following problem in C the IDE used is Microsoft VC++, Please let me know the which option in VC++ will do the needful The problem statement:...
5
by: saravanan.linux | last post by:
Hi, Is there any way to find out, whether the executable is compiled with -g option enabled? ThankX in advace.
0
by: tombeaux | last post by:
is there a compiler or linker option to increase the stack size in C#? need to step through code in Debug mode so editing the compiled binary with EDITBIN isn't viable.
8
by: Charles Sullivan | last post by:
I have a program written in C under Linux (gcc) which a user has ported to run under AT&T SysV R4. He sent me a copy of his makelog which displays a large number of compiler warnings similar to...
19
by: David W | last post by:
float nanometers(long pm) { return pm / 1000.0f; } void f() { if(nanometers(309311L) == nanometers(309311L)) { // do something }
5
by: Gregor Kofler | last post by:
Refactoring my old DualSelectBox , I thought it would be much smarter to get rid of the new Option() stuff, and instead just switch childNodes between the two select elements. Something like that:...
2
by: =?Utf-8?B?TUNN?= | last post by:
I am using the compiler: Microsoft.VisualBasic.VBCodeProvider What are the different options I can specify using <providerOption>? So far I have: <system.codedom> <compilers> <compiler...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.