Safe subset of C? - Page 2

Robert Vazan

I am looking for other people's attempts to create safe subset of C and
enforce it with scripts. Does anybody know about anything like this?

By "safe", I mean the following:
* Strongly typed memory. No way to reinterpret it as bunch of bytes
* Recovery from invalid and NULL pointers other than crash
* Possibility to isolate piece of code by not giving it key pointers

Library used to support such safe subset must not introduce its own flaws.
For example, it is not a good idea to use int proxies for pointers like
Unix API does, because this allows pointer guessing and consequently
prevents isolation.

Nov 13 '05

Subscribe Reply

3893

Robert Vazan

On Sat, 22 Nov 2003 12:45:39 -0600, James Hu wrote:

On 2003-11-22, Robert Vazan <ro*********@pr ivateweb.sk> wrote:
Add arrays,
Why?

Bounds of simple C arrays can be looked up, but it is computationally
costly. It is better to store item count next to the array, which
implies custom array type, so no raw C arrays.
I was questioning myself about disallowing unions. Since acquiring
the value of a member union other than the last one stored into invokes
unspecified behavior, a "good enough" lint should be able to flag this.
That's heuristics. It catches such behavior sometimes, but not always.
Heuristic tools increase in complexity without bounds and they never quite
make it.
If unspecified and undefined behaviors are not allowed, memory
deallocation should be safe.
Deallocation invalidates all variables that pointed into freed area. I
need working verifier, not just 1000 pages of rules. Undefined behaviors
that appear during memory deallocation cannot be catched without aiding
verifier with extra syntax.
I guess one could require an interface for each type to be allocated and
deallocated:

int * malloc_int_arra y(int number_of_ints) ;
void free_int_array( int *int_array);

And have the free wrapper do whatever it needed to do to make sure it
was freeing something its corresponding wrapper allocated.

Standard malloc and free can do this already.

Your rules also prohibit interfaces.

How so?

I should have said virtual functions. Virtual functions need to downcast
pointer passed to them. C++ will do it invisibly and safely, but C
requires cast from void pointer to structure pointer.

Nov 13 '05 #11

Simon Biber

"Robert Vazan" <ro*********@pr ivateweb.sk> wrote:

On Sun, 23 Nov 2003 01:21:18 +0000, Gordon Burditt wrote:
I think it is possible to have the compiler compile this to "safe"
assembly language with one of three opcodes: halt EXIT_SUCCESS,
halt EXIT_FAILURE, or branch-to-self.

Sure, but I am uncertain whether your subset is really the only option.

I am reasonably certain that Gordon was joking!

However, it does bear some wisdom -- a completely 'safe subset' is
a pipe dream.

A static compile-time lint checker is quite limited; you can do a lot
more with run-time checking for array bounds, format specifiers,
generally regulating access to memory.

--
Simon.

Nov 13 '05 #12

James Hu

On 2003-11-23, Robert Vazan <ro*********@pr ivateweb.sk> wrote:

On Sat, 22 Nov 2003 12:45:39 -0600, James Hu wrote:
On 2003-11-22, Robert Vazan <ro*********@pr ivateweb.sk> wrote:
Add arrays,

Why?

Bounds of simple C arrays can be looked up, but it is computationally
costly. It is better to store item count next to the array, which
implies custom array type, so no raw C arrays.

You want a safe C subset with built-in runtime protection? Just use
a safer language.

In C, I would say your best option is to use tests to achieve code
coverage and boundary conditions on code that is instrumented
specifically to catch such errors, and this instrumentation should be
compile time removable once verification is complete.

Some of waht you want to do can be achieved through static analysis,
but requires extra hints provided in the form of stylized comments
that the preprocessor can understand.

I was questioning myself about disallowing unions. Since acquiring
the value of a member union other than the last one stored into
invokes unspecified behavior, a "good enough" lint should be able to
flag this.

That's heuristics. It catches such behavior sometimes, but not always.
Heuristic tools increase in complexity without bounds and they never
quite make it.

Of course they are complex. But writing provably correct code can also
increase in complexity without bounds (the complexity of writing the
code increases with the complexity of the software specification), and
some would argue they never quite make it either.

If unspecified and undefined behaviors are not allowed, memory
deallocation should be safe.

Deallocation invalidates all variables that pointed into freed area. I
need working verifier, not just 1000 pages of rules. Undefined
behaviors that appear during memory deallocation cannot be catched
without aiding verifier with extra syntax.

A runtime diagnostic tool, such as purify, can verify the correctness
of your program with proper test coverage.

I guess one could require an interface for each type to be allocated and
deallocated:

int * malloc_int_arra y(int number_of_ints) ;
void free_int_array( int *int_array);

And have the free wrapper do whatever it needed to do to make sure it
was freeing something its corresponding wrapper allocated.

Standard malloc and free can do this already.

My suggestion prevents allocating a structure and assigning it to some
other pointer type.

Your rules also prohibit interfaces.

How so?

I should have said virtual functions. Virtual functions need to
downcast pointer passed to them. C++ will do it invisibly and safely,
but C requires cast from void pointer to structure pointer.

Downcasting can be performed safely with the proper instrumentation .
The objects that is the context of the interface should be opaque,
and the function that creates such objects can set a special
field with a signature value that the other routines can check
against before attempting the downcast.

If my rules are relaxed to remove the union restriction (but still
prohibit unspecified and undefined behavior), as I had suggested
earlier, then the downcasting can be safely achieved via accessing union
members at the cost of explicitly enumerating the types that are safe to
downcast to.

-- James

Nov 13 '05 #13

Robert Vazan

On Mon, 24 Nov 2003 01:37:17 +1100, Simon Biber wrote:

I am reasonably certain that Gordon was joking!
I understood it too. Jokes are often used to make a claims that nobody can
argue with (it was joke, so what), but that still make it into minds of
people. I wanted to show that I don't share his pessimistic view.
However, it does bear some wisdom -- a completely 'safe subset' is
a pipe dream.
What, Java sandbox doesn't work? I must disable it in my browser...
Processes don't work? Poor ISPs granting shell access to customers. I know
that both Java and Unix have security holes, but the concept is good.
A static compile-time lint checker is quite limited; you can do a lot
more with run-time checking for array bounds, format specifiers,
generally regulating access to memory.

Supporting library can do run-time checking instead of language. Verifier
can then enforce use of the library. The art is to design it so that the
result still looks like C.

Nov 13 '05 #14

Robert Vazan

On Sun, 23 Nov 2003 10:41:40 -0600, James Hu wrote:

You want a safe C subset with built-in runtime protection? Just use
a safer language.
C has certain advantages like tool support, simplicity, and large share of
smart programmers. The only debugger for Java is in Microsoft's J++, AFAIK.
Some of waht you want to do can be achieved through static analysis,
but requires extra hints provided in the form of stylized comments
that the preprocessor can understand.
Stylized comments are acceptable.
But writing provably correct code can also
increase in complexity without bounds (the complexity of writing the
code increases with the complexity of the software specification), and
some would argue they never quite make it either.

Proof for certain aspects can be easy to inline into code and easy to
verify. Complexity grows up only if you try to prove everything. Allowing
small library to protect itself is just about enough for me.

Nov 13 '05 #15

Sheldon Simms

On Sun, 23 Nov 2003 18:40:21 +0100, Robert Vazan wrote:

On Mon, 24 Nov 2003 01:37:17 +1100, Simon Biber wrote:
A static compile-time lint checker is quite limited; you can do a lot
more with run-time checking for array bounds, format specifiers,
generally regulating access to memory.

Supporting library can do run-time checking instead of language. Verifier
can then enforce use of the library. The art is to design it so that the
result still looks like C.

Why? If you don't like how C works, why not just use a different language?

Nov 13 '05 #16

Richard Heathfield

Robert Vazan wrote:

<snip>

[...] C requires cast from void pointer to structure pointer.

No, it doesn't.

#include <time.h>
void foo(void *p)
{
struct tm *ptm = p; /* no cast required */
}

--
Richard Heathfield : bi****@eton.pow ernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #17

James Hu

On 2003-11-23, Robert Vazan <ro*********@pr ivateweb.sk> wrote:

On Sun, 23 Nov 2003 10:41:40 -0600, James Hu wrote:
Some of waht you want to do can be achieved through static analysis,
but requires extra hints provided in the form of stylized comments
that the preprocessor can understand.
Stylized comments are acceptable.

http://www.google.com/search?q=splint &btnI=I'm+Feeli ng+Lucky

But writing provably correct code can also increase in complexity
without bounds (the complexity of writing the code increases with the
complexity of the software specification), and some would argue they
never quite make it either.

Proof for certain aspects can be easy to inline into code and
easy to verify.

That is rather simplistic view, and it is a naive application that
leaves such verification code enabled all the time (e.g., verifying
qsort really sorted the array after each invocation).
Complexity grows up only if you try to prove everything.
Allowing small library to protect itself is just about
enough for me.

Complexity grows whenever the system you are verifying becomes more
complex. Suppose you are just verifying a small library. Whenever
you add a new interface, you have increased the complexity and the
proof burden. This is true both of interfaces you expose to clients
of the library, but also of interfaces to other sub-systems that
the small library is dependent upon.

Program correctness is getting to be off-topic for this newsgroup.
If you want to pursue the issue further, I would suggest following
up in comp.software-eng.

Anyway, most C programmers will use assert() (or implement their own
variation of it) to verify assumptions.

-- James

Nov 13 '05 #18

Sheldon Simms

On Sun, 23 Nov 2003 18:55:58 +0100, Robert Vazan wrote:

On Sun, 23 Nov 2003 10:41:40 -0600, James Hu wrote:
You want a safe C subset with built-in runtime protection? Just use
a safer language.
C has certain advantages like tool support, simplicity, and large share of
smart programmers.

I guess you figure that all those "smart programmers" are incapable of
using any other language. I can't speak for anyone else, but I wouldn't be
interested in working in crippled C. However, I have no problem learning a
new language if that's what the project requires.
The only debugger for Java is in Microsoft's J++, AFAIK.

There are many debuggers for Java, usually integrated in one of the very
many IDEs for Java. There is also a command line debugger that comes with
the standard java distribution.

Nov 13 '05 #19

Simon Biber

"Robert Vazan" <ro*********@pr ivateweb.sk> wrote in message
news:pa******** *************** *****@privatewe b.sk...

On Mon, 24 Nov 2003 01:37:17 +1100, Simon Biber wrote:
I am reasonably certain that Gordon was joking!
I understood it too. Jokes are often used to make a claims that nobody can
argue with (it was joke, so what), but that still make it into minds of
people. I wanted to show that I don't share his pessimistic view.
However, it does bear some wisdom -- a completely 'safe subset' is
a pipe dream.

What, Java sandbox doesn't work? I must disable it in my browser...

It has the potential for misuse, such as spamming lots of windows or
unkillable dialog boxes... see even the javascript (yes I know it's
not Java, but it's still an example of a sandboxed language):
while(1) alert("Please Click OK");
which on many (older) browsers required a forced kill of the program.
Processes don't work? Poor ISPs granting shell access to customers. I know
that both Java and Unix have security holes, but the concept is good.

Fewer and fewer ISPs do grant shell access in my experience. The costs
associated with system admin and general policing of customers are high.

A static compile-time lint checker is quite limited; you can do a lot
more with run-time checking for array bounds, format specifiers,
generally regulating access to memory.

Supporting library can do run-time checking instead of language. Verifier
can then enforce use of the library. The art is to design it so that the
result still looks like C.

So you need to regulate array access; how? Your supporting library must
hook into every single array access:

int int_item( const int *array, size_t index);
long long_item( const long *array, size_t index);
short short_item( const short *array, size_t index);
double double_item(con st double *array, size_t index);
float float_item( const float *array, size_t index);
char char_item( const char *array, size_t index);
etc.

Then you must redefine every single library function so it accesses arrays in
terms of these accessor functions?!

--
Simon.

Nov 13 '05 #20

Similar topics

2603

is there a safe marshaler?

by: Irmen de Jong | last post by:

Pickle and marshal are not safe. They can do harmful things if fed maliciously constructed data. That is a pity, because marshal is fast. I need a fast and safe (secure) marshaler. Is xdrlib the only option? I would expect that it is fast and safe because it (the xdr spec) has been around for so long. Or are there better options (perhaps 3rd party libraries)?

Python

2197

list of unique non-subset sets

by: les_ander | last post by:

Hi, I have many set objects some of which can contain same group of object while others can be subset of the other. Given a list of sets, I need to get a list of unique sets such that non of the set is an subset of another or contain exactly the same members. Tried to do the following: s1=set() s2=set() s3=set()

Python

7487

Random Subset of Range of Integers

by: Chris Dutrow | last post by:

I searched around on the net for a bit, couldn't find anything though. I would like to find some code for a function where I input A Range Of Integers For example: Function( 1, 100 ); And the function will return me an array holding a random subset of integers in that range of a size that I specify So the Function would Probabaly look something like this:

C / C++

1819

Is this thread safe?

by: Dan Bass | last post by:

I know that XslTransform's Transform is thread safe according to the MSDN, and that Load is not. I've therefore applied this simply Mutex to it and would just like to confirm this is okay. XslTransform xslt = null; Mutex mut = new Mutex(); public override string MapMessage ( string messageSource ) {

C# / C Sharp

1328

ASP.NET, source safe and promotion levels

by: Jon Paul Jones | last post by:

For some time now, I have been looking for a build system for ASP.NET that will merge the file based ease of deployment of classic ASP with the type saftey and prebuilt nature of ASP.NET with codebehind. The team that I work on manages several web sites and we have many different projects (not VS projects) going on in each site simultaneously. Each of these projects may have different deployment schedules. In a classis ASP world this was...

ASP.NET

1056

Safe eval critique (homework done)

by: Babar K. Zafar | last post by:

Hi guys! I know this subject has been beaten to death and I am not going to whine about lacking features for proper restricted execution in the Python runtime. It's the OS job, I get it. Anyways, I thought about using a restricted *subset* of the language for simple configuration scripts and storing data in a user-friendly way. I'm fully aware about the dangers of introducing "eval" into the picture so I took different route and...

Python

4452

Safe C library

by: jacob navia | last post by:

We have discussed often the proposition from Microsoft for a safer C library. A rationale document is published here by one of the members of the design team at microsoft: http://msdn.microsoft.com/msdnmag/issues/05/05/SafeCandC/default.aspx jacob

C / C++

4627

Thread safe socket code

by: jecheney | last post by:

Hi, Im currently using the following code for reading/writing to a network socket. private StreamReader clientStreamReader; private StreamWriter clientStreamWriter; .... TcpClient tcpClient = new TcpClient(server_host_name, server_port);

C# / C Sharp

2931

Is it safe to use UTF-8 in comments?

by: Szabolcs | last post by:

I am not familiar with the UTF-8 encoding, but I know that it encodes certain characters with up to four bytes. Is it safe to use UTF-8 encoded comments in C++ source files? For example, is there a remote possibility that some multi-byte character, when interpreted byte-by-byte, will contain */ and close the comment? Or is there something else that can go wrong?

C / C++

9497

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10363

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10164

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10110

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

9962

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8992

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

5398

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4067

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3670

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP