Pointer to "base" type - what does the Standard say about this? - Page 2

Stephan Beal

Hi, all!

Before i ask my question, i want to clarify that my question is not
about the code i will show, but about what the C Standard says should
happen.

A week or so ago it occurred to me that one can implement a very basic
form of subclassing in C (the gurus certainly already know this, but
it was news to me). What i've done (shown below) seems to work all
fine and well, and does exactly what i'd expect, but i'm asking about
it because when i switched to a higher optimization level on gcc i
started getting warnings about type-punned pointers violating "strict
mode." That got me wondering, does it mean "strict C mode" or "strict
GCC mode"? i don't care much about the latter, as long as i comply
with the former. To be clear (again), my question is not GCC-specific.
My question is whether or not the approach i've taken here is legal
according to The Standard. i only ask because GCC suggests (at some
optimization levels, anyway) that i might be violating some C rule
without knowing i'm doing so.

The code (sorry for the length - it's about as short as i can make
this example in C while still keeping it readable):

// ------------------- begin code
#include <stdio.h>
#include <stdlib.h>

struct base_type; // unfortunate fwd decl
// Public API for base_type objects:
struct base_public_api
{
void (*func1)( struct base_type const * self );
long (*func2)( struct base_type const * self, int );
};
typedef struct base_public_api base_public_api ;

// Base-most type of the abstract interface
struct base_type
{
base_public_api api;
};
typedef struct base_type base_type;

// Implementation of base_type abstract interface
struct sub_type
{
base_public_api api;
int member1;
};
typedef struct sub_type sub_type;

#define MARKER if(1) printf("MARKER: %s:%d:%s():
\n",__FILE__,__ LINE__,__func__ ); if(1) printf

#define SUBP ((sub_type const *)self)
void impl_f1( base_type const * self )
{
MARKER("SUBP->member1=%d\n", SUBP->member1);
}
long impl_f2( base_type const * self, int x )
{
return SUBP->member1 * x;
}

// Now here's the part which is dubious: note the concrete types here:
static const sub_type sub_type_inst = { {impl_f1,impl_f 2}, 42 };
static base_type const * sub_inst = (base_type const*) &sub_type_in st;
// ^^^^ "warning: dereferencing type-punned pointer will break strict-
aliasing rules"

int main( int argc, char const ** argv )
{

sub_inst->api.func1(sub_ inst);
MARKER("func2() ==%ld\n", sub_inst->api.func2(sub_ inst, 2) );
return 0;
}
// ------------------- end code

On my box that looks like:
stephan@jareth: ~/tmp$ ls -la inher.c
-rw-r--r-- 1 stephan stephan 1184 2008-11-05 14:43 inher.c
stephan@jareth: ~/tmp$ make inher
cc inher.c -o inher
stephan@jareth: ~/tmp$ ./inher
MARKER: inher.c:34:impl _f1():
SUBP->member1=42
MARKER: inher.c:48:main ():
func2()==84
Am i headed down a Dark Path with this approach? Or is there a better/
more acceptable approach to simulating single inheritance in C? (i'm
not abject to changing the model, but i really do need some form of
separate interface/implementation for what i'm doing.)

Many thanks in advance for your insights.

PS (not relevant to the question, really): what's the point of all
that? i'm working on a library where i really need abstract base
interfaces (with only one level of inheritance necessary), and this
approach seems to be fairly clear (though a tad bit verbose at times).
i've used it to implement subclasses of an abstract stream interface,
for example, so my library can treat FILE handles and in-memory
buffers (or client-supplied stream types, with an appropriate wrapper)
with the same read/write API.

PS2: my appologies for the dupe post on comp.lang.c.mod erated - i
inadvertently posted to that group.

Nov 5 '08

Subscribe Reply

2458

Andrey Tarasevich

Stephan Beal wrote:

>
Many thanks in advance for your insights.

The technique you describe has been used in C since forever. The other
posters already gave you a quote form the language specification, which
validates this useful technique, and which was actually included into
the language specification specifically for that purpose.

(The valid struct<->first member conversion actually made it into C++
specification as well).

--
Best regards,
Andrey Tarasevich

Nov 5 '08 #11

Stephan Beal

On Nov 5, 7:57 pm, Hallvard B Furuseth <h.b.furus...@u sit.uio.no>
wrote:

Stephan Beal writes:
Doh, i spoke to soon:

http://www.cellperformance.com/mike_...standing_stric...

Says:

"In C99, it is illegal to create an alias of a different type than the
original. This is often refered to as the strict aliasing rule."

Not quite. What is illegal, with some exceptions, is to access an

<HUGE snip

Wow, thanks for that! Now i've got some reading to do :).

The base struct macro you show is basically how i'm initiaizing my
subclasses, with the exception that i have on extra degree of
indirection - the subtypes hold an object which itself holds the
common API (that approach seems to simplify maintenance of the
subclass implementations in the event of a change in the base API).

Nov 5 '08 #12

Stephen Sprunk

Stephan Beal wrote:

On Nov 5, 6:36 pm, Jean-Marc Bourguet <j...@bourguet. orgwrote:
>The problem isn't with a coffee machine -- the compiler for which will
probably have a simple optimizer -- but with a complex optimizer which uses
the aliasing rules to drives the optimisation. And so code will
works... until the optimizer see an opportunity to use the fact that two
things shouldn't alias.

Optimization is certainly a potential problem. i can conceive that for
some reason a compiler might optimize or pad these differently:

struct sub1
{
base_api api;
int m1;
double m2;
};

struct sub1
{
base_api api;
double m1;
char const * m2;
};

but my knowledge of explicit optimizations done by any given compiler
is pretty minimal. i'm much better versed in C++ than C.

Those structs will likely be padded differently, yes, but you _should_
be able to cast from one to the other as long as you only access the
initial elements that they have in common (in this case, only "api").
Compilers are required to pad consistently enough that, as far into the
struct as the element types remain the same, they will be at the same
offsets; this is deliberate to allow casts to access them.

Think about this example:

struct point {
int x;
int y;
};
struct circle {
int x;
int y;
int radius;
};
struct rect {
int x;
int y;
int width;
int height;
};

Any time you need a struct point, you can safely cast a struct circle
and access x or y. This is very, very bare-bones inheritance and
polymorphism.

>In the OP case, there is hope. Replacing

static base_type const * sub_inst = (base_type const*) &sub_type_in st;

by

static base_type const * sub_inst = &sub_type_inst. api;

That was my original thought, but the point of passing (base_type
[const]*) as the "self" argument of base_type_api was to give me a
level of indirection which i want (for storage of subtype-specific
data without requiring subclasses to literally redeclare the whole
public API), and i lose that (and features based off of it) if i pass
a (base_type_api* ) and cast it to a (sub_type*) (which in my eyes is
just plain wrong, even if it might work in this case).

You could also do the above as:

struct point {
int x;
int y;
};
struct circle {
struct point center;
int radius;
};

The syntax to access x and y isn't quite as pretty, but the layout in
memory will be the same (a pointer to a struct is guaranteed to be
equivalent to a pointer to its first element) and the compiler should be
quiet if you cast a struct circle to a struct point.

I haven't read all your code due to the length, so I'm not entirely sure
this helps, but I've used the same tricks in OO code of my own.

gcc doesn't complain until i turn on -O2 or higher or turn on the -
fstrict-aliasing flag. tcc doesn't complain at all.

If TCC doesn't complain, it probably doesn't have enough optimizing
intelligence to care about aliasing problems. GCC is pretty aggressive
in that area, but there's a huge cost in complexity to detect aliasing
(or lack thereof), which TCC probably can't afford given its name.

S

Nov 5 '08 #13

jameskuyper

Stephen Sprunk wrote:
....

Those structs will likely be padded differently, yes, but you _should_
be able to cast from one to the other as long as you only access the
initial elements that they have in common (in this case, only "api").
Compilers are required to pad consistently enough that, as far into the
struct as the element types remain the same, they will be at the same
offsets; this is deliberate to allow casts to access them.

The relevant section of the standard makes that guarantee only if the
two structs are members of the same union. In practice, it generally
works for a much wider range of cases than the ones guaranteed by the
standard.

Nov 5 '08 #14

Eric Sosman

Stephen Sprunk wrote:

Stephan Beal wrote:
>>
Optimization is certainly a potential problem. i can conceive that for
some reason a compiler might optimize or pad these differently:

struct sub1
{
base_api api;
int m1;
double m2;
};

struct sub1

'sub2', I think?

>{
base_api api;
double m1;
char const * m2;
};

but my knowledge of explicit optimizations done by any given compiler
is pretty minimal. i'm much better versed in C++ than C.

Those structs will likely be padded differently, yes, but you _should_
be able to cast from one to the other as long as you only access the
initial elements that they have in common (in this case, only "api").
Compilers are required to pad consistently enough that, as far into the
struct as the element types remain the same, they will be at the same
offsets; this is deliberate to allow casts to access them.
[...]

See James Kuyper's response, but the struct layout is not the
only issue. Even if the layouts are as you want them, the compiler
is allowed to "know" that a `struct sub1*' and a `struct sub2*' do
not point to the same thing (see Hallvard Furuseth's response). An
aggressive optimizer might assume that storing to the `api' element
of something pointed at by a `struct sub1*' does not affect the `api'
element of something pointed at by a `struct sub2*', and if in fact
they point at the same memory odd things could happen.

One clean way to handle this is to pack all the allied types
into a union, but this requires that you know all those types up
front, which can be irksome. Another clean way to handle it is to
use a `base_api*' and point it at the `api' element of whichever
structs you're dealing with. Since the `api' is the first element
in each struct, it is always safe to convert the struct pointer to
a `base_api*' and back.

--
Er*********@sun .com

Nov 5 '08 #15

CBFalconer

Stephan Beal wrote:

Antoninus Twink <nos...@nospam. invalidwrote:

.... snip ...

>
>If your interested in head-on-a-pin discussions about whether
it will work on embedded C for a coffee machine with half the
standard library missing, the "regulars" will no doubt be along
soon with their usual grandstanding answers.

i'm only interested in conforming to The Standard. My programming
won't allow me to sleep at night if i knowingly make use of a
compiler-specific extension (with the exception of a couple very
common extensions, like free placement of var decls in functions,
instead of all at the front).

Then you should entirely ignore Twink. He is a troll, and only
interested in disturbing the newsgroup.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.

Nov 5 '08 #16

CBFalconer

Stephan Beal wrote:

Jean-Marc Bourguet <j...@bourguet. orgwrote:

>The problem isn't with a coffee machine -- the compiler for which
will probably have a simple optimizer -- but with a complex
optimizer which uses the aliasing rules to drives the
optimisation . And so code will works... until the optimizer see
an opportunity to use the fact that two things shouldn't alias.

Optimization is certainly a potential problem. i can conceive that
for some reason a compiler might optimize or pad these differently:

....

The following references may be helpful. The C99 ones are the
standard, while the n869_txt.bz2 is a bzipped version of n169.txt,
which in turn is the last version available as a text file.

Some useful references about C:
<http://www.ungerhu.com/jxh/clc.welcome.txt >
<http://c-faq.com/ (C-faq)
<http://benpfaff.org/writings/clc/off-topic.html>
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf(C99)
<http://cbfalconer.home .att.net/download/n869_txt.bz2 (pre-C99)
<http://www.dinkumware. com/c99.aspx (C-library}
<http://gcc.gnu.org/onlinedocs/ (GNU docs)
<http://clc-wiki.net/wiki/C_community:com p.lang.c:Introd uction>

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home .att.net>
Try the download section.

Nov 6 '08 #17

Stephan Beal

On Nov 5, 11:07 pm, Eric Sosman <Eric.Sos...@su n.comwrote:

See James Kuyper's response, but the struct layout is not the
only issue. Even if the layouts are as you want them, the compiler
is allowed to "know" that a `struct sub1*' and a `struct sub2*' do
not point to the same thing (see Hallvard Furuseth's response). An
aggressive optimizer might assume that storing to the `api' element
of something pointed at by a `struct sub1*' does not affect the `api'
element of something pointed at by a `struct sub2*', and if in fact
they point at the same memory odd things could happen.

That thought kept me up much of the night :(.

>
One clean way to handle this is to pack all the allied types
into a union, but this requires that you know all those types up
front, which can be irksome. Another clean way to handle it is to
use a `base_api*' and point it at the `api' element of whichever
structs you're dealing with. Since the `api' is the first element
in each struct, it is always safe to convert the struct pointer to
a `base_api*' and back.

Coincidentally, that's the approach was is on my list of trying out
tonight. It requires the fewest changes and "seems" to be safest
(aside from the Union) so far.

:)

Nov 6 '08 #18

Stephan Beal

A follow up on how i got to a safe solution...

On Nov 6, 7:07 am, Stephan Beal <sgb...@googlem ail.comwrote:

On Nov 5, 11:07 pm, Eric Sosman <Eric.Sos...@su n.comwrote:
One clean way to handle this is to pack all the allied types
into a union, but this requires that you know all those types up

i've ruled out the union idea because concrete impls can be provided
by client code, and i obviously can't link those in to my lib.

front, which can be irksome. Another clean way to handle it is to
use a `base_api*' and point it at the `api' element of whichever
structs you're dealing with. Since the `api' is the first element
in each struct, it is always safe to convert the struct pointer to
a `base_api*' and back.

Here's what i've ended up doing, which offers both an approach with
the safety guaranty approach and the extension-which-might-work-but-is-
technically-unsafe approach:

typedef struct base_api {
void (*member1)( struct foo const * self );
int (*member1)( struct foo const * self, int arg1 );
...
void const * implData;
} base_api;

Now my Base type looks like:

typedef struct base {
base_api const * api;
};

(This extra level of indirection isn't really necessary any longer,
and i may get rid of it.)

For my particular cases, all of my implementations can (and in fact
should) be initialized with constant, immutable data (it may be
instance-specific but should be immutable). With this approach i no
longer need concrete "subclasses " - i only need concrete
implementations of base, which allows me to completely avoid the
((base*)mySubT) cast. The impl functions can require that the api-

>implData object be set to some implementation-specific value, which

the impls can then cast to their heart's content.

What's all this for?

As part of c11n (http://s11n.net/c11n/) i need abstract interfaces for
3 particular object types. The interfaces are used by the rest of the
API and only care that impls follow the rules defined in the API docs
for the base class API. For example, i have an interface called
c11n_marshaller , which is a marshaller type for de/serializing objects
of a specific type (we need one implementation/instance per
serializable type). Some common cases (e.g. well-known PODs) can be
combined into a single implementation of the base_api functions,
differing only in the metadata they need for the marshalling
conversion. To do this we point the api->implData to some instance-
specific static struct containing that metadata which differs from POD
type to POD type (e.g. a printf/scanf specifier). For the c11n_stream
interface, the (void * implData) (non-const) member will hold info for
the underlying native stream object (e.g. FILE handle or in-memory
buffer).

Anyway...

Thanks a thousand times to all of you for your feedback - it's helped
me move away from a potentially horrible design mistake!

Nov 6 '08 #19

Similar topics

2104

"array of Derived" is not a kind-of "array of Base" question

by: Joseph Turian | last post by:

Fellow hackers, I have a class BuildNode that inherits from class Node. Similarly, I have a class BuildTree that inherits from class Tree. Tree includes a member variable: vector<Node> nodes; // For clarity, let this be "orig_nodes" BuildTree includes a member variable:

C / C++

1511

C# 14.5.5.1: what means "all methods declared in a base type of T are removed"?

by: Alex Sedow | last post by:

Method invocation will consist of the following steps: 1. Member lookup (14.3) - evaluate method group set (Base.f() and Derived.f()) .Standart say: "The compile-time processing of a method invocation of the form M(A), where M is a method group and A is an optional argument-list, consists of the following steps: The set of candidate methods for the method invocation is constructed. Starting with the set of methods associated with M,...

C# / C Sharp

1717

Impementing "Base" pages.

by: Wade | last post by:

Hi all, We have created some "Base" class pages for our WebForms and UserControls. For instance, when we create a WebForm called "WebForm1.aspx", instead of inheriting from "System.Web.UI.Page" we implement from our "Base" class page which itself inherits from "System.Web.UI.Page" -- I know, pretty standard. We do the same with our UserControls, instead they inherit from "System.Web.UI.UserControl". Now, there are some methods that we...

ASP.NET

2249

"this" and "base"

by: relient | last post by:

Question: Why can't you access a private inherited field from a base class in a derived class? I have a *theory* of how this works, of which, I'm not completely sure of but makes logical sense to me. So, I'm here for an answer (more of a confirmation), hopefully. First let me say that I know people keep saying; it doesn't work because the member "is a private". I believe there's more to it than just simply that... Theory: You inherit,...

C# / C Sharp

1462

passing derived type to client who only knows base type

by: reckless2k | last post by:

Client side; knows nothing of Derived: class Base { ... virtual void do_something() } #include "Base.h" void main() {

C / C++

1587

Why doesn't can't a vector of "Derived" be passed to function takingvector of "Base"?

by: Rob | last post by:

I have these classes (elided methods): class Base { public: Base(string name) {...} }; class Derived : public Base {

C / C++

3600

Problem with static downcast of base type to derived type

by: Dom Jackson | last post by:

I have a program which crashes when: 1 - I use static_cast to turn a base type pointer into a pointer to a derived type 2 - I use this new pointer to call a function in an object of the derived type 3 - this function then 'grows' the derived type object (by pushing onto a vector).

C / C++

10199

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9979

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8861

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7393

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6661

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5433

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3948

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3551

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2810

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General