473,626 Members | 3,531 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Object persistence in C

I am writing software to make a general storage
facility of any kind of objects to/from disk.

The intermeidate format used is XML, using the schema
(modified a bit) of Microsoft: xmlns="x-schema:xop-schema.xml"

Operation:
----------
The software generates several C functions that implement the
writing of the XML. To make things more concrete suppose
the following setup:

typedef struct tagG {
int tab[10];
} Tab;
typedef struct tagstruct {
char a;
short b;
int c;
unsigned d;
long e;
long long f;
long double g;
double h;
char * str;
Tab tab;
struct tagstruct *Next;
} structure;

The "wizard" software generates the following functions:
----------------------------------------------
//@ Serialization function for structure structure
int structureSerial ize(structure *data,FILE *out)
{
int i;
unsigned char *p;
if (data == NULL)
return 0;
if (!initialized) {
InitXmlWriter(o ut);
initialized=1;
}
fprintf(out,"<O bject id=\"ID%x\"
typename=\"stru cture\">\n",(in t)data);
fprintf(out,"\t <byte name=\"a\">%d</byte>\n",data->a);
fprintf(out,"\t <int name=\"b\">%d</int>\n",data->b);
fprintf(out,"\t <int name=\"c\">%d</int>\n",data->c);
fprintf(out,"\t <unsignedInt
name=\"d\">%u</unsignedInt>\n" ,data->d);
fprintf(out,"\t <int name=\"e\">%d</int>\n",data->e);
fprintf(out,"\t <long name=\"f\">%ll</long>\n",data->f);
// Type long double not supported natively.
// Using hexadecimal encoding
p = (unsigned char *)&data->g;
fprintf(out,"\t <bin.hex name=\"g\">");
for(i=0; i<12;i++) {
fprintf(out,"%x ",*(p++) & 0xff);
}
fprintf(out,"</bin.hex>\n");
fprintf(out,"\t <double name=\"h\">%.15 g</double>\n",data->h);
// Assume char * points to strings
fprintf(out,
"\t<string name=\"str\" xml:space=\"pre serve\">%s</string>\n",
data->str);
fprintf(out,"\t <IDREF name=\"tab\">ID %x</IDREF>\n",&data->tab);
fprintf(out,"\t <IDREF name=\"Next\">I D%x</IDREF>\n",data->Next);
fprintf(out,"</Object>\n");
structureSerial ize(data->Next,out); // follow the Next pointer
TabSerialize(&d ata->tab,out); // Follow embedded structures
return 1;
}
-----------------------------------------------------------------
This function, when called will generate the following xml:
----------------------------------------------------
<Object id="ID12ff00" typename="struc ture">
<byte name="a">-56</byte>
<int name="b">3876</int>
<int name="c">-254</int>
<unsignedInt name="d">598877 </unsignedInt>
<int name="e">777899 </int>
<bin.hex name="g">000000 080ff7f00</bin.hex>
<double name="h">687.98 8877</double>
<string name="str" xml:space="pres erve">A string</string>
<IDREF name="tab">ID12 ff40</IDREF>
<IDREF name="Next">ID0 </IDREF>
</Object>
---------------------------------------------------------

Design principles:
------------------

1) The software will follow pointers and should be able to cope with
complicated and messy graphs, even if they contain loops.
To do this it records the address of each object stored.
(Not shown in the example above)
2) Since the address of each object is unique, the implementation
contains no embedded objects, just references (pointers) to
other objects. All objects are stored under the ObjectStore
tag (not shown).

3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.
B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.

I have followed a bit the literature about this, and I have never
seen any C implementation. Just C++ ones, where the problems are
much bigger than in C since they have to cope with multiple
heritance hierarchies, templates, whatever. Happily in C everything
is much simpler.

Questions:

Are any of you aware of an implementation of this in C?

What would you propose for unions and function pointers?

Are there any other standards for datatypes in XML besides
the one mentioned above?

Thanks in advance for your time

jacob
Nov 15 '05 #1
11 1972
jacob navia wrote:
I am writing software to make a general storage
facility of any kind of objects to/from disk.


You might try comp.programmin g, since they deal in a lot of the
algorithmic questions.

Jon
----
Learn to program using Linux assembly language
http://www.cafeshops.com/bartlettpublish.8640017
Nov 15 '05 #2


jacob navia wrote:
I am writing software to make a general storage
facility of any kind of objects to/from disk.
[...]
The "wizard" software generates the following functions:
----------------------------------------------
//@ Serialization function for structure structure
int structureSerial ize(structure *data,FILE *out)
{
int i;
unsigned char *p;
if (data == NULL)
return 0;
if (!initialized) {
InitXmlWriter(o ut);
initialized=1;
}
Is `initialized' a static variable somewhere? If so,
it seems you can have only one XmlWriter stream active at
a time, or maybe even at all.

A possible alternative would be to wrap the FILE* in
a struct of its own along with whatever state variables
are needed, so you can do

XmlWriter *outxml = NewXmlWriter(ou t);

.... and then pass an XmlWriter* to all the wizard-generated
("charmed?") functions.
fprintf(out,"<O bject id=\"ID%x\"
typename=\"stru cture\">\n",(in t)data);
Non-portable (as I expect you know), since the conversion
from pointer to int is implementation-defined and perhaps
meaningless. Even if the conversion does something simple
like "just copy the bits," the generated object IDs might
not be unique (if int is narrower than pointer, say, or if
dynamic memory management re-uses a free()d object's memory).
fprintf(out,"\t <byte name=\"a\">%d</byte>\n",data->a);
fprintf(out,"\t <int name=\"b\">%d</int>\n",data->b);
fprintf(out,"\t <int name=\"c\">%d</int>\n",data->c);
...
Bleah. Have you considered a table-driven solution?
// Type long double not supported natively.
// Using hexadecimal encoding
p = (unsigned char *)&data->g;
fprintf(out,"\t <bin.hex name=\"g\">");
for(i=0; i<12;i++) {
fprintf(out,"%x ",*(p++) & 0xff);
}
Non-portable, of course.
structureSerial ize(data->Next,out); // follow the Next pointer
TabSerialize(&d ata->tab,out); // Follow embedded structures
I'd have expected these to be done in the opposite order
(but I haven't read the M'soft specs). Either way, though,
using recursion to chase what might be a long linked list is
not a wonderful idea.
return 1;
If `1' means "success," maybe this should be written
as `return !ferror(out);' or some such.
3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.
Hence non-portable.
B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.
... and hasn't been recompiled or even relinked, and
hasn't been loaded with a newer version of a shared library,
and isn't running under a debugger and ...

There's also the problem that C doesn't define the
conversion of a function pointer to any numeric datum; the
only way to get a portable representation would be to deal
with the pointer's constituent bytes. The byte stream would
be interpretable by but meaningless to a recipient other than
the same program (if lucky), hence non-portable.

If you have a table of "pointable" functions you can
translate the pointer to a name easily enough -- and such
a table would seem necessary on the receiving end, to get
from name back to function pointer again. If you get hold
of a function pointer whose target is not in your table,
I think you should announce a serialization failure.
Questions:

Are any of you aware of an implementation of this in C?

What would you propose for unions and function pointers?
If you can't support them usefully, don't support them
at all. Opinion only; YMMV.
Are there any other standards for datatypes in XML besides
the one mentioned above?


I don't know. Probably. My counter-question: Since you're
committed to a non-portable representation anyhow (c.f. the
treatment of `long double'), why fool around with XML? What
advantage does it offer if the portably-packaged content isn't
itself portable?

--
Er*********@sun .com

Nov 15 '05 #3
jacob navia wrote:
3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.
B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.
What would you propose for unions and function pointers?

jacob


You could provide serialisation functions that take a list of union
types and/or already assigned function pointers for that particular
structure, in order of appeareance in the structure. You could easily
do this by overloading the function, using your C compiler with
overloading extensions ;->

This is the way I would have done it. Probably you have already thought
of better solutions.

Bahadir

Nov 15 '05 #4
Thanks for your answer. I reply below:

Eric Sosman wrote:

jacob navia wrote:
I am writing software to make a general storage
facility of any kind of objects to/from disk.
[...]
The "wizard" software generates the following functions:
----------------------------------------------
//@ Serialization function for structure structure
int structureSerial ize(structure *data,FILE *out)
{
int i;
unsigned char *p;
if (data == NULL)
return 0;
if (!initialized) {
InitXmlWriter(o ut);
initialized=1;
}

Is `initialized' a static variable somewhere? If so,
it seems you can have only one XmlWriter stream active at
a time, or maybe even at all.


In this first implementation yes. I will improve that later, creating
an output stream type, that will contain the static
data.
A possible alternative would be to wrap the FILE* in
a struct of its own along with whatever state variables
are needed, so you can do

XmlWriter *outxml = NewXmlWriter(ou t);

Exactly. Thanks for pointing this.
... and then pass an XmlWriter* to all the wizard-generated
("charmed?") functions.

fprintf(out,"<O bject id=\"ID%x\"
typename=\"st ructure\">\n",( int)data);

Non-portable (as I expect you know), since the conversion
from pointer to int is implementation-defined and perhaps
meaningless. Even if the conversion does something simple
like "just copy the bits," the generated object IDs might
not be unique (if int is narrower than pointer, say, or if
dynamic memory management re-uses a free()d object's memory).


You are right. Will change that to (intptr_t) and include
<stdint.h>

fprintf(out,"\t <byte name=\"a\">%d</byte>\n",data->a);
fprintf(out,"\t <int name=\"b\">%d</int>\n",data->b);
fprintf(out,"\t <int name=\"c\">%d</int>\n",data->c);
...

Bleah. Have you considered a table-driven solution?


Note that you are seeing the code generated by the "wizard", not
the code of the wizard itself. This is straightforward to generate
and easy to follow.

// Type long double not supported natively.
// Using hexadecimal encoding
p = (unsigned char *)&data->g;
fprintf(out,"\t <bin.hex name=\"g\">");
for(i=0; i<12;i++) {
fprintf(out,"%x ",*(p++) & 0xff);
}

Non-portable, of course.


True. I have to investigate writing ratios of big precision
integers, since integers are supported with 64 bit precision,
maybe I can express a long double as a/b where a and b are 64 bit
quantities.

structureSerial ize(data->Next,out); // follow the Next pointer
TabSerialize(&d ata->tab,out); // Follow embedded structures

I'd have expected these to be done in the opposite order
(but I haven't read the M'soft specs). Either way, though,
using recursion to chase what might be a long linked list is
not a wonderful idea.


You have a point here. But I do not see an easy way out other than
recurse.
return 1;

If `1' means "success," maybe this should be written
as `return !ferror(out);' or some such.


Yes, good suggestion.
3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.

Hence non-portable.


I do not see what I could do other than that.

B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.

... and hasn't been recompiled or even relinked, and
hasn't been loaded with a newer version of a shared library,
and isn't running under a debugger and ...

There's also the problem that C doesn't define the
conversion of a function pointer to any numeric datum; the
only way to get a portable representation would be to deal
with the pointer's constituent bytes. The byte stream would
be interpretable by but meaningless to a recipient other than
the same program (if lucky), hence non-portable.


I say "may" be useful. Probably I should bail out with an error, the
same as when I find a union.
If you have a table of "pointable" functions you can
translate the pointer to a name easily enough -- and such
a table would seem necessary on the receiving end, to get
from name back to function pointer again. If you get hold
of a function pointer whose target is not in your table,
I think you should announce a serialization failure.

Questions:

Are any of you aware of an implementation of this in C?

What would you propose for unions and function pointers?

If you can't support them usefully, don't support them
at all. Opinion only; YMMV.


I think I will do that. Better warn the user of unsupported
features.
Are there any other standards for datatypes in XML besides
the one mentioned above?

I don't know. Probably. My counter-question: Since you're
committed to a non-portable representation anyhow (c.f. the
treatment of `long double'), why fool around with XML? What
advantage does it offer if the portably-packaged content isn't
itself portable?


Well, besides the long double problem, other types are 100% portable.
Other ways to encode the long double in a portable way would be
to split it in mantissa, sign and exponent, and store them in portable
types: mantissa in a 64 bit unsigned integer (supported natively),
sign and exponent (without the bias) as normal integers.

Thanks for your input.

jacob
Nov 15 '05 #5


jacob navia wrote:
Thanks for your answer. I reply below:

Well, besides the long double problem, other types are 100% portable.
Well, "100% portable to the implementations where they're
portable." ;-) An `int', for example, is only portable if
its value is in the range -32767 <= i <= 32767, a `char'
(considered as a number) is only portable if 0 <= c <= 127,
and other types have similar "value bands" of portability.

And then there's floating-point: You're converting to
text with "%.15g", but you really don't know how many decimal
digits you need to guarantee that the receiver can read back
exactly the same value the sender serialized. If you can use
C99 features, consider flavors of "%a" instead.

There's also the nasty issue of infinities and NaNs, which
(1) are not supported on all implementations and (2) can have
implementation-defined text formats (see 7.19.6.1/8).
Other ways to encode the long double in a portable way would be
to split it in mantissa, sign and exponent, and store them in portable
types: mantissa in a 64 bit unsigned integer (supported natively),
sign and exponent (without the bias) as normal integers.


I still don't understand why `long double' should be any
more troublesome than `double' or `float'. It's supported on
all conforming C implementations (albeit with different ranges
and precisions, but that doesn't seem to bother you for any
of the other types). Why is `long double' special?

--
Er*********@sun .com

Nov 15 '05 #6
Eric Sosman wrote:
I still don't understand why `long double' should be any
more troublesome than `double' or `float'. It's supported on
all conforming C implementations (albeit with different ranges
and precisions, but that doesn't seem to bother you for any
of the other types). Why is `long double' special?

Because the XML reader should support natively double/float/64 bit
ints and 32 bit ints. Long double isn't in that list.

This is from the specs I have read at the microsoft site that
described the xop-schema that extends the XML datatype schema.
Nov 15 '05 #7
Eric Sosman wrote:
[snip] And then there's floating-point: You're converting to
text with "%.15g", but you really don't know how many decimal
digits you need to guarantee that the receiver can read back
exactly the same value the sender serialized. If you can use
C99 features, consider flavors of "%a" instead.


Using the IEEE 754 representation DBL_DIG is 15. That's why I used
that. Isn't that correct? What value would you use?

And of course, if the reading machine has 16 bits ints, some values
can't be read back as such, or if it doesn't support
floating point, etc etc.
Nov 15 '05 #8
jacob navia <ja***@jacob.re mcomp.fr> writes:
Eric Sosman wrote:
[snip]
And then there's floating-point: You're converting to
text with "%.15g", but you really don't know how many decimal
digits you need to guarantee that the receiver can read back
exactly the same value the sender serialized. If you can use
C99 features, consider flavors of "%a" instead.


Using the IEEE 754 representation DBL_DIG is 15. That's why I used
that. Isn't that correct? What value would you use?


I'm jumping into the middle of this without having read some of the
previous discussion, but ...

If you're assuming IEEE 754 representation, you're not writing
completely portable C code. That's not necessarily a horrible
thing, but you should at least document your assumptions.

As for what value you should use, why not just use DBL_DIG? (Or do
you need DBL_DIG+1 to guarantee you can retrieve the original value?
Perhaps a floating-point expert can clarify.)

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #9
[ Jumping in the middle of a discussion which is happenning in two forums is
probably a bad idea, but... ]

En news:42******** *************** @news.wanadoo.f r,
jacob navia va escriure:
Because the XML reader should support natively double/float/64 bit
ints and 32 bit ints. Long double isn't in that list.


You are writing for Win32/64, ain't you? Then long double has the same
representation as double on these platforms, so the XML reader will not make
any problem.

Of course Virginia, it is not portable to make such an assumption. But it is
exactly as not portable as would it be to assume that long long int is _not_
an 128-bit wide type (which is not handled either).
Antoine

Nov 15 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
4175
by: Paolo Losi | last post by:
Hi all, I'm pretty new to the python language so please excuse me if this is FAQ... I'm very glad to be part of the list! :-) I'm looking into a way to implement a generic workflow framework with python. The basic idea is to use python scripts as the way to specify workflow behavior. The framework should not only use scripts as a specification language but is going to leverage on python interpreter for the execution of the scripts. One...
2
1366
by: object-relational persistence mapping | last post by:
Someone can tell me where I find a good Object-Relational Mapping library.
4
1581
by: Marcel Balcarek | last post by:
Hello, I have a page (page1.aspx) that builds some complex .NET objects. I successfully pass these objects to another different window (page2.aspx). How can I persist these objects on a POST back of the same window (page2.aspx)? Thank you Marcel
3
296
by: Jo Inferis | last post by:
So, I'm using a 3rd party com object via interop (already I can hear screams of anguish). This object was originally written to be used as the backend for multiple screens of a VB application. Now in this situation the object remains in memory the whole time, so persistence of data between screens is not a problem, but on the web I'm assuming (from previous experience) that we'll need to use Load and Commit methods on this object - where...
5
1495
by: Chris Spencer | last post by:
Before I get too carried away with something that's probably unnecessary, please allow me to throw around some ideas. I've been looking for a method of transparent, scalable, and human-readable object persistence, and I've tried the standard lib's Shelve, Zope's ZODB, Divmod's Axiom, and others. However, while they're all useful, none satisfies all my criteria. So I started writing some toy code of my own: http://paste.plone.org/5227 ...
3
1669
by: Robert | last post by:
I am trying to persist an instance specific object without using a session object. Is this possible? For example: Class object: Car Properties: Make, Model Car.Make = Ford Car.Model = F150
6
1427
by: Peter Richardson | last post by:
Hi, I'm wondering if someone can help me with some design questions I have. I'm trying to create a class in C# to represent my customers. I know how to create teh Customer class and all, but my problem comes with some conceptual issues I have: Let's say I have a business layer and a data layer. I use a seperate assembly to host the business objects that I pass from one to another. That is, I have for example:
1
7095
by: =?ISO-8859-1?Q?Lasse_V=E5gs=E6ther_Karlsen?= | last post by:
I get the above error in some of the ASP.NET web applications on a server, and I need some help figuring out how to deal with it. This is a rather long post, and I hope I have enough details that someone who bothers to read all of it have some pointers. Note, I have posted the stack trace and the code exhibiting the problem further down so if you want to start by reading that, search for +++ Also note that I am unable to reproduce...
0
1579
by: Bill McCormick | last post by:
I'm looking for the best (easiest, cheapest, fastest, most flexible) way to implement object persistence in .NET 3.5. The object data is native XML, so serialization is a good option as well as SQL Server. In memory objects will have a state modifier that indicates some sort of progress on the object. In a file-system persistence solution, the state modifier would place have the object move through some number of different folders....
0
2387
myusernotyours
by: myusernotyours | last post by:
Hi all, Am trying to create a Java Desktop App that uses Java Persistence in Netbeans. The database is MS Access but I tried with Mysql and got the same error. When I run the app( Create the entity manager), I keep getting the following... Exception in thread "AWT-EventQueue-0" javax.persistence.PersistenceException: No Persistence provider for EntityManager named ReceiptingPU: The following providers:...
0
8199
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8705
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8638
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8505
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6125
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5574
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4198
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2626
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1511
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.