I want to make a C# library of some classes

Chris F Clark

Please excuse the length of this post, I am unfortunately long-winded,
and don't know how to make my postings more brief.

I have a C++ class library (and application generator, called
Yacc++(r) and the Language Objects Library) that I have converted over
to C#. It works okay. However, in the C# version, one has to build
the class library into the generated application, because I haven't
structured this one thing right. I would like to fix that obvious
hole for my clients, so that it works "just like" the C++ version.
(Well, there will always be some differences, for example the C#
version is nicely garbage collected while in the C++ version, that is
done in the C++ style.)

Ok, here is a summary of the key problem.

There is this class "yy_ref_obj" which is an application specific
class. That is each application run through the generator gets its
own custom yy_ref_obj class that is tailored to that application.

There are some other classes, like a "yy_psr_dflt_obj" which are also
generated as part of the application. In fact, the yy_psr_dflt_obj
class is actually the factory class, for the yy_psr_dflt_obj class.
That is all the yy_ref_obj objects are created by calls to functions
in the yy_psr_dflt_obj class.

However, there are common methods and members that each
yy_psr_dflt_obj class use, these are encapsulated in classes like
yy_psr_obj which is a base class for the yy_psr_dflt_obj class and is
one of the classes I would like to put into the class library.

The problem is that the base class has some arrays of yy_ref_objs and
member functions which return elements of those arrays. In C++, I use
pointers to yy_ref_objs to resolve this problem, an array of
yy_ref_objs in C++ is simply a pointer to the first yy_ref_obj in the
array. Thus, the base yy_psr_obj class can use a "forward declared"
(opaque) version of a yy_ref_obj (i.e. class yy_ref_obj;) to declare
the functions which are often virtual and defined in the derived
yy_psr_dflt_obj class when they need to return actual yy_ref_objs in
any case. What makes this work in C++ is that anytime one has a need
of a real yy_ref_obj, one has brought in an application specific
header file and the code can determine what the application specific
layout is. If one has only the "headers" for the class library
loaded, one cannot see into a yy_ref_obj, so one cannot tell its size
or shape and one doesn't care. (if this were a functional programming
group, I would explain that the library uses parametric polymorphism
(i.e. the parameter is strongly typed, but the library is agnostic to
what that type actually is), if that helps anyone understand the
problem.)

However, I don't know the best way to approach the same problem in C#.
The current solution use the yy_ref_obj class from the application in
the library code, couples the two pieces of code together too
strongly. I have thought about just using the "Object" class, but I
really want uses in the application, which call functions in the
library to get back application specific objects and not have to cast
them. I have also thought of using an interface class (e.g. a
yy_ref_ptr class) in the library code and having something that
implicitly converts a yy_ref_ptr to a yy_ref_obj, but I'm not sure of
the C# way of expressing that idea.

Note, I do have this application generation program (Yacc++)
involved, so I can generate classes and member functions that do the
real work (with yy_ref_objs) in the application context. I just want
to keep the non-application specific pieces in the library as much as
possible.

Note the current code is targeted to VS 2003, but I can target VS 2005
if that will help.

Thanks for any help or suggestions,
-Chris

************************************************** ***************************
Chris Clark Internet : co*****@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------

Apr 22 '06 #1

Subscribe Post Reply

2098

Lebesgue

You seem to be very, very much bound to C++thinking. When coding in C#,
first of all, you should change your naming style to something more human
than yy_psr_dflt_obj for a class.
Your post is so long and contains so many details that I don't understand
your problem. Are you having troubles compiling to a class library? Or
designing the interfaces between library and application? In fact, you've
provided answer to some of your question in the question itself, like the
one about interfaces. Yes, you should use them. But only after learning some
C#.
Maybe this could help you to dig to your interace-implementation issue
http://msdn2.microsoft.com/en-us/library/87d83y5b.aspx, but I'd strongly
suggest reading a C# book first.

"Chris F Clark" <cf*@shell01.TheWorld.com> wrote in message
news:sd*************@shell01.TheWorld.com...

Please excuse the length of this post, I am unfortunately long-winded,
and don't know how to make my postings more brief.

I have a C++ class library (and application generator, called
Yacc++(r) and the Language Objects Library) that I have converted over
to C#. It works okay. However, in the C# version, one has to build
the class library into the generated application, because I haven't
structured this one thing right. I would like to fix that obvious
hole for my clients, so that it works "just like" the C++ version.
(Well, there will always be some differences, for example the C#
version is nicely garbage collected while in the C++ version, that is
done in the C++ style.)

Ok, here is a summary of the key problem.

There is this class "yy_ref_obj" which is an application specific
class. That is each application run through the generator gets its
own custom yy_ref_obj class that is tailored to that application.

There are some other classes, like a "yy_psr_dflt_obj" which are also
generated as part of the application. In fact, the yy_psr_dflt_obj
class is actually the factory class, for the yy_psr_dflt_obj class.
That is all the yy_ref_obj objects are created by calls to functions
in the yy_psr_dflt_obj class.

However, there are common methods and members that each
yy_psr_dflt_obj class use, these are encapsulated in classes like
yy_psr_obj which is a base class for the yy_psr_dflt_obj class and is
one of the classes I would like to put into the class library.

The problem is that the base class has some arrays of yy_ref_objs and
member functions which return elements of those arrays. In C++, I use
pointers to yy_ref_objs to resolve this problem, an array of
yy_ref_objs in C++ is simply a pointer to the first yy_ref_obj in the
array. Thus, the base yy_psr_obj class can use a "forward declared"
(opaque) version of a yy_ref_obj (i.e. class yy_ref_obj;) to declare
the functions which are often virtual and defined in the derived
yy_psr_dflt_obj class when they need to return actual yy_ref_objs in
any case. What makes this work in C++ is that anytime one has a need
of a real yy_ref_obj, one has brought in an application specific
header file and the code can determine what the application specific
layout is. If one has only the "headers" for the class library
loaded, one cannot see into a yy_ref_obj, so one cannot tell its size
or shape and one doesn't care. (if this were a functional programming
group, I would explain that the library uses parametric polymorphism
(i.e. the parameter is strongly typed, but the library is agnostic to
what that type actually is), if that helps anyone understand the
problem.)

However, I don't know the best way to approach the same problem in C#.
The current solution use the yy_ref_obj class from the application in
the library code, couples the two pieces of code together too
strongly. I have thought about just using the "Object" class, but I
really want uses in the application, which call functions in the
library to get back application specific objects and not have to cast
them. I have also thought of using an interface class (e.g. a
yy_ref_ptr class) in the library code and having something that
implicitly converts a yy_ref_ptr to a yy_ref_obj, but I'm not sure of
the C# way of expressing that idea.

Note, I do have this application generation program (Yacc++)
involved, so I can generate classes and member functions that do the
real work (with yy_ref_objs) in the application context. I just want
to keep the non-application specific pieces in the library as much as
possible.

Note the current code is targeted to VS 2003, but I can target VS 2005
if that will help.

Thanks for any help or suggestions,
-Chris

************************************************** ***************************
Chris Clark Internet : co*****@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------

Apr 22 '06 #2

Chris F Clark

I apologize for what appears to you to be C++ thinking. It is more
ecelctic than that, as I probably program in 3 or 4 languages on any
given day, C++, Lisp, Perl, and Verilog. The names of the classes are
historical and have been that way for about 15 years and correspond to
conventions commonly used by compiler writers, which go farther back
than that--the roots of the names are partly in PL/I and partly in C.
However, I will switch to using Camel names to make the post more
understandable. I have read several C# books and understand
interfaces, as well as a host of other concepts. I have interfaces in
my library to encapsulate certain concepts, like IParser (something
that parses), ILexer (something that lexes), IErrorPosition (some
place you can report an error associated with, e.g. a file name and
line number), IErrorMessage (some kind of error that can be reported),
and so forth. However, I've only written around 100k lines of C# code
so far, so I'm far from facile in it.

Ok, so a user gives a specification (a grammar) to the application
generator that describes some language (perhaps simplified-English
with paragraphs, sentences, phrases, nouns, verbs, and so forth or
perhaps a programming language with blocks, statements, expressions,
variables, and constant). In that specification, the application
designer provides some code that is to be executed as the language is
recognized. In that code are some computations that need to be
performed and some data structures to perform those computations on.
The application generator reads that in and spits out a C# program
that reads in texts in the given language and performs the desired
calculations.

To make that more precise, lets say the uses is counting nouns and
verbs in English sentences. The user may want the resulting program
to keep two kinds of counters, numberVerbs and numberNouns. So, they
make a class MyData { int numberVerbs; int numberNouns; }. The
generated program keeps objects of this class for the user and does
the user specified calculations on those classes, which might look
like this:
sentence: adjective* noun { myData(2).numberNouns++; }
adverb* verb { myData(4).numberVerbs++; } object?;

Now, the problem is in that "MyData myData(int whichOne)" is a member
function of the IParser interface that is in fact at least partially
implemented in the BaseParser class that I want in the library.
However, the MyData class is application specific and some
applications will have more than one parser and those parsers will have
different MyData classes. I don't want the user to have to write
{ MyEnglishData(myData(2)).numberNouns++; } in the English parser and
{ MyPascalData(myData(1)).firstVariable } in their Pascal parser.

Now, in other languages, I have gotten that to work by making
EglishParser implement the IArrayMyData interface and having the code
that was specific to different application specific MyData classes be
handled in that. The BaseParser class has an IArrayMyData member
(i.e. IArrayMyData myParser;) that is the EnglishParser that is uses
to access the actual data, as in:

//// This I want in a pre-compiled library:

class BaseParser : implements IParser {
IArrayMyData myParser; // application specific parser
MyData myData(int whichOne)
{ int firstEntry = myParser.firstEntry();
return( myParser.GetNextEntry( firstEntry, whichOne) );
}
....
}

//// In (generated) application code:

namespace EnglishParserSpace;
class MyData { int numberVerbs; int numberNouns; }
class EnglishParser : base(BaseParser), implements IArrayMyData
{ ... // including the code we copied from the users spec
sawNounInSentence() { myData(2).numberNouns++; }
sawVerbInSentence() { myData(4).numberVerbs++; }
}

//// In other (generated) application code:

namespace PascalParserSpace;
class MyData { MyData firstVariable; ... }
class PascalParser : base(BaseParser), implements IArrayMyData
{ ...
sawVarInDeclaration() { myData(1).firstVariable = myData(2); }
}

The problem is how can I compile BaseParser into a library that
doesn't depend on knowing the type of MyData and without requiring the
users to cast the result of the myData function back to the correct
type?

Hope this helps,
-Chris

************************************************** ***************************
Chris Clark Internet : co*****@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------

Apr 22 '06 #3

Lebesgue

Now I hope I understood your problem.
The first thing that comes to my mind is that you should simply include and
compile the BaseParser with the generated library - seems too simple for me
and I suppose you've got reasons not to do it.

I don't want the user to have to write
{ MyEnglishData(myData(2)).numberNouns++; } in the English parser and
{ MyPascalData(myData(1)).firstVariable } in their Pascal parser.
If I understand correctly, you need a kind of "logical late binding" here,
so the second thing that comes into my mind is to let them type
myData(2).numberNouns++ and let the code generator add the cast according to
context the command is used in or try to infere the type from the property
name they used - don't know which one is applicable in your case.

If you don't want / can't do it this way, the answer in general is to use
code instead of data. So instead of trying to cast the result to some data
structure, let the result do what it is supposed to do by providing a
polymorphic method to do it. This would possibly involve the inference of
the type too and surely would add an extra complexity to the code and is not
applicable in all scenarios, but would do the job. Since it is generated
code, I don't think the extra complexity or lack of understandability would
hurt.

I will think about it tomorrow and eventually let you know if I find out how
it may be done in this scenario.

"Chris F Clark" <cf*@shell01.TheWorld.com> wrote in message
news:sd*************@shell01.TheWorld.com...I apologize for what appears to you to be C++ thinking. It is more
ecelctic than that, as I probably program in 3 or 4 languages on any
given day, C++, Lisp, Perl, and Verilog. The names of the classes are
historical and have been that way for about 15 years and correspond to
conventions commonly used by compiler writers, which go farther back
than that--the roots of the names are partly in PL/I and partly in C.
However, I will switch to using Camel names to make the post more
understandable. I have read several C# books and understand
interfaces, as well as a host of other concepts. I have interfaces in
my library to encapsulate certain concepts, like IParser (something
that parses), ILexer (something that lexes), IErrorPosition (some
place you can report an error associated with, e.g. a file name and
line number), IErrorMessage (some kind of error that can be reported),
and so forth. However, I've only written around 100k lines of C# code
so far, so I'm far from facile in it.

Ok, so a user gives a specification (a grammar) to the application
generator that describes some language (perhaps simplified-English
with paragraphs, sentences, phrases, nouns, verbs, and so forth or
perhaps a programming language with blocks, statements, expressions,
variables, and constant). In that specification, the application
designer provides some code that is to be executed as the language is
recognized. In that code are some computations that need to be
performed and some data structures to perform those computations on.
The application generator reads that in and spits out a C# program
that reads in texts in the given language and performs the desired
calculations.

To make that more precise, lets say the uses is counting nouns and
verbs in English sentences. The user may want the resulting program
to keep two kinds of counters, numberVerbs and numberNouns. So, they
make a class MyData { int numberVerbs; int numberNouns; }. The
generated program keeps objects of this class for the user and does
the user specified calculations on those classes, which might look
like this:
sentence: adjective* noun { myData(2).numberNouns++; }
adverb* verb { myData(4).numberVerbs++; } object?;

Now, the problem is in that "MyData myData(int whichOne)" is a member
function of the IParser interface that is in fact at least partially
implemented in the BaseParser class that I want in the library.
However, the MyData class is application specific and some
applications will have more than one parser and those parsers will have
different MyData classes. I don't want the user to have to write
{ MyEnglishData(myData(2)).numberNouns++; } in the English parser and
{ MyPascalData(myData(1)).firstVariable } in their Pascal parser.

Now, in other languages, I have gotten that to work by making
EglishParser implement the IArrayMyData interface and having the code
that was specific to different application specific MyData classes be
handled in that. The BaseParser class has an IArrayMyData member
(i.e. IArrayMyData myParser;) that is the EnglishParser that is uses
to access the actual data, as in:

//// This I want in a pre-compiled library:

class BaseParser : implements IParser {
IArrayMyData myParser; // application specific parser
MyData myData(int whichOne)
{ int firstEntry = myParser.firstEntry();
return( myParser.GetNextEntry( firstEntry, whichOne) );
}
...
}

//// In (generated) application code:

namespace EnglishParserSpace;
class MyData { int numberVerbs; int numberNouns; }
class EnglishParser : base(BaseParser), implements IArrayMyData
{ ... // including the code we copied from the users spec
sawNounInSentence() { myData(2).numberNouns++; }
sawVerbInSentence() { myData(4).numberVerbs++; }
}

//// In other (generated) application code:

namespace PascalParserSpace;
class MyData { MyData firstVariable; ... }
class PascalParser : base(BaseParser), implements IArrayMyData
{ ...
sawVarInDeclaration() { myData(1).firstVariable = myData(2); }
}

The problem is how can I compile BaseParser into a library that
doesn't depend on knowing the type of MyData and without requiring the
users to cast the result of the myData function back to the correct
type?

Hope this helps,
-Chris

************************************************** ***************************
Chris Clark Internet : co*****@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------

Apr 23 '06 #4

Chris F Clark

Lebesgue wrote (replying to me):

Now I hope I understood your problem.
The first thing that comes to my mind is that you should simply include and
compile the BaseParser with the generated library - seems too simple for me
and I suppose you've got reasons not to do it.
Yes, there are actually more (at least 1, but less than 5) classes
between the BaseParser and the ApplicationParser, plus auxillary
classes that are involved. The point is all this code (and it is a
fair amount) is application invariant and if they use multiple
ApplicationParsers in one program, we want the code to be shared.
If I understand correctly, you need a kind of "logical late binding"
here, so the second thing that comes into my mind is to let them
type myData(2).numberNouns++ and let the code generator add the cast
according to context the command is used in or try to infere the
type from the property name they used - don't know which one is
applicable in your case.

We would also like to avoid having the tool (application generator) do
anything to the copied code beside copy it. There are a couple of
reasons for that.

First, it makes certain our tool doesn't get tied to a version of the
language. If someone makes a new version or variation of C#, we don't
want to have to change our tool to support that new version. More
importantly, we don't want to risk our tool getting the rules of C#
wrong and hindering someone from doing something valid because of an
error on our part.

Second, we want users to be able to write code that interoperates with
our generated code (and looks like the code that we copy from) but not
have that code processed by our tool. For example, someone might
write another class that calls member functions in one of our
generated classes. We want all the "magic" stuff that makes our code
to be correct, also to apply to their code, at least when they are
interacting with our classes. The more subtle case of that, is when
they derive their class from one our generated classes. We want what
makes our class work to make their class work, and they will be
writing their class in "vanilla" C#, so our solution must depend on
vanilla C#.

To explain this point more, in the early 90's our generated code was
written "very carefully" as a series of "include" files that we
combined with some "very sophisticated" source code with some "very
interesting" macros. As a result, our code although object-oriented
would actually compile in C as well as C++. However, the code was not
obvious to anyone outside our company. It was effectively a magic
incantation. Around 95, we redid our entire library getting rid of
any attempt to support C and simplifying the C++ so that "mere
mortals" could understand it. Some parts of the code are sill
slightly "unusual" C++, just because we want to retain certain
historical things (e.g. the traditional yy prefix always associated
with "yacc" tools). However, the code is still readable because we
stick to a simple subset of C++ (e.g. no templates, no multiple
inheritance, no complicated macros) and have small easily composed
classes in the library. We are never going back.

The ideal is that someone can take the generated code plus the library
include it in their project and edit it just as if it were code they
had written. We have customers who do that. We are even trying to
make our tool handle the round-trip case, where the users have edited
the generated output and need to make a change to the grammar and have
the two changes get integrated "automatgically" with minimal user
intervention.

We want our C# to have the same feel. Mostly, we have been
successful. Moreover, most of the C++ features work identically in
the C# version, so our customers who are used to the C++ version just
have to learn C# and not C# plus a new version of Yacc++.

This does not mean the code is the same. In C++ we use pointer types
extensively, while in C# we use reference types, and that make
numerous small syntactic (and occassionally semantic) differences.

Still, the result is roughly what we want, the C# version "feels" like
C# code, not a port of a C++ library into C#, while still providing
most of the subtle advances we made in our C++ version over 15 years
of refining it. Moreover, it looks hopeful that we will make
refinements in the C# version that will improve the structure of our
C++ version also.

This library issue is one thing we don't have solved.

I'm going to try to put a better understanding of our library problem
in a separate posting.

-Chris

Apr 23 '06 #5

I want to make a C# library of some classes

Similar topics