473,498 Members | 1,830 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Compiling Data Files Into a Program

Let's say you want to implement a Dictionary class, which contains a
vector of DictionaryEntry. Assume each DictionaryEntry has two
members, a word and a definition.

Now assume your program needs to create a Dictionary *object* to be
populated with values that come from a text file with a format like
this:

<dict.txt>

APPLE
a fruit

ANT
an insect

....etc...

</dict.txt>

Clearly it would not be hard to write a parser that went through the
text file and populated the object. This, however, makes the program
depenedent on an uncompiled text file, which could be a problem if,
eg, the words and definitions were all top secret.

One solution which I don't like is to write a program that converts
the text file into a .cpp file which, in turn, defines the dictionary
object you'll need and populates it. The result of the conversion
might look like:

<dict.cpp>
Dictionary SECRET_DICTIONARY;

DictionaryEntry E1("APPLE","a fruit");
SECRET_DICTIONARY.addEntry(E1);

DictionaryEntry E2("ANT","an insect");
SECRET_DICTIONARY.addEntry(E2);

....etc...
</dict.cpp>

Then your client program could #include dict.cpp and use
SECRET_DICTIONARY as needed. Of course, this requires you to:

1. Write, compile, and run the conversion program to produce dict.cpp
2. #include dict.cpp in your program, and then compile that.

Thus a two step compilation process. Is there a better way to handle
this situation?

Thanks for any suggestions,
cpp
Jul 22 '05 #1
9 2395
cppaddict wrote:
Let's say you want to implement a Dictionary class, which
contains a vector of DictionaryEntry. Assume each
DictionaryEntry has two members, a word and a definition.

Now assume your program needs to create a Dictionary
*object* to be populated with values that come from a
text file with a format like this:

<dict.txt>

APPLE
a fruit

ANT
an insect

...etc...

</dict.txt>

Clearly it would not be hard to write a parser that went
through the text file and populated the object. This,
however, makes the program depenedent on an uncompiled
text file, which could be a problem if, eg, the words and
definitions were all top secret.

One solution which I don't like is to write a program
that converts the text file into a .cpp file which, in
turn, defines the dictionary object you'll need and
populates it. The result of the conversion might look
like:

<dict.cpp>
Dictionary SECRET_DICTIONARY;

DictionaryEntry E1("APPLE","a fruit");
SECRET_DICTIONARY.addEntry(E1);

DictionaryEntry E2("ANT","an insect");
SECRET_DICTIONARY.addEntry(E2);

...etc...
</dict.cpp>

Then your client program could #include dict.cpp and use
SECRET_DICTIONARY as needed. Of course, this requires
you to:

1. Write, compile, and run the conversion program to
produce dict.cpp
2. #include dict.cpp in your program, and then compile
that.

Thus a two step compilation process. Is there a better
way to handle this situation?

Thanks for any suggestions,
cpp


Encrypt the text file that contains the definitions.
Jul 22 '05 #2
cppaddict wrote:

<Snip>
Clearly it would not be hard to write a parser that went through the
text file and populated the object. This, however, makes the program
depenedent on an uncompiled text file, which could be a problem if,
eg, the words and definitions were all top secret.
In that case it would be better to associate an MD5 hash of the word with a
definition as otherwise anyone running strings on your executable would
have the word available. I am not sure that there is much you can do about
the definitionss themselves but the keyword you can protect with a one way
hash.
One solution which I don't like is to write a program that converts
the text file into a .cpp file which, in turn, defines the dictionary
object you'll need and populates it. The result of the conversion
might look like:
<Snip>

Well if you are going for something that is actually secret then having the
plain text converted on a development machine it never leaves would surely
be a good thing?

Sure you have to write the utility, but that is not that hard, a simple sed
or awk script springs to mind. It may even be doable with 'tr' (not sure).
Then your client program could #include dict.cpp and use
Why not link dict.o, it may be that it is easier on your platform to produce
a object file from a flat text file then it is to produce C++ and even if
C++ is the intermediate target, I would still not include it directly.
extern and let the linker sort them out!
SECRET_DICTIONARY as needed. Of course, this requires you to:

1. Write, compile, and run the conversion program to produce dict.cpp
2. #include dict.cpp in your program, and then compile that.

Thus a two step compilation process. Is there a better way to handle
this situation?


What is the problem with a 2 step process, that is why make exists.

dictionary.cpp: dictionary.csv script.awk
awk .... whatever

dictionary.o: dictionary.cpp
CC dictionary.cpp -odictionary.o (or whatever)

Thus any change to the dictionary or to the awk script that produces it will
result in dictionary.cpp being regenerated and then in dictionary.o being
regenerated followed by (I assume) a relink.

There are (in most cases) highly platform dependent ways to include a data
file as an object at link time, but they tend to be non portable.
Depending on your platform, I would look at the man pages for binutils.

I would also note that if I can read the executable I can probably find your
word list even if it is linked in as a object. Also, any platform dependent
trick like this will probably result in (at best) a C style string which is
the entire contents of the file.

In general I would put the data file in whatever location your OS has for
platform independent data files, and possibly do something clever with the
permissions, platform dependent tricks have a nasty tendency to bite you!

Regards, Dan.
--
And on the evening of the first day, the lord said.... LX1, Go!
And there was light.
The email address *IS* valid, do not remove the spamblock.
Jul 22 '05 #3
Dan, thanks very much for your comments. A couple points of
clarification:
Why not link dict.o, it may be that it is easier on your platform to produce
a object file from a flat text file then it is to produce C++ and even if
C++ is the intermediate target, I would still not include it directly.
extern and let the linker sort them out!
How would you do this? Assuming the existence of dict.o, what is the
code to extern it?

I would also note that if I can read the executable I can probably find your
word list even if it is linked in as a object. Also, any platform dependent
trick like this will probably result in (at best) a C style string which is
the entire contents of the file.


How is it that someone could read the strings in the executable.
Wouldn't that require difficult reverse engineering?

Thanks again,
cpp

Jul 22 '05 #4
cppaddict <he***@hello.com> wrote:
I would also note that if I can read the executable I can probably find your
word list even if it is linked in as a object. Also, any platform dependent
trick like this will probably result in (at best) a C style string which is
the entire contents of the file.


How is it that someone could read the strings in the executable.
Wouldn't that require difficult reverse engineering?


No, a simple hex editor (which can be downloaded from any number of web
sites for free) can let someone look at your code. What they will see is
a bunch of garbage (where the actual code is) and occasionally some
blocks of text like, "Apple\0" and "a fruit\0".

Joe Laughlin had the best solution. While developing the program, keep
the dict.txt file unencrypted, but write the code so that adding a
decrypter will be a simple matter of changing one line of code. Then,
when everything works and you are ready to ship, add that line, encrypt
the file and test. If everything works fine, hand it over to your
customers. QED.
Jul 22 '05 #5
>No, a simple hex editor (which can be downloaded from any number of web
sites for free) can let someone look at your code. What they will see is
a bunch of garbage (where the actual code is) and occasionally some
blocks of text like, "Apple\0" and "a fruit\0".


Very interesting.... and good to know.

Thanks,
cpp
Jul 22 '05 #6
cppaddict wrote:
No, a simple hex editor (which can be downloaded from any number of web
sites for free) can let someone look at your code. What they will see is
a bunch of garbage (where the actual code is) and occasionally some
blocks of text like, "Apple\0" and "a fruit\0".


Very interesting.... and good to know.

Or even easier, the 'strings' utility will scan any file (as a raw stream of
bytes) and print any string of printable ascii longer then a specified
number of characters (sometimes very revealing on word documents....).

As to how to link an object file with the rest of your code, that is
platform dependent but assuming the object file contains a C (to avoid name
mangling issues) object called dict of type array of dictionary, I would
stick
typedef struct {
char * word;
char * definition;
} dictionary; /* Or whatever your structure is */
extern struct dictionary dict[];

in a suitable header file (say dict.h). This should then allow your code
using dict to compile to object files, then just link them all together in
whatever way your build enviroment provides.

As you have not told us what toolchain you are using it is kind of hard to
be more precise.

Regards, Dan (who is much more comfortable in C then C++ which is why the
above has a C flavour).

--
And on the evening of the first day, the lord said.... LX1, Go!
And there was light.
The email address *IS* valid, do not remove the spamblock.
Jul 22 '05 #7
cppaddict wrote:
Let's say you want to implement a Dictionary class, which contains a
vector of DictionaryEntry. Assume each DictionaryEntry has two
members, a word and a definition.

Now assume your program needs to create a Dictionary *object* to be
populated with values that come from a text file with a format like
this:

<dict.txt>

APPLE
a fruit

ANT
an insect

...etc...

</dict.txt>

Clearly it would not be hard to write a parser that went through the
text file and populated the object. This, however, makes the program
depenedent on an uncompiled text file, which could be a problem if,
eg, the words and definitions were all top secret.

One solution which I don't like is to write a program that converts
the text file into a .cpp file which, in turn, defines the dictionary
object you'll need and populates it. The result of the conversion
might look like:

<dict.cpp>
Dictionary SECRET_DICTIONARY;

DictionaryEntry E1("APPLE","a fruit");
SECRET_DICTIONARY.addEntry(E1);

DictionaryEntry E2("ANT","an insect");
SECRET_DICTIONARY.addEntry(E2);

...etc...
</dict.cpp>

Then your client program could #include dict.cpp and use
SECRET_DICTIONARY as needed. Of course, this requires you to:

1. Write, compile, and run the conversion program to produce dict.cpp
2. #include dict.cpp in your program, and then compile that.

Thus a two step compilation process. Is there a better way to handle
this situation?

Thanks for any suggestions,
cpp


On one of my embedded systems applications, I used an assembly language
file to contain the data. The assembly language has a directive for
including a file as binary data. We just placed a global symbol at the
beginning as well as some alignment operators:
ALIGN 3
EXPORT Dictionary_Data
Dictionary_Data
INCBIN "Dictionary_Data.txt"
END

We then refer the to data using the "extern" C keyword. This was a lot
simpler and less error prone than using the C language array
initialization syntax. The assembly language option also allowed us
to force the data on a given alignment boundary, which was required
by the hardware. The C language cannot guarantee that data is placed
on a given alignment boundary.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #8
cppaddict <he***@hello.com> wrote in message news:<1c********************************@4ax.com>. ..
Let's say you want to implement a Dictionary class, which contains a
vector of DictionaryEntry. Assume each DictionaryEntry has two
members, a word and a definition.

[snip]

What you are really talking about here is a single table database.
C++ is not specifically a database-ish language. If you really need
a huge big dictionary, you really want a database to go with.
So, you should either:
1) buy a standard database tool and do it in that, or
2) roll your own.

Which you do depends on how much your own time is worth, how much
you expect this system to expand, whether there will ever be another
table, what kind of interogation of the table(s) you want to do, etc.
socks
Jul 22 '05 #9
cppaddict <he***@hello.com> wrote in message news:<1c********************************@4ax.com>. ..
One solution which I don't like is to write a program that converts
the text file into a .cpp file which, in turn, defines the dictionary
object you'll need and populates it. The result of the conversion
might look like:

<dict.cpp>
Dictionary SECRET_DICTIONARY;

DictionaryEntry E1("APPLE","a fruit");
SECRET_DICTIONARY.addEntry(E1);

DictionaryEntry E2("ANT","an insect");
SECRET_DICTIONARY.addEntry(E2);

...etc...
</dict.cpp>

Then your client program could #include dict.cpp and use
SECRET_DICTIONARY as needed. Of course, this requires you to:

1. Write, compile, and run the conversion program to produce dict.cpp
2. #include dict.cpp in your program, and then compile that.

Thus a two step compilation process. Is there a better way to handle
this situation?


Actually that method is fine and is used all over the place.

#include is normally handled by the preprocessor. From this point of view,
the preprocessor is simply a conversion program that reads multiple source
files and creates a single file that is handed off to the actual compiler.

lex/yacc (i.e. flex/bison) are even more like what you propose. You write
a lexer or a parser in a special language that more closely models what
you are doing. lex/yacc then convert the file into a C file which is
compiled normally.

So go ahead and do it that way. Give a dictionary file a special extension
so build tools can recognize what it is.

samuel
Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2450
by: H. S. | last post by:
Hi, I am trying to compile these set of C++ files and trying out class inheritence and function pointers. Can anybody shed some light why my compiler is not compiling them and where I am going...
0
2687
by: Norm Wong | last post by:
If anyone is interested in using db2uext2 with Cygwin gcc compiler on Windows, I've modified the IBM provided sample with the attached file. There are two main modifications. The mkdir command...
2
1258
by: gurpreet | last post by:
Hi this is gurpreet, I know this is a very simple question but still I want to clear some doubts. What happens when we compile and link a c-program? I hope aquite a lot of responses to my...
2
2095
by: Erik | last post by:
Hi Everyone, I'm having real problems compiling some source for eVC4++. The errors I am getting are below: It all seems to be centred around winsock. If I move the afsock.h reference to before...
0
9733
by: Kirt Loki Dankmyer | last post by:
So, I download the latest "stable" tar for perl (5.8.7) and try to compile it on the Solaris 8 (SPARC) box that I administrate. I try all sorts of different switches, but I can't get it to compile....
1
3048
by: Riaan | last post by:
I am compiling my app (including CODEBASE database headers) but as soon as i compile it gives me the following errors : c:\Program Files\Microsoft Visual Studio\VC98\Include\UTILITY(81) : warning...
9
2175
by: Sheldon | last post by:
Good day Everyone, I am a still very new at learning C and I have thrown myself in the deep end. I started with a simple program and kept widening the scope. This has taught me many things about...
8
2175
by: WebSnozz | last post by:
I have an application written in C that does a lot of low level stuff. It does a lot of things like casting from void*'s. I want to create a new GUI for it in either C# or MC++, but reuse the...
1
3902
by: jon2211 | last post by:
I tried to compile some code with #include <shellapi.h. I am linking shell32.lib. I am not trying to use ShellExecute() but right now just getting the code to compile with the header file an...
10
2181
by: Tomás Ó hÉilidhe | last post by:
I'd post this on a gcc newsgroup but I'd be more productive talking to the wall. Anyway, let's say someone throws some source code at you for a particular program and says, "Just compile it, it...
0
7125
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7167
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7208
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7379
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
4915
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3095
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3085
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
657
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
292
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.