By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,589 Members | 1,209 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,589 IT Pros & Developers. It's quick & easy.

how to insert unique ID into binary file that created after compilation?

P: n/a
hello All,
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)

thx

Nov 15 '05 #1
Share this Question
Share on Google+
24 Replies


P: n/a
In article <11**********************@g43g2000cwa.googlegroups .com>,
pristo <Ma*******@gmail.com> wrote:
hello All,
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)


Not portable. Can't discuss it here. Blah, blah, blah.

Nov 15 '05 #2

P: n/a


pristo wrote On 11/08/05 11:36,:
hello All,
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)


In portable C, the closest you can come is to create
a static variable initialized with the ID you desire. The
source file name, date, and time are often used for this:

static const char version_info[] =
"@@@ VERSION INFO @@@"
" Compiled from " __FILE__
" on " __DATE__
" at " __TIME__;

Many systems have utilities that can extract and display
the string constants from an executable or object file,
and ways to separate the "@@@ VERSION INFO @@@" strings
from the others (on Unix systems, you could use `strings'
and `grep'). Some source-management systems can be made
to provide part or all of the data, which could automate
the inclusion of a version identifier for each file. Of
course, all such utilities are not part of the C language,
and will vary from system to system.

A disadvantage of this technique is that some compilers
will issue diagnostic messages about variables that are
declared but not used; such compilers are likely to complain
about the `version_info' variable.

--
Er*********@sun.com

Nov 15 '05 #3

P: n/a
>how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)


One common method of inserting an ID is:

static char rcsid[] = "$Id: foo.c,v 1.7 2005/11/08 12:01:32 root Exp $";

A compiler might give a warning about an unreferenced variable (and
might be smart enough to try to discard it): so reference it, for
example, make "foo -V" print out the version string.

The method of INSERTING the id is portable. The method of extracting
it isn't, but locating such a string is fairly easy and makes few
assumptions beyond the file being a raw binary file: scan until
you see a $, followed by a known keyword, followed by a colon, then
output the stuff from a $ to the next $. I think if non-text
characters (whatever that means) or newlines are encountered between
the two $, it's not output. I've never seen the program that
extracts the headers have a false match, although constructing a
failure case is easy.

RCS and CVS (source code control systems) will even maintain the
strings for you with a current version, time stamp, and user who
last checked it in. You actually put in:
static char rcsid[] = "$Id$";
and RCS or CVS will expand it for you.

Gordon L. Burditt
Nov 15 '05 #4

P: n/a
If you want to mark the binary file that is generated by more than one
source code: e.g (main.c, other.c, etc.) which will cause more than one
rcsid, and you want one unique mark for the whole binary, a good way to
do this is creating your own marker, something like the piece of code
below: (notice that I'm not testing for i/o, string or any other errors
in this example).

typedef struct tagFooter {
BYTE area1[10];
BYTE area2[10];
} _FOOTER;
#define size_footer sizeof(_FOOTER)

int main(int argc,char **argv)
{
FILE *f;
int size;
_FOOTER myInfo;

if(argc < 3) {
/* we expect to receive at least 2 parameters from command line */
return(1);
}
strcpy(myinfo.area1,argv[1]);
strcpy(myinfo.area2,argv[2]);

f=fopen("myfile.bin","wb");
fseek(f,0,SEEK_END);
fwrite(&myinfo,size_footer ,1,f);
fclose(f);
return(0);
}

Instead of getting the values from command line, you could put a CRC
check or anything you desire to use. Also It's pretty easy to create a
program to show the footer information for a marked file.

Hope this can help you.

Nov 15 '05 #5

P: n/a
first thx to all for helping,
i try use your suggestion but its seem that the compiler is optmize the
variable.
i write the code for embdded system and i dont have interface to print
the version out
so how can i cause the compiler think that i use the variable and not
to optmize it?

Pristo

Nov 15 '05 #6

P: n/a
>If you want to mark the binary file that is generated by more than one
source code: e.g (main.c, other.c, etc.) which will cause more than one
rcsid,
Nothing wrong with more than one ID, if you make them static.
Often that's the whole point: you get the version number of
everything that was used to create the executable.
and you want one unique mark for the whole binary, a good way to
A common way to deal with this is to use the RCS ID in the version.c
file or the one in the main program as the primary ID, and that's
the one that the -V flag prints.
do this is creating your own marker, something like the piece of code
below: (notice that I'm not testing for i/o, string or any other errors
in this example).
Appending random crap to an executable may destroy it. Particularly
if the executable has a checksum, digital signature, or other
integrity check. Also, the program doing it *SHOULD* be flagged
as a virus.

There's no guarantee that SEEK_END on a binary file will work.
(Some OS such as CP/M only keep file sizes in sectors, so
there may be unwritten stuff at the end of the last sector in
a binary file, with no way to tell how much of it is unwritten.)

Gordon L. Burditt

typedef struct tagFooter {
BYTE area1[10];
BYTE area2[10];
} _FOOTER;
#define size_footer sizeof(_FOOTER)

int main(int argc,char **argv)
{
FILE *f;
int size;
_FOOTER myInfo;

if(argc < 3) {
/* we expect to receive at least 2 parameters from command line */
return(1);
}
strcpy(myinfo.area1,argv[1]);
strcpy(myinfo.area2,argv[2]);

f=fopen("myfile.bin","wb");
fseek(f,0,SEEK_END);
fwrite(&myinfo,size_footer ,1,f);
fclose(f);
return(0);
}

Instead of getting the values from command line, you could put a CRC
check or anything you desire to use. Also It's pretty easy to create a
program to show the footer information for a marked file.

Hope this can help you.

Nov 15 '05 #7

P: n/a
thx for helping.
i try use your solution
and its seen that the copiler optimize the variable,
i write for embedded system and i dont have interface for print the
version out
how can i cause the compiler not to optmize this constant?

thx

Nov 15 '05 #8

P: n/a
In article <11*********************@g49g2000cwa.googlegroups. com>,
pristo <Ma*******@gmail.com> wrote:
i try use your suggestion but its seem that the compiler is optmize the
variable.
i write the code for embdded system and i dont have interface to print
the version out
so how can i cause the compiler think that i use the variable and not
to optmize it?


Sometimes declaring the variable as volatile is enough to get the
compiler to leave it alone even if it isn't used. Sometimes
simple tricks like taking its strlen() are enough. It depends on
how smart the compiler is about finding "dead code" and "dead variables".
--
If you lie to the compiler, it will get its revenge. -- Eric Sosman
Nov 15 '05 #9

P: n/a
pristo wrote:
thx for helping.
i try use your solution
and its seen that the copiler optimize the variable,
i write for embedded system and i dont have interface for print the
version out
how can i cause the compiler not to optmize this constant?

Please quote the relevant parts of posts your reply to.

Your compiler or linker may have an option that tells it to emit statics
even if they're not used. For gcc, it is notable that
-fkeep-static-consts does *not* do the trick in general; this only works
if you're not optimizing. Consult your documentation.

Otherwise, you'll have to try and outsmart your compiler (this is
generally a bad idea, but it may be the only option). That is, involve
the constant in expressions that are too complicated for the compiler to
detect as going unused or evaluating to a constant.

If you find such an expression, it may be a good idea to encapsulate it
in a function, otherwise it'll look very mysterious.

S.
Nov 15 '05 #10

P: n/a
Skarmander wrote:
pristo wrote:
thx for helping.
i try use your solution
and its seen that the copiler optimize the variable,
i write for embedded system and i dont have interface for print the
version out
how can i cause the compiler not to optmize this constant?

Please quote the relevant parts of posts your reply to.

....you reply to.

S.
--
I hate typo fixes, but sometimes they're necessary
Nov 15 '05 #11

P: n/a


Skarmander wrote On 11/08/05 16:01,:
pristo wrote:
thx for helping.
i try use your solution
and its seen that the copiler optimize the variable,
i write for embedded system and i dont have interface for print the
version out
how can i cause the compiler not to optmize this constant?


Please quote the relevant parts of posts your reply to.

Your compiler or linker may have an option that tells it to emit statics
even if they're not used. For gcc, it is notable that
-fkeep-static-consts does *not* do the trick in general; this only works
if you're not optimizing. Consult your documentation.


Also check your documentation to see whether your compiler
already "fingerprints" its outputs, and whether tools to display
the fingerprints exist. Such things aren't part of the C language,
but since you seem to be concerned with just one particular C
implementation you might as well use whatever it offers.

--
Er*********@sun.com

Nov 15 '05 #12

P: n/a
On 8 Nov 2005 12:26:02 -0800, in comp.lang.c , "pristo"
<Ma*******@gmail.com> wrote:
so how can i cause the compiler think that i use the variable and not
to optmize it?


Use it in your code.

Or search the documentation for compiler flags to disable optimisation
for that specific object. There may be pragmas.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Nov 15 '05 #13

P: n/a
On Tue, 08 Nov 2005 19:20:34 -0000, go***********@burditt.org (Gordon
Burditt) wrote:
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)
One common method of inserting an ID is:

static char rcsid[] = "$Id: foo.c,v 1.7 2005/11/08 12:01:32 root Exp $";


It's not a bad idea to use a format recognized by the Unix "what"
command. Excerpt from man page:

The what command searches the given files for all occurrences of the
pattern that get(1) substitutes for %Z% (currently @(#) at this
printing) and prints out what follows until the first ", >, new-line,
\, or null character. For example, if the C program in file f.c
contains

char ident[] = "@(#)identification information";

Then you can write a portable version of "what" to go along with it
:-)
A compiler might give a warning about an unreferenced variable (and
might be smart enough to try to discard it): so reference it, for
example, make "foo -V" print out the version string.

The method of INSERTING the id is portable. The method of extracting
it isn't, but locating such a string is fairly easy and makes few
assumptions beyond the file being a raw binary file: scan until
you see a $, followed by a known keyword, followed by a colon, then
output the stuff from a $ to the next $. I think if non-text
characters (whatever that means) or newlines are encountered between
the two $, it's not output. I've never seen the program that
extracts the headers have a false match, although constructing a
failure case is easy.

RCS and CVS (source code control systems) will even maintain the
strings for you with a current version, time stamp, and user who
last checked it in. You actually put in:
static char rcsid[] = "$Id$";
and RCS or CVS will expand it for you.

Gordon L. Burditt

--
Al Balmer
Balmer Consulting
re************************@att.net
Nov 15 '05 #14

P: n/a
Alan Balmer <al******@att.net> writes:
On Tue, 08 Nov 2005 19:20:34 -0000, go***********@burditt.org (Gordon
Burditt) wrote:
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)


One common method of inserting an ID is:

static char rcsid[] = "$Id: foo.c,v 1.7 2005/11/08 12:01:32 root Exp $";


It's not a bad idea to use a format recognized by the Unix "what"
command.


<OT>
The "what" command is part of SCCS, a mostly obsolete source control
system that isn't even installed on a lot of Unix-like systems.
See also RCS, CVS, and "ident".
</OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 15 '05 #15

P: n/a
> Appending random crap to an executable may destroy it.

The random "crap" you mention is just an example on how to implement a
digital signature, as this binary file belongs to him. Besides this,
the "rcsid" example requires a control version system, which cannot be
the case.
There's no guarantee that SEEK_END on a binary file will work.


Sorry my friend, but the issue with SEEK_END is with text files, not
binary.

Nov 15 '05 #16

P: n/a
On Wed, 09 Nov 2005 00:50:59 GMT, Keith Thompson <ks***@mib.org>
wrote:
Alan Balmer <al******@att.net> writes:
On Tue, 08 Nov 2005 19:20:34 -0000, go***********@burditt.org (Gordon
Burditt) wrote:
how can i insert unique ID into binary file (that created by compiler)?
(so after compiling i can to identify the src that i use)

One common method of inserting an ID is:

static char rcsid[] = "$Id: foo.c,v 1.7 2005/11/08 12:01:32 root Exp $";


It's not a bad idea to use a format recognized by the Unix "what"
command.


<OT>
The "what" command is part of SCCS, a mostly obsolete source control
system that isn't even installed on a lot of Unix-like systems.
See also RCS, CVS, and "ident".
</OT>

<Still OT>
It's a standalone command, not dependent on SCCS. Our HP-UX systems
always have it, but never have SCCS (or RCV or CVS, for that matter.)
It's probably available for most Unix systems (all of which have stuff
available which may or may not be installed.)

What's more, it's a very useful command, and as I suggested, not hard
to implement yourself. Better, imo, than inventing a new format with
no meaning to anyone else.
--
Al Balmer
Balmer Consulting
re************************@att.net
Nov 15 '05 #17

P: n/a
In article <11**********************@g49g2000cwa.googlegroups .com>,
CarlosB <ca*******@gmail.com> wrote:
[without proper attribution to another poster]
There's no guarantee that SEEK_END on a binary file will work.
Sorry my friend, but the issue with SEEK_END is with text files, not
binary.


C89 4.9.9.2 The fseek Function

A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END.

For a text stream, either offset ahall be zero, or offset shall be
a value returned by an earlier call to the ftell function on the same
stream and whence shall be SEEK_SET.

--
I was very young in those days, but I was also rather dim.
-- Christopher Priest
Nov 15 '05 #18

P: n/a
On 2005-11-08, Gordon Burditt <go***********@burditt.org> wrote:
There's no guarantee that SEEK_END on a binary file will work.
(Some OS such as CP/M only keep file sizes in sectors, so
there may be unwritten stuff at the end of the last sector in
a binary file, with no way to tell how much of it is unwritten.)


But those will be null bytes. You could always decide "that is part of
the file for purposes of what i'm doing" - just keep in mind that there
also may be extra null bytes after whatever _you_ wrote.
Nov 15 '05 #19

P: n/a
On Wed, 9 Nov 2005 19:36:20 +0000 (UTC), Jordan Abel
<jm****@purdue.edu> wrote:
On 2005-11-08, Gordon Burditt <go***********@burditt.org> wrote:
There's no guarantee that SEEK_END on a binary file will work.
(Some OS such as CP/M only keep file sizes in sectors, so
there may be unwritten stuff at the end of the last sector in
a binary file, with no way to tell how much of it is unwritten.)


But those will be null bytes. You could always decide "that is part of
the file for purposes of what i'm doing" - just keep in mind that there
also may be extra null bytes after whatever _you_ wrote.


They won't necessarily be null. Even if they are, how do you know they
weren't written on purpose, and part of the data?
--
Al Balmer
Balmer Consulting
re************************@att.net
Nov 15 '05 #20

P: n/a
On 2005-11-09, Alan Balmer <al******@att.net> wrote:
On Wed, 9 Nov 2005 19:36:20 +0000 (UTC), Jordan Abel
<jm****@purdue.edu> wrote:
On 2005-11-08, Gordon Burditt <go***********@burditt.org> wrote:
There's no guarantee that SEEK_END on a binary file will work.
(Some OS such as CP/M only keep file sizes in sectors, so
there may be unwritten stuff at the end of the last sector in
a binary file, with no way to tell how much of it is unwritten.)
But those will be null bytes. You could always decide "that is part of
the file for purposes of what i'm doing" - just keep in mind that there
also may be extra null bytes after whatever _you_ wrote.


They won't necessarily be null.


C89 4.9.2

A binary stream is an ordered sequence of characters that can
transparently record internal data. Data read in from a binary stream
shall compare equal to the data that were earlier written out to that
stream, under the same implementation. Such a stream may, however, have
an implementation-defined number of null characters appended.

IOW... If the underlying filesystem doesn't zero such "garbage
characters" out, the C implementation is required to on closing the
file.
Even if they are, how do you know they weren't written on purpose, and
part of the data?


You'll have to keep track of that yourself. Anyway, it's
"Implementation-defined", not unspecified, so you have to be able to
look it up in the documentation.
Nov 15 '05 #21

P: n/a
>> Even if they are, how do you know they weren't written on purpose, and
part of the data?


You'll have to keep track of that yourself. Anyway, it's
"Implementation-defined", not unspecified, so you have to be able to
look it up in the documentation.


You can't keep track of that yourself. The idea here was that the
code takes a random executable, having absolutely no idea whatever
of the format of an executable, and appends some stuff to the end
of it to stamp it with some kind of tag, hopefully without destroying
its usefulness as an executable in the process. Since the code has
no idea what the format of an executable is, it has no way to tell
whether trailing 0 bytes are an intentional part of the executable
or not.

A possible consequence of doing this is that some of the global
variables that are supposed to be zeroed on program loading are
now initialized to the contents of the tag. Or the executable
could fail its checksum and the OS would refuse to run it.

Gordon L. Burditt
Nov 15 '05 #22

P: n/a
On 2005-11-09, Gordon Burditt <go***********@burditt.org> wrote:
Even if they are, how do you know they weren't written on purpose, and
part of the data?


You'll have to keep track of that yourself. Anyway, it's
"Implementation-defined", not unspecified, so you have to be able to
look it up in the documentation.


You can't keep track of that yourself. The idea here was that the
code takes a random executable, having absolutely no idea whatever
of the format of an executable, and appends some stuff to the end
of it to stamp it with some kind of tag, hopefully without destroying
its usefulness as an executable in the process. Since the code has
no idea what the format of an executable is, it has no way to tell
whether trailing 0 bytes are an intentional part of the executable
or not.


It doesn't need to. It just has to leave everything it finds intact. I
didn't say it should overwrite the null bytes. I said it should write
beyond it, leaving everything it finds, including the null bytes,
intact.
Nov 15 '05 #23

P: n/a
At about the time of 11/8/2005 1:01 PM, Skarmander stated the following:
Otherwise, you'll have to try and outsmart your compiler (this is
generally a bad idea, but it may be the only option). That is, involve
the constant in expressions that are too complicated for the compiler to
detect as going unused or evaluating to a constant. If you find such an expression, it may be a good idea to encapsulate it
in a function, otherwise it'll look very mysterious.

S.


For an embedded solution, that might not be a bad idea. It would throw
those off who are trying to hack the code. But, because the compiler is
compiling for an embedded system, there may not be a way to disable this
because embedded systems only have a few K of program storage at most.
For example, the Atmel ATmega16 has only 8K locations at 16-bits wide.
The processor word side is 16-bits, and it's a RISC.

--
Daniel Rudy

Email address has been base64 encoded to reduce spam
Decode email address using b64decode or uudecode -m

Why geeks like computers: look chat date touch grep make unzip
strip view finger mount fcsk more fcsk yes spray umount sleep
Nov 15 '05 #24

P: n/a
>On 2005-11-09, Gordon Burditt <go***********@burditt.org> wrote:
... The idea here was that the
code takes a random executable, having absolutely no idea whatever
of the format of an executable, and appends some stuff to the end
of it to stamp it with some kind of tag, hopefully without destroying
its usefulness as an executable in the process. Since the code has
no idea what the format of an executable is, it has no way to tell
whether trailing 0 bytes are an intentional part of the executable
or not.

In article <sl*******************@random.yi.org>
Jordan Abel <jm****@purdue.edu> wrote:It doesn't need to. It just has to leave everything it finds intact. I
didn't say it should overwrite the null bytes. I said it should write
beyond it, leaving everything it finds, including the null bytes,
intact.


Indeed, you can certainly append data to the file (if it is writeable
in such fashion at all) -- but systems exist in which this will
make the file non-executable. In particular, some systems include
a "cryptographically-strong" checksum of the binary in the binary,
and will not excute a binary whose checksum is incorrect. (This
does not guarantee that the binary will *not* run either, even
ignoring the -- tiny, one hopes -- possibility that the checksum
still matches: it depends on whether the checksum covers the entire
binary, or only the parts that are used when executing the binary.)

There are also systems that *will* run the resulting binary, but
on which it will behave incorrectly. In particular, the tail end
of the file will be mapped as data+bss on Unix-like systems that
do virtual memory by mapping correctly-aligned files. The first
few bytes of "bss", which are supposed to be zero, will now be
nonzero and contain the appended data. This may cause some
static-duration variables that must be zero to be nonzero:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
static int mbz; /* must be 0 */

if (mbz != 0)
puts("broken");
return mbz == 0 ? EXIT_SUCCESS : EXIT_FAILURE;
}

(One's chances of hitting "mbz" here are small in some cases, large
in others, depending on any shared library mechanisms. You can
often "improve" the chance by fiddling with linker options.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #25

This discussion thread is closed

Replies have been disabled for this discussion.