473,387 Members | 1,789 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

ASM => C

Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.
Nov 14 '05 #1
16 1844
>Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.


It is possible to write an emulator for a Pentium CPU in C.
From there, you can add

unsigned char memory[] = {
(code for program goes here)
(oh, yes, you probably have to throw in a copy of
the BIOS ROM and the OS, too)
};

and the emulator will run the code, and it's written in C.
(You will probably have to do something more specific about I/O
getting to a real device, and the emulator probably won't run
real-time).

Gordon L. Burditt
Nov 14 '05 #2
go***********@burditt.org wrote...
Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.


It is possible to write an emulator for a Pentium CPU in C.
From there, you can add

unsigned char memory[] = {
(code for program goes here)
(oh, yes, you probably have to throw in a copy of
the BIOS ROM and the OS, too)
};

and the emulator will run the code, and it's written in C.
(You will probably have to do something more specific about I/O
getting to a real device, and the emulator probably won't run
real-time).

Gordon L. Burditt


I am looking for something that more directly translates an .s (or
..asm) file into a .c file -- but that's an interesting observation
you've made.
Nov 14 '05 #3
Generally speaking, what you're asking cannot be done. Even assuming the
assembly was generated by a C compiler, it's still impossible in theory.
The general statement is "You can't make steak from a hamburger" -
information is destroyed in the compilation process; you cannot recover
the original code. Go to vivisimo and search for 'decompilation'.

You *can* generate a C program that does the equivalent of what the
assembly code does, although that depends on how strictly you define
'equivalent'. What you cannot do is create a C program that is guaranteed
to do all and only the things the original C program did. Information on
variable types can be lost. For example, a variable may be signed in the
C code, but the assembly gives no indication because there was nothing in
the C that actually made use of the sign. If you make it unsigned in the
decompilation, it's possible to get behaviour that is different from the
original program.

That being said, my rates are semi-reasonable.

--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list
Nov 14 '05 #4
Something that calls itself Unsolved Mysteries wrote:
Anyone know of a translator
that converts an Intel Pentium assembly listing into C?
In general, no.
C does not implement all Intel machine instructions.

Design information is discarded when C is converted to assembler.
More generally, this is a problem
when programmers fail to document their designs
*before* they implement them as C programs
because they never get around to documentation
after the design is working.
It is very hard to "reverse engineer" undocumented C code
because the original author's intent is unknown.
The quality of output code doesn't have to be great,
so long as it's accurate.


I used Google

http://www.google.com/

to search for

+"convert assembler to C"

and I found a couple of things that might interest you.
Nov 14 '05 #5
"Unsolved Mysteries" <um@domain.invalid> wrote in message
news:MP************************@news.verizon.net.. .
I am looking for something that more directly translates an .s (or
.asm) file into a .c file -- but that's an interesting observation
you've made.


Disassembly works because there is a 1:1 (or nearly so) correspondence
between machine and assembly code. There is no such correspondence between
assembly and C; there are an infinite number of C sources that could result
in the same assembly listing and vice versa.

So, if you want the original C source corresponding to a given assembly
file, you're completely out of luck. If you want _any_ C source that might
compile to a given assembly listing, you might have a chance of writing such
a program, but AFAIK none exists. Even with debug symbols (which aren't
guaranteed to exist), the C you end up with is unlikely to even
superficially resemble the original C program or even any C that a human is
likely to write.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Nov 14 '05 #6
st*****@sprunk.org wrote...
"Unsolved Mysteries" <um@domain.invalid> wrote in message
I am looking for something that more directly translates an .s (or
.asm) file into a .c file -- but that's an interesting observation
you've made.


Disassembly works because there is a 1:1 (or nearly so) correspondence
between machine and assembly code. There is no such correspondence between
assembly and C; there are an infinite number of C sources that could result
in the same assembly listing and vice versa.

So, if you want the original C source corresponding to a given assembly
file, you're completely out of luck. If you want _any_ C source that might
compile to a given assembly listing, you might have a chance of writing such
a program, but AFAIK none exists. Even with debug symbols (which aren't
guaranteed to exist), the C you end up with is unlikely to even
superficially resemble the original C program or even any C that a human is
likely to write.


The mapping between some large number of C programs and a given
assembler listing is understood, and OK.

I don't care at all how close to the original program it is.
Nov 14 '05 #7


Unsolved Mysteries wrote:
st*****@sprunk.org wrote...
"Unsolved Mysteries" <um@domain.invalid> wrote in message
I am looking for something that more directly translates an .s (or
.asm) file into a .c file -- but that's an interesting observation
you've made.


Disassembly works because there is a 1:1 (or nearly so) correspondence
between machine and assembly code. There is no such correspondence between
assembly and C; there are an infinite number of C sources that could result
in the same assembly listing and vice versa.

So, if you want the original C source corresponding to a given assembly
file, you're completely out of luck. If you want _any_ C source that might
compile to a given assembly listing, you might have a chance of writing such
a program, but AFAIK none exists. Even with debug symbols (which aren't
guaranteed to exist), the C you end up with is unlikely to even
superficially resemble the original C program or even any C that a human is
likely to write.

The mapping between some large number of C programs and a given
assembler listing is understood, and OK.

I don't care at all how close to the original program it is.


Then what's the purpose of creating "trashy" C source?
The value of a source file is that it can be read and
understood, then modified and recompiled to produce a new
program. If it's unreadable (or nearly so) it's also
unmodifiable (o.n.s.) -- so, what do you intend to do with
your cow made from hamburger?

--
Er*********@sun.com

Nov 14 '05 #8
er*********@sun.com wrote...


Unsolved Mysteries wrote:
st*****@sprunk.org wrote...
"Unsolved Mysteries" <um@domain.invalid> wrote in message

I am looking for something that more directly translates an .s (or
.asm) file into a .c file -- but that's an interesting observation
you've made.

Disassembly works because there is a 1:1 (or nearly so) correspondence
between machine and assembly code. There is no such correspondence between
assembly and C; there are an infinite number of C sources that could result
in the same assembly listing and vice versa.

So, if you want the original C source corresponding to a given assembly
file, you're completely out of luck. If you want _any_ C source that might
compile to a given assembly listing, you might have a chance of writing such
a program, but AFAIK none exists. Even with debug symbols (which aren't
guaranteed to exist), the C you end up with is unlikely to even
superficially resemble the original C program or even any C that a human is
likely to write.
The mapping between some large number of C programs and a given
assembler listing is understood, and OK.

I don't care at all how close to the original program it is.


Then what's the purpose of creating "trashy" C source?


I said it doesn't have to be the original C. While I think we could
agree that there are many, many readable C programs that do the same
thing, your question implies otherwise.
The value of a source file is that it can be read and
understood, then modified and recompiled to produce a new
program. If it's unreadable (or nearly so) it's also
unmodifiable (o.n.s.) -- so, what do you intend to do with
your cow made from hamburger?


But if it's a readable C program, then your question is badly formed.
Nonetheless: C is more portable than ASM, last time I looked.

--
"It is much easier to propagandize a public that believes in its own
freedom." - Robert McChesney
Nov 14 '05 #9
"Eric Sosman" <er*********@sun.com> wrote in message
news:d1**********@news1brm.Central.Sun.COM...
Unsolved Mysteries wrote:
st*****@sprunk.org wrote...
So, if you want the original C source corresponding to a given
assembly file, you're completely out of luck. If you want _any_ C
source that might compile to a given assembly listing, you might
have a chance of writing such a program, but AFAIK none exists.
Even with debug symbols (which aren't guaranteed to exist), the
C you end up with is unlikely to even superficially resemble the
original C program or even any C that a human is likely to write.


The mapping between some large number of C programs and a given
assembler listing is understood, and OK.

I don't care at all how close to the original program it is.


Then what's the purpose of creating "trashy" C source?
The value of a source file is that it can be read and
understood, then modified and recompiled to produce a new
program. If it's unreadable (or nearly so) it's also
unmodifiable (o.n.s.) -- so, what do you intend to do with
your cow made from hamburger?


If nothing else, it makes a great project for undergrads ;)

Depending on how smart the decompiler is, it might do a reasonable job of
ferreting out calling conventions, flow control instructions, etc. With
debug information available, it could even get the function and variable
names (and types?) right. That's certainly more readable/modifiable for me
than what I get from a disassembler, but the usefulness is still low
compared to the original source.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Nov 14 '05 #10
Unsolved Mysteries wrote:
er*********@sun.com wrote...

Unsolved Mysteries wrote:
[...]
I don't care at all how close to the original program it is.
Then what's the purpose of creating "trashy" C source?


I said it doesn't have to be the original C. While I think we could
agree that there are many, many readable C programs that do the same
thing, your question implies otherwise.


No (or at any rate, I don't think so): I'm suggesting
that mechanical dis-compiling is likely to produce one of
the many possible *un*readable C sources for the object code.
The value of a source file is that it can be read and
understood, then modified and recompiled to produce a new
program. If it's unreadable (or nearly so) it's also
unmodifiable (o.n.s.) -- so, what do you intend to do with
your cow made from hamburger?


But if it's a readable C program, then your question is badly formed.


If the output is readable, consider yourself either lucky
or an excellent reader ... In any case, questions are valid or
invalid on their premises, not on whatever the answer turns out
to have been.
Nonetheless: C is more portable than ASM, last time I looked.


It depends rather strongly on the C: you have but to lurk
on this newsgroup for a few days to see enough examples of
wildly non-portable C as you can stomach. Here's a plausible
example: somewhere in the object code you find instructions
that load a `double' register from one location and store
it to another. Your dis-compiler may well generate

*(double*)p = *(double*)q;

.... which accurately reflects the object code. Portable?
By no means! What was *really* going on was

struct st { short s; int i; };
struct st x = { 42, 42 };
/* here come the instructions in question: */
struct st y = x;

.... where the compiler decided to copy an eight-byte struct
by copying an eight-byte `double'. How portable is this?
Not very! The compiler has taken advantage of its own non-
portable knowledge in generating the code, as it is permitted
to do. Is the idea that sizeof(struct st) == sizeof(double)
portable? No, it is not. How about alignment: Is there any
guarantee that "alignof(struct st)" >= "alignof(double)"?
No, there is not. How about preservation of representation:
Is there any guarantee that loading something that might look
like a signalling NaN into a `double' register will preserve
its bit pattern for the subsequent store? No, there is not.

If you hope to dis-compile on machine A and re-compile
on machine B and get working code, you may well hope and your
hope may be rewarded, at least some of the time. But you would
be well-advised not to expect much ...

--
Eric Sosman
es*****@acm-dot-org.invalid
Nov 14 '05 #11
es*****@acm-dot-org.invalid wrote...
Unsolved Mysteries wrote:
er*********@sun.com wrote...

Unsolved Mysteries wrote:
[...]
I don't care at all how close to the original program it is.

Then what's the purpose of creating "trashy" C source?
I said it doesn't have to be the original C. While I think we could
agree that there are many, many readable C programs that do the same
thing, your question implies otherwise.


No (or at any rate, I don't think so): I'm suggesting
that mechanical dis-compiling is likely to produce one of
the many possible *un*readable C sources for the object code.


OK. I guess there _would_ be more unreadable than readable resulting
sources, and purely mechanical rendering would be much more likely to
produce one of the former than one of the latter.
The value of a source file is that it can be read and
understood, then modified and recompiled to produce a new
program. If it's unreadable (or nearly so) it's also
unmodifiable (o.n.s.) -- so, what do you intend to do with
your cow made from hamburger?


But if it's a readable C program, then your question is badly formed.


If the output is readable, consider yourself either lucky
or an excellent reader ... In any case, questions are valid or
invalid on their premises, not on whatever the answer turns out
to have been.
Nonetheless: C is more portable than ASM, last time I looked.


It depends rather strongly on the C: you have but to lurk
on this newsgroup for a few days to see enough examples of
wildly non-portable C as you can stomach.


Cripes. I'm 0-2 here. Point taken.
Here's a plausible
example: somewhere in the object code you find instructions
that load a `double' register from one location and store
it to another. Your dis-compiler may well generate

*(double*)p = *(double*)q;

... which accurately reflects the object code. Portable?
By no means! What was *really* going on was

struct st { short s; int i; };
struct st x = { 42, 42 };
/* here come the instructions in question: */
struct st y = x;

... where the compiler decided to copy an eight-byte struct
by copying an eight-byte `double'. How portable is this?
Not very! The compiler has taken advantage of its own non-
portable knowledge in generating the code, as it is permitted
to do. Is the idea that sizeof(struct st) == sizeof(double)
portable? No, it is not. How about alignment: Is there any
guarantee that "alignof(struct st)" >= "alignof(double)"?
No, there is not. How about preservation of representation:
Is there any guarantee that loading something that might look
like a signalling NaN into a `double' register will preserve
its bit pattern for the subsequent store? No, there is not.

If you hope to dis-compile on machine A and re-compile
on machine B and get working code, you may well hope and your
hope may be rewarded, at least some of the time. But you would
be well-advised not to expect much ...


--
"It is much easier to propagandize a public that believes in its own
freedom." - Robert McChesney
Nov 14 '05 #12
Mac
On Tue, 15 Mar 2005 18:33:49 +0000, Unsolved Mysteries wrote:
Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.


Funny, I thought this was in the FAQ list, but when I went to look for it
I couldn't find it.

It probably should be in the FAQ list.

--Mac

Nov 14 '05 #13
"Unsolved Mysteries" <um@domain.invalid> wrote:
st*****@sprunk.org wrote...
So, if you want the original C source corresponding to a given assembly
file, you're completely out of luck. If you want _any_ C source that might
compile to a given assembly listing, you might have a chance of writing such
a program, but AFAIK none exists. Even with debug symbols (which aren't
guaranteed to exist), the C you end up with is unlikely to even
superficially resemble the original C program or even any C that a human is
likely to write.
The mapping between some large number of C programs and a given
assembler listing is understood, and OK.


For a pre-determined compiler, used on a pre-determined platform, with
pre-determined options, perhaps. If you don't know any of that, it isn't
and when push gets to shove cannot be understood.
I don't care at all how close to the original program it is.


Then you have a better chance at generating _something_, but still
hardly a chance of generating something legible, let alone maintainable.

Richard
Nov 14 '05 #14
On Wed, 16 Mar 2005 03:08:31 GMT, Unsolved Mysteries <um@domain.invalid>
wrote:
Cripes. I'm 0-2 here. Point taken.


I tried to tell ya.

--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list
Nov 14 '05 #15
Unsolved Mysteries wrote:
Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.


I know there are decompilers that can generate C code, but the problem is
that it will just catch simple things, such as loops etc. The resulting C
source still is far from portable, especially because escapes to assembler
are made throughout the source.

One big issue is the stub that a C compiler puts in front of every program
it generates. This stub may not (and probably will not) be recognized as
such, causing the decompiler to generate some strange code which is
completely redundant.

If what you are trying to do is reverse engineer a program by getting the C
equivalent for it, any C generator will only help so little and
understanding of the machine on which the original program was intented to
run is still necessary, as well as some basic ASM knowledge.

If your goal is to generate a portable version of a piece of software, you
might as well forget about it and see if you can make a deal with the
original developer.

Good luck either way!

--
Martijn
http://www.sereneconcepts.nl
Nov 14 '05 #16
"Unsolved Mysteries" <um@domain.invalid> wrote in message
news:MP************************@news.verizon.net.. .
Anyone know of a translator that converts an Intel Pentium assembly
listing into C? The quality of output code doesn't have to be great,
so long as it's accurate.


Check out Software Migrations, Ltd. http://www.smltd.com/
They supply a service to do this.
I believe it to be very high quality, although I have not used it.

--
Ira D. Baxter, Ph.D., CTO 512-250-1018
Semantic Designs, Inc. www.semdesigns.com

Nov 14 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Francesco Moi | last post by:
Hello. I'm trying to build a RSS feed for my website. It starts: ----------------//--------------------- <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE rss PUBLIC "-//Netscape...
1
by: Christian Schmidbauer | last post by:
Hello! I prepare my XML document like this way: ------------------------------------------------------- PrintWriter writer; Document domDocument; Element domElement; // Root tag
2
by: Eshrath | last post by:
Hi, What I am trying to do: ======================= I need to form a table in html using the xsl but the table that is formed is quite long and cannot be viewed in our application. So we are...
2
by: Donald Firesmith | last post by:
I am having trouble having Google Adsense code stored in XSL converted properly into HTML. The <> unfortunately become &lt; and &gt; and then no longer work. XSL code is: <script...
0
by: Arne Schirmacher | last post by:
I want to display a MySQL database field that can contain HTML markup. If I use <esql:get-string> then I get all of the database field, but all tags are escaped which is not what I want. If I use...
34
by: Mark Moore | last post by:
It looks like there's a pretty serious CSS bug in IE6 (v6.0.2800.1106). The HTML below is validated STRICT HTML 4.01 and renders as I would expect in Opera, FrontPage, and Netscape. For some...
11
by: Les Paul | last post by:
I'm trying to design an HTML page that can edit itself. In essence, it's just like a Wiki page, but my own very simple version. It's a page full of plain old HTML content, and then at the bottom,...
2
by: bissatch | last post by:
Hi, I am currently writing a simple PHP program that uses an XML file to output rows for a 'Whats New' page. Once written, I will only require updating the XML file and any pages that use the...
0
by: vdex42 | last post by:
Apologies if this has been asked before, but I haven't been able to find the answer to this yet: My problem is that .NET will not allow me to insert escaped '>' characters (i.e. &gt;) within the...
5
by: John Nagle | last post by:
This, which is from a real web site, went into BeautifulSoup: <param name="movie" value="/images/offersBanners/sw04.swf?binfot=We offer fantastic rates for selected weeks or days!!&blinkt=Click...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.