Mostly for testing reasons I'd like to see if it makes sense to chose
the following approach for just-in-time compilation of shaders for a
renderer:
Seeing as the shaders themsefs consist mostly of very basic operations
I'd like to translate them into assembly, have an assembler compile the
binary code and then call the resulting machine code from c++.
The thing is that up until now I have only used inline assembly in my
c++ projects, so there's a few things I hardly know anything about and
would be very greatful if anyone here could point me in the right
direction:
- Having a set of asm instructions, say "addl 5, %%eax" or "add eax, 5"
respectively, how would I go about translating just this one line into
binary? (in a way that doesn't mean i'll have to re-write the whole
thing when porting to a different os if at all possible :)
- How do I jump into the binary from my c++ app in a way that I can jmp
back at the end of my assembly code segment?
Thanks!
Dec 23 '06
14 2481 ne**@rtrussell. co.uk wrote:
Robert Mabee wrote:
>>I believe in fact it is set to prevent such, but the trap handler does "what you want" (for suitable choice of what to want...) by flushing the cache and changing the pages from R/W data to read-only code.
Your evidence for that is what exactly? I'm pretty sure it's not the
case.
The evidence that Windows has such a trap handler is your prior claim
that such code works. A cache flush is needed on any CPU that has
an I cache that doesn't snoop the bus. I recall discussions about such
a CPU from Intel but couldn't say if this is a problem with recent
chips, but the code has to be right on the worst case that might still
be running.
>>It might get really expensive if the code wrote to its own page.
Which of course is exactly what self-modifying code does. Yes it's
expensive in performance, but IA-32 processors support it without any
intervention from the OS (see my quote from the Intel Optimization
Manual elsewhere in this thread).
Then where does the expense come from? I mean by my remark to warn the
OP to do all the writing to the fabricated code before jumping into it,
which will either make no difference (your model) or a vast improvement
(my model) versus a possible implementation that mixes writes and code
fetches to the same page.
I checked the quote -- it unfortunately doesn't say anything about
whether prior implementations also do this right.
All this talk of cache flush penalties: How is
a call to a just-created block of code any
different to the CPU than an indirect call through
a register? In either case the CPU must load
new code into the pipeline that can't be prefetched.
I assume the OP would only use generated code
for performance, which implies it will be in a loop
that gets used enough times to overcome any
setup penalties... just like most performance code.
Best regards,
Bob Masta
dqatechATdaqart aDOTcom
D A Q A R T A
Data AcQuisition And Real-Time Analysis www.daqarta.com
Home of DaqGen, the FREEWARE signal generator
Robert Mabee wrote:
Then where does the expense come from?
This is what the Intel optimization document says:
"Software should avoid writing to a code page in the same 1 KB subpage
of that is being executed or fetching code in the same 2 KB subpage of
that is currently being written. In addition, sharing a page containing
directly or speculatively executed code with another processor as a
data page can trigger an SMC condition that causes the entire pipeline
of the machine and the trace cache to be cleared. This is due to the
self-modifying code condition".
I mean by my remark to warn the
OP to do all the writing to the fabricated code before jumping into it,
which will either make no difference (your model) or a vast improvement
(my model) versus a possible implementation that mixes writes and code
fetches to the same page.
I never suggested that there was no performance hit, I simply wanted to
emphasise that writing machine code to data memory and then executing
it is supported by the processor, and requires no user or OS
intervention (such as flushing the instruction cache) - the necessary
steps are carried out by the CPU itself. This is the 'SMC condition'
referred to by Intel.
I checked the quote -- it unfortunately doesn't say anything about
whether prior implementations also do this right.
It refers to "Self-modifying code (SMC) that ran correctly on Pentium
III processors and prior implementations ". Since 'prior
implementations ' presumably include the 80386, the implication is that
all IA-32 processors have supported self-modifying code. Certainly I
have never encountered any unexpected behavior from dynamically
assembling code and then executing it in data memory, which all
versions of BBC BASIC (right back to the 6502) have done.
Richard. http://www.rtrussell.co.uk/
On Sat, 23 Dec 2006, sp******@crayne .org wrote:
Mostly for testing reasons I'd like to see if it makes sense to chose
the following approach for just-in-time compilation of shaders for a
renderer:
Seeing as the shaders themsefs consist mostly of very basic operations
I'd like to translate them into assembly, have an assembler compile the
binary code and then call the resulting machine code from c++.
The thing is that up until now I have only used inline assembly in my
c++ projects, so there's a few things I hardly know anything about and
would be very greatful if anyone here could point me in the right
direction:
- Having a set of asm instructions, say "addl 5, %%eax" or "add eax, 5"
respectively, how would I go about translating just this one line into
binary? (in a way that doesn't mean i'll have to re-write the whole
thing when porting to a different os if at all possible :)
- How do I jump into the binary from my c++ app in a way that I can jmp
back at the end of my assembly code segment?
Thanks!
you can use or study the GNU lightning library: http://www.gnu.org/software/lightning
GNU lightning is a library that generates assembly language code at
run-time; it is very fast, making it ideal for Just-In-Time compilers, and
it abstracts over the target CPU, as it exposes to the clients a
standardized RISC instruction set inspired by the MIPS and SPARC chips.
GNU lightning 1.0 has been released and is usable in complex code
generation tasks. The available backends cover the x86, SPARC and PowerPC
architectures; the floating point interface is still experimental though,
and developed for the x86 only.
regards,
lajos
Hi,
I have a project, SoftWire, that does exactly what you intend to do: https://gna.org/projects/softwire/. It's free for use under the LGPL
license. Its commercial successor is used in SwiftShader, an advanced
software renderer.
Executing the binary code is as simple as treating the pointer to the
memory buffer as a function pointer, and calling it. Memory can be made
executable with the following code (straight from SoftWire):
#ifdef WIN32
unsigned long oldProtection;
VirtualProtect( machineCode, length, PAGE_EXECUTE_RE ADWRITE,
&oldProtection) ; // #include <windows.h>
#elif __unix__
mprotect(machin eCode, length, PROT_READ | PROT_WRITE | PROT_EXEC);
// #include <sys/mman.h>
#endif
Kind regards,
Nicolas Capens sp******@crayne .org wrote:
Mostly for testing reasons I'd like to see if it makes sense to chose
the following approach for just-in-time compilation of shaders for a
renderer:
Seeing as the shaders themsefs consist mostly of very basic operations
I'd like to translate them into assembly, have an assembler compile the
binary code and then call the resulting machine code from c++.
The thing is that up until now I have only used inline assembly in my
c++ projects, so there's a few things I hardly know anything about and
would be very greatful if anyone here could point me in the right
direction:
- Having a set of asm instructions, say "addl 5, %%eax" or "add eax, 5"
respectively, how would I go about translating just this one line into
binary? (in a way that doesn't mean i'll have to re-write the whole
thing when porting to a different os if at all possible :)
- How do I jump into the binary from my c++ app in a way that I can jmp
back at the end of my assembly code segment?
Thanks!
This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Steven T. Hatton |
last post by:
Is there anything that gives a good description of how source code is
converted into a translation unit, then object code, and then linked. I'm
particularly interested in understanding why putting normal functions in
header files results in multiple definition errors even when include guards
are used.
--
STH
Hatton's Law: "There is only One inviolable Law"
KDevelop: http://www.kdevelop.org SuSE: http://www.suse.com
Mozilla:...
|
by: cppaddict |
last post by:
Let's say you want to implement a Dictionary class, which contains a
vector of DictionaryEntry. Assume each DictionaryEntry has two
members, a word and a definition.
Now assume your program needs to create a Dictionary *object* to be
populated with values that come from a text file with a format like
this:
<dict.txt>
|
by: Morten Aune Lyrstad |
last post by:
I wish to create my own assembly language for script. For now it is
mostly for fun and for the sake of the learning, but I am also creating
a game engine where I want this system in. Once the assembler is done I
will develop a simple script language/compiler which uses this
assembler. What operators do you suggest I implement in it?
Yours,
Morten Aune Lyrstad
|
by: H. S. |
last post by:
Hi,
I am trying to compile these set of C++ files and trying out class
inheritence and function pointers. Can anybody shed some light why my
compiler is not compiling them and where I am going wrong?
I am using g++ (GCC) 3.3.5 on a Debian Sarge system. The compiler complains:
//****************************
//**************************** Compiler output starts ***********
cd /home/red/tmp/testprogs/
|
by: enfis.the.paladin |
last post by:
Hi to all!
I have something like this:
class FWrap {
public:
virtual void READ (void) = 0;
}
class Optimized {
private:
| |
by: mclagett |
last post by:
Can anyone please help me figure out why all of a sudden (even after many
restarts of Visual Studio 2005 and reboots, etc) I am no longer able to set
breakpoints in the disassembler window. I should mention that this is
assembler code I generate by writing opcodes to memory, but for a month or so
I had no problems. All of a sudden it stopped working. Is there some
setting somewhere that has gotten set somehow, or did a security download...
|
by: Jan Althaus |
last post by:
Mostly for testing reasons I'd like to see if it makes sense to chose
the following approach for just-in-time compilation of shaders for a
renderer:
Seeing as the shaders themsefs consist mostly of very basic operations
I'd like to translate them into assembly, have an assembler compile the
binary code and then call the resulting machine code from c++.
The thing is that up until now I have only used inline assembly in my
c++ projects,...
|
by: Randy Yates |
last post by:
Hi Folks,
I have a cross-development problem in which I'm using the x86_64 version
of Fedora Core 6 as a development system but want to build executables that
are 32-bit. I've got a mix of C (mainly C) and about 2 or 3 assembly routines.
I'm using yasm as my assembler and gcc 4.1.2 for my C compiler, and
GNU ld version 2.17.50.0.6-2.fc6 20061020 as my linker.
I thought that if I just specified a 32-bit target for the C compiler via...
|
by: Analizer1 |
last post by:
Hello all
I have a idea...and dont know if it is possible......
we have a pretty huge system at work and
we send EDI Special formatted Data to Several
Other Companies, via sFtp,dial up, vpn depending on these customers
requirements
one of the areas we are using a implimentation in vb6 (yuck) but
works...slow as well u know,
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
| |
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |