473,785 Members | 2,823 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Can anyone improve this any further......

I am writing a byte-code interpreter/emulator for a language that
exclusively uses strings for variables (i.e all variables are pushed onto
the stack as strings). Therefore for arithmetic operations I need to
convert the string to a 32 bit integer, carry out the arithmetic operation,
then convert it back to a string before pushing it back onto the stack.
This can happen millions of times per second so it is essential that I have
optimised (for speed) atol() and ltoa() functions.

Here is my attempt at ltoa()... (The __fastcall keyword is to tell my
compiler to pass arguments through registers and is used instead of inline
because it seems to be just as fast and this function gets called from
various modules ..).

static char * __fastcall myltoa(long n,char * s)
{
register long r, k;
register int flag = 0;
register int next;

next = 0;
if (n == 0) {
s[next++] = '0';
}
else {
if (n < 0) {
s[next++] = '-';
n = -n;
}

if(n < 100) goto label2;
if(n < 100000) goto label1;

k = 1000000000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 100000000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 10000000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 1000000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 100000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;
label1:
k = 10000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 1000;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

k = 100;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;
label2:
k = 10;
r = n/k;
if(flag) s[next++] = '0' + r;
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
n -= r * k;

r=n;
s[next++] = '0' + r;
}
s[next] = '\0';
return(s);
}
The goto's are there because I recognised that the majority of numbers are
less than 100 (mainly return codes from functions) and those that are not
less than 100 tend to be less than 100,000 (this language is not usually
used for mathemetics). This did give me another 15-20% improvement in
speed on a tight loop going some arithmetic..

By the way, I have tried pre-calculating integer values for all strings
pushed onto the stack (by always appending the 4 byte integer value on the
stack at the end of the string, but it actually caused a slowdown since the
I have to do a atol() on every function return value or string concatenation
operation even if the value is not going to be used in an arithmetic or
comparison operation.

Any hints would be welcome...

Sean
Nov 14 '05 #1
17 2088
Sean Kenwrick wrote:

I am writing a byte-code interpreter/emulator for a language that
exclusively uses strings for variables (i.e all variables are pushed onto
the stack as strings). Therefore for arithmetic operations I need to
convert the string to a 32 bit integer, carry out the arithmetic operation,
then convert it back to a string before pushing it back onto the stack.
This can happen millions of times per second so it is essential that I have
optimised (for speed) atol() and ltoa() functions.

Here is my attempt at ltoa()... (The __fastcall keyword is to tell my
compiler to pass arguments through registers and is used instead of inline
because it seems to be just as fast and this function gets called from
various modules ..).

static char * __fastcall myltoa(long n,char * s)
{
register long r, k;
register int flag = 0;
register int next;

next = 0;
if (n == 0) {
s[next++] = '0';
}
else {
if (n < 0) {
s[next++] = '-';
n = -n;
Undefined behavior if (-LONG_MAX > n)
}

if(n < 100) goto label2;
if(n < 100000) goto label1;

k = 1000000000;
r = n/k;
if(flag) s[next++] = '0' + r;
if(flag)
flag is already known to be zero, at this point.
else if(r > 0){ s[next++] = '0' + r;flag = 1;}
Any hints would be welcome...


Please time this for me:

#include <limits.h>
char *lto_a(long n, char *s)
{
char c, *p, *q;
int flag = 0;

q = p = s;
if (0 > n) {
*p++ = '-';
++s;
if (-LONG_MAX > n) {
flag = 1;
n = LONG_MAX;
} else {
n = -n;
}
}
do {
*p++ = (char)(n % 10 + '0');
n /= 10;
} while (n != 0);
if (flag) {
++*s;
}
for (*p = '\0'; --p > s; ++s) {
c = *s;
*s = *p;
*p = c;
}
return q;
}

--
pete
Nov 14 '05 #2

"pete" <pf*****@mindsp ring.com> wrote in message
news:40******** ***@mindspring. com...
Sean Kenwrick wrote:

I am writing a byte-code interpreter/emulator for a language that
exclusively uses strings for variables (i.e all variables are pushed onto the stack as strings). Therefore for arithmetic operations I need to
convert the string to a 32 bit integer, carry out the arithmetic operation, then convert it back to a string before pushing it back onto the stack.
This can happen millions of times per second so it is essential that I have optimised (for speed) atol() and ltoa() functions.

Here is my attempt at ltoa()... (The __fastcall keyword is to tell my
compiler to pass arguments through registers and is used instead of inline because it seems to be just as fast and this function gets called from
various modules ..).

static char * __fastcall myltoa(long n,char * s)
{
register long r, k;
register int flag = 0;
register int next;

next = 0;
if (n == 0) {
s[next++] = '0';
}
else {
if (n < 0) {
s[next++] = '-';
n = -n;
Undefined behavior if (-LONG_MAX > n)
}

if(n < 100) goto label2;
if(n < 100000) goto label1;

k = 1000000000;
r = n/k;
if(flag) s[next++] = '0' + r;


if(flag)
flag is already known to be zero, at this point.
else if(r > 0){ s[next++] = '0' + r;flag = 1;}


Any hints would be welcome...


Please time this for me:

#include <limits.h>
char *lto_a(long n, char *s)
{
char c, *p, *q;
int flag = 0;

q = p = s;
if (0 > n)

*p++ = '-';
++s;
if (-LONG_MAX > n) {
flag = 1;
n = LONG_MAX;
} else {
n = -n;
}
}
do {
*p++ = (char)(n % 10 + '0');
n /= 10;
} while (n != 0);
if (flag) {
++*s;
}
for (*p = '\0'; --p > s; ++s) {
c = *s;
*s = *p;
*p = c;
}
return q;
}

--
pete


After I change the call to __fastcall and use register variables It's very
slightly slower (mine took 4650ms, yours took 4860ms for a 1,000,000 times
loop doing some simple arithmetic). The difference is probably in the
strrev() required at the end - but I like the solution of going from the
right to left and I think I have a version that might not need the strrev()
at the end..... I will post my attempt and the timing later...

Thanks
Sean
Nov 14 '05 #3
What about this?

#include <limits.h>

#if INT_MAX==0x7FFF FFFF && INT_MIN+2147483 647>=-1
/* 32 bit signed arithmetic assumed */
char *myltoa(long n, char *s)
{
unsigned short j = 1;
if (n<0)
{ /* handle negatives */
s[0] = '-'; ++j;
if (INT_MIN+214748 3647==-1 && n==INT_MIN)
{ /* handle -2147483648 the hard way */
s[1] = '2'; ++j;
n = -147483648;
}
n = -n;
}
/* find length using kinda dichotomy */
s[j += (n<10000)?(n<10 0)?(n>=10):(n>= 1000)+2:
(n<100000000)?( n<1000000)?(n>= 100000)+4:
(n>=10000000)+6 :(n>=1000000000 )+8] = '\0';
/* convert from right to left */
do s[--j] = n%10+'0'; while ((n/=10)!=0);
/* all done */
return(s);
}
#endif
François Grieu
Nov 14 '05 #4
Sean Kenwrick wrote:
After I change the call to __fastcall and use register variables
It's very slightly slower (mine took 4650ms,
yours took 4860ms for a 1,000,000 times
loop doing some simple arithmetic).
The difference is probably in the strrev() required at the end
- but I like the solution of going from the
right to left and I think I have a version that might not need
the strrev() at the end.....
I will post my attempt and the timing later...


Don't forget the (-LONG_MAX > n) case.
Thank you.

--
pete
Nov 14 '05 #5

"Sean Kenwrick" <sk*******@hotm ail.com> wrote in message
news:c0******** **@titan.btinte rnet.com...

"pete" <pf*****@mindsp ring.com> wrote in message
news:40******** ***@mindspring. com...
Sean Kenwrick wrote:

I am writing a byte-code interpreter/emulator for a language that
exclusively uses strings for variables (i.e all variables are pushed onto the stack as strings). Therefore for arithmetic operations I need to convert the string to a 32 bit integer, carry out the arithmetic operation, then convert it back to a string before pushing it back onto the stack. This can happen millions of times per second so it is essential that I have optimised (for speed) atol() and ltoa() functions.

Here is my attempt at ltoa()... (The __fastcall keyword is to tell my
compiler to pass arguments through registers and is used instead of inline because it seems to be just as fast and this function gets called from
various modules ..).

static char * __fastcall myltoa(long n,char * s)
{
register long r, k;
register int flag = 0;
register int next;

next = 0;
if (n == 0) {
s[next++] = '0';
}
else {
if (n < 0) {
s[next++] = '-';
n = -n;
Undefined behavior if (-LONG_MAX > n)
}

if(n < 100) goto label2;
if(n < 100000) goto label1;

k = 1000000000;
r = n/k;
if(flag) s[next++] = '0' + r;


if(flag)
flag is already known to be zero, at this point.
else if(r > 0){ s[next++] = '0' + r;flag = 1;}


Any hints would be welcome...


Please time this for me:

#include <limits.h>
char *lto_a(long n, char *s)
{
char c, *p, *q;
int flag = 0;

q = p = s;
if (0 > n)

*p++ = '-';
++s;
if (-LONG_MAX > n) {
flag = 1;
n = LONG_MAX;
} else {
n = -n;
}
}
do {
*p++ = (char)(n % 10 + '0');
n /= 10;
} while (n != 0);
if (flag) {
++*s;
}
for (*p = '\0'; --p > s; ++s) {
c = *s;
*s = *p;
*p = c;
}
return q;
}

--
pete


After I change the call to __fastcall and use register variables It's very
slightly slower (mine took 4650ms, yours took 4860ms for a 1,000,000 times
loop doing some simple arithmetic). The difference is probably in the
strrev() required at the end - but I like the solution of going from the
right to left and I think I have a version that might not need the

strrev() at the end..... I will post my attempt and the timing later...

Thanks
Sean


By the way I mis-remembered the timings above, they were actually 2650 and
2860ms...

Here is the version I ended up with of the above algorithm. I can be
certain that the long passed does not exceed +/- LONG_MAX so I don't need
these tests, and I've done away with the strrev() at the end by filling the
string backwards from the 11th char and then returning a pointer to where
the actual number starts in the string..

char * __fastcall myltoa(long n, char *s)
{
register char *p;
register int flag=0;

p=s+12;
*p--='\0'; /* Null terminate */

// If n is negative
if (0 > n) {
flag=1;
n = -n;
}
do {
// Next char is n%10 (Rightmost digit)
*p-- = (char)(n % 10 + '0');
// Strip off the rightmost digit
n /= 10;
} while (n != 0); // Until n is 0

if(flag)
*p-- = '-';
// Return the offset into the string where the number starts
return p+1;
}

However there was no significant change in the timings so I suspect that the
problem with the above is that it is doing two divides per digit (in fact
one modulo and 1 divide) - and divides are typically alot slower than all
the other arithmetic operations.. In my version I do 1 divide, 1 multiply
and 1 subtraction which might be a saving of quite a few cycles per digit...

I can't see any way of optimising this version any firther unless there is a
way to get rid of one of the divides...
Sean
Nov 14 '05 #6

"Francois Grieu" <fg****@microne t.fr> wrote in message
news:fg******** *************** ***@news.cis.df n.de...
What about this?

#include <limits.h>

#if INT_MAX==0x7FFF FFFF && INT_MIN+2147483 647>=-1
/* 32 bit signed arithmetic assumed */
char *myltoa(long n, char *s)
{
unsigned short j = 1;
if (n<0)
{ /* handle negatives */
s[0] = '-'; ++j;
if (INT_MIN+214748 3647==-1 && n==INT_MIN)
{ /* handle -2147483648 the hard way */
s[1] = '2'; ++j;
n = -147483648;
}
n = -n;
}
/* find length using kinda dichotomy */
s[j += (n<10000)?(n<10 0)?(n>=10):(n>= 1000)+2:
(n<100000000)?( n<1000000)?(n>= 100000)+4:
(n>=10000000)+6 :(n>=1000000000 )+8] = '\0';
/* convert from right to left */
do s[--j] = n%10+'0'; while ((n/=10)!=0);
/* all done */
return(s);
}
#endif
François Grieu


This is similar to my solution expect that I didn't bother checking against
INT_MIN/MAX (because I know that these can't be exceeded) and I didn't
bother filling the string from the first character (so no need for the
length calculation (I just return a pointer to the offset in the string).
But this suffers from the same problem that I had with my version (See
previous post) in that it effectively does two divides per digit and this
seems to be causing the function to be slower than my orignal by about 10%..

If there is any way to get rid of one of the divides then this would
probably be faster..

Sean
Nov 14 '05 #7
Sean Kenwrick wrote:
However there was no significant change in the timings
so I suspect that the problem with the above is that it
is doing two divides per digit
(in fact one modulo and 1 divide)
- and divides are typically alot slower than all
the other arithmetic operations.. In my version I do 1 divide,
1 multiply and 1 subtraction which might be a saving of quite
a few cycles per digit...

I can't see any way of optimising this version any
firther unless there is a way to get rid of one of the divides...


There is a way, but is it faster ?

#include <stdlib.h>
ldiv_t ldiv(long numer, long denom);

--
pete
Nov 14 '05 #8

"pete" <pf*****@mindsp ring.com> wrote in message
news:40******** ***@mindspring. com...
Sean Kenwrick wrote:
However there was no significant change in the timings
so I suspect that the problem with the above is that it
is doing two divides per digit
(in fact one modulo and 1 divide)
- and divides are typically alot slower than all
the other arithmetic operations.. In my version I do 1 divide,
1 multiply and 1 subtraction which might be a saving of quite
a few cycles per digit...

I can't see any way of optimising this version any
firther unless there is a way to get rid of one of the divides...


There is a way, but is it faster ?

#include <stdlib.h>
ldiv_t ldiv(long numer, long denom);

--
pete


I got rid of the extra divide and replaced it with a multiply and subtract
(to do the modulo bit). It is now as fast as my original, and even
slightly faster in some cases since I don't need to short-cut numbers less
than 100,000 or 100 for example... It is also a much cleaner solution so I
will use this instead. Here is my final version...

char * __fastcall myltoa(long n, char *s)
{
register int flag=0;
register int x;

s+=12;
*s--='\0'; /* Null terminate */

/* If n is negative */
if (0 > n) {
flag=1;
n = -n;
}
do {
/* Next char is n%10 (Rightmost digit) */
x=n/10;
*s-- = n -(x*10) + '0';
/* Strip off the rightmost digit */
n = x;
} while (n != 0); /* Until n is 0 */

if(flag)
*s-- = '-';
/* Return the offset into the string where the number starts */
return s+1;
}
Sean
Nov 14 '05 #9
In article <c0**********@t itan.btinternet .com>,
"Sean Kenwrick" <sk*******@hotm ail.com> wrote:
I got rid of the extra divide and replaced it with a multiply and subtract
(to do the modulo bit). It is now (faster)


In my experience, is unusual that % is slower than / is for unsigned
quatities. I guess your system has a problem with modulo of signed
quantities.
Since apparently you have no requirement that the string returned
starts where s does, maybe try:

/* 32 bit signed arithmetic assumed */
/* does not handle -2147483648 */
char *myltoa(long n, char *s)
{
char neg;
*(s += 11) = neg = '\0';
if (n<0)
{
n = -n; /* does not handle -2147483648 */
neg = '-';
}
do *--s = (unsigned)n%10+ '0';
while ((n = (unsigned)n/10)!=0);
if (neg!='\0')
*--s = neg;
return(s);
}
François Grieu
Nov 14 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
1709
by: Brandon J. Van Every | last post by:
Is anyone using Python for .NET? I mean Brian's version at Zope, which simply accesses .NET in a one-way fashion from Python. http://www.zope.org/Members/Brian/PythonNet Not the experimental ActiveState stuff, which tried to compile IL and apparently didn't succeed. Two motives for the question: 1) whether to use it for my C++ / C# / .NET / Python (?) game project. It's a prototype, so in this context a "mostly working beta" is...
13
2679
by: James Conrad St.John Foreman | last post by:
One of the larger tables in our database is now 6.8 million rows (1 per financial transaction since 2000). Every time an amendment is made to a booking, new rows are added to the table for each transaction, so in general we never have any call to update old rows. Usually, we only deal with analysis on transactions in the current financial year or the previous one, but *occasionally* we'll want to go back further. So I'm thinking as...
9
4617
by: Peng Jian | last post by:
I have a function that is called very very often. Can I improve its efficiency by declaring its local variables to be static?
65
12617
by: Skybuck Flying | last post by:
Hi, I needed a method to determine if a point was on a line segment in 2D. So I googled for some help and so far I have evaluated two methods. The first method was only a formula, the second method was a piece of C code which turned out to be incorrect and incomplete but by modifieing it would still be usuable. The first method was this piece of text:
0
2499
by: Brian Basquille | last post by:
Hello all. Quick question for anyone that has used RawMouse / RawInputMouse (and for anyone that hasn't used it.. look here: http://link.mywwwserver.com/~jstookey/arcade/rawmouse/). RawMouse is basically a program for managing multiple mouses independently but connected to the same machine using MAME. It's been ported to C# recently enough and having finally gotten it to read
0
1162
by: Laurence Parry | last post by:
Some people over on Wikipedia seem to have the idea that VB.NET is not a major programming language: http://en.wikipedia.org/wiki/Template_talk:Major_programming_languages_small#Visual_Basic_.NET I think they're wrong. But I had to agree with what one guy said - the general view in the geeky programming community is that VB.NET is a poor brother to C#; some still think it's just a slightly improved version of VB! I don't think this will...
19
2205
by: Alan Silver | last post by:
Hello, Having discovered what I believe to be two CSS bugs in IE7, I have submitted bug reports to MS. AFAIK, these don't get acted on until they receive votes form others to say they are worth investigating. As (I naively assume) it's in everyone's interest to see IE7 support CSS properly, would anyone care to vote for these bugs, so that they might get fixed before IE7 comes out of beta?
0
1280
by: Jeff Rush | last post by:
(inspired by a thread on the psf-members list about Python certification) Those who know me know I am no fan of the programmer certification industry and I agree that your typical certificate arrangement, e.g. to show HR to get that box checked in your candidate file, so a company can skip responsibility in vetting a candidate, is un-Pythonic. However, I do think we need a good answer to the engineering manager or team leader who asks,...
2
1385
by: Bowlderster | last post by:
Hi,all. I produce a function to analysis the test data,which is wave signal, and stored as array a. I want to figure out how many times it upcrosses zero,which means that when a<0,and a>0, it upcrosses zero one time.I need store the numbers of upcross zero, and the index where it upcrosses zero, for example, index i(a>0) and index j(a<0&&a>0).Between a and a, there is no upcross zero signal. The maximum and minimum value between a and a...
2
2775
by: sdanda | last post by:
Hi , Do you have any idea how to improve my java class performance while selecting and inserting data into DB using JDBC Connectivity ......... This has to work for more than 8,00,000 of records ..... Can you give some performance tips if you have known 1) For this I am using oci driver ( because I m using oracle 10g) instead of thin driver 2) In that programme I m using prepared statement instead of statement 3) I am...
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10330
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10153
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10093
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9952
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8976
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6740
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5381
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.