473,485 Members | 1,397 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

is there any faster way to parse string into float number

Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay

Mar 16 '06 #1
9 14067
Py***********@gmail.com wrote:
i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.


Maybe. atof() has to handle all possible formats - thus if you know the
format of your strings you should be able to write a faster parser
specialised for your own needs.
But what exactly is your c++-question?;)

Mathias
Mar 16 '06 #2
thanks for you reply.

any kind of valid float number expression would be possible.
here is a excerpt from one of my files for example: "8.2109000000e+04
-2.8705000000e+04 0 0 0.800000 0.270000 2.160000 9.9000000000e-01"
so, i have to handle all possible formats too. :(
as far as i know, there is a c++ standard header file named cmath, and
there is also a function which is defined in cmath named atof.
so, my c++ question is how to speed up the c++ function named atof? or
is there any decent c++ way to achieve my purpose. ;)

thanks a million.
Leo Jay

Mar 16 '06 #3
Col
If some how you make char* (i.e creating a string object by passing
char*) to string object in c++, and pass those object to this func. it
can solve ur problem.

float value(string str)
{
istream ob;
ob>>str;
return str;
}
Regards,
Apoorv

Py***********@gmail.com wrote:
Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay


Mar 16 '06 #4
Py***********@gmail.com wrote:
as far as i know, there is a c++ standard header file named cmath, and
there is also a function which is defined in cmath named atof.


That's just C's 'atof()' function made available through namespace
'std'. Also, 'atof()' is not in <cmath> but in <cstdlib>.

Also, I doubt that you can squeeze a major performance gain out of
'atof()'.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #5
Col <ap***********@gmail.com> wrote:
Py***********@gmail.com wrote:
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

float value(string str)
{
istream ob;
ob>>str;
return str;
}


I doubt that this will be faster than using atof. Besides the fact
that this will not even work, you will be constructing a temporary
string object and a temporary istream object here. This is not what
makes things faster.

If the OP proved atof to be the bottleneck, he should try writing a
faster version himself. Since you need all the functionality atof gives
you, I agree this might be a bit difficult. You might be able to reduce
the number of function calls tho:

You said you have a string "8.2109000000e+04 -2.8705000000e+04 0 0
0.800000 0.270000 2.160000 9.9000000000e-01". So instead of parsing
every float seperately and finding the next space, you could have your
function parse all floats at once and store the result in an array. This
way you will 1) decrease the number of parameters you have to pass the
functions, 2) decrease the number of copied return values and 3) are
also able to have your function return the pointer *after* the parsing:

char* atofs (char* buf, std:vector <float>& out)
{
while we have something left to parse
{
parse the float in 'buf'

if parsing went alright
out.push_back (the resulting float);
else
break;
}

return buf;
}

Obviously, the above is pseudo-code. Note that the more floats you
will pass at once (ie. the longer your string for parsing is), the
greater the speed difference might be. But there is no guarantee that
anything will be faster afterwards. Nonetheless, I would take the chance
and just try it.

hth
--
jb

(reply address in rot13, unscramble first)
Mar 16 '06 #6
To Col: i don't think the istream is faster than atof, just as cout is
much lower than printf.

To Dietmar Kuehl: but in visual c++ 6.0, i found the atof in math.h
which is included in cmath.

To Jakob Bieling: that's a good idea, i will try it. thanks.

Thank all of you for your help!!!

Mar 16 '06 #7
Leo jay wrote:
To Dietmar Kuehl: but in visual c++ 6.0, i found the atof in math.h
which is included in cmath.


Well, the standard location of 'atof()' is <cstdlib> or <stdlib.h>.
It may also be defined in <math.h> and/or in <cmath> but relying on
this will render your program non-portable.

However, from what you have said, you are probably looking into
different issues than portability with respect of the location of
'atof()'...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #8
Leo jay wrote:
To Jakob Bieling: that's a good idea, i will try it. thanks.


I doubt that the processing of multiple floating point values will
make much of a difference. However, here are some hints on what
indeed might make some difference:

- 'atof()' is supposed to cope with really *all* kinds of floating
point values, including hexadecimal representations (there are
used for exact external representation of IEEE floating point
values). If you, at least, don't use these, you can safe a little
bit of preprocessing to figure out the format.

- The example value you have shown exhibit only a relatively small
number of significant digits - most of them are just meaningless
zeros. If it is acceptable to have only 'log(10) ULONG_MAX' (i.e.
for typical 32 bit machines nine) significant decimal digits
after the decimal point, you can deal with an unsigned long to
represent the fraction part of the mantissa. A similar argument
cannot applied to the integer part unless the floating point
format is known to procude scientific notation if the number of
integer digits exceeds e.g. six.

In addition, I actually doubt that 'atof()' is indeed you bottleneck.
Unless the system you are using has really fast I/O, your actual
bottleneck is more than likely I/O rather than 'atof()'. Have you
profiled your application and traced your performance problem to the
use of 'atof()'?
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #9
Py***********@gmail.com wrote:
Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay


Write yourself a function that specializes in string-to-float
conversion. If you have difficulties grab a book on compiler or parser
and read about it.

But I doubt significant performance gain over atof can be achieved,
though it is worth while to try.

Also, do an analysis on the pattern of the strings in the file. Some
optimization can sometimes be done by caching the most recent
calculations, etc.

Regards,
Ben
Mar 17 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
7970
by: Bryan R. Meyer | last post by:
I am a relatively new C++ programmer and am attempting to write a function that takes a number of type float and adds commas to it in the appropriate places. In order to manipulate the number to...
19
1948
by: pkilambi | last post by:
I wrote this function which does the following: after readling lines from file.It splits and finds the word occurences through a hash table...for some reason this is quite slow..can some one...
6
7591
by: karthi | last post by:
hi, I need user defined function that converts string to float in c. since the library function atof and strtod occupies large space in my processor memory I can't use it in my code. regards,...
4
9437
by: Phil Mc | last post by:
OK this should be bread and butter, easy to do, but I seem to be going around in circles and not getting any answer to achieving this simple task. I have numbers in string format (they are...
10
2158
by: Extremest | last post by:
I know there are ways to make this a lot faster. Any newsreader does this in seconds. I don't know how they do it and I am very new to c#. If anyone knows a faster way please let me know. All...
3
5178
by: SharpCoderMP | last post by:
i've run into some trouble using data from xml inside my app. the scenario is simple. input data looks more or less like this: <item> <name>MyName</name> <somefloat>11.5</somefloat> </item> ...
6
8534
by: trevor | last post by:
Incorrect values when using float.Parse(string) I have discovered a problem with float.Parse(string) not getting values exactly correct in some circumstances(CSV file source) but in very similar...
2
3479
by: Samuel R. Neff | last post by:
I'm using a quasi open-source project and am running into an exception in double.Parse which is effectively this: double.Parse(double.MinValue.ToString()) System.OverflowException: Value was...
1
4142
by: (2b|!2b)==? | last post by:
I am expecting a string of this format: "id1:param1,param2;id2:param1,param2,param3;id" The tokens are seperated by semicolon ";" However each token is really a struct of the following...
0
7090
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7161
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6825
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7275
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5418
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4857
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4551
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3063
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
247
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.