473,248 Members | 1,915 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,248 software developers and data experts.

is there any faster way to parse string into float number

Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay

Mar 16 '06 #1
9 14030
Py***********@gmail.com wrote:
i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.


Maybe. atof() has to handle all possible formats - thus if you know the
format of your strings you should be able to write a faster parser
specialised for your own needs.
But what exactly is your c++-question?;)

Mathias
Mar 16 '06 #2
thanks for you reply.

any kind of valid float number expression would be possible.
here is a excerpt from one of my files for example: "8.2109000000e+04
-2.8705000000e+04 0 0 0.800000 0.270000 2.160000 9.9000000000e-01"
so, i have to handle all possible formats too. :(
as far as i know, there is a c++ standard header file named cmath, and
there is also a function which is defined in cmath named atof.
so, my c++ question is how to speed up the c++ function named atof? or
is there any decent c++ way to achieve my purpose. ;)

thanks a million.
Leo Jay

Mar 16 '06 #3
Col
If some how you make char* (i.e creating a string object by passing
char*) to string object in c++, and pass those object to this func. it
can solve ur problem.

float value(string str)
{
istream ob;
ob>>str;
return str;
}
Regards,
Apoorv

Py***********@gmail.com wrote:
Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay


Mar 16 '06 #4
Py***********@gmail.com wrote:
as far as i know, there is a c++ standard header file named cmath, and
there is also a function which is defined in cmath named atof.


That's just C's 'atof()' function made available through namespace
'std'. Also, 'atof()' is not in <cmath> but in <cstdlib>.

Also, I doubt that you can squeeze a major performance gain out of
'atof()'.
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #5
Col <ap***********@gmail.com> wrote:
Py***********@gmail.com wrote:
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

float value(string str)
{
istream ob;
ob>>str;
return str;
}


I doubt that this will be faster than using atof. Besides the fact
that this will not even work, you will be constructing a temporary
string object and a temporary istream object here. This is not what
makes things faster.

If the OP proved atof to be the bottleneck, he should try writing a
faster version himself. Since you need all the functionality atof gives
you, I agree this might be a bit difficult. You might be able to reduce
the number of function calls tho:

You said you have a string "8.2109000000e+04 -2.8705000000e+04 0 0
0.800000 0.270000 2.160000 9.9000000000e-01". So instead of parsing
every float seperately and finding the next space, you could have your
function parse all floats at once and store the result in an array. This
way you will 1) decrease the number of parameters you have to pass the
functions, 2) decrease the number of copied return values and 3) are
also able to have your function return the pointer *after* the parsing:

char* atofs (char* buf, std:vector <float>& out)
{
while we have something left to parse
{
parse the float in 'buf'

if parsing went alright
out.push_back (the resulting float);
else
break;
}

return buf;
}

Obviously, the above is pseudo-code. Note that the more floats you
will pass at once (ie. the longer your string for parsing is), the
greater the speed difference might be. But there is no guarantee that
anything will be faster afterwards. Nonetheless, I would take the chance
and just try it.

hth
--
jb

(reply address in rot13, unscramble first)
Mar 16 '06 #6
To Col: i don't think the istream is faster than atof, just as cout is
much lower than printf.

To Dietmar Kuehl: but in visual c++ 6.0, i found the atof in math.h
which is included in cmath.

To Jakob Bieling: that's a good idea, i will try it. thanks.

Thank all of you for your help!!!

Mar 16 '06 #7
Leo jay wrote:
To Dietmar Kuehl: but in visual c++ 6.0, i found the atof in math.h
which is included in cmath.


Well, the standard location of 'atof()' is <cstdlib> or <stdlib.h>.
It may also be defined in <math.h> and/or in <cmath> but relying on
this will render your program non-portable.

However, from what you have said, you are probably looking into
different issues than portability with respect of the location of
'atof()'...
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #8
Leo jay wrote:
To Jakob Bieling: that's a good idea, i will try it. thanks.


I doubt that the processing of multiple floating point values will
make much of a difference. However, here are some hints on what
indeed might make some difference:

- 'atof()' is supposed to cope with really *all* kinds of floating
point values, including hexadecimal representations (there are
used for exact external representation of IEEE floating point
values). If you, at least, don't use these, you can safe a little
bit of preprocessing to figure out the format.

- The example value you have shown exhibit only a relatively small
number of significant digits - most of them are just meaningless
zeros. If it is acceptable to have only 'log(10) ULONG_MAX' (i.e.
for typical 32 bit machines nine) significant decimal digits
after the decimal point, you can deal with an unsigned long to
represent the fraction part of the mantissa. A similar argument
cannot applied to the integer part unless the floating point
format is known to procude scientific notation if the number of
integer digits exceeds e.g. six.

In addition, I actually doubt that 'atof()' is indeed you bottleneck.
Unless the system you are using has really fast I/O, your actual
bottleneck is more than likely I/O rather than 'atof()'. Have you
profiled your application and traced your performance problem to the
use of 'atof()'?
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence
Mar 16 '06 #9
Py***********@gmail.com wrote:
Dear all,

i need to parse billions of numbers from a file into float numbers for
further calculation.
i'm not satisfied with the speed of atof() function on my machine(i'm
using visual c++ 6).

so i wonder is there any faster way to parse string(char array) into
float number.

Thanks
Leo Jay


Write yourself a function that specializes in string-to-float
conversion. If you have difficulties grab a book on compiler or parser
and read about it.

But I doubt significant performance gain over atof can be achieved,
though it is worth while to try.

Also, do an analysis on the pattern of the strings in the file. Some
optimization can sometimes be done by caching the most recent
calculations, etc.

Regards,
Ben
Mar 17 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Bryan R. Meyer | last post by:
I am a relatively new C++ programmer and am attempting to write a function that takes a number of type float and adds commas to it in the appropriate places. In order to manipulate the number to...
19
by: pkilambi | last post by:
I wrote this function which does the following: after readling lines from file.It splits and finds the word occurences through a hash table...for some reason this is quite slow..can some one...
6
by: karthi | last post by:
hi, I need user defined function that converts string to float in c. since the library function atof and strtod occupies large space in my processor memory I can't use it in my code. regards,...
4
by: Phil Mc | last post by:
OK this should be bread and butter, easy to do, but I seem to be going around in circles and not getting any answer to achieving this simple task. I have numbers in string format (they are...
10
by: Extremest | last post by:
I know there are ways to make this a lot faster. Any newsreader does this in seconds. I don't know how they do it and I am very new to c#. If anyone knows a faster way please let me know. All...
3
by: SharpCoderMP | last post by:
i've run into some trouble using data from xml inside my app. the scenario is simple. input data looks more or less like this: <item> <name>MyName</name> <somefloat>11.5</somefloat> </item> ...
6
by: trevor | last post by:
Incorrect values when using float.Parse(string) I have discovered a problem with float.Parse(string) not getting values exactly correct in some circumstances(CSV file source) but in very similar...
2
by: Samuel R. Neff | last post by:
I'm using a quasi open-source project and am running into an exception in double.Parse which is effectively this: double.Parse(double.MinValue.ToString()) System.OverflowException: Value was...
1
by: (2b|!2b)==? | last post by:
I am expecting a string of this format: "id1:param1,param2;id2:param1,param2,param3;id" The tokens are seperated by semicolon ";" However each token is really a struct of the following...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.