472,808 Members | 2,005 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,808 software developers and data experts.

ScientificPython - LeastSquareFit diverges

Dear all,

I am trying to fit a powerlaw to a small dataset using
Scientific.Functions.LeastSquares fit.
Unfortunately, the algorithm seems to diverge and throws an
OverflowException.
Here is how I try it:
>>from Scientific.Functions.LeastSquares import leastSquaresFit

data = [
.... (2.5, 589.0, 0.10000000000000001),
.... (7.5, 442.0, 0.10000000000000001),
.... (12.5, 96.0, 0.10000000000000001),
.... (17.5, 36.0, 0.10000000000000001),
.... (22.5, 16.0, 0.10000000000000001),
.... (27.5, 7.0, 0.10000000000000001),
.... (32.5, 6.0, 0.10000000000000001),
.... (37.5, 3.0, 0.10000000000000001),
.... (42.5, 3.0, 0.10000000000000001),
.... (47.5, 1.0, 0.10000000000000001),
.... (52.5, 3.0, 0.10000000000000001),
.... (57.5, 1.0, 0.10000000000000001),
.... (67.5, 1.0, 0.10000000000000001),
.... (77.5, 2.0, 0.10000000000000001),
.... (82.5, 1.0, 0.10000000000000001),
.... (87.5, 2.0, 0.10000000000000001)
.... ]
>>>
def powerlaw((a,b),x) :
.... return a*x**b
....
>>params,chisq = leastSquaresFit(powerlaw,(10,-3),data)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File
"/usr/lib/python2.4/site-packages/Scientific/Functions/LeastSquares.py",
line 72, in leastSquaresFit
next_chi_sq, next_alpha = _chiSquare(model, next_p, data)
File
"/usr/lib/python2.4/site-packages/Scientific/Functions/LeastSquares.py",
line 22, in _chiSquare
f = model(parameters, point[0])
File "<stdin>", line 2, in powerlaw
File
"/usr/lib/python2.4/site-packages/Scientific/Functions/FirstDerivatives.py",
line 182, in __rpow__
return pow(other, self)
File
"/usr/lib/python2.4/site-packages/Scientific/Functions/FirstDerivatives.py",
line 171, in __pow__
raise OverflowError, "Numerical result of pow(%s,%s) out of range."
% (self.value,other.value-1)
OverflowError: Numerical result of pow(2.5,8376.79243687) out of range.
>>>

I added some debugging output in
/usr/lib/python-2.4/site-packages/Scientifc/Functions/LeastSquares.py
in the function _chiSquare that prints the fit parameters during the
Levenberg-Marquardt iteration.
The procedure seems do diverge after the first step:

((10, [1]), (-3, [0, 1]))
[(-67402.311817579117, [1]), (8377.7924368716158, [0, 1])]

Note that I could easily fit the above data using gnuplots internal
fitting procedure. Any idea what is going wrong here? Is it a known
problem? Are there any work arounds or other packages?

Any help is appreciated!

- harold -

Jul 18 '06 #1
4 2320

"Harold Fellermann" <da******@googlemail.comwrote in message
news:11**********************@s13g2000cwa.googlegr oups.com...
I am trying to fit a powerlaw to a small dataset using
Scientific.Functions.LeastSquares fit.
This is a bit off-topic here, and normally better for the scipy list, but I
have some experience with nonlinear least squares.
Unfortunately, the algorithm seems to diverge and throws an
OverflowException.
Assuming the program is okay, this means that either the function
mismatches the data or the initial values are too far off to converge.
Here is how I try it:
>>>from Scientific.Functions.LeastSquares import leastSquaresFit

data = [
... (2.5, 589.0, 0.10000000000000001),
... (7.5, 442.0, 0.10000000000000001),
... (12.5, 96.0, 0.10000000000000001),
I presume that tuples are x, y, some_error_indicator. But the last does
not matter here.
>>>def powerlaw((a,b),x) :
... return a*x**b
Did you try plotting logx versus log y to see if you get approximately a
straight line? If so, the intercept and slope are estimates of loga and b.
...
>>>params,chisq = leastSquaresFit(powerlaw,(10,-3),data)
I presume (10,-3) is the starting (a,b). But, for instance 10*7.5**-3 =
..02, which has no relation to 442, whereas, for instance, 1000*7.5-.75 =
221, which is in the ballpark, at least. So (a,b)=(1000, -.75) might have
a chance.

Terry Jan Reedy

Jul 18 '06 #2
On 18.07.2006, at 15:59, Harold Fellermann wrote:
>>>def powerlaw((a,b),x) :
... return a*x**b
Fitting power laws is a tricky business, you need a pretty good
initial guess to get convergence.
Note that I could easily fit the above data using gnuplots internal
fitting procedure. Any idea what is going wrong here? Is it a known
problem? Are there any work arounds or other packages?
My suggestion is to fit, at least as a first step, the logarithms of
your data points:

import Numeric as N

def powerlaw_log((a, b), x) :
return N.log(a) + b*N.log(x)

params1, chisq = leastSquaresFit(powerlaw_log, (10., -3.),
[(x, N.log(y)) for x, y, sigma in
data])
You can then use those parameters as starting values for fitting your
original problem:

params2, chisq = leastSquaresFit(powerlaw, params1, data)

Doing this for your data yields:

params1: [9469.9675999067185, -2.0881423620750521]

params2: [1591.4025775162165, -1.0112284948049179]

The big difference between the two fits is a further indicator for a
stability problem. I would trust the first set more than the second one.

As a general rule, the model to be fitted should be a smoothly
varying function of the parameters, and the same should be true for
the derivatives.

The second general rule is never to trust a non-linear fit algorithm
blindly. Look at your data first, see if the model can be a good fit,
and play with some paramater values to get a feeling for how they
influence the fit. Plotting your data set, it is immediately clear
that the first point ruins any nice power law behaviour. You might
thus prefer to do the fit without the first point, and you will get a
much better defined exponent:

params1: [31363.301954929859, -2.4047303053979046]
params2: [182522.2346197216, -2.9893640209815757]

Plotting the models corresponding to these two sets together with the
data, you will see that everything coincides well for large x values,
meaning that the first two points make all the difference - another
pointer towards a lack of stability in the fit.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: hi****@cnrs-orleans.fr
---------------------------------------------------------------------
Jul 18 '06 #3
Thanks for your advices, Terry and Konrad,

using the linear fit as initial condition for the pawerlow fit works
pretty well for my data.
(I already had the two calculations but performed them vice versa ...
:-) Anyway, I had
the impression that the leastSquaresFit in Scientific Python is an
implementation of
the Levenberg Marquardt algorithm as it is presented in the Numerical
Recipes. Accoring
to reviews, this algorithm is not famous for its stability
(e.g. http://www.stanford.edu/class/cme302/wnnr/nr.html). Better
implementations
are out there (e.g. http://www.ics.forth.gr/~lourakis/levmar/). Are
there any plans to
improve the SciPy algorithm? Would it be a welcome contribution to
SciPy to work
this part out?

- harold -

Jul 19 '06 #4
On Jul 19, 2006, at 16:53, Harold Fellermann wrote:
:-) Anyway, I had
the impression that the leastSquaresFit in Scientific Python is an
implementation of
the Levenberg Marquardt algorithm as it is presented in the Numerical
Recipes.
True.
Accoring
to reviews, this algorithm is not famous for its stability
(e.g. http://www.stanford.edu/class/cme302/wnnr/nr.html). Better
implementations
are out there (e.g. http://www.ics.forth.gr/~lourakis/levmar/). Are
there any plans to
improve the SciPy algorithm? Would it be a welcome contribution to
SciPy to work
this part out?
Yes, definitely. And no, I have no plans to do it myself any time
soon. The current implementation has always been sufficient for my
needs, and time is scarce...

BTW, ScientificPython (http://dirac.cnrs-orleans.fr/
ScientificPython/) is not the same thing as SciPy (http://
www.scipy.org/). Both are scientific libraries for Python, but their
focus is different: ScientificPython aims at providing pythonic
modules for scientific computing, whereas SciPy's objective is to
provide Python interfaces to the large pool of scientific libraries
from the Fortran/C/C++ world. Scientific users of Python should
probably have both of them installed.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: hi****@cnrs-orleans.fr
---------------------------------------------------------------------
Jul 19 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Xcal | last post by:
Anyone knows how to make a Gaussian fit to a histogram data using Python, or where I can find a library that helps me in this task? Tks everyone... Paxcal
4
by: sdhyok | last post by:
Hi, I am trying to build up a system handling time series data a lot. Do you know any well-designed python class specially for time series data? Thanks in advance. Shin, Daehyok
17
by: Just | last post by:
While googling for a non-linear equation solver, I found Math::Polynomial::Solve in CPAN. It seems a great little module, except it's not Python... I'm especially looking for its poly_root()...
0
by: Zhuanshi He | last post by:
I try to compile ScientificPython-2.4.9 (http://starship.python.net/~hinsen/ScientificPython/) under Windows XP cygwin environment using python 2.3.3 ,and gcc (GCC) 3.3.3 (cygwin special). The...
109
by: Andrew Thompson | last post by:
It seems most people get there JS off web sites, which is entirely logical. But it is also a great pity since most of that code is of such poor quality. I was looking through the JS FAQ for any...
9
by: skilpat | last post by:
I am going to be doing a lot of work with large data sets stored in various netCDF files, and after checking out the alternatives, I would really like to go with SciPy. The problem is that SciPy...
8
by: Russ | last post by:
Does it ever make sense to derive a class from a basic type such as float or int? Suppose, for example, that I want to create a class for physical scalars with units. I thought about deriving from...
43
by: parallelpython | last post by:
Has anybody tried to run parallel python applications? It appears that if your application is computation-bound using 'thread' or 'threading' modules will not get you any speedup. That is because...
17
by: pyramid | last post by:
Hello I am working on one of my lab for this week, which calculates the approximate value of pi. Listed below is the actual problem, which I am posting here, so that you can see the different...
0
by: kcodez | last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Sept 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.