# Robust statistics and optimmization from Python

 P: n/a I use Python to generate a huge amount of data in a .csv file which I then process using Excel. In particular, I use Excel's solver to solve a number of non-linear equation, and then regress the results of hundreds of calls to Solver against a set of known values, enabling me to calibrate my model. This is a pain: i'd much rather perform all the computations in Python and improve on Excels' regression as well. Questions: 1. Is there a way to perform (or make a call to) a non-linear optimization from Python? 2. Do Python packages for robust statistics (robust regression in particular) exist. If so, which one would you recommend/ Thanks, as always, in advance for the guidance Thomas Philips Aug 29 '05 #1
 P: n/a tk****@hotmail.com wrote: I use Python to generate a huge amount of data in a .csv file which I then process using Excel. In particular, I use Excel's solver to solve a number of non-linear equation, and then regress the results of hundreds of calls to Solver against a set of known values, enabling me to calibrate my model. This is a pain: i'd much rather perform all the computations in Python and improve on Excels' regression as well. Questions: 1. Is there a way to perform (or make a call to) a non-linear optimization from Python? Look at scipy. http://www.scipy.org In : from scipy import optimize In : optimize? .... Optimization Tools ================== A collection of general-purpose optimization routines. fmin -- Nelder-Mead Simplex algorithm (uses only function calls) fmin_powell -- Powell's (modified) level set method (uses only function calls) fmin_cg -- Non-linear (Polak-Rubiere) conjugate gradient algorithm (can use function and gradient). fmin_bfgs -- Quasi-Newton method (can use function and gradient) fmin_ncg -- Line-search Newton Conjugate Gradient (can use function, gradient and hessian). leastsq -- Minimize the sum of squares of M equations in N unknowns given a starting estimate. Constrained Optimizers (multivariate) fmin_l_bfgs_b -- Zhu, Byrd, and Nocedal's L-BFGS-B constrained optimizer (if you use this please quote their papers -- see help) fmin_tnc -- Truncated Newton Code originally written by Stephen Nash and adapted to C by Jean-Sebastien Roy. fmin_cobyla -- Contrained Optimization BY Linear Approximation 2. Do Python packages for robust statistics (robust regression in particular) exist. If so, which one would you recommend/ Offhand, I can't think of any, but it's easy enough to do maximum likelihood with Laplacians and the functions above. If you find suitable FORTRAN or C code that implements a particular "robust" algorithm, it can probably wrapped for scipy relatively easily. -- Robert Kern rk***@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter Aug 29 '05 #2

 P: n/a Robert Kern wrote: If you find suitable FORTRAN or C code that implements a particular "robust" algorithm, it can probably wrapped for scipy relatively easily. An alternative would be to call R (a free statistical package) from Python, using something like the R/SPlus - Python Interface at http://www.omegahat.org/RSPython/ . Many statistical algorithms, including those for robust statistics, have been implemented in R, usually by wrapping C or Fortran 77 code. Aug 29 '05 #3

 P: n/a use R. it's pretty highend, and there is an interface for python. Aug 29 '05 #4

 P: n/a be*******@aol.com wrote: Robert Kern wrote: If you find suitableFORTRAN or C code that implements a particular "robust" algorithm, itcan probably wrapped for scipy relatively easily. An alternative would be to call R (a free statistical package) from Python, using something like the R/SPlus - Python Interface at http://www.omegahat.org/RSPython/ . Unless you really want to call Python from R (as opposed to calling R from Python), I strongly suggest that you use RPy (http://rpy.sf.net) rather than RSPython. RPy is much easier to install and use and far less buggy than RSPython. Many statistical algorithms, including those for robust statistics, have been implemented in R, usually by wrapping C or Fortran 77 code. Yup, and if you really don't like the extra dependency or extra memory requirements of R and RPy, it is often possible to port the R code back to Python (or Numeric Python), with some effort. Tim C Aug 29 '05 #5

