471,319 Members | 1,698 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,319 software developers and data experts.

Questions about mathematical and statistical functionality in Python

Greetings Pythoners!

I hope you'll indulge an ignorant outsider. I work at a financial software
firm, and the tool I currently use for my research is R, a software
environment for statistical computing and graphics. R is designed with
matrix manipulation in mind, and it's very easy to do regression and time
series modeling, and to plot the results and test hypotheses. The kinds of
functionality we rely on the most are standard and robust versions of
regression and principal component / factor analysis, bayesian methods such
as Gibbs sampling and shrinkage, and optimization by linear, quadratic,
newtonian / nonlinear, and genetic programming; frequently used graphics
include QQ plots and histograms. In R, these procedures are all available
as functions (some of them are in auxiliary libraries that don't come with
the standard distribution, but are easily downloaded from a central
repository).

For a variety of reasons, the research group is considering adopting Python.
Naturally, I am curious about the mathematical, statistical, and graphical
functionality available in Python. Do any of you out there use Python in
financial research, or other intense mathematical/statistical computation?
Can you compare working in Python with working in a package like R or S-Plus
or Matlab, etc.? Which of the procedures I mentioned above are available in
Python? I appreciate any insight you can provide. Thanks!

-- TMK --
212-460-5430 home
917-656-5351 cell
Jun 14 '07 #1
4 2932
On Jun 14, 4:02 pm, "Talbot Katz" <topk...@msn.comwrote:
Greetings Pythoners!

I hope you'll indulge an ignorant outsider. I work at a financial software
firm, and the tool I currently use for my research is R, a software
environment for statistical computing and graphics. R is designed with
matrix manipulation in mind, and it's very easy to do regression and time
series modeling, and to plot the results and test hypotheses. The kinds of
functionality we rely on the most are standard and robust versions of
regression and principal component / factor analysis, bayesian methods such
as Gibbs sampling and shrinkage, and optimization by linear, quadratic,
newtonian / nonlinear, and genetic programming; frequently used graphics
include QQ plots and histograms. In R, these procedures are all available
as functions (some of them are in auxiliary libraries that don't come with
the standard distribution, but are easily downloaded from a central
repository).

For a variety of reasons, the research group is considering adopting Python.
Naturally, I am curious about the mathematical, statistical, and graphical
functionality available in Python. Do any of you out there use Python in
financial research, or other intense mathematical/statistical computation?
Can you compare working in Python with working in a package like R or S-Plus
or Matlab, etc.? Which of the procedures I mentioned above are available in
Python? I appreciate any insight you can provide. Thanks!

-- TMK --
212-460-5430 home
917-656-5351 cell
I'd look at following modules:

matplotlib - http://matplotlib.sourceforge.net/
numpy - http://numpy.scipy.org/

Finally, this website lists other resources: http://www.astro.cornell.edu/staff/loredo/statpy/

Mike

Jun 14 '07 #2
Talbot Katz wrote:
I hope you'll indulge an ignorant outsider. I work at a financial
software firm, and the tool I currently use for my research is R, a
software environment for statistical computing and graphics. R is
designed with matrix manipulation in mind, and it's very easy to do
regression and time series modeling, and to plot the results and test
hypotheses. The kinds of functionality we rely on the most are standard
and robust versions of regression and principal component / factor
analysis, bayesian methods such as Gibbs sampling and shrinkage, and
optimization by linear, quadratic, newtonian / nonlinear, and genetic
programming; frequently used graphics include QQ plots and histograms.
In R, these procedures are all available as functions (some of them are
in auxiliary libraries that don't come with the standard distribution,
but are easily downloaded from a central repository).
I use both R and Python for my work. I think R is probably better for
most of the stuff you are mentioning. I do any sort of heavy
lifting--database queries/tabulation/aggregation in Python and load the
resulting data frames into R for analysis and graphics.
--
Michael Hoffman
Jun 14 '07 #3
Michael Hoffman wrote:
Talbot Katz wrote:
>I hope you'll indulge an ignorant outsider. I work at a financial
software firm, and the tool I currently use for my research is R, a
software environment for statistical computing and graphics. R is
designed with matrix manipulation in mind, and it's very easy to do
regression and time series modeling, and to plot the results and test
hypotheses. The kinds of functionality we rely on the most are standard
and robust versions of regression and principal component / factor
analysis, bayesian methods such as Gibbs sampling and shrinkage, and
optimization by linear, quadratic, newtonian / nonlinear, and genetic
programming; frequently used graphics include QQ plots and histograms.
In R, these procedures are all available as functions (some of them are
in auxiliary libraries that don't come with the standard distribution,
but are easily downloaded from a central repository).

I use both R and Python for my work. I think R is probably better for
most of the stuff you are mentioning. I do any sort of heavy
lifting--database queries/tabulation/aggregation in Python and load the
resulting data frames into R for analysis and graphics.
I would second that. It is not either/or. Use Python, including Numpy
and matplotlib and packages from SciPy, for some things, and R for
others. And you can even embed R in Python using RPy - see
http://rpy.sourceforge.net/

We use the combination of Python, Numpy (actually, the older Numeric
Python package, but soon to be converted to Numpy), RPy and R in our
NetEpi Analysis project - exploratory epidemiological analysis of large
data sets - see http://sourceforge.net/projects/netepi - and it is a
good combination - Python for the Web interface, data manipulation and
data heavy-lifting, and for some of the more elementary statistics, and
R for more involved statistical analysis and graphics (with teh option
of using matplotlib or other Python-based graphics packages for some
tasks if we wish). The main thing to remember, though, is that indexing
is zero-based in Python and 1-based in R...

Tim C
Jun 14 '07 #4
On Thursday 14 June 2007 5:54 pm, Tim Churches wrote:
Michael Hoffman wrote:
Talbot Katz wrote:
I hope you'll indulge an ignorant outsider. I work at a financial
software firm, and the tool I currently use for my research is R, a
software environment for statistical computing and graphics. R is
designed with matrix manipulation in mind, and it's very easy to do
regression and time series modeling, and to plot the results and test
hypotheses. The kinds of functionality we rely on the most are standard
and robust versions of regression and principal component / factor
analysis, bayesian methods such as Gibbs sampling and shrinkage, and
optimization by linear, quadratic, newtonian / nonlinear, and genetic
programming; frequently used graphics include QQ plots and histograms.
In R, these procedures are all available as functions (some of them are
in auxiliary libraries that don't come with the standard distribution,
but are easily downloaded from a central repository).
I use both R and Python for my work. I think R is probably better for
most of the stuff you are mentioning. I do any sort of heavy
lifting--database queries/tabulation/aggregation in Python and load the
resulting data frames into R for analysis and graphics.

I would second that. It is not either/or. Use Python, including Numpy
and matplotlib and packages from SciPy, for some things, and R for
others. And you can even embed R in Python using RPy - see
http://rpy.sourceforge.net/

We use the combination of Python, Numpy (actually, the older Numeric
Python package, but soon to be converted to Numpy), RPy and R in our
NetEpi Analysis project - exploratory epidemiological analysis of large
data sets - see http://sourceforge.net/projects/netepi - and it is a
good combination - Python for the Web interface, data manipulation and
data heavy-lifting, and for some of the more elementary statistics, and
R for more involved statistical analysis and graphics (with teh option
of using matplotlib or other Python-based graphics packages for some
tasks if we wish). The main thing to remember, though, is that indexing
is zero-based in Python and 1-based in R...

Tim C
Thirded. I use R, Python, Matlab along with other languages (I hate pipeline
pilot) in my work and from what I've seen nothing can compare with R when it
comes to stats. I love R, from its brilliant CRAN system (PyPI needs serious
work to be considered in the same class as CPAN et al) to its delicious Emacs
integration.

I just wish there was a way to distribute R packages without requiring the
user to separately install R.

In a similar vein, I wish there was a reasonable Free Software equivalent to
Spotfire. The closest I've found (and they're nowhere near as good) are
Orange (http://www.ailab.si/orange) and WEKA
(http://www.cs.waikato.ac.nz/ml/weka/). Orange is written in Python, but its
tied to QT 2.x as the 3.x series was not available on Windows under the GPL.
Josh Gilbert
Jun 15 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by dont bother | last post: by
6 posts views Thread by administrata | last post: by
162 posts views Thread by techievasant | last post: by
3 posts views Thread by Thomas Nelson | last post: by
5 posts views Thread by jeremito | last post: by
8 posts views Thread by Krypto | last post: by
2 posts views Thread by mc | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.