OLAP and pivot tables

George Sakkis

After a brief search, I didn't find any python package related to OLAP
and pivot tables. Did I miss anything ? To be more precise, I'm not so
interested in a full-blown OLAP server with an RDBMS backend, but
rather a pythonic API for constructing datacubes in memory, slicing and
dicing them, drilling down or up dimensions and exposing them in some
suitable form to a presentation layer. I've hacked a first cut of a
pivot table implementation and an XHTML generator that produces
hierarchical html tables but it's not particularly general or easily
extensible so far. Is there any interest at all on a pythonic version
of something like JOLAP or XMLA ?

George

May 26 '06 #1

Subscribe Post Reply

5797

Ben Stroud

George Sakkis wrote:

After a brief search, I didn't find any python package related to OLAP
and pivot tables. Did I miss anything ? To be more precise, I'm not so
interested in a full-blown OLAP server with an RDBMS backend, but
rather a pythonic API for constructing datacubes in memory, slicing and
dicing them, drilling down or up dimensions and exposing them in some
suitable form to a presentation layer. I've hacked a first cut of a
pivot table implementation and an XHTML generator that produces
hierarchical html tables but it's not particularly general or easily
extensible so far. Is there any interest at all on a pythonic version
of something like JOLAP or XMLA ?

George

I'd be interested as well. I posted a similar question to the ruby
mailing list a few months ago to no avail. Ideally, someone much more
talented than myself would create a open OLAP library in C that could be
interfaced with dynamic languages easily (I ordered some OLAP books and
started in on this, and decided I was in over my head for now). As far
as free software, all I've been able to find is java-based Mondrian.
Maybe it could serve as a reference implementation for someone.

Cheers,
Ben

May 26 '06 #2

Duncan Smith

George Sakkis wrote:

After a brief search, I didn't find any python package related to OLAP
and pivot tables. Did I miss anything ? To be more precise, I'm not so
interested in a full-blown OLAP server with an RDBMS backend, but
rather a pythonic API for constructing datacubes in memory, slicing and
dicing them, drilling down or up dimensions and exposing them in some
suitable form to a presentation layer. I've hacked a first cut of a
pivot table implementation and an XHTML generator that produces
hierarchical html tables but it's not particularly general or easily
extensible so far. Is there any interest at all on a pythonic version
of something like JOLAP or XMLA ?

George

I have a few applications that require the generation of large numbers
of contingency tables from a higher-dimensional base table. The
approaches I've tried (Numeric arrays / dictionary-based sparse arrays /
various caching schemes / searches on subset lattices for previously
generated 'super'-tables that can be marginalised from etc.) still
represent major bottlenecks. So, I guess I would be interested.

Duncan

May 26 '06 #3

Tim Churches

Ben Stroud wrote:

George Sakkis wrote:
After a brief search, I didn't find any python package related to OLAP
and pivot tables. Did I miss anything ? To be more precise, I'm not so
interested in a full-blown OLAP server with an RDBMS backend, but
rather a pythonic API for constructing datacubes in memory, slicing and
dicing them, drilling down or up dimensions and exposing them in some
suitable form to a presentation layer. I've hacked a first cut of a
pivot table implementation and an XHTML generator that produces
hierarchical html tables but it's not particularly general or easily
extensible so far. Is there any interest at all on a pythonic version
of something like JOLAP or XMLA ?
I'd be interested as well. I posted a similar question to the ruby
mailing list a few months ago to no avail. Ideally, someone much more
talented than myself would create a open OLAP library in C that could be
interfaced with dynamic languages easily (I ordered some OLAP books and
started in on this, and decided I was in over my head for now). As far
as free software, all I've been able to find is java-based Mondrian.
Maybe it could serve as a reference implementation for someone.

The NetEpi Analysis project - see http://sourceforge.net/projects/netepi
, although not strictly an OLAP or datacube engine, might offer some of
the things you are looking for. It is intended for exploratory
epidemiological analysis of (potentially large) health-related datasets,
but should work with most types of data for which an OLAP engine would
be useful. Underneath there is a vertically-disaggregated,
ordinally-mapped, set-theoretic data selection and summarisation engine,
which is a pompous way of saying that it holds data column-wise in
memory-mapped Numpy (Numeric Python) arrays, and uses some fast
(custom-written) set functions on inverted indexes on the ordinal
positions of column values to select and summarise data (entirely at
run-time, cf most OLAP engines, which rely on a degree of
pre-summarisation along pre-chosen dimensions). It is all Python and
thus has a Python(ic) API, including an SQL-like WHERE clause parser
for data selection (OK, SQL is not Pythonic, but that's just for data
subsetting). It includes quite a few statistical functions and nice
graphics courtesy of R (http://www.r-project.org) (which is embedded via
RPy - http://rpy.sourceforge.net/). Full support for missing values and
weighted datasets is provided (but not full support for survey data with
complex sample designs - that's forthcoming). Currently it works well
with datasets in the 5-10 million row range, but the basic design lends
itself easily to parallelisation if you have bigger datasets, and
preliminary work indicates good speed improvements - something we want
to pursue given all these multi-core CPUs which are now available at
reasonable cost. Be warned that NetEpi Analysis is currently only of
beta quality, and is a bit of a pig to install, on Linux/Unix/Mac OS X
only at present. We hope to be able to ready a production-ready Version
1.0 by the end of 2006, possibly with MS-Windows support as well.
However, the core data summarisation/subsetting engine is thought to be
sound (and there are some unit tests to attest to that).

Probably not quite what you were after but I thought it worth a mention.
Please post follow-ups, if any, to the NetEpi mailing list:
http://sourceforge.net/mail/?group_id=123700

Tim C

Cheers,
Ben

May 26 '06 #4

Similar topics

OLAP Proposal for MySQL

by: Philip Stoev | last post by:

Hi all, Please tell me if any of this makes sense. Any pointers to relevant projects/articles will be much appreciated. Philip Stoev http://www.stoev.org/pivot/manifest.htm ...

MySQL Database

Calling Pivot Table using OLAP Cube from asp

by: shabnam | last post by:

I have various reports in Pivot table form using OLAP Cube. I want thes reports to be portable so i want them to call through asp so that it i available within the intranet. How do i do it? ...

ASP / Active Server Pages

Pivot Table

by: Rob | last post by:

I'm just getting around to using pivot tables and charts. I find the Pivot table interface to be INCREDIBLY frustrating. When I view a table in Design view, then choose Pivot table view, I get...

Microsoft Access / VBA

problems with totals in pivot tables and poivot charts

by: Zlatko Matiæ | last post by:

I have experienced some problems with total operations (sum, min, max, avg etc) in pivot tables nad pivot charts in .mde. In .mdb I can activate any totals operation. on both notebook and desktop...

Microsoft Access / VBA

Excel Pivot tables in vb.net

by: nikila | last post by:

Hi, I have to create excel pivot tables from vb.net. Already I am creating excel file using oledb connection. I want to use the same to create the excel pivot tables. Can anyone please help me...

Visual Basic .NET

Pivot-Table Alternatives?

by: PeteCresswell | last post by:

I've got something called "Reference Rates". The idea is that on a given day, we have various rates of return for various entities. e.g. Libor 3-month return, Libor 6-month return, US Treasury...

Microsoft Access / VBA

ASP.NET OLAP Cube OWC PivotTable Update DB

by: radcaesar | last post by:

Hi, I Have an PivotTable control (OWC) which displays cube from OLAP (Sql Server). Now i want to edit the Data in pivot and update the same to the OLAP Db. How can i update the data from...

.NET Framework

Group tables in a pivot

by: Thyag | last post by:

Hi All, I need to group multiple tables in to a pivot. Could some body help me. Thanks in Advance, Thyag

Microsoft Access / VBA

Sharing Pivot Tables in Access

by: mld01s | last post by:

I really need help!!! I dont know if its possible to share pivot tables, or see pivot tables in other machines that the one where the tables were created. This is what happens: I created a...

Microsoft Access / VBA

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server