By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,827 Members | 2,155 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,827 IT Pros & Developers. It's quick & easy.

Access - Crosstab Query Performance

P: 1
Hi,
I have finally been converted to Access as the data set Iím using grew to a ridiculous size. I am analysing one yearsí worth of data and the database itself has grown to be quite large at just over 1GB with a few million rows number of rows. This looks at about 700 stations and for each stations there are many 10,000ís of rows of data, hence the need to switch to a database.

I then also have an excel spreadsheet that contains some reference data for each station that we intend to update over time, hence why I wanted to keep it separate.

My question is regarding performance of the database in general. I am running cross tab queries that can take in excess of 20+minutes to run. If this is normal for a data set of this size then I will have to live with it but if not I would appreciate any suggestions on how to speed things up.

I created a query that combines the two data sets, the raw data and the station key using a relationship, the code of which I have shown below, and this runs relatively quickly (<5 seconds) which was quite encouraging.
Linked Data Query:

Expand|Select|Wrap|Line Numbers
  1. SELECT [2013 Data Private].Hour
  2.    , [2013 Data Private].Dte
  3.    , [2013 Data Private].Unidad
  4.    , [2013 Data Private].[Type Offer]
  5.    , [2013 Data Private].Volume
  6.    , [2013 Data Private].Price
  7.    , [2013 Data Private].OM
  8.    , [Linked MIBEL Codes].Type
  9. FROM [Linked MIBEL Codes] 
  10.    INNER JOIN [2013 Data Private] 
  11.       ON [Linked MIBEL Codes].[UP Code] 
  12.          = [2013 Data Private].Unidad
  13. WHERE ((
  14.    ([2013 Data Private].[Type Offer])="v") 
  15.       AND (([2013 Data Private].OM)="O"));
I then built another query off the first (this is the start of many layers) to be able to quickly filter the data under certain criteria, one query being built to filter price in a range of a min, mid and max scenario. Code again shown below for the min filter (one of three) and whilst performance was worse (<10 seconds) I was still happy with this. Admittedly the code does not look good from below but the purpose of the query is to filter by the 14 station types, each with a different price. As I said I think this part works fine in most but for reference purposes

Expand|Select|Wrap|Line Numbers
  1. SELECT [2013 Data Private].Hour
  2.    , [2013 Data Private].Dte
  3.    , [2013 Data Private].Unidad
  4.    , [2013 Data Private].[Type Offer]
  5.    , [2013 Data Private].Volume, [2013 Data Private].Price
  6.    , [2013 Data Private].OM
  7.    , [Linked MIBEL Codes].Type
  8. FROM [Linked MIBEL Codes] 
  9.    INNER JOIN [2013 Data Private] 
  10.       ON [Linked MIBEL Codes].[UP Code] 
  11.          = [2013 Data Private].Unidad
  12. WHERE ((
  13.    ([2013 Data Private].[Type Offer])="v") 
  14.       AND (([2013 Data Private].Price)<15) 
  15.       AND (([2013 Data Private].OM)="O") 
  16.       AND (([Linked MIBEL Codes].Type)="COAL")) 
  17.    OR ((([2013 Data Private].[Type Offer])="v") 
  18.       AND (([2013 Data Private].Price)<30) 
  19.       AND (([2013 Data Private].OM)="O") 
  20.       AND (([Linked MIBEL Codes].Type)="GAS")) 
  21.    OR ((([2013 Data Private].[Type Offer])="v") 
  22.       AND (([2013 Data Private].Price)<50) 
  23.       AND (([2013 Data Private].OM)="O") 
  24.       AND (([Linked MIBEL Codes].Type)="OIL")) 
  25.    OR ((([2013 Data Private].[Type Offer])="v") 
  26.       AND (([2013 Data Private].Price)<5) 
  27.       AND (([2013 Data Private].OM)="O") 
  28.       AND (([Linked MIBEL Codes].Type)="NUKE")) 
  29.    OR ((([2013 Data Private].[Type Offer])="v") 
  30.       AND (([2013 Data Private].Price)<30) 
  31.       AND (([2013 Data Private].OM)="O") 
  32.       AND (([Linked MIBEL Codes].Type)="Thermal")) 
  33.    OR ((([2013 Data Private].[Type Offer])="v") 
  34.       AND (([2013 Data Private].Price)<15) 
  35.       AND (([2013 Data Private].OM)="O") 
  36.       AND (([Linked MIBEL Codes].Type)="Dom Coal")) 
  37.    OR ((([2013 Data Private].[Type Offer])="v") 
  38.       AND (([2013 Data Private].Price)<5) 
  39.       AND (([2013 Data Private].OM)="O") 
  40.       AND (([Linked MIBEL Codes].Type)="Geothermal")) 
  41.    OR ((([2013 Data Private].[Type Offer])="v") 
  42.       AND (([2013 Data Private].Price)<5) 
  43.       AND (([2013 Data Private].OM)="O") 
  44.       AND (([Linked MIBEL Codes].Type)="Pumping Stor.")) 
  45.    OR ((([2013 Data Private].[Type Offer])="v") 
  46.       AND (([2013 Data Private].Price)<5) 
  47.       AND (([2013 Data Private].OM)="O") 
  48.       AND (([Linked MIBEL Codes].Type)="Hydro")) 
  49.    OR ((([2013 Data Private].[Type Offer])="v") 
  50.       AND (([2013 Data Private].Price)<15) 
  51.       AND (([2013 Data Private].OM)="O") 
  52.       AND (([Linked MIBEL Codes].Type)="N/a")) 
  53.    OR ((([2013 Data Private].[Type Offer])="v") 
  54.       AND (([2013 Data Private].Price)<5) 
  55.       AND (([2013 Data Private].OM)="O") 
  56.       AND (([Linked MIBEL Codes].Type)="Solar PV")) 
  57.    OR ((([2013 Data Private].[Type Offer])="v") 
  58.       AND (([2013 Data Private].Price)<5) 
  59.       AND (([2013 Data Private].OM)="O") 
  60.       AND (([Linked MIBEL Codes].Type)="Solar Ther.")) 
  61.    OR ((([2013 Data Private].[Type Offer])="v") 
  62.       AND (([2013 Data Private].Price)<5) 
  63.       AND (([2013 Data Private].OM)="O") 
  64.       AND (([Linked MIBEL Codes].Type)="Wind")) 
  65.    OR ((([2013 Data Private].[Type Offer])="v") 
  66.       AND (([2013 Data Private].Price)<15) 
  67.       AND (([2013 Data Private].OM)="O") 
  68.       AND (([Linked MIBEL Codes].Type)="UNKNOWN"));
However the problem comes when I begin to build cross tab queries off this final query which contains the filtered results. The first is a simple average of prices across months however this now takes 5 minutes to run on its own for me to be able to get a table that can be exported to excel. In addition I would love for the cross tab query to show a weighted average of price instead of the simple average it does now.
Average Price Crosstab Query:

Expand|Select|Wrap|Line Numbers
  1. TRANSFORM 
  2.    Avg([Price Development Volume Query Min].Price) 
  3.          AS AvgOfPrice
  4.    SELECT [Price Development Volume Query Min].Type
  5.       , [Price Development Volume Query Min].Unidad
  6.       , Avg([Price Development Volume Query Min].Price) 
  7.          AS [Total Of Price]
  8. FROM [Price Development Volume Query Min]
  9. GROUP BY [Price Development Volume Query Min].Type
  10.       , [Price Development Volume Query Min].Unidad
  11. PIVOT Format([Dte],"mmm") 
  12.    In
  13.    ("Jan","Feb","Mrz","Apr"
  14.       ,"Mai","Jun","Jul","Aug"
  15.       ,"Sep","Okt","Nov","Dez");
The final cross tab query is supposed to simply sum up the total volume offered in each hour, for each day and for each station. Whilst I fully appreciate this gives a v.large data set the query takes in excess of 20 minutes to run and often causes access to crash.
Volume Crosstab Query:

Expand|Select|Wrap|Line Numbers
  1. TRANSFORM 
  2.    Sum([Price Development Volume Query Min].Volume) 
  3.       AS [Volume]
  4.    SELECT [Price Development Volume Query Min].Type
  5.       , [Price Development Volume Query Min].Unidad
  6.       , [Price Development Volume Query Min].Date
  7. FROM [Price Development Volume Query Min]
  8. GROUP BY [Price Development Volume Query Min].Type
  9.       , [Price Development Volume Query Min].Unidad
  10.       , [Price Development Volume Query Min].Date
  11. PIVOT [Price Development Volume Query Min].Hour;
Any feedback on how to improve performance, if only slightly, would be greatly appreciated. Am fully expecting that I have built in a lot of excess that has potential to be stripped out
Mar 31 '14 #1
Share this Question
Share on Google+
1 Reply


Expert Mod 2.5K+
P: 2,545
A downside of crosstabs is that Access does not know even what fields it is going to have as a result of the Transform statement until it completely evaluates the underlying Select statement. Having done that, Access then has to sort out the summing and so on under the columns concerned. This can take a considerable amount of time to do, as you are finding.

One factor in your SQL which will cause a deterioration in performance is that you have a complex WHERE clause with multiple text comparisons such as

Expand|Select|Wrap|Line Numbers
  1. ([2013 Data Private].[Type Offer])="v") 
  2.        AND (([2013 Data Private].Price)<15) 
  3.        AND (([2013 Data Private].OM)="O") 
  4.        AND (([Linked MIBEL Codes].Type)="COAL"))
I would suggest simplifying or eliminating these as much as possible, as text comparisons take many times longer to execute than numeric comparisons, and the complex sets of ANDs and ORs will not help this process.

Similarly, GROUP BY statements based on text fields are more time consuming to execute than those based on simpler numeric fields, so if you have numeric code alternatives for your text values you may wish to group on those instead.

Even your Pivot statement is based on a text conversion - from date to a month name - which will also cause a performance hit when compared to pivoting on a month number (such as month([dte]) in place of your format([dte], "mmm")).

Bear in mind as I mentioned that Access cannot even know the names of all the fields resulting from the Transform statement until all the underlying SQL has been executed across all rows of all parts of the underlying SQL, so comparing the performance of a crosstab to its underlying Select statements is not going to give a true picture of what is efficient and what is not.

One final piece of advice is never feed a crosstab into another crosstab. I'm not sure that you are, but if you are I would keep any queries on which you base the final crosstab as SELECT statements with no crosstabbing. Access can really struggle if one crosstab feeds another.

-Stewart
Mar 31 '14 #2

Post your reply

Sign in to post your reply or Sign up for a free account.