By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,327 Members | 1,806 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,327 IT Pros & Developers. It's quick & easy.

Using SQL Server for Data Procressing

P: n/a
ml
My employers currently use Access for processing large volumes of data
for reporting and simple modelling, which involves a lot of make
table/update queries etc.

I have been asked to work on a retail forecasting model, but looking at
the spec. the base data tables have approximately 6 million records.

I know this can theoretically be done using Access (Access 97) but in
reality it is extremely slow (even with the necessary tables optimised
with indexes etc), plus the 1GB file limit is going to be a problem.

My question is, what is SQL Server like for data processing (as opposed
to just "serving" data) ? Would this be the recommended option for the
above requirements ?

These models wouldn't have more than a couple of concurrent users on the
whole, but it would be grinding lots of data a lot of the time.

Thanks in advance for your opinions.
Nov 13 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
ml wrote:
My employers currently use Access for processing large volumes of data
for reporting and simple modelling, which involves a lot of make
table/update queries etc.

I have been asked to work on a retail forecasting model, but looking
at the spec. the base data tables have approximately 6 million
records.
I know this can theoretically be done using Access (Access 97) but in
reality it is extremely slow (even with the necessary tables optimised
with indexes etc), plus the 1GB file limit is going to be a problem.

My question is, what is SQL Server like for data processing (as
opposed to just "serving" data) ? Would this be the recommended
option for the above requirements ?

These models wouldn't have more than a couple of concurrent users on
the whole, but it would be grinding lots of data a lot of the time.

Thanks in advance for your opinions.


It depends. If a good percentage of your slowness is being caused by heavy
traffic on the LAN when Access is pulling data to "process" then SQL Server
would likely perform a lot better because all of that i/o would stay on the
server and only the results sent over the LAN (if you do it correctly that is).
If it is "just the processing" then SQL Server might not do any better and might
actually be slower. The extra security, data integrity, etc., that SQL Server
provides comes at the expense of additional overhead.

Of course when most places set up a SQL Server they put it on a pretty
impressive box (Multi Xeon processors, tons of RAM, and a good RAID disk
system). If you did that then the processing on the SQL Server would be lots
faster, but it would be due to the hardware, not because of SQL Server itself.

If you are running this Access app over a network though my guess is that LAN
traffic IS a significant part of the problem. You could certainly run some
tests using the free MSDE version of SQL Server that comes with Office and gauge
the results.

--
I don't check the Email account attached
to this message. Send instead to...
RBrandt at Hunter dot com

--
I don't check the Email account attached
to this message. Send instead to...
RBrandt at Hunter dot com
Nov 13 '05 #2

P: n/a
ml wrote:
My employers currently use Access for processing large volumes of data
for reporting and simple modelling, which involves a lot of make
table/update queries etc.

I have been asked to work on a retail forecasting model, but looking at
the spec. the base data tables have approximately 6 million records.

I know this can theoretically be done using Access (Access 97) but in
reality it is extremely slow (even with the necessary tables optimised
with indexes etc), plus the 1GB file limit is going to be a problem.

My question is, what is SQL Server like for data processing (as opposed
to just "serving" data) ? Would this be the recommended option for the
above requirements ?

These models wouldn't have more than a couple of concurrent users on the
whole, but it would be grinding lots of data a lot of the time.

Thanks in advance for your opinions.


SQL-Server is as good at data processing as the person who writes the
T-SQL which does the data processing. That is, 95% of the time it's
dreadful.

There are some very important questions involved in your project:

1. Do you want the server to grind lots of data, or do you want the
server to hold the data before and after the grinding which occurs
somewhere else, sending and receiving the data when needed? How powerful
is your server? Does it have other jobs, databases, applications? If
this grinding is very demanding consider off-loading it to a dedicated
ball-breaker machine.

2. What technology do you intend to use to grind the data? VBA and its
relative T-SQL are very archaic script languages which are enormously
inefficient and lacking for this type of work. Houdini, Christ and
Muhammad as a team couldn't make these technologies fast, and Bill Gates
has not done so.

3. Who will write the code/script that will do this grinding? I would
suggest to you that a skillful programmer may produce code that is
thousands (yes !REALLY!) of times faster than someone who is not so
skillful.

If your organization does not have the in-house expertise and experience
to deal with these questions, my opinion is that your employers should
invest some money in the project, do a careful evaluation of available
application developers, and hire the best to do the job.

--
Lyle
--
Nov 13 '05 #3

P: n/a
Hi,

most of the time people try to emulate access when they upgrade to SQL
server, these technologies are so different in their respective designs
e.g. access client processing only, SQL server client/server that do
this is like driving your lada to work with a bmw sitting in the
garage.

Although just moving to SQL without doing any optimisation server
normally will bring some improvement in perfomance. In every instance
where you have data loops within data loops, these can be optimised to
be several hundred times faster, for 2 reasons, TSQL code is just a
data manipulation langauge and is optimised as one, the second reason
the code is closer to the data commands do not have to pass down the
wire and back again.

To get the best performance a complete rethink of your data structures
is important, the use of indexes is a double edged sword, loads of
indexes are good for finding things, but bad for updates and inserts.

If you have one specific query that is responsible for most of your
traffic, then the use of clustered indexes can improve things a great
deal e.g.

if your database is always being searched on the same field, then to
physically order the data by that field can have great improvements in
speed.

The thing I try to keep in my mind when developing in SQL is this, try
to keep all the business rules in SQL rather than the front-end, let
the back-end do as much of the work as posible. Think about the
posibility of there being several front-ends e.g.
access,http/asp/asp.net to the data make the SQL server do the work.

Good use of triggers (code run when table actions happen), can reduce
client-side code massively.

If I don't get performance improvements of 2-3 times overall when
upgrading from access to SQL, then I feel I have failed, as my main
selling points of the upgrade are reliability and performance.

If you have specific questions on moving to SQL server please post
here.

Regards

Alex

Nov 13 '05 #4

P: n/a
Lyle Fairfield <ly******@yahoo.ca> wrote in
news:Kh*********************@read2.cgocable.net:
ml wrote:
My employers currently use Access for processing large
volumes of data for reporting and simple modelling, which
involves a lot of make table/update queries etc.

I have been asked to work on a retail forecasting model, but
looking at the spec. the base data tables have approximately
6 million records.

I know this can theoretically be done using Access (Access
97) but in reality it is extremely slow (even with the
necessary tables optimised with indexes etc), plus the 1GB
file limit is going to be a problem.

My question is, what is SQL Server like for data processing
(as opposed to just "serving" data) ? Would this be the
recommended option for the above requirements ?

These models wouldn't have more than a couple of concurrent
users on the whole, but it would be grinding lots of data a
lot of the time.

Thanks in advance for your opinions.


SQL-Server is as good at data processing as the person who
writes the T-SQL which does the data processing. That is, 95%
of the time it's dreadful.

There are some very important questions involved in your
project:

1. Do you want the server to grind lots of data, or do you
want the server to hold the data before and after the grinding
which occurs somewhere else, sending and receiving the data
when needed? How powerful is your server? Does it have other
jobs, databases, applications? If this grinding is very
demanding consider off-loading it to a dedicated ball-breaker
machine.

2. What technology do you intend to use to grind the data? VBA
and its relative T-SQL are very archaic script languages which
are enormously inefficient and lacking for this type of work.
Houdini, Christ and Muhammad as a team couldn't make these
technologies fast, and Bill Gates has not done so.

3. Who will write the code/script that will do this grinding?
I would suggest to you that a skillful programmer may produce
code that is thousands (yes !REALLY!) of times faster than
someone who is not so skillful.

If your organization does not have the in-house expertise and
experience to deal with these questions, my opinion is that
your employers should invest some money in the project, do a
careful evaluation of available application developers, and
hire the best to do the job.

--
Lyle
--


Cheers and applause.

--
Bob Quintal

PA is y I've altered my email address.
Nov 13 '05 #5

P: n/a
> 2. What technology do you intend to use to grind the data? VBA and its
relative T-SQL are very archaic script languages which are enormously
inefficient and lacking for this type of work. Houdini, Christ and
Muhammad as a team couldn't make these technologies fast, and Bill Gates
has not done so.


Hi Lyle

What technology is efficient for this type of work?

Thanks - Joe
Nov 13 '05 #6

P: n/a
"Joe Black" <jo********@hotmail.com> wrote in news:diL8e.19690$1S4.1957089
@news.xtra.co.nz:
2. What technology do you intend to use to grind the data? VBA and its
relative T-SQL are very archaic script languages which are enormously
inefficient and lacking for this type of work. Houdini, Christ and
Muhammad as a team couldn't make these technologies fast, and Bill Gates
has not done so.


Hi Lyle

What technology is efficient for this type of work?

Thanks - Joe


In my opinion ... something with advanced array handling techniques. I
interpret data griding to mean repeated scans through data. Traversing
recordsets is extremely slow. Traversing arrays is extremely fast. In
addition I would want to be able to push, pop, shift, unshift, slice,
splice and sort arrays.
If the grinding is to involve floating point calculations ... something
that will compile to native code.
If the grinding is to include text manipulation something with a strong
Regular Expression or similar capability.
And if I could find something that compiled my loops to a linear outcome,
that would be great too.

--
Lyle
--
From ADO28.chm

Deprecated Components
Each of the following components is considered obsolete. While these
components are still supported in this release of the Microsoft® Data
Access Components (MDAC), they may be removed in the future. When writing
new applications, you should avoid using these deprecated components. When
modifying existing applications, you are strongly encouraged to remove any
dependency on these components.

ODBC Provider (MSDASQL)
You are strongly encouraged to use one of the native OLE DB Providers
instead of the Microsoft Open Database Connectivity (ODBC) Provider. Native
OLE DB Providers provide better application stability and performance.
Furthermore, native OLE DB Providers will be supported in the future,
whereas MSDASQL will not have any new features added to it, will not be
available on 64-bit, and will not be accessible from the OLE DB NET Data
Provider.

Nov 13 '05 #7

P: n/a
ml
Thanks all for your replies.

I suppose my first concern about this new model is the size of the base
data i.e. it will definitely go over the 1GB limit in A97, whereas SQL
Server is able to deal with massive datasets. I suppose one way to deal
with the limit would be to have each table in a separate DB but this is
hardly ideal.

Until I become more fluent in TSQL (I have a reasonable understanding of
SQL from within Access) what kind of performance changes would I expect
to see by linking the tables on a SQL Server into Access via ODBC or
passthrough queries and performing the updates etc. from within Access ?

Even performing updates on a table of 6m records on a hard drive takes a
long time (and that's with the indexes dropped for the inserts/updates).

I don't think the actual resulting dataset (the 'output' data) would be
huge, but as I said the base tables are pretty hefty.

With regard to the calculations being performed, they will probably be
numerical factors and manipulation applied to the base data, which in
turn would be aggregated.

As you can probably tell I haven't used SQL Server (although this is
something I am keen to learn) so don't be too harsh is these seem like
somewhat ignorant questions !

Unfortunately getting a dedicated developer isn't an option due to
current budget contraints.

I can't really give you details of the server setup because there
currently isn't one ! If this is feasible, I will put a case together
and try and get a dedicated box for the processing.

Cheers, and please feel free to post any further thoughts.
Nov 13 '05 #8

P: n/a
rkc
ml wrote:
Cheers, and please feel free to post any further thoughts.


Have you considered looking at exisitng retail forecasting
software?
Nov 13 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.