By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,660 Members | 1,593 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,660 IT Pros & Developers. It's quick & easy.

Advice on middleware products for TRUE Scaling out of SQL Server

P: n/a
We have a 3 month old quad processor/dual core server running SQL
Server 2005 and already it is getting close to hitting the CPU wall.
An 8 way CPU box is prohibitively expensive and out of the question. I

am looking desperately for a way to TRULY scale out SQL server...in the

same way that IIS can be scaled out via App Center.

The "in the box" solution for SQL Server 2005 scaling out is the DMV.
Unfortunately this solution makes the system less available rather than

more (one server outage takes them all out for that table) and requires

serious rearchitecting of the software to use. Contrast this to IIS
and AppCenter where each added server makes the system more available,
and requires no rearchitecting to work.

Before someone says "what you want can't be done in a
database"...Oracle has an application server middleware product that
lets you do both of the above. Just plug a new server with Oracle on
it, and you've doubled your capacity. But SQL Server 2005 doesn't yet
have a similar capability.

So I read with great interest the following article that talks about
why this is the case with SQL Server. There are two issues that make
it very difficult to do:
http://www.sql-server-performance.co...bility_availab...
You can create a crude pool using replication, but the performance
times look horrendous.

However, the article also talks about the latest developments in this
field...specifically MIDDLEWARE that can create a scale out solution
that is more available and that requires simply adding new servers to
scale up.

I found two companies which seem to offer this new capability:
http://www.metaverse.cc/newsevents.asp?cid=17999
and
http://www.pcticorp.com/product.aspx

Both companies appear to have patents or a patent pending on the
process. I tried to contact metaverse but got no reply, despite their
recent press release. I just emailed Pcticorp today to see if I could
learn more about their product.

My question for this group is:
Does anyone have experience with either of the two products (or any
others that provide this capability)?

Many thanks in advance for your help.

Ian Ippolito
http://www.rentacoder.com

Apr 14 '06 #1
Share this Question
Share on Google+
17 Replies


P: n/a

"IanIpp" <ia**********@gmail.com> wrote in message
news:11**********************@z34g2000cwc.googlegr oups.com...
We have a 3 month old quad processor/dual core server running SQL
Server 2005 and already it is getting close to hitting the CPU wall.
An 8 way CPU box is prohibitively expensive and out of the question. I

am looking desperately for a way to TRULY scale out SQL server...in the

same way that IIS can be scaled out via App Center.

Well the first thing I would say is make damn sure it's not a code issue.

To relate a story, we had two boxes maxing out and were ready to buy a 3rd
in order handle the load.

After reading a white paper on performance tuning, I was able to work with
our developers to rewrite a single stored procedure and get to the point
where ONE box was handling the entire load and still had room to scale.

Ok, that's an extreme case (the boxes were basically doing only the one
thing) but it can show how much a difference simple tuning can make.

Ok, assuming that you've done that, if you can break any of the stuff into
read-only queries, one thing that might work is setup the current server as
a "publishing" server and use replication to push the data to "read-only"
servers.
The "in the box" solution for SQL Server 2005 scaling out is the DMV.
DMV, I'm not familiar with that acronym.
Unfortunately this solution makes the system less available rather than
more (one server outage takes them all out for that table) and requires
serious rearchitecting of the software to use. Contrast this to IIS
and AppCenter where each added server makes the system more available,
and requires no rearchitecting to work.

Before someone says "what you want can't be done in a
database"...Oracle has an application server middleware product that
lets you do both of the above. Just plug a new server with Oracle on
it, and you've doubled your capacity. But SQL Server 2005 doesn't yet
have a similar capability.

So I read with great interest the following article that talks about
why this is the case with SQL Server. There are two issues that make
it very difficult to do:
http://www.sql-server-performance.co...bility_availab...
You can create a crude pool using replication, but the performance
times look horrendous.
Not necessarily. We do fine with it.

However, the article also talks about the latest developments in this
field...specifically MIDDLEWARE that can create a scale out solution
that is more available and that requires simply adding new servers to
scale up.

I found two companies which seem to offer this new capability:
http://www.metaverse.cc/newsevents.asp?cid=17999
and
http://www.pcticorp.com/product.aspx

Both companies appear to have patents or a patent pending on the
process. I tried to contact metaverse but got no reply, despite their
recent press release. I just emailed Pcticorp today to see if I could
learn more about their product.

My question for this group is:
Does anyone have experience with either of the two products (or any
others that provide this capability)?

That I can't say much on. Sorry.

Many thanks in advance for your help.

Ian Ippolito
http://www.rentacoder.com

Apr 14 '06 #2

P: n/a
IanIpp wrote:
We have a 3 month old quad processor/dual core server running SQL
Server 2005 and already it is getting close to hitting the CPU wall.
An 8 way CPU box is prohibitively expensive and out of the question. I


HP ProLiant DL585-G1 128GB/2.4GHz/DC/4P
Availability Date 11/08/05
TPC-C Throughput 202,551
http://www.tpc.org/tpcc/results/tpcc...p?id=105100101

Are you doing more than 100,000 transactions/minute in an OLTP system
and can't pay an 8 way machine?

Then I guess your problem is other, not with SQL2K5.

Apr 14 '06 #3

P: n/a
Is your problem during ETL or Query processing?

Apr 14 '06 #4

P: n/a
Greg D. Moore (Strider) (mo****************@greenms.com) writes:
Well the first thing I would say is make damn sure it's not a code issue.

To relate a story, we had two boxes maxing out and were ready to buy a 3rd
in order handle the load.

After reading a white paper on performance tuning, I was able to work with
our developers to rewrite a single stored procedure and get to the point
where ONE box was handling the entire load and still had room to scale.

Ok, that's an extreme case (the boxes were basically doing only the one
thing) but it can show how much a difference simple tuning can make.

Ok, assuming that you've done that, if you can break any of the stuff
into read-only queries, one thing that might work is setup the current
server as a "publishing" server and use replication to push the data to
"read-only" servers.


To be blunt, I think Ian has a lot of potential here. Provided of course,
that he has control over the code. If he has a some sleazy third-party
app, tuning may not be that much of an option. Then again, SQL 2005
offers plan guides where you can give hints or complete plans to queries
without direct access to the source code. And he can still add indexes.
The "in the box" solution for SQL Server 2005 scaling out is the DMV.


DMV, I'm not familiar with that acronym.


Dynamic Management Views, the new interface to engine-state information
in SQL 2005.

I guess that Ian was thinking of DPV, Distributed Partioned Views.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 14 '06 #5

P: n/a

"Erland Sommarskog" <es****@sommarskog.se> wrote in message
news:Xn**********************@127.0.0.1...
Greg D. Moore (Strider) (mo****************@greenms.com) writes:
Ok, assuming that you've done that, if you can break any of the stuff
into read-only queries, one thing that might work is setup the current
server as a "publishing" server and use replication to push the data to
"read-only" servers.
To be blunt, I think Ian has a lot of potential here. Provided of course,
that he has control over the code. If he has a some sleazy third-party
app, tuning may not be that much of an option. Then again, SQL 2005
offers plan guides where you can give hints or complete plans to queries
without direct access to the source code. And he can still add indexes.


True, I was just giving him the benefit of the doubt. :-)

The "in the box" solution for SQL Server 2005 scaling out is the DMV.
DMV, I'm not familiar with that acronym.


Dynamic Management Views, the new interface to engine-state information
in SQL 2005.

I guess that Ian was thinking of DPV, Distributed Partioned Views.


That makes a LOT more sense. :-)


--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Apr 14 '06 #6

P: n/a
All,

Thanks for the replies.

1) Sorry...yes I did mean DPV instead of DMV! The distributed
partioned views is a way to create an updatable copy of data on
multiple databases in SQL 2005 (which MSFT calls a federation). It
sounds good in concept until you realize that the outage of any one
federated database kills access to the table on all of them. So rather
than scaling and becoming MORE available...this solution scales at the
expense of less availability. On the other hand, the middleware
products I quoted above promise BOTH scaling and availability.

2) One other thing..."scaling up" (adding more CPUs) has a less than
linear effect on performance becuase switching and other things hurt
performance. In other words...if you go from a 2 cpu machine to a 4 cpu
machine...you won't get double the performance (some source quote the
last CPU as only increasing 15-17% rather than 50%). But "scaling out"
(adding more servers) with a middleware product (if it truly works)
promises a linear increase in performance. So it would not only be
cheaper but better...and that is always what everyone is looking for.
That is why I'm hoping that someone here has experience with such
products and can comment on them.

3) Regarding tuning queries,etc.

Yes, we have control over the code but we already run
extensive/constant query tuning and add/adjust indexes and regularly
use the Database Tuning Advisor (see my post here for some of the
existing bugs I've found in SQL 2005's DTA:
http://rentacoder.com/CS/blogs/real_...03/17/447.aspx
). We also update statistics and defrag the indices (and rebuild the
ones that can't be defragged). There are 2 bugs I have open tickets on
with indices not being defragged even after rebuilding...and not on
small tables, but large ones with thousands of pages of data. I'll
update my blog once MSFT gives more information on what is going on.

But if you are growing, tuning, defragging indices, etc. can only get
you so far. Eventually you WILL run into the limitations of your
hardware. Guaranteed. So it's not a true solution...it just delays
the inevitable.

Regarding:
HP ProLiant DL585-G1 128GB/2.4GHz/DC/4P

Availability Date 11/08/05
TPC-C Throughput 202,551
http://www.tpc.org/tpcc/results/tpcc...p?id=105100101
Are you doing more than 100,000 transactions/minute in an OLTP system
and can't pay an 8 way machine? Then I guess your problem is other, not
with SQL2K5.

I looked at this result and was encouraged for a minute that perhaps we
might be able to make better use of the hardware "somehow". But then I
looked deeper. I have to wonder how applicable these #s are to "real
life". Maybe I'm offbase, but our machine is the top of the line Dell
quad processor/dual core model..and comes in at about $35k (just the
machine...no software licenses). The machine in this contest was
priced at half a million dollars ($500,000)! What are they running
this thing on? My understanding is that Microsoft devotes an entire
team to doing exotic things to the hardware that companies without PHDs
in computer engineering and system design cannot do. I have also heard
(but don't know if it's true) that they add features to SQL Server
after receiving their workload to make it perform well...and if so then
this is something else that no one else can do. If I'm wrong about
any of this, someone please correct me. If I'm essentially right, then
it's not reasoanble to expect these rates.

So in that vein...I'd be interested to hear about anyone with "real
life" implementations and the TPS they are achieving? First, what is
the best way to measure TPS? I found the perf mon counter
"batches/sec"...is that what others use? If so, then we are at about
6,000-8,000/minute on a 64 bit quad processor/dual core machine and
currently at 70-80% CPU capacity. This is far below the 202,000/minute
of the TPC benchmark. What do other people get on this stat?

4) Does anyone have experience with a read-only database in a real life
situation? In reading some papers it seems that using replication to
do this will severely slow down your inserts and updates (one quoted
50-80%). That isn't a realistic solution. Another possible solution
is mirroring and using a snapshot, but if you do this, Microsoft won't
support your database anymore (also not realistic). Maybe log shipping
is the best way...what is your real life experience?

5) Does anyone have experience using these middleware products which
promise better performance and price than traditional scaling up and
more availability than traditional "scaling out" via DPVs or a read
only database?

Ian Ippolito
www.RentACoder.com

Apr 19 '06 #7

P: n/a
Stu
Hey Ian,

Not sure if this will help you or not, but I'm interested in
"real-world" statistics as well. Using your batch requests/sec metric,
we're getting anywhere from 21,000 -30,000 batches per minute on a
32-bit dual XEON machine, with 8 gig of RAM. We run a mixture of both
OLTP and OLAP databases on that box.

HTH,
Stu

Apr 19 '06 #8

P: n/a
Lan,
Just curious, is this a transactional system or data warehouse or both?
We have extensive experience with scaling out SQL Server, but only
from a data warehousing (ETL and Reporting) perspective and not a
transactional perspective.
Thanks,
Brad

Apr 19 '06 #9

P: n/a
Thanks Stu...that is interesting. We're getting less per minute than
you are on twice the # of processors. What percentage of CPU usage
does your box average?

Brad...this is a transactional system (OLTP) database only, with no
data warehousing running on it.

Does anyone else want to comment on their configuration, batches/minute
(or other metric) and CPU usage?

Ian

Apr 19 '06 #10

P: n/a
Stu
Hey Ian, on average we hover around 30%, but it's been a long painful
journey to get to there. And, those are Xeon processors, so to
Windows, it appears as a quad-processor box.

Apr 19 '06 #11

P: n/a
Lan,
Just curious, is this a transactional system or data warehouse or both?
We have extensive experience with scaling out SQL Server, but only
from a data warehousing (ETL and Reporting) perspective and not a
transactional perspective.
Thanks,
Brad

Apr 19 '06 #12

P: n/a
IanIpp (ia**********@gmail.com) writes:
3) Regarding tuning queries,etc.

Yes, we have control over the code but we already run
extensive/constant query tuning and add/adjust indexes and regularly
use the Database Tuning Advisor (see my post here for some of the
existing bugs I've found in SQL 2005's DTA:
http://rentacoder.com/CS/blogs/real_...03/17/447.aspx
). We also update statistics and defrag the indices (and rebuild the
ones that can't be defragged). There are 2 bugs I have open tickets on
with indices not being defragged even after rebuilding...and not on
small tables, but large ones with thousands of pages of data. I'll
update my blog once MSFT gives more information on what is going on.


I don't want to belittle, but I have a strong feeling that you still
have a lot to gain by tuning the application. Maybe you've past all
the simple ones: adding indexes, finding bad queries etc, and you
will now have to look for more structural issues. That is, how much
iterative processing (cursors and the like) do you have?

After all, the numbers Stu gave for his system were appalling better
than yours.

Of course, TPC-C benchmark was even further afield, but that is a
value that more demonstrates the outer edge of what is at all possible.

I can't give any numbers for our system, but none of our customers are
close to the load that yours and Stu's system see.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 19 '06 #13

P: n/a
That's interesting info Stu.

Erland,
I don't want to belittle, but I have a strong feeling that you still have a lot to gain by tuning the application. Maybe you've past all
the simple ones: adding indexes, finding bad queries etc, and you
will now have to look for more structural issues. That is, how much
iterative processing (cursors and the like) do you have?

Virtually no cursors and iterative processing, due to all the problems
with them.
After all, the numbers Stu gave for his system were appalling better

than yours.

Erland, it's premature to judge one way or the other, since it's not
necessarily an apples to apples comparison.

Imagine this example. Two systems are both perfectly tuned and the
applications are perfectly designed. One is doing one query over and
over again: SELECT one_field FROM [SimpleTable] and that table is only
a few thousands rows. The second server is doing another query over
and over again: SELECT <lots of fields...> FROM [Table1] INNERJOIN
[Table2] INNERJOIN [Table3] INNERTJOIN [Table4] ... with tables that
each have a few million rows. In this example, system 1 will get a
signficantly higher # of TPS. This doesn't mean that you can jump to
the conclusion that system #2 is out of tune...it just means its job
requirements force it to do more work.

I'll give you a real life example. This is the most heavily used page
on the site (about 60% of the volume of traffic) and thus 60% of the
queries to the database:

http://www.rentacoder.com/RentACoder...ngExpiration=1

That page is actually called from a # of different places (newest bid
requests, my bid requests, search bid requests, browse bid request
category)...all of them lead to that page. But the end result is
always the same thing...show a list of bid requests. It seems simple
until you realize that there are over a million rows in our table of
registered people...and that must be joined to. We have half a million
bid requests and that table must be joined to. Connected each of these
is an average of 50 bids (x half a million=25 million rows) and this
table must be queried to produce some of the summary information. Etc,
etc.

Now maybe Stu's typical transaction is equally demanding. But without
asking him, we can't yet tell. (By the way Stu, do you know your
heaviest volume transaction and what kinds of tables sizes are
involved?)

Some other interesting things. The biggest killer of time on that page
is the fact that it involves paging. This means that:
a) Everyone expects you to provide a feature that say s Page # 1 of
<some #>...meaning you need to know how many total rows are in the
result set...even if you don't return them. So this requires doing a
COUNT (slow).
b) Paging is handled using a a great new SQL Server 2005 feature called
ROW_NUMBER(). That feature shaves off several orders of magnitude of
time versus in 2000 as you can see:
http://sqljunkies.com/WebLog/amachan...1/03/4945.aspx.
Unfortuantely it doesn't work properly on a DISTINCT query (which is
understandable)...which requires structuring it as a ROW_NUMBER() of a
subquery. But there is a bug in 2005...it can do this...but as soon as
you add the BETWEEN clause or WHERE (which is what allows you to save
time via this method) it gives an error and won't run. I'm got a bug
report open with MSFT on this one too (I'm sure they love me...I have
way too many tickets open right now).

Ian

But my point is that back in the SQL 2000 days you didn't have this new
feature...and you just had to put up with the slowness if you were
doing paging.

Apr 19 '06 #14

P: n/a
Stu
I'm reluctant to give too many details about our structure for fear
that I'd be compromising our business model; I can say that our primary
application is a data warehouse structure that involves several million
rows of data per day. Data is loaded in small batches (every minute,
24 X7), and we do our own pattern seeking against the new data (kind of
a home-grown analysis services).

Our batches can range in size from 1 row to 20,000 rows, depending on
the time of day, and the nature of the data. We host both a raw
database (involving a very verbose but simple OLTP structure) and the
data warehouse on the same box.

Some of the things we do to lessen the bottleneck on the server
include:

1. distribute the ETL process as much as possible. We have several
little bots that run on various servers that handle loading, analysis,
and grouping off the main server. We used to use DTS quite heavily;
we're transitioning away from that.

2. Use appropriate locking hints. We write all sorts of code involving
NOLOCK and temp tables to prefetch the data.

3. Make the most of our physical structure. We use filegroups to
seperate indexes from tables, and have isolated our busiest databases
from each other on seperate drive arrays.

4. We always cluster on monotonically-increasing values (such as dates
or date representation) so that page splits are minimal. We also use
partitioned views (although they are a bit of a bear to maintain).

HTH,Stu

Apr 20 '06 #15

P: n/a
IanIpp (ia**********@gmail.com) writes:
I'll give you a real life example. This is the most heavily used page
on the site (about 60% of the volume of traffic) and thus 60% of the
queries to the database:

http://www.rentacoder.com/RentACoder...ngExpiration=1
...
Some other interesting things. The biggest killer of time on that page
is the fact that it involves paging. This means that:


So each time I request the next page the query is rerun? Here is a tip
from a pure end-user perspective. Spit out 100 of entries at time rather
than just 10. I don't know why web designers insist on giving my small piece
at time. Give me at least 100 items at a time. I've better things to do all
day, than paging forth and back in a lousy web browser. On top of that,
if the query is rerun each time I page, I may get to see different results.

From a more technical perspective, saving the search results in a process-
keyed table could be an option, although it means that each initial search
will require a write operation, and if users don't page very often, it
could just make matters worse. (Then again, here is an easy option to
scale out: the middle tier could receive the full result set, and then
write the search to different server.)
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Apr 20 '06 #16

P: n/a
Thanks for the information guys.

Stu, thanks for the info on your DB. The Rent a Coder DB is OLTP,
which definitely presents different challenges than a datawarehousing
system because of the inherent conflicts of optimizing/and locking that
cocur with updates/inserts versus SELECTS (you also do updates/inserts,
but only when refreshing the data) .

-----------------------------------------------------------

By the way, I found some of the answers to my original question:
1) The Metaverse database scattering middleware prodcut is NOT
available yet...still in Alpha. The owner actually recommended the 2nd
product.
2) The PCTI Corp middleware product (ICS-UDS) is available. The owner
is preparing references from several companies...two of which are house
hold names, so that is encouarging.

---------------------------------------------------------------

Here's an unrelated question for anyone:

The built in SQL Server 2005 tools for analyzing performance problems
are very limited when trying to diagnose active problems. Here are a
few tools that I found that make up some of the deficiencies:

1)
http://www.quest.com/quest_central_f...e_analysis.asp
(There is a nice video explaining the problems with the existing SQL
2005 tools here:
http://www.quest.com/Quest_Central_f...r/dba_tale.htm )
2) http://www.sqlpower.com/index.html (Pepsi uses this product)

I'm sure there are many more. Which add on tool do you use and why?

Thanks,
Ian Ippolito
RentACoder.com

Apr 21 '06 #17

P: 6
Quest is definately a top-notch way to go. With the technology they gleaned from Imceda, they definately produce some great products. Their LiteSpeed is hard to beat for backups.

Symantec/Veritas, or whatever their company name is today, as a product called I3 that is really nice as well. We bought the product when it was called Precise (so was the company). It got bought out and the rest is history. The thing about I3--it does an enterprise picture as well--SQL Server is part of the picture, but it will also show you what the webserver or the application server was doing at the same time and correlate the results--nice functionality.


Also, one of our larger clients uses Idera Diagnostic Manager (used to be NetIQ--sold off). It's a nice product as well.

Brett
May 6 '06 #18

This discussion thread is closed

Replies have been disabled for this discussion.