Scheduled job hangs the server - Microsoft SQL Server

Bill

Sorry re-posted as my email setting were wrong on the last post

I wonder if anyone can help.

I have a scheduled job running overnight to delete old records for a
particular Db table.

The table contains more than half million records and the script
simply uses the date field to delete any of the records which have a
date older than 7 days. My guess is that there will be some 100,000
records which need to be deleted.

The job takes ages to run and whilst it is running sql server is
completely locked and the associated website is effectively offline
because of it.

I think I know where the problem lies but I don't know why.

Whoever created the Db table created a varchar 255 field as the
primary key. The field is then filled with a string of 40 characters
generated within the webserver script to have a 'unique' string.

Those are the facts. The following is my interpretation.

I think the server is unable to cope with not only a character field
as primary key but also one with so many characters in it. In such a
larger table presumably in order to delete the old records it must do
some sort of ordered sort on the Primary Key and it is this that is
causing the problem.

I would like to introduce a new field called 'id' make it autonumbered
and primary and make the errant field non-primary.

So my question is this:

Is my analysis correct but more importantly, why? Can anyone give me
clear reasoning for it.

Also is the solution sound?
Table looks like this:

clientID int
refID varchar 255 Primary Key
fieldA varchar 512
creationDate datetime
fieldB varchar 255
field C varchar 32

Job script:

delete from myTable where creationDate < [7daysAgo]

Thanks in anticipation

Bill

Nov 23 '05 #1

Subscribe Post Reply

1683

Erland Sommarskog

Bill (wj*@blueyonder.co.uk) writes:

The table contains more than half million records and the script
simply uses the date field to delete any of the records which have a
date older than 7 days. My guess is that there will be some 100,000
records which need to be deleted.

The job takes ages to run and whilst it is running sql server is
completely locked and the associated website is effectively offline
because of it.
I doubt that SQL Server is completely locked. You shoudl be able to
log in and to an sp_who.

But if the table is essential for the function of the web site, I can
understand that the table is entire inaccessible during this time.

Furthermore, if there are other tables referencing this table through
foreign-key constraints, and those tables are huge and the FK columns
are non-indexed, this can be a really aggrevating factor. In the below
I don't discuss this further (as this occurred to me as I was about to
post), but you must definitely investigate it.
Whoever created the Db table created a varchar 255 field as the
primary key. The field is then filled with a string of 40 characters
generated within the webserver script to have a 'unique' string.
...
I think the server is unable to cope with not only a character field
as primary key but also one with so many characters in it. In such a
larger table presumably in order to delete the old records it must do
some sort of ordered sort on the Primary Key and it is this that is
causing the problem.
That is not likely. However, if the clustered index is not on the date
field, SQL Server will have to scan the entire table. Furthermore, if
the PK is clustered, and its values are random, the deletion will cause
gaps and leave the table fragmented as Stu points out. Then again, if
the PK is indeed clustered, then new rows should fall in the gaps,
so the fragmentation would be reduced as new rows appear. Yet, then
again, if that generated string is not entirely random, it may be more
complicated.

You can examine fragmentation with DBCC SHOWCONTIG. The cure for
fragmentation is to run DBREINDEX.
I would like to introduce a new field called 'id' make it autonumbered
and primary and make the errant field non-primary.

So my question is this:

Is my analysis correct but more importantly, why? Can anyone give me
clear reasoning for it.

Also is the solution sound?

No, the solution will not solve all your problems. First of all, PK does
not matter here - what matters is where you have the clustered index on
the table. If you add an autonumber column and add a clustered index on
that column (PK or not), you will not see fragementation, assuming that
the date column is fully correlated with the autonumber. But the DELETE
operation will still lock the table.

There are two ways to go:
1) Add an autonumber column with a clustered index on it, and then change
the job, so that it looks up the newest row to delete:

SELECT @id = MIN(id) FROM tbl WHERE datecol < @deletedate

And then use @id for deleting the rows. You would need a non-clustered
index on datecol for this be optimal.

2) Make the PK non-clustered and add a clustered index on the date column.

I would recommend the latter, unless you know that the current clustered
index is very good for something else.

The point with aligning the DELETE with the clustered index, is that
then the DELETE operation does not have to lock the entire table, only
the rows to be deleted.

By the way, just adding a non-clustered index on the date column is not
likely to help with the numbers you've indicated. If 20% of the rows are
hit by the condition, SQL Server is not going to use that index anyway.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Nov 23 '05 #2

Bill

Sorry it's taken a while for me to respond Erland.

When I get time I'm going to chose your option 2. I've been doing a
bit of reading and now understand the problem much better.

Thank you for taking the time to respond it is very much appreciated.
Once the correction is made I shall let you know how successful it has
been.

Thanks

Bill

On Sun, 20 Nov 2005 19:27:14 +0000 (UTC), Erland Sommarskog
<es****@sommarskog.se> wrote:

Bill (wj*@blueyonder.co.uk) writes:
The table contains more than half million records and the script
simply uses the date field to delete any of the records which have a
date older than 7 days. My guess is that there will be some 100,000
records which need to be deleted.

The job takes ages to run and whilst it is running sql server is
completely locked and the associated website is effectively offline
because of it.

I doubt that SQL Server is completely locked. You shoudl be able to
log in and to an sp_who.

But if the table is essential for the function of the web site, I can
understand that the table is entire inaccessible during this time.

Furthermore, if there are other tables referencing this table through
foreign-key constraints, and those tables are huge and the FK columns
are non-indexed, this can be a really aggrevating factor. In the below
I don't discuss this further (as this occurred to me as I was about to
post), but you must definitely investigate it.
Whoever created the Db table created a varchar 255 field as the
primary key. The field is then filled with a string of 40 characters
generated within the webserver script to have a 'unique' string.
...
I think the server is unable to cope with not only a character field
as primary key but also one with so many characters in it. In such a
larger table presumably in order to delete the old records it must do
some sort of ordered sort on the Primary Key and it is this that is
causing the problem.

That is not likely. However, if the clustered index is not on the date
field, SQL Server will have to scan the entire table. Furthermore, if
the PK is clustered, and its values are random, the deletion will cause
gaps and leave the table fragmented as Stu points out. Then again, if
the PK is indeed clustered, then new rows should fall in the gaps,
so the fragmentation would be reduced as new rows appear. Yet, then
again, if that generated string is not entirely random, it may be more
complicated.

You can examine fragmentation with DBCC SHOWCONTIG. The cure for
fragmentation is to run DBREINDEX.
I would like to introduce a new field called 'id' make it autonumbered
and primary and make the errant field non-primary.

So my question is this:

Is my analysis correct but more importantly, why? Can anyone give me
clear reasoning for it.

Also is the solution sound?

No, the solution will not solve all your problems. First of all, PK does
not matter here - what matters is where you have the clustered index on
the table. If you add an autonumber column and add a clustered index on
that column (PK or not), you will not see fragementation, assuming that
the date column is fully correlated with the autonumber. But the DELETE
operation will still lock the table.

There are two ways to go:
1) Add an autonumber column with a clustered index on it, and then change
the job, so that it looks up the newest row to delete:

SELECT @id = MIN(id) FROM tbl WHERE datecol < @deletedate

And then use @id for deleting the rows. You would need a non-clustered
index on datecol for this be optimal.

2) Make the PK non-clustered and add a clustered index on the date column.

I would recommend the latter, unless you know that the current clustered
index is very good for something else.

The point with aligning the DELETE with the clustered index, is that
then the DELETE operation does not have to lock the entire table, only
the rows to be deleted.

By the way, just adding a non-clustered index on the date column is not
likely to help with the numbers you've indicated. If 20% of the rows are
hit by the condition, SQL Server is not going to use that index anyway.

Nov 24 '05 #3

Similar topics

Start A Scheduled Task From ASP Server

by: Colin Steadman | last post by:

We have a number of scheduled tasks on our IIS server that run daily at some point during the early morning. These tasks run as a specific user that has the correct permissions to perform whatever...

ASP / Active Server Pages

Executing a stored proc on another server from a Scheduled task

by: Greg D. Moore \(Strider\) | last post by:

Ok, I thought this one would be easy. I have a stored proc: master.dbo.restore_database_foo This is on database server B. Database server A backs up database foo on a daily basis as a...

Microsoft SQL Server

DCOM errors while running as domain user - scheduled task

by: David Olive | last post by:

Hi guys, I'm having a bit of a problem getting a VB .NET console app to run happily as a scheduled task. The app itself generates a bunch of word documents on a file share on another server by...

.NET Framework

Scheduled Job Hangs the server

by: aaa | last post by:

I wonder if anyone can help. I have a scheduled job running overnight to delete old records for a particular Db table. The table contains more than half million records and the script simply...

Microsoft SQL Server

Scheduled exec

by: Mike | last post by:

Hi all, In my recent project (using ASP.NET 2.0 and MSSQL), I need to scheduled a certain operation to be executed, for example, on beginning of a month. Is there anyway to do this on...

.NET Framework

SQL 2005 and Scheduled Tasks

by: Myster Edd | last post by:

I have a strange problem that I think deals with security on SQL 2005. I have a scheduled task that runs on a Windows 2000 machine. It calls a vb script which creates a connection to SQL Server. ...

Microsoft SQL Server

Accessing SQL Server 2005 via Scheduled Task

by: Myster Edd | last post by:

I have a strange problem that I think deals with security on SQL 2005. I have a scheduled task that runs on a Windows 2000 machine. It calls a vb script which creates a connection to SQL Server. ...

Microsoft SQL Server

xmlrpclib hangs execution

by: Arno Stienen | last post by:

Perhaps I should be a bit more specific. When using this code to connect to a remote XML-RPC server (C++, xmlrpc++0.7 library): import xmlrpclib server =...

Python

Accessing MSAccess dll's from a Windows Scheduled Task

by: jdaelhousen | last post by:

I have a bit of a problem I'm hoping someone can shed some light on... I have a VB.Net console application written in VS 2003 that produces a .exe file that now sits on a Windows 2000 server...

.NET Framework

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware