Indexing question - Microsoft SQL Server

kaelin358

I need some advice on the best way to load data to a table while
maintaining an index. I have a table that is very small at the moment
but will have more than 70 million rows by the end of the year. Data
is added to it several times a day, parsed from a text file. I'd like
to create an index on the table now, when it is small, and maintain it
as the table grows. What is the fastest way to both load the data and
maintain the index? Should I drop the index, load the data and then
reindex? Is there any benefit to loading the data to a temporary file,
indexing that file and then inserting the data into the master table
which is indexed on the same field? Sorry if this is a really newbie
question but I'd appreciate any advice you can give.

Jul 23 '05 #1

Subscribe Post Reply

1178

Hugo Kornelis

On 12 Jan 2005 17:59:04 -0800, ka*******@gmail.com wrote:

I need some advice on the best way to load data to a table while
maintaining an index. I have a table that is very small at the moment
but will have more than 70 million rows by the end of the year. Data
is added to it several times a day, parsed from a text file. I'd like
to create an index on the table now, when it is small, and maintain it
as the table grows. What is the fastest way to both load the data and
maintain the index? Should I drop the index, load the data and then
reindex?
Hi kaelin358,

It depends.

If large numbers of rows are inserted in the table, it is often useful to
drop all indexes before the insert and recreate them afterwards. This goes
especially for nonclustered indexes; the gain for a clustered index is
less, because the cost of rebuilding the clustered index is higher than
the cost of rebuilding a nonclustered index.

However, if the amount of data added is only a fraction of what's already
there (e.g. 60 million rows in the table and a mere 100,000 rows to be
added), then the cost of rebuilding indexes for 60 million rows might well
exceed the cost of updating the index with 100,000 new rows. In that case,
it's cheaper to just keep the index during the insert.

Another consideration is whether the insert is during down time or on a
live system. If you drop, then recreate the indexes, other processes
reading from or writing to the table will slow down. On the other hand, if
you keep the index, other processes writing to the table have a higher
chance of being blocked.

Since performance depends on lots of things (hardware, row size, size of
indexed columns, etc), the only way to know for sure what method is the
quickest is to test both and compare the results.

Is there any benefit to loading the data to a temporary file,
indexing that file and then inserting the data into the master table
which is indexed on the same field?

I would definitely recommend this if you have to do the insert on a live
system. The actual import process will probably be relatively slow; if you
use a staging table to import the data, clean it up (if needed) and only
then import if to the actual table, you'll keep the time that the actual
table is blocked by the import process as low as possible. You'll have to
experiment to find out what (if any) indexes on the staging table result
in the quickest copying to the actual table.

If you only do the insert during down time AND you're sure that you'll
never have to cleanup bad data, then I wouldn't bother and simply insert
straight into the actual table.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #2

kaelin358

Thanks Hugo. It looks like we need to do some experimentation to see
what works best. I'm thinking that the temp table procedure will be
the way to go.

Jul 23 '05 #3

Similar topics

Indexing list of lists

by: Hilde Roth | last post by:

This may have been asked before but I can't find it. If I have a rectangular list of lists, say, l = ,,], is there a handy syntax for retrieving the ith item of every sublist? I know about for i...

Python

NumArray array-indexing

by: Michael Drumheller | last post by:

(If you're not interested in NumArray, please skip this message.) I am new to NumArray and I wonder if someone can help me with array-indexing. Here's the basic situation: Given a rank-2 array...

Python

Third party indexing solutions

by: Chuck Chekuri | last post by:

Hi, We have a 3TB oracle db. 24x7 operation with data constantly coming in to the DB. Users from all over the world use a web based applicaton to query and work on the data. Out transaction...

Oracle Database

Indexing to prevent timeout expired message

by: Danny | last post by:

sorry for posting a question that has probably been posted hundreds of times, but i can't seem to find the answer i need...We're using a Sql Server 7.0 database and recently started getting timeout...

Microsoft SQL Server

Indexing Date/Time fields.

by: Thomas Bartkus | last post by:

If we have a date/time field and are doing frequent queries WHERE {date/time field} BETWEEN TimeA AND TimeB Does it make sense, query speed wise, to create an index on the date/time field? ...

MySQL Database

108

Bug in slice type

by: Bryan Olson | last post by:

The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...

Python

Indexing strategy

by: Ryan | last post by:

I have a bit of a problem with regards an indexing strategy. Well, basically there is no indexing strategy on a set of data I have at work. Now, I didn't create the design as I would have allowed...

Microsoft SQL Server

Full Text File Search with Indexing Service on Windows (cont.)

by: Chung Leong | last post by:

Here's the rest of the tutorial I started earlier: Aside from text within a document, Indexing Service let you search on meta information stored in the files. For example, MusicArtist and...

PHP

How much slower is dict indexing vs. list indexing?

by: Emin | last post by:

Dear Experts, How much slower is dict indexing vs. list indexing (or indexing into a numpy array)? I realize that looking up a value in a dict should be constant time, but does anyone have a...

Python

Indexing Service Server 2008 IIS 7.0 Adobe Ifilter

by: =?Utf-8?B?SmVycnkgQw==?= | last post by:

I have a server 2008 IIS 7.0 with indexing service installed. I have created the catalog and have a test page using these posts:...

ASP.NET

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++