Indexing question - Microsoft SQL Server

kaelin358

I need some advice on the best way to load data to a table while
maintaining an index. I have a table that is very small at the moment
but will have more than 70 million rows by the end of the year. Data
is added to it several times a day, parsed from a text file. I'd like
to create an index on the table now, when it is small, and maintain it
as the table grows. What is the fastest way to both load the data and
maintain the index? Should I drop the index, load the data and then
reindex? Is there any benefit to loading the data to a temporary file,
indexing that file and then inserting the data into the master table
which is indexed on the same field? Sorry if this is a really newbie
question but I'd appreciate any advice you can give.

Jul 23 '05 #1

Subscribe Reply

1189

Hugo Kornelis

On 12 Jan 2005 17:59:04 -0800, ka*******@gmail .com wrote:

I need some advice on the best way to load data to a table while
maintaining an index. I have a table that is very small at the moment
but will have more than 70 million rows by the end of the year. Data
is added to it several times a day, parsed from a text file. I'd like
to create an index on the table now, when it is small, and maintain it
as the table grows. What is the fastest way to both load the data and
maintain the index? Should I drop the index, load the data and then
reindex?
Hi kaelin358,

It depends.

If large numbers of rows are inserted in the table, it is often useful to
drop all indexes before the insert and recreate them afterwards. This goes
especially for nonclustered indexes; the gain for a clustered index is
less, because the cost of rebuilding the clustered index is higher than
the cost of rebuilding a nonclustered index.

However, if the amount of data added is only a fraction of what's already
there (e.g. 60 million rows in the table and a mere 100,000 rows to be
added), then the cost of rebuilding indexes for 60 million rows might well
exceed the cost of updating the index with 100,000 new rows. In that case,
it's cheaper to just keep the index during the insert.

Another consideration is whether the insert is during down time or on a
live system. If you drop, then recreate the indexes, other processes
reading from or writing to the table will slow down. On the other hand, if
you keep the index, other processes writing to the table have a higher
chance of being blocked.

Since performance depends on lots of things (hardware, row size, size of
indexed columns, etc), the only way to know for sure what method is the
quickest is to test both and compare the results.

Is there any benefit to loading the data to a temporary file,
indexing that file and then inserting the data into the master table
which is indexed on the same field?

I would definitely recommend this if you have to do the insert on a live
system. The actual import process will probably be relatively slow; if you
use a staging table to import the data, clean it up (if needed) and only
then import if to the actual table, you'll keep the time that the actual
table is blocked by the import process as low as possible. You'll have to
experiment to find out what (if any) indexes on the staging table result
in the quickest copying to the actual table.

If you only do the insert during down time AND you're sure that you'll
never have to cleanup bad data, then I wouldn't bother and simply insert
straight into the actual table.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Jul 23 '05 #2

kaelin358

Thanks Hugo. It looks like we need to do some experimentation to see
what works best. I'm thinking that the temp table procedure will be
the way to go.

Jul 23 '05 #3

Similar topics

4312

Indexing list of lists

by: Hilde Roth | last post by:

This may have been asked before but I can't find it. If I have a rectangular list of lists, say, l = ,,], is there a handy syntax for retrieving the ith item of every sublist? I know about for i in l] but I was hoping for something more like l. Hilde

Python

2741

NumArray array-indexing

by: Michael Drumheller | last post by:

(If you're not interested in NumArray, please skip this message.) I am new to NumArray and I wonder if someone can help me with array-indexing. Here's the basic situation: Given a rank-2 array (i.e., a matrix) it seems to be trivial, with array indexing, to extract a subset of its *columns*. But it does not seem to be trivial to extract a subset of its *rows*. The code snippet below describes the problem (if it really is a problem)...

Python

2187

Third party indexing solutions

by: Chuck Chekuri | last post by:

Hi, We have a 3TB oracle db. 24x7 operation with data constantly coming in to the DB. Users from all over the world use a web based applicaton to query and work on the data. Out transaction performance is quite good but our query performance sucks. looking up by ids and some combinations of search critera return ok but some never return. We indexed the db for a lot of combinations and we materialized some views and that helped a little....

Oracle Database

5179

Indexing to prevent timeout expired message

by: Danny | last post by:

sorry for posting a question that has probably been posted hundreds of times, but i can't seem to find the answer i need...We're using a Sql Server 7.0 database and recently started getting timeout expired errors on our views, which i'm assuming is due to the volume of records in our tables. Unfortunately i'm not too experiences w/ indexing tables, so my question is: 1) how do you determine which fields to index (does it have to be a...

Microsoft SQL Server

4818

Indexing Date/Time fields.

by: Thomas Bartkus | last post by:

If we have a date/time field and are doing frequent queries WHERE {date/time field} BETWEEN TimeA AND TimeB Does it make sense, query speed wise, to create an index on the date/time field? The reason I ask is that TimeA and TimeB are significant down to seconds. My *assumption* is that with a large amount of data scatter very few records will contain duplicate date/time points. And that under these circumstances, indexing will be of...

by: Bryan Olson | last post by:

The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would describe if applied to a sequence of length items. It returns a tuple of three integers; respectively these are the /start/ and /stop/ indices and the /step/ or stride length of the slice. Missing or out-of-bounds indices are handled in a manner...

Python

1816

Indexing strategy

by: Ryan | last post by:

I have a bit of a problem with regards an indexing strategy. Well, basically there is no indexing strategy on a set of data I have at work. Now, I didn't create the design as I would have allowed for this. OK, so there is primary key (clustered) indexes (mainly composite keys), but no other indexes on the tables. As you would expect, the performance leaves a lot to be desired. A hell of a lot. We have several million rows in a lot of the...

Microsoft SQL Server

9548

Full Text File Search with Indexing Service on Windows (cont.)

by: Chung Leong | last post by:

Here's the rest of the tutorial I started earlier: Aside from text within a document, Indexing Service let you search on meta information stored in the files. For example, MusicArtist and MusicAlbum let you find MP3 and other music files based on the singer and album name; DocAuthor let you find Office documents created by a certain user; DocAppName let you find files of a particular program, and so on. Indexing Service uses plug-ins...

PHP

11601

How much slower is dict indexing vs. list indexing?

by: Emin | last post by:

Dear Experts, How much slower is dict indexing vs. list indexing (or indexing into a numpy array)? I realize that looking up a value in a dict should be constant time, but does anyone have a sense of what the overhead will be in doing a dict lookup vs. indexing into a list? Some ad hoc tests I've done indicate that the overhead is less than 15% (i.e., dict lookups seem to take no more than 15% longer than indexing into a list and there...

Python

8096

Indexing Service Server 2008 IIS 7.0 Adobe Ifilter

by: =?Utf-8?B?SmVycnkgQw==?= | last post by:

I have a server 2008 IIS 7.0 with indexing service installed. I have created the catalog and have a test page using these posts: http://kbalertz.com/954822/install-configure-Indexing-Service-Windows-Server-computer.aspx http://kbalertz.com/820105/Application-Query-Indexing-Service-Catalog-Using-Visual-Basic.aspx I can search some files .xls .doc and others but It will not find anything in .pdf

ASP.NET

8413

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

8324

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

8842

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

8740

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

8513

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

6176

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5642

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4173

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4330

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET