473,803 Members | 4,391 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

SQL Server for data processing

I'm working on a system right now where I have a database (two,
actually, but one is discarded halfway through), but it's created
and used as part of a process (reporting), rather than as the
actual production data repository. I may be keeping the database
permanantly, but it would be completely read-only; once the
process is complete, the database will not change again. This has
me wanting to do a few things that are rather foreign to my usual
experience, and I don't know how many of them are supported.

In several cases, I'm summarizing one table into another by several
fields, and then updating the original table with an ID for the
summary row each source row was summarized into (e.g., I summarize
PlaceAndProduct Summary into PlaceSummary, and then populate
PlaceSummaryID in PlaceAndProduct Summary). The update of the
source table is much faster if the summary table has a clustered
index on the summarized fields, but all later access will be faster
if the clustered index is on the identity column. I've been
including an ORDER BY the summarized fields in the original insert,
so the identity column is in the same order as the summarized fields,
but I don't know of any way to take advantage of that in the
indexing declarations.

As another approach to the above situation, if I change the
clustered index on a table, and the rows happen to be in the
same order by both indexes, will the table still get rebuilt?

I will never do a roll-back in the process; if an action fails, I
want to raise an error and halt (and I haven't lost any data).
Is there any way to completely turn off logging?

Will I gain anything by marking the database as single-user?

Any indexes that I am not using while I populate the tables, I'm
adding at the end with FillFactor 100, to keep any slack out.
Is there a way to remove all the slack from everything else, at
the end of the process? During a backup operation would be fine.

Thanks,
Bill

Jul 23 '05 #1
2 1439
On Mon, 07 Feb 2005 04:54:43 GMT, William Cleveland wrote:

(snip)
In several cases, I'm summarizing one table into another by several
fields, and then updating the original table with an ID for the
summary row each source row was summarized into (e.g., I summarize
PlaceAndProduc tSummary into PlaceSummary, and then populate
PlaceSummary ID in PlaceAndProduct Summary). The update of the
source table is much faster if the summary table has a clustered
index on the summarized fields, but all later access will be faster
if the clustered index is on the identity column. I've been
including an ORDER BY the summarized fields in the original insert,
so the identity column is in the same order as the summarized fields,
but I don't know of any way to take advantage of that in the
indexing declarations.
Hi William,

First, you should know that including ORDER BY in an INSERT .. SELECT
statement in meaningless. There is no way to guarantee that the identity
values get handed out in the order you specify. If you really must have
them in a specific order, the only safe way is to use a cursor - but since
IDENTITY is meant to be a **meaningless** unique code, this is a quite
questionable approach. Especially since there are more caveats to using an
identity for this goal. If the generated value has to have some logical
relation to the data, you'll have to calculate it, either on insert or on
retrieval.

I don't really understand the rest of this paragraph, though. It appears
that you are merely looking for maximum speed. If that's the case, there
might be other options to consider. Maybe, the update of the source table
will be equally fast (or even faster) if you use a covering nonclustered
index?

If you could show some more specifics about what exactly you are trying to
do, I might be able to give more specific comments.

As another approach to the above situation, if I change the
clustered index on a table, and the rows happen to be in the
same order by both indexes, will the table still get rebuilt?
My guess is that it is - it should! First because SQL Server doesn't know
that the rows are in order, so it will have to be checked anyway. Second
because a clustered index is more than just the table - the table is in
the leaf pages of the index; the root and the intermediate pages have to
be built as well.

To verify, I ran this code with the Show Execution Plan option turned on
(Ctrl-K in Query Analyzer). Both with or without the added nonclustered
index, the rebuilding of the clustered index had exactly the same plan as
the building of the first clustered index.

create table testit (a int not null, b int not null)
go
create unique clustered index i1 on testit(a)
-- create unique nonclustered index i2 on testit(b)
go
insert testit values (1, 1)
insert testit values (2, 2)
insert testit values (3, 3)
go
create unique clustered index i1 on testit(b) with drop_existing
go
drop table testit
go

I will never do a roll-back in the process; if an action fails, I
want to raise an error and halt (and I haven't lost any data).
Is there any way to completely turn off logging?
No, there is no way to turn off logging. It's not only used for rollback,
it's also used for trigger execution, for recovery after a system crash
and for backup and restore - and then I'm probably forgetting some.

Besides, *you* might never do a rollback, but for some errors, SQL Server
will issue a rollback.

Will I gain anything by marking the database as single-user?
Yes: you'll save the overhead of taking and releasing locks. The price you
pay is that you can't run multiple connections at once. Without the
single-user setting, you could still inspect some data in table A while a
long running query is updating table B. This is impossible in single-user
mode (in fact, it should have been called single-connection mode).

Any indexes that I am not using while I populate the tables, I'm
adding at the end with FillFactor 100, to keep any slack out.
Is there a way to remove all the slack from everything else, at
the end of the process? During a backup operation would be fine.


The only way I know is to rebuild all indexes with fillfactor 100.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Jul 23 '05 #2
Hugo Kornelis (hugo@pe_NO_rFa ct.in_SPAM_fo) writes:
First, you should know that including ORDER BY in an INSERT .. SELECT
statement in meaningless. There is no way to guarantee that the identity
values get handed out in the order you specify.


That is definitely not the same as saying that ORDER BY is meaningless.

If you say:

INSERT tbl (...)
SELECT ....
ORDER BY
OPTION (MAXDOP 1)

there is no guarantee that your IDENTITY values match - but your odds are
pretty good.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 23 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
2576
by: Matt | last post by:
I think this is the basic concept in ASP server-side development. My boss told me web application is NOT client-server application. I argued with him because browser is the client, and the server code put in server. Then web application should be a client-server application. My understanding is that a web application is an application that runs on a browser. But client-server application is not necessary a web application. Please...
7
3316
by: rdh | last post by:
Hi all, I am in process of developing a Server in C++ supporting multiple protocols. The server will be exposing various functionalities, and the clients can communicate over any of the protocols may be TCP, IPX, SAP, NETBEUI to access the server to access the functionalities exposed. The server doesnot know in advance which client is using what protocol. ALSO, ALL THE INTERACTION WILL BE MADE OVER XML. example my server has...
26
3833
by: David W. Fenton | last post by:
A client is panicking about their large Access application, which has been running smoothly with 100s of thousands of records for quite some time. They have a big project in the next year that will lead to a lot of use of the database and the adding of quite a lot of new data (though I can't conceive of them adding more than than 10s of thousands of records, which won't change the current performance profile at all). If there is a SQL...
5
3003
by: Ian | last post by:
Can anyone give me advice on migrating my Access backend to sequel server? I am sure there are discussion groups and documentation for this but don't know where to look. I am looking for answers to general questions such as: Will my VBA still work? What is the migration procedure?
8
1643
by: ml | last post by:
My employers currently use Access for processing large volumes of data for reporting and simple modelling, which involves a lot of make table/update queries etc. I have been asked to work on a retail forecasting model, but looking at the spec. the base data tables have approximately 6 million records. I know this can theoretically be done using Access (Access 97) but in reality it is extremely slow (even with the necessary tables...
29
2648
by: Jan | last post by:
Hi: I have an Access database that's been running (in one form or another) for a couple of different clients for a few years. Now a new client has requested that it be implemented with a SQL server back-end. I'm doing my best to learn about SQL server, and I plan to leave the front-end more or less as-is, just linking to the SQL server back end, but here's a basic question: The db has a front-end linked to two back-ends. One of the...
2
6971
by: Jobs | last post by:
Download the JAVA , .NET and SQL Server interview with answers Download the JAVA , .NET and SQL Server interview sheet and rate yourself. This will help you judge yourself are you really worth of attending interviews. If you own a company best way to judge if the candidate is worth of it. http://www.questpond.com/InterviewRatingSheet.zip
1
1216
by: gsalbertson | last post by:
am new to asp.net and have a question on just what is processing on the server and what is processing on the client. I have an asp.net page with html controls on it, say a couple of text boxes and a button. The user enters a number in one textbox, then the code behind does stuff -- let's say it validates the number, adds the number to 100, then displays the new number in the 2nd textbox. Once the user hits the button, isn't the...
7
1960
by: David | last post by:
i think i just realized i'm an idiot. again. (not syntactically correct code... just pieces to illustrate) class StateObject { members like socket, receiveBuffer, receiveBufferSize, StringBuilder etc.. }
0
10542
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10289
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10068
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9119
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7600
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6840
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5496
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4274
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3795
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.