(an******@gmail.com) writes:
I need some help understanding what's happening on a join when updating
tables. Assume 2 tables with identical structures:
UPDATE
Table1
SET
Field1 = Table2.Field1,
Field2 = Table2.Field2
FROM
Table2
WHERE
Table1.Field3 = Table2.Field3
AND Table1.Field4 = Table2.Field4
Indexes exist on Field3 and Field4 on both tables. So why does SQL
Server choose a hash join?
Are those indexes on (Field3, Field4) or indexes on the individual
fields?
In any case, one of the tables will have to be scanned. Say that would
be Table2. Now for each row, we should look for a matching row in Table1.
Now, assume that only a few rows match. In this case, using a nested loop
and look up the row in Table1 is a good idea.
But what if all rows match? In this case, the pages in Table1 would be
accessed many times, and that would be expensive. Better then to scan
Table1 once. If there is a clustered index on (Field3, Field4), SQL
Server should be able to do a merge join, and scan both tables in
parallel. But if the index is non-clustered, then it's not of much
use, so instead SQL Server builds the hash table.
--
Erland Sommarskog, SQL Server MVP,
es****@sommarskog.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp