Connect with Expertise | Find Experts, Get Answers, Share Insights

How do I compare these 2 tables ?

C
 
Join Date: Jan 2009
Posts: 278
#1: Jan 11 '10
Hi,

I have two comparisons that I need to do and they are a bit
beyond my knowledge, would appreciate some help. :)

I have two tables that have an identical structure,
one is the incoming daily update, tableA which I want to
compare to the history tableB.

Both structures look like this:
id, title, desc, data1, data2, ... data18

(1)
The first comparison is to check for new rows. I think I can
do this by using the LEFT JOIN command

Is this how I do it ?

Expand|Select|Wrap|Line Numbers
  1. $query = "SELECT tableA.*, tableB.* FROM tableA LEFT JOIN tableB ON tableA.id = tableB.id"; 
  2.  
  3. $result = mysql_query($query) or die(mysql_error());
  4.  
(2)
My second comparison is to compare all the rows that have same ids
but have differences in the other fields i.e. not NEW records but CHANGED records.

Atli's Avatar
E
M
C
 
Join Date: Nov 2006
Location: Iceland
Posts: 4,680
#2: Jan 12 '10

re: How do I compare these 2 tables ?


Hey.

Would I be right in guessing that the goal of these comparisons is to insert the data from the "input" table into the "history" table, adding new rows as new rows but updating old rows with the new data? (New and old rows being defined by whether their IDs preexist in the history table.)

If so, you don't have to do that manually. You can use the ON DUPLICATE KEY UPDATE clause of the INSERT statement to do this automatically.

For example, say I have a "input_table" and "storage_table" tables who both share the exact same structure:
Expand|Select|Wrap|Line Numbers
  1. (
  2.     `id` Serial Primary Key, 
  3.     `title` VarChar(255) Not Null Default 'Untitled', 
  4.     `desc` VarChar(255) Not Null Default 'No description'
  5. )
Given that they have the following data:
Expand|Select|Wrap|Line Numbers
  1. INSERT INTO `storage_table`
  2.     (`id`, `title`, `desc`)
  3. VALUES
  4.     (1, 'First in storage', 'This is the first row defined in the storage table'),
  5.     (2, 'Second in storage', 'This is the second row defined in the storage table'),
  6.     (3, 'Third in storage', 'This is the third row defined in the storage table');
  7.  
  8. INSERT INTO `input_table`
  9.     (`id`, `title`, `desc`)
  10. VALUES
  11.     (1, 'First from input', 'This is the first row, updated fromt he input table.'),
  12.     (3, 'Third from input', 'This is the third row, updated fromt he input table.'),
  13.     (4, 'Fourth from input', 'This is the fourth row, new from the input table.');
I could issue this command:
Expand|Select|Wrap|Line Numbers
  1. INSERT INTO `storage_table`
  2.     (`id`, `title`, `desc`)
  3. SELECT
  4.     `id`, `title`, `desc` 
  5.     FROM `input_table`
  6. ON DUPLICATE KEY UPDATE
  7.     `id`    = VALUES(`id`),
  8.     `title` = VALUES(`title`),
  9.     `desc`  = VALUES(`desc`);
After which the data in the "storage_table" would become:
Expand|Select|Wrap|Line Numbers
  1. +----+-------------------+------------------------------------------------------+
  2. | id | title             | desc                                                 |
  3. +----+-------------------+------------------------------------------------------+
  4. |  1 | First from input  | This is the first row, updated fromt he input table. | 
  5. |  2 | Second in storage | This is the second row defined in the storage table  | 
  6. |  3 | Third from input  | This is the third row, updated fromt he input table. | 
  7. |  4 | Fourth from input | This is the fourth row, new from the input table.    | 
  8. +----+-------------------+------------------------------------------------------+
See what I mean?
dgreenhouse's Avatar
E
C
 
Join Date: May 2008
Location: San Francisco
Posts: 154
#3: Jan 13 '10

re: How do I compare these 2 tables ?


UPDATE:
Looking at Atli's post I realized that (his?) recommendation works in one fell-swoop.
I'd go with that... But you will need to specify all the column names...
You can leave out the column names in insert and the sub-select, but you WILL need all of the column names after the 'on duplicate key update' clause.
i.e.
Expand|Select|Wrap|Line Numbers
  1. insert into tableb (select * from tablea)
  2. on duplicate key update
  3. id = values(id),
  4. title = values(title),
  5. `desc` = values(`desc`),
  6. data1 = values(data1),
  7. ...
  8. data18 = values(data18);
  9.  
  10. Note the backticks ` encapsulating `desc`
  11. desc is a reserved word in MySQL
  12. - ergo the need for the backtick.
  13. As Atli showed, it's probably best to always use backticks 
  14. to avoid query failures when using reserved words as 
  15. column and/or table names. 
  16. But it's best to avoid reserved words altogether.
  17.  
END-UPDATE:

For the first criteria - (pulling new records in) - this should work:
insert into tableB (select * from tableA where id not in (select id from tableB));

For the second criteria, you could use Atli's suggestion by using the "on duplicate key update" clause of an insert statement.

The one problem I see with this is when you get a lot of records in your history table (tableB). The other way requires a fairly complex select/insert statement.

But it's best to K.I.S.S.

Hope that helps...

By the way, unless there's a compelling reason to have the tables structured the way you have them, I can foresee problems later when you might want to expand the application.

(i.e. having that many fields named data1, data2, ..., data18)

But there's nothing inherently bad about the structure if that's indeed what you need.

(note: if indeed your structure is like data1,data2..., it will make the more complex select update command(s) easier.)

If you haven't already, look into "normalization", but as many times is the case, it's just too hard to go back and redesign an application that works for the most part. But DO look into normalization for future projects.
Atli's Avatar
E
M
C
 
Join Date: Nov 2006
Location: Iceland
Posts: 4,680
#4: Jan 13 '10

re: How do I compare these 2 tables ?


@dgreenhouse
I agree, it is very important to keep the normalization rules in mind when designing you databases. Normalized databases are generally easier to maintain and upgrade.

You can check out Database Normalization and Table Structures in our MS Access forums. It describes these rules very nicely.

@dgreenhouse
Indeed :)
Reply