JStrummer wrote:
I just put together a PHP mailing list sign-up page that posts to a
mySQL DB. The PHP script prevents INSERTs when the email address is
already located in the database.
Problem: I need to import some flat-files that stored the signups,
prior to this new form. Email addresses weren't checked, so there are
a lot of records that have the same email address.
Once I import these into the table, how would I go about putting
together a SQL statement against that table that would locate
duplicate records (ie. records that share the email address field) and
delete all subsequent ones, but not the first record with that email
address?
One solution would be to update rather than insert, when a duplicate
exists. There seem to be two ways to do this with MySQL (using MySQL
extensions to standard SQL):
REPLACE, which works like INSERT but performs an UPDATE instead if the
data you are entering matches an existing record by primary key.
See
http://dev.mysql.com/doc/mysql/en/REPLACE.html
INSERT... ON DUPLICATE KEY UPDATE
See
http://dev.mysql.com/doc/mysql/en/INSERT.html
If this doesn't work in your case (for instance, if the email field
isn't the primary key), here's another solution: create a second table
with similar structure to the old email addresses table. Import your
flat file to this new table. Then do an outer join between the two
tables to find the newly imported addresses that _aren't_ in the old table:
select n.email
from new_emails n left outer join old_emails o on n.email = o.email
where o.email is null
The result is the set that you need to add to the table. You can insert
directly from a select query:
insert into old_emails select email ...
Regards,
Bill K.