473,398 Members | 2,403 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

De-Duplicate SQL Script?

I just put together a PHP mailing list sign-up page that posts to a
mySQL DB. The PHP script prevents INSERTs when the email address is
already located in the database.

Problem: I need to import some flat-files that stored the signups,
prior to this new form. Email addresses weren't checked, so there are
a lot of records that have the same email address.

Once I import these into the table, how would I go about putting
together a SQL statement against that table that would locate
duplicate records (ie. records that share the email address field) and
delete all subsequent ones, but not the first record with that email
address?

Any help would be appreciated.
Jul 20 '05 #1
1 2942
JStrummer wrote:
I just put together a PHP mailing list sign-up page that posts to a
mySQL DB. The PHP script prevents INSERTs when the email address is
already located in the database.

Problem: I need to import some flat-files that stored the signups,
prior to this new form. Email addresses weren't checked, so there are
a lot of records that have the same email address.

Once I import these into the table, how would I go about putting
together a SQL statement against that table that would locate
duplicate records (ie. records that share the email address field) and
delete all subsequent ones, but not the first record with that email
address?


One solution would be to update rather than insert, when a duplicate
exists. There seem to be two ways to do this with MySQL (using MySQL
extensions to standard SQL):

REPLACE, which works like INSERT but performs an UPDATE instead if the
data you are entering matches an existing record by primary key.
See http://dev.mysql.com/doc/mysql/en/REPLACE.html

INSERT... ON DUPLICATE KEY UPDATE
See http://dev.mysql.com/doc/mysql/en/INSERT.html

If this doesn't work in your case (for instance, if the email field
isn't the primary key), here's another solution: create a second table
with similar structure to the old email addresses table. Import your
flat file to this new table. Then do an outer join between the two
tables to find the newly imported addresses that _aren't_ in the old table:
select n.email
from new_emails n left outer join old_emails o on n.email = o.email
where o.email is null

The result is the set that you need to add to the table. You can insert
directly from a select query:
insert into old_emails select email ...

Regards,
Bill K.
Jul 20 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: grigoo | last post by:
bonjour a tous je me presente a vous::: greg dit le grigoo sur le web ,,etudiant en bioinformatique a montreal et jusqu au cou dans notre language prefere....java. et biojava.. et je suis en un...
0
by: Laurent Pointal | last post by:
Bienvenue sur la liste Python francophone, hébergée par l'AFUL, ou sur le newsgroup fr.comp.lang.python ("fclp"). Votre abonnement à cette liste de diffusion ou votre lecture de fclp montrent...
0
by: Michael Dyson | last post by:
M. Michael DYSON DIRECTEUR-ADJOINT SOCIÉTÉ DE SÉCURITÉ SARL. TEL.00229 20 21 80 Cotonou République du Bénin Email:michaeldyson2005@latinmail.com Bonjour . Je sais que mon message sera d’une...
0
by: Alejandro Scomparin | last post by:
FORMULACION, PREPARACION Y EVALUACION ECONOMICA Y FINANCIERA DE PROYECTOS DE TECNOLOGIA INFORMATICA (IT) en la UTN Inicia: 17 de Mayo. 19 hs Dirigido a Esta preparado para gerentes, jefes,...
0
by: Alejandro Scomparin | last post by:
FORMULACION, PREPARACION Y EVALUACION ECONOMICA Y FINANCIERA DE PROYECTOS DE TECNOLOGIA INFORMATICA (IT) en la UTN Inicia: 17 de Mayo. 19 hs Dirigido a Esta preparado para gerentes, jefes,...
0
by: gandalf | last post by:
Con motivo del enorme esfuerzo realizado para visitar la ciudad argentina de Córdoba, participar en la Reunión del MERCOSUR, en la clausura de la Cumbre de los Pueblos en la histórica Universidad...
1
by: crow | last post by:
http://www.pagina12.com.ar/diario/elpais/1-72984-2006-09-14.html Por Miguel Bonasso Desde La Habana Me había preparado para verlo, pero la realidad fue mucho más fuerte. Incluso le llevaba de...
1
by: gandalf | last post by:
CON LA PASION DE SIEMPRE HABLO DE CHAVEZ, DE LA MEDICINA CUBANA... Y DE SU PROPIA MUERTE Relato de la nueva gran batalla de Fidel El líder cubano mostró cómo evoluciona su recuperación en el...
1
by: Sebastien | last post by:
Bonjour, Je tient d'abort a m'excuser de poster cette demande ici, mais je ne vois pas tres bien ou fair cette demande, Je développe un logiciel de génération de code (Génération de Code VB et...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.