469,327 Members | 1,219 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,327 developers. It's quick & easy.

SELECT DISTINCT: remove duplicates when not all fields match, but display everything


I am pasting my SQL code at the end of this message.

I am trying to use SELECT DISTINCT in a query, but I am a beginner for using Access. Right now I have duplicate ID's (Indiv ID) in my table (All Illumina ITPR3), but the rest of the fields don't necessarily match (Haplotype, DR genotype).

If I run SELECT DISTINCT on "Indiv ID", I can't get the other fields to be displayed. If I run SELECT DISTINCT on all the fields I want (Indiv ID, Haplotype, DR genotype), then I still get duplicate ID's, because it wants to match on all three fields.

Is there a good way to have it not display the duplicate ID records (and not require that all fields match) but display all my other fields?

It would be awesome if I could get a result by Tuesday morning, so I could use this at work tomorrow, but this isn't extremely urgent.


SELECT [All Illumina ITPR3].[Indiv ID], [All Illumina ITPR3].Haplotype, [All Illumina ITPR3].[DR genotype]
FROM [All Illumina ITPR3], [DAISY Individuals DR and DP Typing];
Nov 27 '06 #1
6 10122
32,181 Expert Mod 16PB
I'm thinking about your question. If it makes sense it can be done - I'll get straight back to you.
In the mean time your query :
Expand|Select|Wrap|Line Numbers
  1. SELECT [All Illumina ITPR3].[Indiv ID], [All Illumina ITPR3].Haplotype, [All Illumina ITPR3].[DR genotype]
  2. FROM [All Illumina ITPR3], [DAISY Individuals DR and DP Typing];
only seems to get data from one table yet two are listed :s.
This will cause duplicates (assuming more than one record in the unused table.
It seems you can tidy the SQL up to :
Expand|Select|Wrap|Line Numbers
  1. SELECT [Indiv ID], Haplotype, [DR genotype]
  2. FROM [All Illumina ITPR3];
Nov 27 '06 #2
32,181 Expert Mod 16PB
Is there a good way to have it not display the duplicate ID records (and not require that all fields match) but display all my other fields?
I've thought about it and realised you're asking for the impossible. Not difficult you understand, actually nonsensical.
What could it display for the other fields if you request a unique ID?

Now, if you weren't requiring the actual values but were happy with any example value, then you could use GROUP BY with First(). In case this is acceptable the SQL you want is :
Expand|Select|Wrap|Line Numbers
  1. SELECT [Indiv ID],
  2.   First(Haplotype) AS 1stHaplotype,
  3.   First([DR genotype]) AS [1stDR genotype]
  4. FROM [All Illumina ITPR3]
  5. GROUP BY [Indiv ID];
Nov 27 '06 #3
Thanks-I will eventually draw data from two tables, which is why both are listed. The duplicate "records" are actually in the one table, "All Illumina ITPR3." In this table, I have more than one record per Indiv ID, but the information I need is listed in both records (hence wanting to only list it once and get rid of the duplicate). I can't just delete the records in my table, because there is some important data in both rows.

Thanks for checking into it for me!
Nov 27 '06 #4
32,181 Expert Mod 16PB
The important question isn't about handling duplicates (where all fields are the same), but how to handle records with duplicate ID but different data. I can't stress enough how important that concept is.
I hope I've helped anyway.
Nov 27 '06 #5
Thanks for the help-I just figured out what I needed to do. I was trying to make the query display data from the fields that varied between the two records with the same ID. However, if I just put in the fields that are invariant, using a SELECT DISTINCT, it works!

Thanks for clearing that up!


Here is my new code (with different, invariant fields):

Expand|Select|Wrap|Line Numbers
  1. ..SELECT DISTINCT [All Illumina ITPR3].[Indiv ID], [All Illumina ITPR3].[rs2229634 genotype], [DAISY Individuals DR and DP Typing].DPB_1
  2. FROM [All Illumina ITPR3] LEFT JOIN [DAISY Individuals DR and DP Typing] ON [All Illumina ITPR3].[Indiv ID] = [DAISY Individuals DR and DP Typing].[Indiv ID];..
Nov 27 '06 #6
32,181 Expert Mod 16PB
Sorry Erin. I'd picked up from your first post that you already understood that.
May I make a design concept suggestion.
Naming can be more important than people realise and long, involved names do cause problems.
Anything with embedded spaces cause []s to be mandatory. Long names cause SQL to be hard to understand easily when reading.
These things sound like style issues but they're really not. Usage and understanding will be much easier if short 'spaceless' names are used. Always use enough to relay what it's being used for though. Like most things in life it comes down to balance.

Anyway I'm glad you've got it sorted and understood now :).
Nov 27 '06 #7

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

4 posts views Thread by Marco Alting | last post: by
3 posts views Thread by Cindy | last post: by
9 posts views Thread by Kelvin | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by haryvincent176 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.