By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,575 Members | 1,968 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,575 IT Pros & Developers. It's quick & easy.

It's Me again!

P: 26
Okay,
I have a table with a LONGTEXT field. Some records have 1000+ characters in the suspect field. Since it's a LONGTEXT field, i can't figure out how to set a no-dupe index on it.

I get duplicate records where the first 100 characters are the same. If the first 100 are the same, the entire record is the same. Guaranteed. I have a field that includes the first 50 chars, but sometimes they are the same. (fld 1 "James d. Smith" fld 2= "James d. Smith has deployed" or "James d. Smith has been promoted"

I would like to eliminate existing duplicates. My thought is to create a TEXT field with 100 length, use a query to copy the first 100 chars over to it for each record. THEN, copy the structure, make that field UNIQUE then copy the records over, thus eliminating the duplicates.

Anyone have any better ideas? anyone have any code that does this?
Sep 15 '08 #1
Share this Question
Share on Google+
1 Reply


iburyak
Expert 100+
P: 1,017
Here is my little test to simulate what you have and remove duplicates. I worked with first 14 characters just to make it small.


Expand|Select|Wrap|Line Numbers
  1. 1. Create a table
  2. create table #a (name varchar(100))
  3. 2. Insert some records
  4. insert into #a values ('James d. Smith has deployed')
  5. insert into #a values ('James d. Smith has deployed')
  6. insert into #a values ('b James d. Smith')
  7. insert into #a values ('a James d. Smith has deployed')
  8. insert into #a values ('James d. Smith has deployed')
  9. insert into #a values ('James d. Smith has deployed')
  10. insert into #a values ('b James d. Smith')
  11. insert into #a values ('a James d. Smith has deployed')
  12.  
  13. 3. Run code: 
  14. set rowcount 1
  15.  
  16. select 1
  17. While @@Rowcount > 0
  18. BEGIN
  19. delete from #a where substring(name,1,14) = (select top 1 substring(name,1,14)
  20.                          from #a
  21.                          group by substring(name,1,14)
  22.                          having count(*) > 1)
  23. END
  24. set rowcount 0 
  25.  
  26. 4. Check result
  27. select * from #a order by 1 
Your result should be:

a James d. Smith has deployed
b James d. Smith
James d. Smith has deployed

Then you can add a column to the table with first 100 characters to be able to have unique records.

Good Luck.
Sep 16 '08 #2

Post your reply

Sign in to post your reply or Sign up for a free account.