473,883 Members | 2,607 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

remove duplicate text within a string

4 New Member
How do I remove text that was duplicated within a field. For example:
Series
---------
AC Delco AC Delco Taper Spoiler
Bendix Bendix High Performance Altinators
B&I B&I Headlamps
Bridgestone Firestone Birdgestone Firestone AllWeather Tires
Dec 2 '11 #1
21 13235
Rabbit
12,516 Recognized Expert Moderator MVP
You could split the string by the space character and then loop through the array looking for distinct words.
Dec 2 '11 #2
Bobbie Macy
4 New Member
How do I go about doing that in sql. I have a process in java script but I can't get it to work in Access.

Expand|Select|Wrap|Line Numbers
  1. Function removeDuplicates (field) {
  2.     var temp = field.value;
  3.     var array = temp.split(" ");
  4.     array.sort();
  5.     temp = array.join(" ");
  6.  
  7.     do {
  8.         var newTemp = temp;
  9.         var temp = newTemp.replace(/\s(\w+\s)\1/, " $1");
  10.     } while (temp.length != newTemp.length);
  11.  
  12.     temp = temp.replace(/^(\w+\s)\1/, "$1");
  13.     temp = temp.replace(/(\s\w+)\1$/, "$1");
  14.  
  15. var orig = field.value.split(" ");
  16.     var finalStr = "";
  17.     for (var i=0; i<orig.length; i++) {
  18.         if (temp.indexOf(" " + orig[i] + " ") != -1) {
  19.             finalStr += orig[i] + " ";
  20.             temp = temp.split(" " + orig[i] + " ").join(" ");
  21.         } else if ((temp.indexOf(orig[i]) != -1) && (temp.indexOf(" "
  22. + orig[i]) == (temp.length-orig[i].length-1))) {
  23.             finalStr += orig[i] + " ";
  24.             temp = temp.substring(0, (temp.length-orig[i].length-1));
  25.         } else if (temp.indexOf(orig[i] + " ") == 0) {
  26.             finalStr += orig[i] + " ";
  27.             temp = temp.substring(orig[i].length+1, temp.length);
  28.         } else if (temp == orig[i]) {
  29.             finalStr += orig[i];
  30.             temp = "";
  31.         }
  32.     }
  33.  
  34.     if (finalStr.substring(finalStr.length-1, finalStr.length) == " ") {
  35.         finalStr = finalStr.substring(0, finalStr.length-1);
  36.     }
  37.     field.value = finalStr;
  38. }
Dec 2 '11 #3
Mariostg
332 Contributor
This would be a starter. i.e. remove successive words from string:
Expand|Select|Wrap|Line Numbers
  1. Function removeSuccessingWordInString(val As String) As String
  2. Dim sVal, s As Variant
  3. Dim p As Integer
  4. Static cleanString As String
  5. sVal = Split(val, " ")
  6. For p = 0 To UBound(sVal) - 1
  7.     If sVal(p) = sVal(p + 1) Then
  8.         cleanString = Trim(Replace(val, sVal(p), "", 1, 1))
  9.         removeSuccessingWordInString cleanString
  10.     End If
  11. Next p
  12. removeSuccessingWordInString = cleanString
  13. End Function
Dec 2 '11 #4
dsatino
393 Contributor
Just as a comment:
This appears to be the concatenation of 3 seperate fields from 3 separate tables. If you have control over the concatentation, then it would be easier to rewrite that then try an remove duplication.

If you're getting this from an external source, you might try asking them to modify the concatenation.
Dec 2 '11 #5
NeoPa
32,584 Recognized Expert Moderator MVP
Bobbie, why not explain in words (preferably as part of the original question, but failing that now) what it is you want, rather than dumping some code (in a language other than VBA) for us to try to determine what you should have asked.
Dec 3 '11 #6
ADezii
8,834 Recognized Expert Expert
I see no easy Method to accomplish this. The best results that I can come up with is the following, using a very unorthodox approach. Notice the last 2 Records:
Expand|Select|Wrap|Line Numbers
  1. Series                                                              No_Dups
  2. AC Delco AC Delco Taper Spoiler                                     AC Delco Taper Spoiler
  3. Bendix Bendix High Performance Altinators                           Bendix High Performance tinators
  4. B&I B&I Headlamps                                                    B&I Headlamps
  5. Bridgestone Firestone Bridgestone Firestone                          Bridgestone Firestone
  6. AllWeather Tires                                                     AllWeather Tires
  7. BIG KMART BIG KMART BIG KMART BIG KMART                              BIG KMART
  8. TJ Max TJ Thompsom                                                   TJ Max Thompsom
  9. Bill Clinton Hillary Clinton George Bush Bill Clinton George Bush    Bill Clinton Hillary George Bush
  10.  
Dec 3 '11 #7
Bobbie Macy
4 New Member
@NeoPa
I have an Acces table of fields and one of the fields is concatenated and unfortunately one of the fields was concatenated twice. I don't have the data in from the separate fields otherwise I will simply fix the concatenated field. So I'm stuck with this field to fix. I can't use spaces or drop the first or second word because this problem of duplicate words isn't in all the fields.

So I was looking for a way using an access query to remove the duplicae words. The code I attached earlier I found on the internet and looks like it would work but it's in Java.
Dec 3 '11 #8
NeoPa
32,584 Recognized Expert Moderator MVP
I recognise that at least as an attempt to specify the problem Bobbie (which is a good step forward :-)). Specifications are often a problem to get right, so all we expect is an honest attempt.

Now let's see if we can tidy it up a little. Correct me if I go astray.
  1. You have a table within which is a single field which was previously put together by concatenating various separate data strings.
  2. Unfortunately, the first of these values was included twice in the resultant string.
  3. The string values were originally joined together using spaces to separate them.
  4. Any or all of the values may, but don't necessarily have to, contain embedded spaces within themselves.
  5. So, we have a string field in a table which has a duplication at the start which may consist of one or more words separated in all cases by spaces.
  6. The only indication we have that an original value may have ended and a new one begun (within the resultant data we're working with) is the occurrence of a space character.
  7. You need to determine some logic first, and later some code that implements that logic, to handle de-duplication of the first value added to the string.
Dec 3 '11 #9
ADezii
8,834 Recognized Expert Expert
Thanks to NeoPa's clarification, the following Function, when used in a Query (Calculated Field), will handle 'Start of Field Duplication' for a Maximum of 6 Words. It can easily be expanded if so desired. I ran it against Test Data, some of it yours, and I'll post the Code and results below:
Expand|Select|Wrap|Line Numbers
  1. Public Function fCheckForDupes(strSeries As String)
  2. Dim varDups As Variant
  3. Dim intNumOfWords As Integer
  4. Dim strRetVal As String
  5.  
  6. varDups = Split(strSeries, " ")
  7.  
  8. intNumOfWords = UBound(varDups) + 1
  9.  
  10. Select Case intNumOfWords
  11.   Case 1    '1 Word, return Word
  12.     strRetVal = varDups(0)
  13.   Case 2    '2 Words, check for equality
  14.     If varDups(0) = varDups(1) Then     'Equal, drop the 2nd Word
  15.       strRetVal = varDups(0)
  16.     Else    '2 Words, not equal, return original String
  17.       strRetVal = varDups(0) + " " & varDups(1)
  18.     End If
  19.   Case 3    'Is the 1st equal to the 2nd Word?
  20.     If varDups(0) = varDups(1) Then     '1st & 2nd equal, Drop 2nd
  21.       strRetVal = varDups(0) & " " & varDups(2)
  22.     Else    'No equality, return entire String
  23.       strRetVal = varDups(0) + " " & varDups(1) & " " & varDups(2)
  24.     End If
  25.   Case 4   'Is the 1st = the 2nd OR are the 1st and 2nd = 3rd and 4th?
  26.     If varDups(0) = varDups(1) Then     'Drop the 2nd
  27.       strRetVal = varDups(0) & " " & varDups(2) & " " & varDups(3)
  28.     ElseIf (varDups(0) = varDups(2)) And (varDups(1) = varDups(3)) Then
  29.       strRetVal = varDups(0) & " " & varDups(1)     'Drop 3rd and 4th
  30.     Else
  31.       strRetVal = strSeries
  32.     End If
  33.   Case 5
  34.     If varDups(0) = varDups(1) Then     'Drop the 2nd
  35.       strRetVal = varDups(0) & " " & varDups(2) & " " & varDups(3) & _
  36.                   " " & varDups(4)
  37.     ElseIf (varDups(0) = varDups(2)) And (varDups(1) = varDups(3)) Then
  38.       strRetVal = varDups(0) & " " & varDups(1) & " " & varDups(4)    '
  39.     Else
  40.       strRetVal = strSeries
  41.     End If
  42.   Case 6
  43.     If varDups(0) = varDups(1) Then
  44.       strRetVal = varDups(0) & " " & varDups(2) & " " & varDups(3) & _
  45.                   " " & varDups(4) & " " & varDups(5)
  46.     ElseIf (varDups(0) = varDups(2)) And (varDups(1) = varDups(3)) Then
  47.       strRetVal = varDups(0) & " " & varDups(1) & " " & varDups(4) & _
  48.                   " " & varDups(5)
  49.     ElseIf (varDups(0) = varDups(3)) And (varDups(1) = varDups(4)) And _
  50.            (varDups(2) = varDups(5)) Then
  51.       strRetVal = varDups(0) & " " & varDups(1) & " " & varDups(2)
  52.     Else
  53.       strRetVal = strSeries
  54.     End If
  55.   Case Else
  56.     'If all else fails, return the Original String
  57.     strRetVal = strSeries
  58. End Select
  59.  
  60. fCheckForDupes = strRetVal
  61. End Function
Expand|Select|Wrap|Line Numbers
  1. Series                                             New_Series
  2. AC Delco AC Delco Taper Spoiler                    AC Delco Taper Spoiler
  3. Bendix Bendix High Performance Altinators          Bendix High Performance Altinators
  4. B&I B&I Headlamps                                  B&I Headlamps
  5. Bridgestone Firestone Bridgestone Firestone        Bridgestone Firestone
  6. AllWeather Tires                                   AllWeather Tires
  7. Encyclopedia                                       Encyclopedia
  8. One Two Three                                      One Two Three
  9. YaDa YaDa                                          YaDa
  10. One One Three Four                                 One Three Four
  11. One Two One Two Five                               One Two Five
  12. One Two Three Four Five                            One Two Three Four Five
  13. One Two Three Four                                 One Two Three Four
  14. One One Three Four Five Six                        One Three Four Five Six
  15. One Two Three One Two Three                        One Two Three
  16. 1 2 3 4 5 6                                        1 2 3 4 5 6
Dec 3 '11 #10

Sign in to post your reply or Sign up for a free account.

Similar topics

2
13283
by: Damien Cobbs | last post by:
Is there a way to remove duplicate rows from a table that does not have a primary key leaving one copy of the row without comparing each column to each other as suggested by the code example below: delete from foo x where x.rowid <> ( select min(rowid) from foo y where y.primary key = x.primary key )
7
23941
by: Voetleuce en fênsievry | last post by:
Hello everyone. I'm not a JavaScript author myself, but I'm looking for a method to remove duplicate words from a piece of text. This text would presumably be pasted into a text box. I have, for example, a list of town names, but there are hundreds of duplicates, like: "Aberdeen Aberdeen Aberdeen Edinburg Edinburg Inverness etc."
1
2470
by: guoqi zheng | last post by:
I have a dataset with about 20-50 records on it. Some of has the same value for a key column, those duplicates need to be removed. How can I remove duplicate itmes from a dataset???? I know some poeple will tell me to use GROUP BY and DISTINCE to filter out duplicates in SQL, but in my case, I am not able to do this! Any help?
6
22892
by: Paul | last post by:
Hi just wondering if there is an easy way to remove spaces in a string, tried the trim method but think it is only for leading or trailing spaces. thanks -- Paul G Software engineer.
3
2616
by: dazzle | last post by:
I have an XML file and I would like to remove duplicate nodes within it but I can't get my head round on how to do this. Example XML file: <root> <plugin> <title>A9</title> <url>some url</url>
1
16614
by: JTreefrog | last post by:
Hello - I've read a ton of stuff about deleting duplicate values in an array. They are all very useful - they just haven't addressed an array of objects. Here's my array: var sDat = ; The array is produced from JSON sent to me from a database query. I can't do a "group by" in the sql query - i actually need the repeats in a different array. Anyway - I want to be able to remove duplicate sDat.sid entries in my sDat array - but I'm...
1
3475
by: tosachinji | last post by:
Hi I am new to xslt. Could you please tell me, how can we remove duplicate records from a xml file. Here is the xml file: <Row> <Cell><Data>Active</Data></Cell> <Cell><Data>D</Data></Cell> <Cell><Data>1</Data></Cell>
0
1929
by: TAB | last post by:
"none" <none@none.comskrev i meddelandet news:eaJIxCA7IHA.2260@TK2MSFTNGP03.phx.gbl... Use richtextbox selection properties. Like richTextBox.SelectedText or SelectedRtf if you would like to keep the rtf formatting, SelectionColor, SelectionBackColor, SelectionFont and so on. e.g. string str = richtextBox1.SelectedText;
2
9775
by: ismailc | last post by:
Good day, I need help. I have 2 problems, 1. I want to return the text within a string between : "New Description " - I want the value "1234" 2. I want to remove the "" from the string, the problem is that the string differs in legth so i can't just remove the last 6 char. "New Description " "New Description " I want to return "New Description"
2
4300
by: postman | last post by:
I need to remove all 8 or 9-digit numbers from memo fields in a table. They are at varying positions within the fields. I was able to construct a SELECT query to retreive all the records that contain these numbers, using "Like" and a pattern, but I can't figure out a query solution to replace these numbers with a mask, like "**********". Can probably construct a VBA solution to loop through all the characters of each memo field, but I'd...
0
11125
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10734
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10836
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10407
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9568
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7962
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5982
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4607
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3230
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.