473,404 Members | 2,174 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Determining DISTINCT strings from a list

CJM
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list of
DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD',
'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'

It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it seems a
bit inefficient, so I'm wanting to see if there some good suggestions from
the floor...

Thanks in advance

Chris
Jul 19 '05 #1
10 2382
Split the string into an array and sort it. Then, loop the array, caching
each value and throwing each new one found into a new string.

My .02.

--
William Morris
Product Development, Seritas LLC
Kansas City, Missouri

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list of DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD',
'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'

It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it seems a
bit inefficient, so I'm wanting to see if there some good suggestions from
the floor...

Thanks in advance

Chris

Jul 19 '05 #2
CJM
In simple english, it seems straightforward, but if you tried to code it I
think you will hit the same problem.

So you iterate through the loop,buikding up an array of all values..

You say to then loop the array, spotting 'each new one', which you then
place elsewhere (maybe in a new array perhaps).

But how exactly do you determine if each value is already there, or if it is
indeed new?

CJM
"William Morris" <ne***************************@seamlyne.com> wrote in
message news:bu************@ID-205671.news.uni-berlin.de...
Split the string into an array and sort it. Then, loop the array, caching
each value and throwing each new one found into a new string.

My .02.

--
William Morris
Product Development, Seritas LLC
Kansas City, Missouri

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list

of
DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD', 'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'
It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it seems a bit inefficient, so I'm wanting to see if there some good suggestions from the floor...

Thanks in advance

Chris


Jul 19 '05 #3
The key is to sort the array so that like values are together. Then you
code it something like this (written for speed, not accuracy or syntax):

testValue = ""
newList = ""
for counter = 0 to ubound(array)
if testValue <> array(counter) then
testValue = array(counter)
newList = newList & array(counter) & ","
end if
next

response.write left(newList, len(newList) - 1)

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:u5**************@TK2MSFTNGP10.phx.gbl...
In simple english, it seems straightforward, but if you tried to code it I
think you will hit the same problem.

So you iterate through the loop,buikding up an array of all values..

You say to then loop the array, spotting 'each new one', which you then
place elsewhere (maybe in a new array perhaps).

But how exactly do you determine if each value is already there, or if it is indeed new?

CJM
"William Morris" <ne***************************@seamlyne.com> wrote in
message news:bu************@ID-205671.news.uni-berlin.de...
Split the string into an array and sort it. Then, loop the array, caching
each value and throwing each new one found into a new string.

My .02.

--
William Morris
Product Development, Seritas LLC
Kansas City, Missouri

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a
list of
DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ',
'ABCD', 'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'
It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it
seems a bit inefficient, so I'm wanting to see if there some good suggestions from the floor...

Thanks in advance

Chris



Jul 19 '05 #4
You can add your items to a dictionary object to filter out duplicates by
assigning them as keys. Since you cannot have duplicate keys, you can use
an On Error Resume Next to skip the errors (duplicates) and be left with
uniques.

<%
theStrings = Array("ABCDXX", "ABCDYY", "ABCDZZ", "ABCD", "ABCDYY")

Set oDict = CreateObject("Scripting.Dictionary")
For i = 0 To UBound(theStrings)
On Error Resume Next
oDict.Add theStrings(i), theStrings(i)
On Error Goto 0
Next

theFilteredStrings = oDict.Keys()
Set oDicts = Nothing

Response.write "The Filtered list:<br>"
For i = 0 To UBound(theFilteredStrings)
Response.Write theFilteredStrings(i) & "<br>"
Next
%>
Ray at work
"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list of DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD',
'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'

It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it seems a
bit inefficient, so I'm wanting to see if there some good suggestions from
the floor...

Thanks in advance

Chris

Jul 19 '05 #5
Assuming an array that is already sorted, you can monitor repeats by keeping
track of the previous string. If it's the same, it's a double.
<%
currentString = ""
x = array("a", "b", "b", "c")
for i = 0 to ubound(x)
s = x(i)
if i = 0 then
newString = s
currentString = s
else
if s <> currentString then
currentString = s
newString = newString & "," & s
else
' skip
end if
end if
next
Response.Write newString
%>

--
Aaron Bertrand
SQL Server MVP
http://www.aspfaq.com/


"CJM" <cj*****@yahoo.co.uk> wrote in message
news:u5**************@TK2MSFTNGP10.phx.gbl...
In simple english, it seems straightforward, but if you tried to code it I
think you will hit the same problem.

So you iterate through the loop,buikding up an array of all values..

You say to then loop the array, spotting 'each new one', which you then
place elsewhere (maybe in a new array perhaps).

But how exactly do you determine if each value is already there, or if it is indeed new?

CJM
"William Morris" <ne***************************@seamlyne.com> wrote in
message news:bu************@ID-205671.news.uni-berlin.de...
Split the string into an array and sort it. Then, loop the array, caching
each value and throwing each new one found into a new string.

My .02.

--
William Morris
Product Development, Seritas LLC
Kansas City, Missouri

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a
list of
DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ',
'ABCD', 'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'
It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it
seems a bit inefficient, so I'm wanting to see if there some good suggestions from the floor...

Thanks in advance

Chris



Jul 19 '05 #6
"Ray at <%=sLocation%>" <myfirstname at lane34 dot com> wrote in message
news:OE**************@tk2msftngp13.phx.gbl...
"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list
of
DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ',
'ABCD', 'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'
It's as simple as that.... I'm not bothered what structures we use...

It is easy to build a solution using loads of nested loops, but it seems a bit inefficient, so I'm wanting to see if there some good suggestions from the floor...

Thanks in advance

Chris

You can add your items to a dictionary object to filter out duplicates

by assigning them as keys. Since you cannot have duplicate keys, you can use an On Error Resume Next to skip the errors (duplicates) and be left with uniques.

<%
theStrings = Array("ABCDXX", "ABCDYY", "ABCDZZ", "ABCD", "ABCDYY")

Set oDict = CreateObject("Scripting.Dictionary")
For i = 0 To UBound(theStrings)
On Error Resume Next
oDict.Add theStrings(i), theStrings(i)
On Error Goto 0
Next

theFilteredStrings = oDict.Keys()
Set oDicts = Nothing

Response.write "The Filtered list:<br>"
For i = 0 To UBound(theFilteredStrings)
Response.Write theFilteredStrings(i) & "<br>"
Next
%>
Ray at work


To add to Ray's solution, you can take advantage of the fact that
assignment to the Dictionary.Item(key) generates a new key if the key
does not already exist. As such:

<%
Const str = "'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD','ABCDYY'"
Dim dct,arr,i,iMax
Set dct = CreateObject("Scripting.Dictionary")
arr = Split(str,",")
iMax = UBound(arr,1)
For i = 0 To iMax
dct.Item(Trim(arr(i))) = "Chris is great"
Next
Response.Write Join(dct.Keys,",")
Set dct = Nothing
%>

-Chris Hohmann
Jul 19 '05 #7
Hello Chris,

Thanks for posting in the community.

Based on my understanding, now the question is: How to remove duplicated
strings in a string list? Please correct me if I have misunderstood the
problem.

I can't think of any better way till now. In ASP.NET, we can make use of
some collection class librarys to see if one string exists in the
collection already. For an example, ArrayList.Contains can be used to
determine whether an element is in the ArrayList or not. I will discuss
with our ASP experts to see whether there is any easy to achieve it and
repy here as soon as possible.

If there is any more questions, please feel free to post here.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! ¨C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Jul 19 '05 #8
CJM
Thanks to you all for your replies...

I'm afraid for some reason I cant see Ray's post, but fortunately his
solution is tagged on to the end of Chris's post!

I tried a similar thing with just a simple Collection, but it wasnt quite up
to the job. My next step was to have a collection of a UDT which, though a
slightly long-winded, would do the job.

However, I like Ray/Chris's solutions so I'll try them first.
Thanks to all...
Chris

"CJM" <cj*****@yahoo.co.uk> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
I have a bit of code which involves some looping...

In each iteration, we retrieve a string value... I want to build a list of DISTINCT (in the SQL sense) string values.

Our example strings might be 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ', 'ABCD',
'ABCDYY'
In which case, we want to end up with 'ABCD', 'ABCDXX', 'ABCDYY', 'ABCDZZ'

It's as simple as that.... I'm not bothered what structures we use...
It is easy to build a solution using loads of nested loops, but it seems a
bit inefficient, so I'm wanting to see if there some good suggestions from
the floor...

Thanks in advance

Chris

Jul 19 '05 #9
CJM
I've just looked into the Dictionary solution... I've never used or come
across the object before, but looking at it, it is an obvious solution! It's
effectively what I was hinting at with my proposed Collection/UDT solution.

Cheers Ray!

Chris
Jul 19 '05 #10
Hi All,

Ray and Chris's idea is creative. I didn't think of it before. :) What I
thought is to do this in compiled code (COM object, ActiveX control) to
improve performance and use it in the page.

Thanks very much for sharing it in the community.

Also, Chris (CJM), thanks for participating the community.

Have a good day.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! ¨C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Jul 19 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: hokiegal99 | last post by:
How would I determine if a filename is greater than a certain number of characters and then truncate it to that number? For example a file named XXXXXXXXX.txt would become XXXXXX fname = files...
3
by: frazer | last post by:
hi i have the following 2 typed datasets. 1."SecurityGroupMembership" LoginName GroupId 2."Users"
2
by: mfyahya | last post by:
I have two tables, both containing an 'authors' column. Is there a way to get a unique list of authors from the two tables? I tried SELECT DISTINCT `authors` from `table1`, `table2`; but I got an...
1
by: none | last post by:
Hello there, I have a table with many text and varchar fields and I would like to get all distinct words from these filelds. For example: table Pet: id int(10) unsigned legende text
3
by: Seelan | last post by:
Hey Guys, I have read a table from SQL Server to a Dataset called ds. Now I want to list the items of one column to a DropDown List...I can do this perfectly but i want ht e list to contain...
7
by: juli | last post by:
I have strings variables in a collection list and I want to create new collection but to add to it only strings that are distinct (no common strings). For example I have an object sentense which...
7
by: Martin Robins | last post by:
I am currently looking to be able to read information from Active Directory into a data warehouse using a C# solution. I have been able to access the active directory, and I have been able to return...
6
by: Bob Stearns | last post by:
I am getting unwanted duplicate rows in my result set, so I added the DISTINCT keyword to my outermost SELECT. My working query then returned the following message: DB2 SQL error: SQLCODE: -214,...
6
by: ronrsr | last post by:
here is my result. How do I determine the number of tuples in this array, returned from a mysql database? How do I determine the number of characters or entry in each tuple? thanks very much...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.