473,788 Members | 2,751 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Importing from delimited text file & removing dulpicates problem

Hi
I have a problem which I hope someone can help me with because I
really don't even know where to start with it.
I am using Access 2003. I have a delimited text file which contains
about 200,000 lines. There are only 2 fields, Role and ID. The role
field contains many duplicates, out of the 200,000 records I think
there are only 16,000 unique roles.

Role ID
AB_F2S_COST_PLA N_CLK EE1
AB_F2S_COST_PLA N_CLK LT10
AB_F2S_COST_PLA N_CLK LT18
AB_F2S_COST_PLA N_CLK LT3
AB_F2S_COST_PLA N_CLK LT7
AB_F2S_COST_PLA N_CLK LT8
AB_F2S_COST_PLA N_DSP TR8
AB_F2S_COST_PLA N_DSP TH2
AB_F2S_COST_PLA N_DSP BM23

I need to produce a single unique list of roles with the ID's related
to that role all concatenated together with comma's as separaters.

e.g
Role
ID
AB_F2S_COST_PLA N_CLK EE1,LT10,LT18,L T3,LT7,LT8
AB_F2S_COST_PLA N_DSP TR8,TH2,BM23

Is this possible and if so, how ?
Thanks in advance

Jun 14 '07 #1
5 2013
Hi

The following should help you...

Step 1: Import your delimited file into a table named "Roles" with
"Role" as 1st named Field and "ID" as 2nd named field

e.g.

Role ID
AB_F2S_COST_PLA N_CLK EE1
AB_F2S_COST_PLA N_CLK LT10
AB_F2S_COST_PLA N_CLK LT18
AB_F2S_COST_PLA N_CLK LT3
AB_F2S_COST_PLA N_CLK LT7
AB_F2S_COST_PLA N_CLK LT8
AB_F2S_COST_PLA N_DSP TR8
AB_F2S_COST_PLA N_DSP TH2
AB_F2S_COST_PLA N_DSP BM23
Step 2: Create a new module and add the following function to it

i.e.

Function ConcatDetail1(N um As String) As Variant
Dim rs As DAO.Recordset
Dim strOut As String
Dim strSQL As String
Dim lngLen As Long
Const strcSep = ","

strSQL = "SELECT ID FROM Test WHERE Role = '" & Num & "';"
Set rs = DBEngine(0)(0). OpenRecordset(s trSQL)
With rs
Do While Not .EOF
strOut = strOut & !ID & strcSep
.MoveNext
Loop
End With
rs.Close

lngLen = Len(strOut) - Len(strcSep)
If lngLen 0 Then
ConcatDetail1 = Left(strOut, lngLen)
Else
ConcatDetail1 = Null
End If
Set rs = Nothing
End Function
Step 3: Create a new query, add the "Roles" Table to it. Use the SQL
VIEW form of the Query and put the following in it...

i.e.
SELECT DISTINCT Test.Role, ConcatDetail1([Role]) AS ID
FROM Test;

Save the query.

Step 4. Run the query and you should get your required result.

i.e.

AB_F2S_COST_PLA N_CLK EE1,LT10,LT18,L T3,LT7,LT8
AB_F2S_COST_PLA N_DSP TR8,TH2,BM23
Step 5. You can now export the query.

If you need further clarification please ask...

Regards,

osmethod

(N.B. The "ConcatDeta il1" function was provided to me by Allen Browne
when I needed a similar resolution to yours..)

See
http://groups.google.ie/group/comp.d...d&rnum=1&hl=en

Jun 14 '07 #2
I have followed your instructions but when I ran the query an error
occured saying that the table 'Test' could not be found. I figured
that my table should be re-named as 'Test' and not 'Roles'. I then ran
the query again and left it for about 40 minutes. It still wasn't
finished so I aborted it. Should it really be taking this long to sort
200,000 records ?!

Jun 15 '07 #3
On Jun 15, 1:14 pm, tom.hepwo...@jt-int.com wrote:
I have followed your instructions but when I ran the query an error
occured saying that the table 'Test' could not be found. I figured
that my table should be re-named as 'Test' and not 'Roles'. I then ran
the query again and left it for about 40 minutes. It still wasn't
finished so I aborted it. Should it really be taking this long to sort
200,000 records ?!
You are correct - table name is test.

Hard to judge the length of time. Some things that may help...

1. Put an index(duplicate s ok) on each of the fields in the test
table.

2. Create a smaller file sample of 100 records from your live file and
import it first.Then run the query and see the effect first.

Regards,

osmethod

Jun 15 '07 #4
On Jun 14, 12:18 pm, tom.hepwo...@jt-int.com wrote:
Hi
I have a problem which I hope someone can help me with because I
really don't even know where to start with it.
I am using Access 2003. I have a delimited text file which contains
about 200,000 lines. There are only 2 fields, Role and ID. The role
field contains many duplicates, out of the 200,000 records I think
there are only 16,000 unique roles.

Role ID
AB_F2S_COST_PLA N_CLK EE1
AB_F2S_COST_PLA N_CLK LT10
AB_F2S_COST_PLA N_CLK LT18
AB_F2S_COST_PLA N_CLK LT3
AB_F2S_COST_PLA N_CLK LT7
AB_F2S_COST_PLA N_CLK LT8
AB_F2S_COST_PLA N_DSP TR8
AB_F2S_COST_PLA N_DSP TH2
AB_F2S_COST_PLA N_DSP BM23

I need to produce a single unique list of roles with the ID's related
to that role all concatenated together with comma's as separaters.

e.g
Role
ID
AB_F2S_COST_PLA N_CLK EE1,LT10,LT18,L T3,LT7,LT8
AB_F2S_COST_PLA N_DSP TR8,TH2,BM23

Is this possible and if so, how ?
Thanks in advance
See if this code-behind-form runs quickly enough:

Private Sub cmdProcess_Clic k()
Const ROLEMAX = 30000
Dim UniqueRole(ROLE MAX) As String
Dim IDList(ROLEMAX) As String
Dim lngI As Long
Dim lngJ As Long
Dim lngMax As Long
Dim lngRecords As Long
Dim MyDB As Database
Dim MyRS As Recordset
Dim OutRS As Recordset
Dim strSQL As String
Dim strID As String
Dim strRole As String
Dim boolRoleFound As Boolean
Dim boolIDFound As Boolean
Dim strTemp As String
Dim intComma As Integer
Dim strTest As String
Dim tdf As TableDef
Dim fld As Field
Dim boolOutputTable As Boolean

Set MyDB = CurrentDb
boolOutputTable = False
For Each tdf In MyDB.TableDefs
If tdf.Name = "tblOutput" Then
boolOutputTable = True
Exit For
End If
Next tdf
If Not boolOutputTable Then
Set tdf = MyDB.CreateTabl eDef("tblOutput ")
Set fld = tdf.CreateField ("Role", dbText, 255)
tdf.Fields.Appe nd fld
Set fld = tdf.CreateField ("IDList", dbMemo)
tdf.Fields.Appe nd fld
tdf.Fields.Refr esh
MyDB.TableDefs. Append tdf
MyDB.TableDefs. Refresh
Set tdf = Nothing
Set fld = Nothing
Else
strSQL = "DELETE tblOutput FROM tblOutput;"
MyDB.Execute strSQL, dbFailOnError
End If
For lngI = 1 To ROLEMAX
UniqueRole(lngI ) = ""
IDList(lngI) = ""
Next lngI
strSQL = "SELECT Role, ID FROM Roles;"
Set MyDB = CurrentDb
Set MyRS = MyDB.OpenRecord set(strSQL, dbOpenSnapshot)
MyRS.MoveLast
lngRecords = MyRS.RecordCoun t
MyRS.MoveFirst
lngMax = 0
For lngI = 1 To lngRecords
strID = MyRS("ID")
strRole = MyRS("Role")
'Search for Role in arrays
If lngMax 0 Then
boolRoleFound = False
For lngJ = 1 To lngMax
If UniqueRole(lngJ ) = strRole Then
boolRoleFound = True
'Update the ID list, empty string, single ID, multiple ID no
substring, multipleID substring
If IDList(lngJ) = "" Then
IDList(lngJ) = strID
ElseIf InStr(1, ",", IDList(lngI), vbTextCompare) = 0 And
IDList(lngJ) <strID Then
IDList(lngJ) = IDList(lngJ) & "," & strID
ElseIf InStr(1, strID, IDList(lngI), vbTextCompare) = 0 Then
IDList(lngJ) = IDList(lngJ) & "," & strID
Else
'Check each piece
boolIDFound = False
strTemp = IDList(lngJ)
Do While Len(strTemp) 0
intComma = InStr(1, ",", strTemp, vbTextCompare)
If intComma 0 Then
strTest = Left(strTemp, intComma - 1)
If strTest = strID Then
boolIDFound = True
Exit Do
Else
strTemp = Right(strTemp, Len(strTemp) - Len(strTest))
End If
Else
'only one piece
If strTemp = strID Then boolIDFound = True
strTemp = ""
End If
Loop
If boolIDFound = False Then
IDList(lngJ) = IDList(lngJ) & "," & strID
End If
End If
Exit For
End If
Next lngJ
If boolRoleFound = False Then
'Add a new Role and ID
lngMax = lngMax + 1
UniqueRole(lngM ax) = strRole
IDList(lngMax) = strID
End If
Else
UniqueRole(1) = strRole
IDList(1) = strID
lngMax = 1
End If
If lngI <lngRecords Then MyRS.MoveNext
Next lngI
MyRS.Close
Set MyRS = Nothing
'Append array values to table
strSQL = "SELECT * FROM tblOutput;"
Set OutRS = MyDB.OpenRecord set(strSQL, dbOpenDynaset)
For lngI = 1 To lngMax
OutRS.AddNew
OutRS("Role") = UniqueRole(lngI )
If IDList(lngI) <"" Then OutRS("IDList") = IDList(lngI)
OutRS.Update
Next lngI
MsgBox ("Done.")
End Sub

Note that by ordering the query (from "SELECT Role, ID FROM Roles;" to
"SELECT Role, ID FROM Roles ORDER BY Role, ID;") it is possible to
simplify greatly the checks for existing Roles and ID's. The main
idea is that a single pass is made through the table. Note that I
didn't put checks in for when lngMax exceeds ROLEMAX - 1. That should
be done also. This was coded in A97 so you may need 'DAO.' in a few
Dim statements (Recordset and possibly Database). This code can also
be done more elegantly with ADO or by using newer commands such as
Split and Join. Anyway, it should give you a good place to start.

James A. Fortune
CD********@Fort uneJames.com

Jun 15 '07 #5
On Jun 15, 5:15 pm, CDMAPos...@Fort uneJames.com wrote:
On Jun 14, 12:18 pm, tom.hepwo...@jt-int.com wrote:


Hi
I have a problem which I hope someone can help me with because I
really don't even know where to start with it.
I am using Access 2003. I have a delimited text file which contains
about 200,000 lines. There are only 2 fields, Role and ID. The role
field contains many duplicates, out of the 200,000 records I think
there are only 16,000 unique roles.
Role ID
AB_F2S_COST_PLA N_CLK EE1
AB_F2S_COST_PLA N_CLK LT10
AB_F2S_COST_PLA N_CLK LT18
AB_F2S_COST_PLA N_CLK LT3
AB_F2S_COST_PLA N_CLK LT7
AB_F2S_COST_PLA N_CLK LT8
AB_F2S_COST_PLA N_DSP TR8
AB_F2S_COST_PLA N_DSP TH2
AB_F2S_COST_PLA N_DSP BM23
I need to produce a single unique list of roles with the ID's related
to that role all concatenated together with comma's as separaters.
e.g
Role
ID
AB_F2S_COST_PLA N_CLK EE1,LT10,LT18,L T3,LT7,LT8
AB_F2S_COST_PLA N_DSP TR8,TH2,BM23
Is this possible and if so, how ?
Thanks in advance

See if this code-behind-form runs quickly enough:
...
Note that by ordering the query (from "SELECT Role, ID FROM Roles;" to
"SELECT Role, ID FROM Roles ORDER BY Role, ID;") it is possible to
simplify greatly the checks for existing Roles and ID's. The main
idea is that a single pass is made through the table. Note that I
didn't put checks in for when lngMax exceeds ROLEMAX - 1. That should
be done also. This was coded in A97 so you may need 'DAO.' in a few
Dim statements (Recordset and possibly Database). This code can also
be done more elegantly with ADO or by using newer commands such as
Split and Join. Anyway, it should give you a good place to start.

James A. Fortune
CDMAPos...@Fort uneJames.com
The good news:

I created a Roles table with 200,000 records and a lot of duplicates.
The code ran in about a minute on a slow machine. A version with an
ordered recordset and simplified logic should be run quite quickly.

The bad news:

It didn't prevent some duplicate ID's from ending up in
tblOutput.IDLis t. The simplified logic using an ordered recordset
should handle that problem nicely.

Post back if you need help with this for either the unordered or
ordered recordset cases.

James A. Fortune
CD********@Fort uneJames.com

Jun 17 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
5433
by: Elmo Watson | last post by:
I've been asked to develop a semi-automated type situation where we have a database table (sql server) and periodically, there will be a comma delimited file from which we need to import the data, replacing the old. I naurally know that we can use to kill the other data, but does anyone have any examples of importing a comma delimited file into SQL Server with ASP?
1
3679
by: sparks | last post by:
I have never done this and wanted to ask people who have what is the best way. One person said import it to excel, then import it into access table. but since this will be done a lot, I am trying to avoid this extra step. can access read in this file directly into a table as append data? thank you very much for any pointers
8
2552
by: harry | last post by:
Hi Folks: I'm trying to do something that looks simple, but I can't make it work right. It's Access 2000 on a Win2000 computer. I create a database with 5 columbs. The data I need to import is text. The way it's formatted is one line each field, for the 5 fields, then to the next record. The text file looks like this: WPTJ410
1
2132
by: don | last post by:
I'm trying to import a comma delimited text file into MS Access 2002 version - the first column is set for an auto increment primary key and everything works right the first time I import a text file, however when I try to append to this new database table Access always messes up the import - as long as I ieave the auto increment column out it has no trouble appending to the database , and then I have to add the auto increment column...
10
2972
by: hendafe | last post by:
I have a problem of importing a text file into Access. The text file will be imported using a delimiter, say ( , ) for example. This should be imported into the database. An example is: John, is, a, boy Ada, is, a, girl I have tried using the DoCmd.TransferText but I am not getting results.
4
2218
by: Kathie via AccessMonster.com | last post by:
Hello, I have to import monthly, files that were once *.csv but due to commas in addresses, the interface program was changed to dump tab delimited. Now my code is not finding the files in the folder? The code is below - can anyone help? (The files still are named with the extension of *.csv) Function Import_Records() 'On Error GoTo Import_Records_Err 'Delete the all records in MONTHLY_IMPORTS table 'Call...
17
2032
by: OdAwG | last post by:
Just some questions regarding tables. I am new Access Database and need a little help. I have the following data listed below 01. I have a table called tbl_Customer with the following information listed below 02. In that table I have three columns that have numeric data in it (City, State, Marital_Status) 03. I need to convert the numeric data in table tbl_customer (City, State, Marital_Status) to alpha numeric data 04. using the...
5
5852
by: Andy | last post by:
Hello All: I am importing an Excel / CSV file. The problem I am having is: the columns are being defined for me as int32 (able to determine by using the ..GetFieldType method). As a result: when a row has a character, we are losing this data, because it is being interprested as "". How would I go about making the columns always text? Below is my code that I am using: System.Data.OleDb.OleDbConnection con = new...
2
4774
by: denisel | last post by:
Hi, We will be conducting surveys through SurveyMonkey online and will be importing the answers by tab delimited or comma delimited file into access. I was wondering if there is specific way to import delimited files into Access or Is there a specific way I should design the survey online so Access can recognize the file? Here is my table structure to give you an idea on where I'm coming from. Thank you so much in advance!!! ...
0
9656
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9498
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10366
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10173
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8993
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6750
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5536
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3674
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2894
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.