473,686 Members | 3,266 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Combine multiple rows with one common field into one row & dedupe

2 New Member
Hi all - I've been browsing this forum for a while now, looking for help, and finally decided to join and ask my question, since I'm not terribly good at writing SQL queries in Access. Here is my dilemma:

Table U contains 1 field (Field F, primary key = no dupes)
Table N contains 50 fields (including Field F, same as in Table U)
* Field F in Table N contains multiple records, with dupe values

Would like to create a new Table (or just append to Table U), all values in Table N that match on Field F, but only with one record for each value of Field F. Here are sample layouts of the two Tables:

Table U
Field F
A
B
C
etc

Table N
Field F Field F1 Field 52 ..... Field 10x
A A1 A2 A5
A A2 A3 A3
A A3 A7 A8
B B1 B2 B3
C C1 C9 C7
C C1 C2 C4

Basically, I want to end up with the following:

New Table
Field F Field F1 Field 52 ..... Field 10x
A A1 A2 A5 A3 A7 A8
B B1 B2 B3
C C1 C9 C7 C2 C4

Where, there is ultimately 1 record per value in Field F, and each of the corresponding values from multiple records of Field F are added to the end of the one record, if that makes logical sense.

I have tried many different things, from INTERSECT, to CONCAT, to INNER JOIN, to OUTER JOIN, to LAST, but my SQL isn't good enough even to post here, as nothing really works. :)

Thank you so much in advance!
karin
Oct 28 '08 #1
3 5640
Stewart Ross
2,545 Recognized Expert Moderator Specialist
Hi Karin. I don't think It's your SQL skills that are the problem here; it's the way you are trying to represent the data, which would appear to be more freeform (spreadsheet-like) than entity-based (database-like) in character.

Database records within a table are generally vertical in structure - with each row representing a different item, and each column a property of that item.

What you are considering is something very different, where in effect you are pivoting what appears to be the same kind of entity in each case as a kind of sideways record of variable length attached to a primary key. In general that is not what databases do, and it is why you will not be able to use any variant of SQL to generate your pivoted records.

It is always possible to use bespoke programming to process bespoke data, using memo fields or the like to store custom lists, but then you will need custom processing to do anything with the results of such storage as well (even to read the lists once generated).

I would suggest taking a different approach, which is to consider whether or not a relational database is the right tool for the job. If it is, then you really need to reconsider what the table structure of your multi-record table is, so that it can be handled properly using standard 1-to-many relationships which, as mentioned, in a relational database are vertical and not horizontal in nature. The horizontal part of a table structure in a database models the fields of each table, not the records themselves. The fields in turn model the discrete properties of the entity being stored.

It is because there is currently a complete mismatch of what you have apparently got in your database with what relational databases are good at doing that you face such difficulties trying to get the end results you seek. This often happens if users are used to working with spreadsheet products rather than databases - I wonder if that is the case here?

It would also help us to help you if you could post example data which is closer to actual real-world data, rather than the use of dummy data values such as A1, A2 etc. We can't really see the patterns in your data from such heavily-disguised values, and it becomes more difficult to give you sound advice on how to progress this task when the data is not easy to understand.

-Stewart
Oct 28 '08 #2
klbachrodt
2 New Member
Hi Stewart...Thank s for the input...I think what I'm trying to get at is in fact a typical db structure, but maybe i'm mistaken. Here is some sample data:

F N na N1 r2 n3 r4
anastasia stacy
andrea andrew drea rea andrew drea rea
andrea andrew dea
andreas andy
andrew andy ansey drewandrea andy ansey
ansey drewandrea
andrew andy drew andrea andy
andrew drew
angela angel angelica angelina angeline angie jane
angel angelica angelina angeline angie jane angelica
angelina angeline angie jane angelina angeline angie
jane angeline angie jane angie jane
angela angelica angelina angeline angelica angelina angeline
angelina angeline
angelina lina
angeline angie
anita annette
anita nan
anita nana
anita nina
anita nita
ann hannah nan nanny nancy nana hannah
ann nan nanny nancy nana nan nanny
ann nancy nana nanny nancy nana nancy
ann nana nana
ann anna nettie anna nettie
ann anna
ann anne
ann annie

What I need is either:
* one record for each unique name in Field F, which contains all possible values (which might be in multiple rows)
OR
* one record for each combination of unique name (in Field F) and each possible corresponding value from any/all of the other fields?

Does that help?
Thanks!
kbr

Hi Karin. I don't think It's your SQL skills that are the problem here; it's the way you are trying to represent the data, which would appear to be more freeform (spreadsheet-like) than entity-based (database-like) in character.

Database records within a table are generally vertical in structure - with each row representing a different item, and each column a property of that item.

What you are considering is something very different, where in effect you are pivoting what appears to be the same kind of entity in each case as a kind of sideways record of variable length attached to a primary key. In general that is not what databases do, and it is why you will not be able to use any variant of SQL to generate your pivoted records.

It is always possible to use bespoke programming to process bespoke data, using memo fields or the like to store custom lists, but then you will need custom processing to do anything with the results of such storage as well (even to read the lists once generated).

I would suggest taking a different approach, which is to consider whether or not a relational database is the right tool for the job. If it is, then you really need to reconsider what the table structure of your multi-record table is, so that it can be handled properly using standard 1-to-many relationships which, as mentioned, in a relational database are vertical and not horizontal in nature. The horizontal part of a table structure in database models the fields of each table, not the records themselves. It is because there is currently a complete mismatch of what you have apparently got in your database with what relational databases are good at doing that you face such difficulties trying to get the end results you seek. This often happens if users are used to working with spreadsheet products rather than databases - I wonder if that is the case here?

It would also help us to help you if you could post example data which is closer to actual real-world data, rather than the use of dummy data values such as A1, A2 etc. We can't really see the patterns in your data from such heavily-disguised values, and it becomes more difficult to give you sound advice on how to progress this task when the data is not easy to understand.

-Stewart
Oct 28 '08 #3
Stewart Ross
2,545 Recognized Expert Moderator Specialist
Hi. Your data is not normalised in any way (see our HowTo article on Database Normalisation and Table Structures), so is decidedly non-standard. It is simply not possible to use the wide range of SQL techniques available for data summary and reporting on non-normalised data structures. It is not an if, a but or a maybe - non-normalised data is of no use in a relational database.

It would appear that you actually have just two separate fields in your table: a name, and a synonym for that name. The synonym is being repeated a number of times (which breaks the first rule of normalisation, to separate repeating groups into individual tables).

As an example of how your data SHOULD look in a database table I show an extract below using just the name-synonym structure:

ann hannah
ann nan
ann nanny
ann nancy
ann nana
ann anna
ann anne
ann annie
andrea andrew
andrea drea
andrea rea

The key field is repeated for each record, and followed by one synonym in each record - not multiple synonyms (unless these are in fact different, separately-identifiable elements of the underlying record with a particular meaning that must be recorded).

There can be no duplicate rows in a relational database table, so the repeated name-synonym combinations (as shown in your sample data) would be dealt with (simply by not being stored!).

Given this kind of vertical structure you CAN use SQL techniques to summarise the data and prepare lists in whatever way is appropriate; at present I cannot see any way to do so that can be applied systematically.

Ask yourself what the transformation rule you need to apply would be if you were to do the task you propose manually (forgetting SQL for the moment). If you would struggle to define the rules, covering all the cases you currently have, then you would find it even more of a struggle (if it can be done at all) to define a set of rules that can be implemented using a highly-structured processing language such as SQL.

I strongly advise you to look again at the structure of your data. Non-normalised data and relational databases do not mix.

-Stewart
Oct 29 '08 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

2
6529
by: Knighterrant | last post by:
How to combine multiple xml files into a single one, or split an xml file into multiple files using xslt?
8
7094
by: mark | last post by:
Access2000 How do I write a query that combines the CTC field from each record below into one record? I need to concatenate the CTC field with a separator, like below: BattID VehicleID STDATE STTIME CTC LKO500HF 00000000 10/27/2003 4:13:51 AM 4 LKO500HF 00000000 10/27/2003 5:13:51 AM 5 LKO500HF 00000000 10/27/2003 10:13:51 AM 6 LKO500HF 00000000 10/27/2003 11:13:51 AM 4
2
4079
by: Ross | last post by:
i have less hair now than i used to...i have a database with many duplicate records in it with only one exception, there is one field with the date it was put in the database and that is different. i can eliminate duplicate records with ease but how do you specify in an sql statement to delete the one with the earlier date? please help me before i go bald... thanks
1
23340
by: mojo | last post by:
We have set up an Access database with tables x,y, & z where we would like to have multiple people entering data into a table x. Each person has an identical copy of the database on their PC's. When the person is done entering the data into table x, we want them to e-mail their database to one person who will combine the data from table x into a single database. What's the easiest method to combine the tables into a single table? ...
5
3002
by: jhutchings | last post by:
Hello everyone, I have a database where I collect shipment data from various tables. However, I have a problem. Whenever I want to see shipping data for orders that were set to ship on or before a certain date (in this case January 30th) the database will return 2 rows for an order as you can see below. Order ID: Line: Due Date: Qty: Ship Qty: Part #: Shipped: 141285 1 1/30/2006 31 10 S15F-55
7
43137
by: Mintyman | last post by:
Hi, I'm working on a system migration and I need to combine data from multiple rows (with the same ID) into one comma separated string. This is how the data is at the moment: Company_ID Material 0x00C00000000053B86 Lead 0x00C00000000053B86 Sulphur 0x00C00000000053B86 Concrete
1
1534
by: Bombus | last post by:
Hello. I have MS Access 2003 and Visual Basic 6.0. In the Access database and two tables tblA and tblB that have a common field fldPN (Personal Number). I use ADO connection. From tblA I have selected a few columns that I show in VB Form1, and a datagrid showing all rows with the same fldPN value in both tables. This works perfect, and was not to hard to code. However, instead of the datagrid (Or in the datagrid) I would like to...
8
11190
by: Cars | last post by:
Hi, Im trying to dedupe about 30,000 records in a access database, i have done the simple dedupe wizard which has worked, but as you know it only does exact matching dedupes, does anyone , know of any access add ins i can use to help like fuzzy logic, The records are just simple company name, address, account number, name etc fields, What would be the best way to dedupe this data, and i need to do this on a regular basis, This task has...
4
5280
by: IT Couple | last post by:
Hi I'm pretty new to SSIS and I wonder if you could help me. I download HTML files (thounsands) as text files using SSIS and then using foreach loop I load each file into table text data type field. Works great apart from that some files appear on multiple rows I presume SSIS thinks there is a new record but there isn't as I want to load the whole file into the table text data type field. So I have two questions: 1) How to import text...
0
8584
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8516
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9054
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8768
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8778
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7599
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6440
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5796
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
2945
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.