423,867 Members | 1,958 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 423,867 IT Pros & Developers. It's quick & easy.

Converting Excel to Normalised Database

PhilOfWalton
Expert 100+
P: 1,380
I am trying to create a normalised database from a linked Excel file as a general purpose database.

Suppose the Excel File is
Expand|Select|Wrap|Line Numbers
  1. Person Number, Name, Address Lines, City, State, Zip Code
I have already created the following tables.
Expand|Select|Wrap|Line Numbers
  1. TblCities
  2.     CityID               AutoNumber  PK
  3.     City                  Text
  4.     StateID            Long               FK
  5.  
  6. TblPersons
  7.     PersonID        AutoNumber  PK
  8.     PerdonNo            Long
  9.     PersonName            Text
  10.     AddressID        Long        FK
  11.  
  12. TblStates
  13.     StateID        AutoNumber  PK
  14.     State        Text
  15.  
  16. TblAddress
  17.     AddressID        AutoNumber  PK
  18.     AddressLines    Text
  19.     ZipCode        Text
  20.     CityID        Long        FK
Note the arbitrary order of the above tables.

I also have a table that gives the Excel field names that match the Access Table & Field Names.

Now in order to populate them, I must do it in the correct order:-
TblStates, TblCities, TblAddress, TblPersons

I had hoped that MsysRelationships would give me the correct information, but although I get the relevant table names & linking fields, the Table names seem to be arbitrarily in the szObject column or the szReferenceObject column.

Any thoughts please.

Phil
Oct 11 '18 #1
Share this Question
Share on Google+
12 Replies


twinnyfo
Expert Mod 2.5K+
P: 2,720
Hey Phili,

Could you explain this one a bit better? I “thought” that if you have created the tables yourself (and established the relationships yourself) that all would be well during the import. I personally don’t care what MsysRelationships says, because sometimes Access creates its own names—that can always be changed later on if we really need to.

Is the data not importing correctly?
Oct 11 '18 #2

PhilOfWalton
Expert 100+
P: 1,380
Hi Twinnyfo,

Sorry the explanation was not clear.

I am trying to create a general purpose Db that can be "incorporated" into a client's Db to Import / Link one or more Excel files, Scan those files and put the data into the Client's existing tables.

Now to do that, the tables have to be built up in a certain order, and the first ones to be populated must me those with no Foreign keys (end of a chain).

Hence in the example above, nothing depends on the TblStates, but the TblCities relies on knowing the State that it is in (StateID).

Likewise, the TblAddress needs the correct city, so the appropriate CityID must be established before the TblAddress can be populated.


Here is is an initial attempt, but please ignore the last 3 columns - I think I can pull this information from MsysRelationships.

Having defined the details of the Excel file in an Access table (Folder, File and Sheet), I link the Excel File and call it "TblExcelPlant". Also load the Date Last Modified. I also define the Excel Link Field in case there is more than 1 Excel file, and they all have a common unique field that potentially combines the Excel files.

In the subform, the first column (Excel Field Name) lists all the field names found in the Excel File.
The second column (Access Table Name) shows all the tables in the database (obtained from MsysObjects)
The third column (Access Field Name) shows all the fields in the table defined in the second column.

So we can say for example the field called "Project Manager" in the Excel table has to be loaded to the "LastName" in the TblEmployees.

Hope that is clearer

Phil




Hope
Attached Images
File Type: jpg Excel.jpg (271.4 KB, 124 views)
Oct 11 '18 #3

twinnyfo
Expert Mod 2.5K+
P: 2,720
So, now your project makes sense (which is amazing, as usual), but I am not exactly sure what the problem is, as things look as they should from your pic.

perhaps an example of:
the Table names seem to be arbitrarily in the szObject column or the szReferenceObject column.
That might help?
Oct 11 '18 #4

PhilOfWalton
Expert 100+
P: 1,380
Sorry for the delay in getting back. Getting old is a pain.

Here are two images



and



Can you see the logic?

Phil
Attached Images
File Type: jpg RelationshipDiagram.jpg (296.0 KB, 118 views)
File Type: jpg Relationships.jpg (343.9 KB, 116 views)
Oct 11 '18 #5

PhilOfWalton
Expert 100+
P: 1,380
Replying to my own question, I see it now

The szObject is the ∞ side of the relationship, and szReferencedObject is the 1 side.

So I have to process the 1 sides before the ∞ sides.

Take the example tblProjectEmployees which need 3 tables - tblYears, tblProjects and tblEmployees. tblYears depends on nothing, so that's not a problem. The other 2 tables have both the 1 side and the ∞ side in them, so I get back to the original question of "which order do I populate the tables? and how do I program that in?

Phil
Oct 11 '18 #6

Rabbit
Expert Mod 10K+
P: 12,280
Sounds like you need to recurse the relationships until you get to the bottom level nodes (all tables that that are referenced only and does not reference other tables) and then work back up.
Oct 11 '18 #7

PhilOfWalton
Expert 100+
P: 1,380
@Rabbit

Not quite sure what you have in mind here. The only experience I have had, is recursively going through a folder to find sub folders, sub sub folders etc, but in that case we have a linear tree structure.

In my case the twigs and branches get grafted back to the trunk or other twigs and branches. Quite incestuous.

Can you be more specific, please

Phil
Oct 12 '18 #8

Rabbit
Expert Mod 10K+
P: 12,280
In pseudocode, it would probably look something like this:

Expand|Select|Wrap|Line Numbers
  1. Sub RecurseRelationship(tableName)
  2.     arrayRelations = Get relations for tableName
  3.  
  4.     For Each item in arrayRelations
  5.         If item has a many on the tableName side and 1 on the otherTable
  6.             RecurseRelationship(otherTable)
  7.         End If
  8.     Next
  9.  
  10.     ' If it gets here, then it's found a bottom level table or a table that has no relationships
  11.     Call LoadTable(tableName)
  12. End Sub
The actual code will probably have to account for situations that I didn't think of or include in the pseudocode. For example, you may need to keep an array of parsed tables so you're not thrown into some infinite loop of references.
Oct 12 '18 #9

PhilOfWalton
Expert 100+
P: 1,380
Thanks, that That looks feasible.

To make things very simple, below is a relationship diagram of 4 tables, assume they all have primary keys and required Foreign keys.


Now we can see that we have to populate tables C & D first because it will not appear in the szObjects in MSysRelationships:- There is only the one side of the relationship.

So then if we find the table pointed to by the relationship from Table D (Table A) and try to populate that, we will fail.

First we have to find the table pointed to by C (Table B), populate that, then find the table pointed to by C (back to A) which, providing we know that D has been populated, we can now populate Table A.

So your Pseudo code will probably find the bottom level (which as mentioned is easy to find directly in MSysRelationships) and it might find the next level up, but as in this example, that may not be sufficient information.

I need to produce a query that looks something like this:-
Expand|Select|Wrap|Line Numbers
  1. Table      Sequence
  2.    D          1
  3.    C          2
  4.    B          3
  5.    A          4
Although C & D could both have the same value or be reversed.

Any ideas please?

Phil
Attached Images
File Type: png SimpleRelation.png (6.5 KB, 109 views)
Oct 12 '18 #10

Rabbit
Expert Mod 10K+
P: 12,280
Should work fine, the code should follow down the left path or right path first, work its way back up, then go down the other path.
Oct 13 '18 #11

Rabbit
Expert Mod 10K+
P: 12,280
How are you coming along?
Oct 16 '18 #12

PhilOfWalton
Expert 100+
P: 1,380
Hi Rabbit.

It's hard going, but I think it's coming along OK

This is where I am so far:-


I can select a level, and if I have got it right, it shows the fields in a table, where the data comes from, and where it is used. Level 1 is the base line tables, and in the case of this DB there are 6 levels (ignore the 7 - now corrected)

In this picture I am selecting the field "Cost Account - Level 2" from the Excel table "TblExcelCosts" as the equivalent of Field "CostDescription" in the Acess Table "tblCostTypes"

Other problems, but I am starting a new thread.

Thanks for your concern

Phil
Attached Images
File Type: jpg FrmLoadData.jpg (228.2 KB, 78 views)
Oct 16 '18 #13

Post your reply

Sign in to post your reply or Sign up for a free account.