473,320 Members | 2,110 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Parsing XML file

zmbd
5,501 Expert Mod 4TB
Looking for tricks or methods of tracking the Node>Children>SubChildren in an XML traverse.

Loading the XML isn't an issue, I have working code for that, it's determining the level of recursion as I traverse the tree, branches, twigs, leaves. What I need to do is relate back the Leaves to the Twigs to Branches
(yes, I know: Nodes, Child Nodes, Subchild nodes, etc...)

Tables I'm looking at (simplified)
[t_leaves]
[PK][FK_MM_T2B][Information from the leaf]

[MM_T2B]
[PK][t_Branches][t_twigs]

[t_twigs]
[PK][Information about the twig]

[t_branches]
[PK][information about the branch]

When I tried to do the simple External>XML-Import my poor database had some 185 tables based on the nodes in the tree... very sad indeed... and there were no relationships to link the data in one table to the next.

No, I do not know where the XML came from.
This isn't an urgent as right now I'm manipulating the data by hand in Excel - but it's painful to do.

HOWEVER, if I open the file in excel it will place the data in the columns by level, etc...
So a group label in Col(A) spans the rows under that group etc... so I know the XML parser can skim the data. I'm almost to the point of instancing Excel, opening the XML in Excel and then walking the rows! However, I know there is an elegant way of doing this if I can just figure out how to solve tacking the recursion...

This is a Very Simplified version of the XML I'm working with and unfortunately the information is confidential.
Expand|Select|Wrap|Line Numbers
  1. 'AIR CODE example xml
  2. <Root>
  3.   <Group>
  4.     <UUID>1234<UUID />
  5.     <Name>GroupName<Name />
  6.     <!-- Blah Blah Blah ->
  7.     <Group>
  8.       <UUID>7890<UUID />
  9.       <Name>CatagoryName<Name />
  10.       <!-- Blah Blah Blah ->
  11.       <Entry>
  12.         <UUID>ABCD<UUID />
  13.         <ItemID>InternalIdOfItem<ItemID />
  14.         <ItemName>InternalNameOfItem<ItemName />
  15.         <!-- Blah Blah Blah ->
  16.       <Entry />
  17.     <Group />
  18.     <Group>
  19.       <!-- Here we have another Root\Group\child that follows the first child, different UUID etc... ->
  20.     <Group />
  21.   <Group />
  22. <Root />
so when I parse the file I'd like to distribute the information so that:
Expand|Select|Wrap|Line Numbers
  1. [t_leaves]
  2. [AutoNum_PK][1][ABCD][InternalIdOfItem][InternalNameOfItem]
  3. [AutoNum_PK][2][EFGH][InternalIdOfItem2][InternalNameOfItem2]
Expand|Select|Wrap|Line Numbers
  1. [MM_T2B]
  2. [1][1][20]
  3. [2][2][1]
  4. [3][15][null]
  5. [...]
  6.  
Expand|Select|Wrap|Line Numbers
  1. [t_Twigs]
  2. [AutoNum_PK=1][7890][CatagoryName]
  3. [AutoNum_PK=2][A123][CatagoryName2]
Expand|Select|Wrap|Line Numbers
  1. [t_Branches]
  2. [AutoNumber_PK=1][0001][GroupName]
  3. [...]
  4. [AutoNumber_PK=20][1234][GroupName20]
  5.  
(Yes I thought about a pedigree type self linking within the [t_branches] and that may be how I end up merging the data; however, it doesn't help with tracking the parent to the child or subchild.

I'd post my VBA; however, it's very dirty right now dozens of Commented out lines of code and notations.

This is my recursion call:
Expand|Select|Wrap|Line Numbers
  1. Call SpiderChildren(zSpider.childNodes, zRecursion, zSpiderGroupLevel)
zRecursion As Long - it's passed into the sub, incremented by one within the recursion scope

zSpiderGroupLevel as long - testing the <tag> to see if it's a group and if so then zSpiderGroupLevel = zRecursion

This is my print line to the text file:
Expand|Select|Wrap|Line Numbers
  1. If PrintSpider Then _
  2. Print #zFreeFile, String(zRecursion, ".") & _
  3.   " (" & " ParentNode: " & _
  4.   zSpider.ParentNode.basename & _
  5.   ") (Spider: " & zSpider.basename & _
  6.   ")(text: " & zSpider.Text & ")"
I found a few example files to work with so attached are the results. These have really nothing to do with the actual XML data - just fodder for the recursion code while I develop the parser.

SiteXML.XML.txt <remove the .txt>

xmlParse.txt is the output file.
(the periods are added String(zRecursion, ".") ) the end of file is "<*>" just tracking for me so that I know that we actually finished parsing the big file.
Attached Files
File Type: txt xmlParse.txt (2.3 KB, 377 views)
File Type: txt SiteXml.Xml.txt (1.3 KB, 528 views)
Jul 26 '18 #1
31 10097
PhilOfWalton
1,430 Expert 1GB
I know nothing of XML and the only thing I know about Trees & leaves is they block my view if the river & sea in summer.

That said, and I am probably way off track, I have a routine that loads a web page, then scans the page source for a particular piece of information. I use it mainly to extract a share price from differently formatted Web pages. It works perfectly on most pages, but on some pages, the information is buried a little deeper, and I guess that is where your SubNodes come in.

If that could be of any help, let me know.

Phil
Jul 27 '18 #2
zmbd
5,501 Expert Mod 4TB
Good Morning PhilOfWalton
Thank you for the offer.
Can the code you have scraping the webpage relate the childnode back to and element in the calling node's structure

I think the secret is as I am passing the parameters on to the next recursion I have to have pulled <UUID> and/or <Name> node and pass them in the same way I've been tracking the recursion level.

My current thought the main loop might be (air code):
(btw - spider name from Web-Search Spiders)
Expand|Select|Wrap|Line Numbers
  1. for each zSpider in zSpiders
  2. if zSpider.haschildNodes then
  3. if zSpider.tagName = "Group" then
  4. '? Flag to pull <UUID> and <NAME> passed to the next recursion to put in module level variable?
  5. '? Once in the module level variable how to pull out the values - would a TempVars or a global level collection be the way to go?
  6. 'As it stands now when zRecursion is passed the value is at the recursive called scope
  7. Call SpiderChildren(zSpider.childNodes, zRecursion, zSpiderGroupLevel, zFlagToPullUuidName)
  8. end if
  9. Print #zFreeFile, String(zRecursion, ".") & " (" & " ParentNode: " & zSpider.ParentNode.basename & ") (Spider: " & zSpider.basename & ")(text: " & zSpider.Text & ")"
  10.  
Jul 27 '18 #3
PhilOfWalton
1,430 Expert 1GB
Good Evening, zmbd. Sorry, as I said I nothing about nodes.

My method is to hunt for clues.
so here is a portion of a Web Pagehttps://www.londonstockexchange.com/...GBGBXSTMM.html

Expand|Select|Wrap|Line Numbers
  1. <div class="commonTable table-responsive">
  2. <table width="100%" cellspacing="0" cellpadding="0" summary="Price data">
  3. <tbody>
  4. <tr class="even">
  5. <td class="name">Price&nbsp;(GBX)</td>
  6. <td>1,155.00</td>
  7. <td class="name">Var % (+/-)</td>
  8. <td>
  9. <span class="green">+1.23%</span>
  10. <span class="green">
  11. (<img src="/media-sme/img/icon/up.gif" alt="Up" />
  12. +14.00)
I have a table of "Clues" called Codes

This is a picture of the "clues" to find the price



So I scan the website page source until I find a sequence of
<td class="name">Price&nbsp;(GBX)</td>
<td>
</td>

The price I am looking for is designated by "Share Price" on the second line (though I don't think it matters that the words "Share Price" is used.

Sorry, I can't help you with the arboriculture

Phil
Jul 27 '18 #4
zmbd
5,501 Expert Mod 4TB
Looks like you're opening the source and then reading the text directly?

There are over 100 such patterns in the file I'm trying to parse...
In one instance I get something like
Expand|Select|Wrap|Line Numbers
  1. <Group>
  2.   <UUID></UUID>
  3.   <Name></Name>
  4.   <Comment />
  5.   <DefaultAuthUserStatusReq />
  6.   <Group>
  7.      <!--Blah Blah Blah-->
  8.      <DefaultAuthUserStatusReq>AsstMngr</DefaultAuthUserStatusReq>
  9.   </Group>
  10. <Group>
Another instance would be
Expand|Select|Wrap|Line Numbers
  1. <Group>
  2.   <UUID></UUID>
  3.   <Name></Name>
  4.   <DefaultAuthUserStatusReq>Mngr</DefaultAuthUserStatusReq >
  5.   <DefaultAuthUserStatusReq />
  6.   <Group>
  7.      <!--Blah Blah Blah-->
  8.      <DefaultAuthUserStatusReq />
  9.   </Group>
  10. <Group>
So take for instance the <DefaultAuthUserStatusReq />
in the first case the Default user status to pull items under the first group is not set so anyone has default access to the items under this group; however, in one of the sub-groups the default is set so that at least an assistant manager (or supervisor) is required to authorize the

In the second case the default is set to a manager or department-lead or higher and the subgroup also doesn't override that requirement

Right now I'm just trying to pull the Group level UUID and Names - say something like KeePass
Group for internet
.SubGroup for Work
..Entries for Work
.SubGroup for Personal
..Entries for Personal

If you xport that file you get a real mess with Root/Groups/Groups etc... somehow when you import this file into a new KeePass it parses the XML and sets up the tree without any effort at all...

so maybe this will help
I'm using the MSXML object
Expand|Select|Wrap|Line Numbers
  1. '>>>This is an Abstract of the Main loop<<<
  2. 'No module level variables - yet
  3. '
  4. Sub ParseXmlToTables()
  5. '
  6. Dim zMsXML as Object
  7. '(...)
  8. 'assume all of the other variables are properly declared.
  9. 'their names should provide context for usage - if unclear ask
  10. 'Error trapping is also setup
  11. '
  12. '(... setup for the file path etc...)
  13. ' I should move this to the module level... so that I don't have to keep closing the file and reopening... I'll change that after this post.
  14. '
  15.   Set zMsXml = CreateObject("MSXML2.DOMDocument.6.0")
  16. '
  17. 'set the parser parameters and load the document
  18.   With zMsXml
  19.     .resolveExternals = False
  20.     .validateOnParse = False
  21.     .async = True
  22.     .Load (zXmlSrcPath)
  23.   End With
  24. '
  25. 'fail if the document didn't properly load and tell why
  26.    If (zMsXml.parseError.errorCode) Then _
  27.        Err.Raise Number:=zMsXml.parseError.errorCode, _
  28.        Source:="XML Document Load Error", _
  29.        Description:=zMsXml.parseError.reason
  30.    End If
  31. '
  32. 'set the xPath call
  33. xPath = "//Root"
  34. '
  35. 'Pull the nodes from the XMLDocument
  36.   Set zNodeList = zMsXml.selectNodes(zXpath)
  37. '
  38. 'some stuff for the text file here - Freefile, Open, Print, Close
  39. '
  40. 'Cannot pass the zNodeList directly - no children
  41. 'just items so each item's child nodes have to be passed into the recursion
  42. '
  43.   For zNodeLength = 0 To (zNodeList.length - 1)
  44.     Call SpiderChildren(zNodeList.Item(zNodeLength).childNodes, 0, 0, zXmlParceFile)
  45. zFreeFile = FreeFile
  46. Open zXmlParceFile For Append As zFreeFile
  47. Print #zFreeFile, vbCrLf & "<*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*> <*>" & vbCrLf
  48. Close
  49.   Next zNodeLength
  50. '
  51. 'Cleanup and error traping...
  52. '
  53. End Sub
Expand|Select|Wrap|Line Numbers
  1. '>This is not air-code, this is the actual working code that produced the above text file.
  2. 'Here I'm going to just post the messy VBA
  3. 'It's a work in progress so the code isn't cleaned up in the slightest
  4. 'A rare insight as to how my mind processes information
  5. '
  6. Sub SpiderChildren(ByRef inNode As Object, ByVal zRecursion As Integer, ByVal zSpiderGroupLevel As Integer, zXmlParceFile As String)
  7.   Dim zSpiders As Object
  8.   Dim zSpider As Object
  9.   Dim zPrintSpider As Boolean
  10. '
  11. On Error GoTo zerrtrap
  12. '
  13.   zPrintSpider = True
  14. '
  15. '
  16.   Set zSpiders = inNode
  17.   For Each zSpider In zSpiders
  18.     'here's the recursion limit... currently set to stop at 100 deep
  19.     If zRecursion <= 100 Then
  20. ''
  21. ''<<I'll be moving this to a module level pointer
  22. ''<<so that I don't have to close and re-open
  23. ''<<the output text file with each recursion
  24. ''<<the file pointer was being lost
  25. ''<<mostly because this is *brain-storm* code
  26. ;;
  27. ''output to a text file, find a free number, close any open files, open the text file for output
  28. Dim zFreeFile As Long
  29. Close
  30. zFreeFile = FreeFile
  31. Open zXmlParceFile For Append As zFreeFile
  32. '
  33.       If zSpider.haschildNodes Then
  34. Print #zFreeFile, String(zRecursion, ".") & " " & zSpider.tagName & "::"
  35.         'Debug.Print String(zRecursion, ".") & " " & zSpider.tagName & "::"
  36.         zRecursion = zRecursion + 1
  37.         'group level
  38.         If zSpider.tagName = "Group" Then
  39.           zSpiderGroupLevel = zRecursion
  40. Print #zFreeFile, ">Group level: " & zSpiderGroupLevel
  41.           'Debug.Print ">Group level: " & zSpiderGroupLevel
  42.         End If
  43.         ''pass the child nodes to the next recursion
  44.         Call SpiderChildren(zSpider.childNodes, zRecursion, zSpiderGroupLevel, zXmlParceFile)
  45.         '
  46.         'don't print the parent node's text
  47.         zPrintSpider = False
  48.       End If
  49.     Else
  50.       'let the values print as is until the recusion is under the limit... not sure what will happen here...
  51.       zPrintSpider = False
  52.     End If
  53.     If zPrintSpider Then
  54. Print #zFreeFile, String(zRecursion, ".") & " (" & " ParentNode: " & zSpider.ParentNode.basename & ") (Spider: " & zSpider.basename & ")(text: " & zSpider.Text & ")"
  55.       'Debug.Print String(zRecursion, ".") & " (" & " ParentNode: " & zSpider.ParentNode.basename & ") (Spider: " & zSpider.basename & ")(text: " & zSpider.Text & ")"
  56.     End If
  57. ''    Debug.Print String(zRecursion, ".") & " " & " ParentNode: " & zSpider.ParentNode.baseName & " -- " & zSpider.baseName & ": " & zSpider.Text
  58.     zPrintSpider = True
  59.   Next zSpider
  60. '
  61. zExitCleanUp:
  62. On Error Resume Next
  63.   Close zFreeFile
  64.   If zRecursion > 0 Then zRecursion = zRecursion - 1
  65.   If Not zSpider Is Nothing Then Set zSpider = Nothing
  66.   If Not zSpiders Is Nothing Then Set zSpiders = Nothing
  67. Exit Sub
  68. zerrtrap:
  69. Stop
  70.   MsgBox Prompt:=Err.Source & vbCrLf & Err.Number & vbCrLf & Err.Description, title:="Error Trap"
  71.   Resume zExitCleanUp
  72. End Sub
Jul 27 '18 #5
PhilOfWalton
1,430 Expert 1GB
Correct.

I have found by putting sufficient "Clues" in before the info I want, I get to the correct answer.

I probably should have pointed out that the "Clues" must follow each other as in the image, with no gaps or intervening statements.

I guess the code could be modified, so that having got the first result, continue reading the XML until the next set of "Clues" is solved.

Phil
Jul 27 '18 #6
Rabbit
12,516 Expert Mod 8TB
So if you're recursing and you're passing in the info of the level you're at, doesn't that give you what you need?
Jul 28 '18 #7
zmbd
5,501 Expert Mod 4TB
Rabbit,
That's the part I'm having issues with, the information isn't available until after the recursive call.
SO the UUID=123 and Name=ABC isn't available until the recursive call is made to pull the childnodes to Line 1


Expand|Select|Wrap|Line Numbers
  1.   <Group>
  2.      <UUID>123</UUID>
  3.      <Name>ABC</Name>
  4.      <Comment />
  5.      <DefaultAuthUserStatusReq />
  6.      <Group>
  7.      <UUID>689</UUID>
  8.      <Name>FGH</Name>
  9.      <ItemDetail>
  10.         <UUID>ItemJKL</UUID>
  11.         <!-- Blah Blah Blah -->
  12.      </ItemDetail>
So when the Line 1 <Group> is parsed
Look at the code for
Sub SpiderChildren
Line 33 If zSpider.haschildNodes Then
Grabs the information
This evaluates to true (UUID and Name are child nodes>
So the information about the childnodes isn't available until the recursive call ...

I think I need to study the parsed file a bit more - this like relating Houston to Texas without knowing you're in Texas then to USA without knowing you're in Texas first to Earth to Solar System without knowing your in the Milky way Galaxy first.

I think I need to pull the XML down to something simple, maybe the <Root>/<Group>/<Name> and then a single child <Group>/<Name> instead of the larger data files...
Jul 28 '18 #8
zmbd
5,501 Expert Mod 4TB
"By Jove, Watson, I've got it! ... "

Or at least getting closer

Expand|Select|Wrap|Line Numbers
  1.         If zSpider.tagName = "Group" Then
  2.           Dim GroupSpiderName As Object
  3.           Set GroupSpiderName = zSpider.selectSingleNode("Name")
  4.           Debug.Print GroupSpiderName.Text
  5.           If Not GroupSpiderName Is Nothing Then Set GroupSpiderName = Nothing
  6.         End If
  7.  
In the immediate pane I get just the <Group>\<name>.text on each recursion so if I pass this through to the recursion call append the name, use the UBound(split())+1 to determine the level of the grouping...

Haven't implemented yet - didn't think the selectSinglNode was specific to the current node/childNode but I was running out of ways of getting that information.

I should be able to reduce the number of recursions to just specific spider.tagname in each Group node...
Jul 28 '18 #9
Rabbit
12,516 Expert Mod 8TB
You should only need to recurse if there is another group node embedded inside a group node.
Expand|Select|Wrap|Line Numbers
  1. For Each child node
  2.    Select node name
  3.       Case "UUID"
  4.          iUUD = ...
  5.       Case "Group"
  6.          recurse
  7.       Case "Name"
  8.          strName = ...
  9.    ......
Or something like that.
Jul 28 '18 #10
PhilOfWalton
1,430 Expert 1GB
If you would care to share, I would be very interested in seeing your final Db, because, as I have mentioned, I can pick up multiple bits of information from the Page Source, but other things, I can't find, but appear on the "Inspect Element".

Phil
Jul 28 '18 #11
zmbd
5,501 Expert Mod 4TB
Phil - absolutely once I have things a bit more polished.
Jul 28 '18 #12
PhilOfWalton
1,430 Expert 1GB
Thanks, I really appreciate that

I have just written a db for pulling multiple values from a Web Page (or XML text file) but it is a modified bit of code from a much larger project, so has a lot of irrelevant code.

I will clean it up and send you a copy to put straight into your delete folder.

Phil
Jul 28 '18 #13
PhilOfWalton
1,430 Expert 1GB
@ zmbd

I don't know how well you got on with your trees & leaves, but have just spent a few days in hospital (home now) and to pass the time, I knocked up this. It is from various sources, so is messy in the extreme, but it extracts some data from Websites an the data from your test file.

I have just chucked the results into the Debug window. In practice, I dare say they would be used to update a table.

Phil
Aug 2 '18 #14
zmbd
5,501 Expert Mod 4TB
I'll have to take a look, I've moved on to the Universe, Galaxies, StarSystems, Planets, and Countries... gave up on the trees :-)

I've a 3/4 parsed XML now; however, haven't had the time to sit back down and finish cleaning up the code. I'll post when done :)
Aug 2 '18 #15
PhilOfWalton
1,430 Expert 1GB
Which universe? I thought we were now in the Multi-Universe situation.

Come on zmbd widen your horizons a bit!!

Phil
Aug 2 '18 #16
zmbd
5,501 Expert Mod 4TB
I did...
[t_Universe]
[PK_U][UUuid][UName][UDateLastTouched]

[t_Galaxy]
[PK_G][FKUniverse[GUuid][GName][GDateLastTouched]

etc...

:)
String theory rules... String Theory for Kids (and Clever Adults)
Aug 2 '18 #17
zmbd
5,501 Expert Mod 4TB
Pulling stuff into tables now...
Lots of repeated code so I'm pulling these into their own procedures - in the old days I would have used a lot of GoSub routines for these sections.

As I pull the sections into their on procedures I'm altering the code to use arrays to pass table field and XML node names between the calling Recursive and the dependencies. This way I don't have to hardcode for each table

I think I also have the Group level issue fixed... next stage once I have the universe test version running.
Aug 4 '18 #18
Hagran
2
Can you please show us the latest version of your XML?
Because I had a pretty similar probleem myself, it is a real pain to manage all this data.
Aug 7 '18 #19
zmbd
5,501 Expert Mod 4TB
So it is WAY too late
and
I am WAY too tired to explain this file in it's entirety.

I've made two XML files they should be fairly easy to read

The code is fairly well commented.

This is only the POC for my final XML data file...
My issue with my final file is that the same tag-name "<Group>" is used like a psychotic nested-if-then so I still have some work to do on passing the recursion levels up and down the tree.

but that is for another day.
Attached Files
File Type: zip ParseXmlFile.zip (62.8 KB, 97 views)
Aug 8 '18 #20
PhilOfWalton
1,430 Expert 1GB
Brilliant piece of work, zmbd.

I am a timorous wee bestie, and wonder if we can combine your method and my method where I use the table Codes to load the Clues into the zSpiderTableFields array.

Then you could go on to study taxonomy of anything

Phil
Aug 9 '18 #21
zmbd
5,501 Expert Mod 4TB
Thank You PhilOfWalton.

I'll have to look at your database again; however, should be able to open a recordset against the "Codes" table and maybe feed the "Clues" directly thus, bypassing the array...

The next challenge is parsing the Big-XML
Where some psychotic nested the <Group> and other tags like nested IF-Then from hell (literally hundreds of these nesting's) reusing the same tag names over and over again with each child-node!
Expand|Select|Wrap|Line Numbers
  1. <Root>
  2.     <Group>
  3.         <UUID></UUID>
  4.         <Times>
  5.              <LastUpDate></LastUpDate>
  6.         </Times>
  7.         <Item></Item>
  8.         -some entries-
  9.         <Group>
  10.             <UUID></UUID>.
  11.             <Times>
  12.                 <LastUpDate></LastUpDate>
  13.             </Times>
  14.             <Item>
  15.                 -some entries-
  16.             </Item>
  17.             <Group>
  18.                 <UUID></UUID>
  19.                 <Times>
  20.                      <LastUpDate></LastUpDate>
  21.                 </Times>
  22.                 <Item>
  23.                      -some entries-
  24.                 </Item> 
  25.             </Group>
  26.             <Group>
  27.                  <UUID></UUID>
  28.                  <Times>
  29.                       <LastUpDate></LastUpDate>
  30.                   </Times>
  31.                  <Item>
  32.                       -some entries-
  33.                  </Item>
  34.             </Group>
  35.         </Group>
  36.     </Group>
  37. </Root>
Aug 9 '18 #22
Rabbit
12,516 Expert Mod 8TB
When you hit a nested group node, can't you recurse and pass in the parent's id?

Edit: Also, my preference would be to put the XML node -> table/field mappings into a metadata table so you can update the mappings more easily than if you had to do it through code.
Aug 9 '18 #23
zmbd
5,501 Expert Mod 4TB
Rabbit
Passing the parent was my initial thought too!
If Parent = Group and Current tag is = Group then we have a subgroup use that table!
Here's the rub, take line 8 above
My thought was that Parent of <UUID> should be <Group>
It doesn't parse that way - instead it is returning the node name.

AND it gets very funny - say <UUID> did return <Group> which <Group>? is it the GreatGrandParent, GrandParent, Parent? If it returns <Root> then at least you know you're at the top level

Expand|Select|Wrap|Line Numbers
  1. Print #zFreeFile, String(zRecursion, ".") & _
  2.     " (" & " ParentNode: " & zSpider.ParentNode.basename & _
  3.        ") (Spider: " & zSpider.basename & _
  4.            ")(text: " & zSpider.Text & ")"
  5. In the text file:
  6.  
  7. >Group level: 1
  8. <<< There's another Print op when the select-case is Groups that places this flush left for visual location<<
  9. Group::   <<Calling level
  10. . UUID:: 
  11. .. ( ParentNode: UUID) (Spider: )(text: Goup_100)
  12. .. Name::
  13. ... ( ParentNode: Name) (Spider: )(text: 100_Group)
  14. ... ( ParentNode: Group) (Spider: Notes)(text: )
  15. ... IconID::
  16. .... ( ParentNode: IconID) (Spider: )(text: 49)
  17. .... Times::
  18. ..... LastModificationTime::
  19. ...... ( ParentNode: LastModificationTime) (Spider: )(text: 2009-04-24T19:18:46Z)
  20. ...... CreationTime::
  21. ....... ( ParentNode: CreationTime) (Spider: )(text: 2007-04-18T09:28:58Z)
  22. ....... LastAccessTime::
  23. ........ ( ParentNode: LastAccessTime) (Spider: )(text: 2009-07-31T18:10:39Z)
  24. ........ ExpiryTime::
  25. ......... ( ParentNode: ExpiryTime) (Spider: )(text: 2007-04-18T09:28:58Z)
  26. ......... Expires::
  27. .......... ( ParentNode: Expires) (Spider: )(text: False)
  28. .......... UsageCount::
  29. ........... ( ParentNode: UsageCount) (Spider: )(text: 153)
> What I have now is this
Expand|Select|Wrap|Line Numbers
  1. SpiderChildren(byRef inChildNodeList as object, _
  2.     ByVal RecursionLevel as Long) 
  3. Select Case spider.tag
  4.  Case <tag=Group>
  5.   'blah
  6.    Select Case Recursion
  7.    Case Is <2
  8.      'blah
  9.      recursion = recursion + 1
  10.      call SpiderChildren(SpiderChildren,Recursion)
  11.      recursion = recursion + 1
  12.    Case Is >=2
  13.      'blah
  14.      call SpiderChildren(SpiderChildren,Recursion)
  15.   End Select
  16.  Case
  17. (...)
  18. End Select
I've flattened the Category/Sub-category structure they are using as the Sub/sub(/sub...)-category are more of a description so these are being moved into a related comments table
This appears to be working and is parsing the main and sub categories into the proper tables and the code is checking a M:M join table (I did have the 1:Category>M:SubCatagory relationship setup and then ran into a mess where the same subcategory was used with multiple catagories - sigh - I tried to cheat on the normalization - bytes.me.ITRET !
Truck - Green
Truck - Blue
Truck - Yellow
Car - Green
Car - Blue
Car - Yellow



As for the metadata table, I gave up on it.
For the example Database I posted the metadata table would be the way to go for ease of maintenance. I can see adding a record for say [>planet>Moons] and parsing to [table_moons].

What I have is very much as shown in the example XML in #22, while neutered, is essentially the format the file is in... how do map the nested <Group> when there are no attributes <Group ID=000>?
So take line 2, Line 9, line 17 so we have Parent, child, grandchild, great-grandchild
Add to root another child - how do you even begin to map this?
Even with a pedigree table... I simply gave up.
Expand|Select|Wrap|Line Numbers
  1. <Root>
  2. +
  3. -----<Group>
  4. +      +
  5. +      ---<Group>
  6. +           +
  7. +           ---<Group>
  8. +----<Group>
Missing, of course, are the <Item> entries under each group

I am very likely missing something with how to create the Metatable - If I could have figured out a XSLT for the import it would have been so much nicer.
Aug 9 '18 #24
Rabbit
12,516 Expert Mod 8TB
Sorry, when I said pass the parent, I wasn't referring to the parent node itself. I just meant passing something with which you can identify the parent. In this case, UUID.

From the way the XML is structured, it looks like some sort of hierarchical data. A company reporting structure or chain of command. Hence, why you need to know the parent group of the group you're processing.

By the time your code reaches the embedded group, it will have parsed the UUID of the parent. So you can pass that into the recursion.

Expand|Select|Wrap|Line Numbers
  1. call SpiderChildren(SpiderChildren,Recursion,parentUUID)
Aug 10 '18 #25
zmbd
5,501 Expert Mod 4TB
Finally,
I have data pulling in to the database and it looks correct!
I've written some code to instance the Excel file and start parsing through it to make sure the records match; however, the first 50 or so agreed.

as for the interface - my take on a "split-form" (I had to redact a few things; however, I think you will get the general idea)

Attached Images
File Type: jpg Capture.jpg (84.9 KB, 1984 views)
Aug 15 '18 #26
strive4peace
39 Expert 32bit
nice example, @zmbd. I tried using MSXML2.DOMDocument.6.0 to read the pages of my website but the HTML is too messy I guess. Things like <br> instead of <br /> apparently hang it up. My purpose is to get the pages of my site into Access so Access will eventually manage and generate the content. I ended up writing my own parser, and am still working on it. I don't know if how you did things will be helpful for my logic, but I plan to look again. Thanks!
Sep 14 '20 #27
SwissProgrammer
220 128KB
An expert asking questions and being guided to the answer by a team of other experts.

I humbly request that the final, working project, minus any sensitive data, be posted for future use.

Some day I might like to be able to do this and I do not want to have to struggle for years just to get to the level of the least of you.

Please.

Thanks.
Sep 24 '20 #28
twinnyfo
3,653 Expert Mod 2GB
@SwissProgrammer,

Believe it or not, I've probably learned much of what I know about MS Access from Bytes. I learned out of necessity because of the projects I've had to work with.

Just stick with it and you can do just about anything!
Sep 24 '20 #29
SwissProgrammer
220 128KB
twinnyfo,

How did you make that text red?
Sep 25 '20 #30
twinnyfo
3,653 Expert Mod 2GB
I used the [HIGHLIGHT] tag.

Kinda neat, isnt it.
Sep 26 '20 #31
SwissProgrammer
220 128KB
twinnyfo,

Thank you.

On page: https://bytes.com/misc.php?do=bbcode

this text is highlighted
Sep 26 '20 #32

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: XspiriX | last post by:
Hi I am new here and so in python :-) I am trying to make a script for extracting (parsing?) some values from a file and write them into an output i thought to follow this example (cookbook): ...
22
by: Jason Heyes | last post by:
Does this function need to call eof after the while-loop to be correct? bool read_file(std::string name, std::string &s) { std::ifstream in(name.c_str()); if (!in.is_open()) return false; ...
3
by: david | last post by:
Is there a stream reader that can pasrse files which are in HTML format (i.e can't be passed as normal text files)? thanks David
7
by: M | last post by:
Hi, I need to parse text files to extract data records. The files will consist of a header, zero or more data records, and a trailer. I can discard the header and trailer but I must split the...
3
by: yehaimanish | last post by:
I am developing an application by which to parse the content from the access_log and insert it into the database. Since each row is an different entry, I am using file() to get the contents into an...
2
by: Robbo | last post by:
I have set up a php script to send a notification email to customers one week after initially placing their order. Here's the code: <? # First, open a database connection #...
8
by: shivam001 | last post by:
I have the following file as the input APPLE 0 118 1 110 1 125 1 135 2 110 3 107 3 115 3 126 ORANGE 0 112 1 119 2 109 2 119 3 112 4 109 4 128 MANGO 0 136 1 143 2 143 3 143 4 136 BANANA 0 5 1...
4
by: Arengin | last post by:
HI everyone, I am new with Xerces-C but I managed to get a DOMTree runing and to extracte the data to the screen. However I need to write the Tree back to a file.... and this is where I am stuck. ...
0
by: =?Utf-8?B?Q2xhcmU=?= | last post by:
Hi groups, VS2005 + Mobile SDK 6 work well on my PC, but after installing platform builder 5.0, VS2005 shows “XML parsing error” when open *.sln. We reinstall VS2005, Mobile SDK and platform...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, youll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shllpp 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.