> > >
Does anybody know a command line tool (linux or windows) that can compare 2 XML files. There's only 1 extra requirement that makes it a bit more
complicated: the order of attributes or elements within an element is of no importance, and so the tool should still report that 2 XML files are equal even though the order is different.
Corno
Is the generic problem of comparing two unordered labeled trees
NP-complete? I think it had been proven to be NP-complete.
It would seem to me that it's not; you could first write them in (a)
canonical form and then compare them.
Corno
You said order of ELEMENTS doesn't matter too.
EX:
<a>
<c/>
<b>
<c/>
</b>
<b>
<d/>
</b>
</a>
and
<a>
<b>
<d/>
</b>
<c/>
<b>
<c/>
</b>
</a>
are equal (per your first e-mail)
So simple canonicalization is not going to help you as it doesn't
change the order of elements.
In other hand to compare two elements which have children
but have same name and same set of attributes you will run into same
tree comparison problem (you have to compare two subtrees of these
elements).
Anyway I should have been more clear in my first post
It has been proven that computing the edit distance for unordered
labeled trees is NP-complete.
Zhang, R. Statman, D. Shasha, "On the editing distance between
unordered labeled trees",
It seems you are interested only in matching two unordered labeled
trees.
(Which is actually making sure that editing distance is zero).
This task could be easier, but I'm not sure about that.