Hi,
I've got a simple console app that just reads an XML file into a DataSet
then prints out a description of each table in the DataSet, including
column names and row values for each column. I'm getting some strange
results depending the input XML file I use. I was wondering if somebody
could help me understand what is going on or point me to a good reference.
The code for my program looks like this:
using System;
using System.Data;
static void Main(string[] args)
{
DataSet ds = new DataSet();
ds.ReadXml("test.xml");
PrintDataSet(ds);
}
public static void PrintDataSet(DataSet ds)
{
Console.WriteLine("DataSet name: " + ds.DataSetName);
foreach (DataTable dt in ds.Tables)
{
int rowCount = dt.Rows.Count;
Console.WriteLine("\nTable: " + dt.ToString() + " (" + rowCount + " rows)");
foreach (DataColumn dc in dt.Columns)
{
Console.WriteLine("Column: " + dc.ColumnName);
for (int i = 0; i < rowCount; i++)
{
Console.WriteLine("Row " + i + ": " + dt.Rows[i][dc].ToString());
}
}
}
}
When "test.xml" looks like this:
<?xml version="1.0" standalone="yes"?>
<test>
<product>Product 1</product>
<customer>
<name>Bill</name>
<company>Bill's Co.</company>
</customer>
<customer>
<name>Sue</name>
<company>Sue's Co.</company>
</customer>
</test>
The output looks like this:
DataSet name: NewDataSet
Table: test (1 rows)
Column: product
Row 0: Product 1
Column: test_Id
Row 0: 0
Table: customer (2 rows)
Column: name
Row 0: Bill
Row 1: Sue
Column: company
Row 0: Bill's Co.
Row 1: Sue's Co.
Column: test_Id
Row 0: 0
Row 1: 0
However when "test.xml" looks like this (NOTE: the only difference
is in the <product> element):
<?xml version="1.0" standalone="yes"?>
<test>
<product>
<name>Product 1</name>
</product>
<customer>
<name>Bill</name>
<company>Bill's Co.</company>
</customer>
<customer>
<name>Sue</name>
<company>Sue's Co.</company>
</customer>
</test>
The output looks like this:
DataSet name: test
Table: product (1 rows)
Column: name
Row 0: Product 1
Table: customer (2 rows)
Column: name
Row 0: Bill
Row 1: Sue
Column: company
Row 0: Bill's Co.
Row 1: Sue's Co.
Questions:
1) I think I see why the test_Id column gets created "on the fly" for
the "customer" table in the first XML example. A "test" table got
created which might have multiple rows. So each row of the "customer"
table has to have some way of relating back to its parent table row,
which is what "test_Id" contains. But why does the "test" table
itself need a "test_Id" column?
2) In the first example a "test" table got created and the name of the
DataSet stayed at the default "NewDataSet". In the second example there
was no "test" table created. Instead the name of the DataSet is "test"
and there is a "product" table and "customer" table. Since there is no
"test" table there is no need for the "test_Id" column in any of the
tables that do exist. Why does it work this way? Why would just changing
that one element (ie. <product>) make such a big difference in the way the
way the DataSet is constructed.
Any help appreciated. And again if you know of some good reference books
or links where these issues are discussed that would be fantastic.
Thanks in advance.
Bill