473,322 Members | 1,232 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

XML to database?

I have a requirement to read element values (can be in thousands) from an
xml file, and eventually write them all the SQL Server database.
I was thinking we could first write all the values from xml to some
datastructure (arraylist etc.) and then write from that datastructure to the
DB with a single database call.

Which would be the best possible datastructure to achieve this? Any
pointers/suggestions are appreciated.
Nov 20 '07 #1
7 1704
One easy way to approach this is to use the DataSet class' ReadXml method to
load your Xml into a DataSet, find the respective tables that contain what
you need, and pass it into a SqlDataAdapter's Update method. Done!

--Peter
"Inside every large program, there is a small program trying to get out."
http://www.eggheadcafe.com
http://petesbloggerama.blogspot.com
http://www.blogmetafinder.com

"Gary" wrote:
I have a requirement to read element values (can be in thousands) from an
xml file, and eventually write them all the SQL Server database.
I was thinking we could first write all the values from xml to some
datastructure (arraylist etc.) and then write from that datastructure to the
DB with a single database call.

Which would be the best possible datastructure to achieve this? Any
pointers/suggestions are appreciated.
Nov 20 '07 #2

See an example here:
http://groups.google.com/group/micro...d13f38f82bec77

But basically, creating a strong dataset ... and then sending in the
ds.GetXml() to the stored procedure will do the trick.

I do this alot... as it gives me "set based" abilities, with well defined
xml via a (strongly typed) dataset.

"Gary" <ba****@gmail.comwrote in message
news:un**************@TK2MSFTNGP05.phx.gbl...
>I have a requirement to read element values (can be in thousands) from an
xml file, and eventually write them all the SQL Server database.
I was thinking we could first write all the values from xml to some
datastructure (arraylist etc.) and then write from that datastructure to
the DB with a single database call.

Which would be the best possible datastructure to achieve this? Any
pointers/suggestions are appreciated.


Nov 20 '07 #3
I think this is a good approach (I have used similar in a bulk-data,
semi-structure scenario) - but just to highlight that SQL Server 2005
(and above) has the "xml" data-type, which allows you to handle xml
(shred, query, build, update, index, etc) /much/ more efficiently than
OPENXML can. Of course, if it is just storage/retrieval then go
"varbinary(max)" and have done with it (varbinary avoids codepage
issues associated with [n]varchar).

As an example, you can pass a single string over to an SP, cast it to
xml in the database and tear it apart with XPath; minimises round
trips.

Marc
Nov 20 '07 #4

That will definately work (peter's idea).

The caveat here is that... it will end up being a "row by row" update....

The OPENXML allows a 1 db hit, bulk (set based) approach (see my other
post).

Do the development goals need to be understood.

Does the OP want fastest way to implement, or best performing? (of which I
don't know the answer).

...

One of the key things about the setbased approach over the row-by-row is
that index rebuilding may be expensive.
The set-based approach allows index to re-update after one big set-based
insert or update, rather than after every row-by-row update.
The OP does mention "in the thousands", thus a set-based/bulk insert would
be the way I would go here.
You can throttle the number of records that go into the dataset as well, say
....every 1000 records in the dataset, gets sent to the db.

I've handled 4 MB xml files with no issues using the OPENXML method, I have
not tested the upper limits, so don't know beyond that.

But most of my stuff is under 1/3 meg ....which works well with OPENXML
IMHO.
You can also throw in some BEGIN TRAN type stuff into the usp (stored
procedure) so you get a very nice "all or nothing" ability.


"Peter Bromberg [C# MVP]" <pb*******@yahoo.NoSpamMaam.comwrote in message
news:AB**********************************@microsof t.com...
One easy way to approach this is to use the DataSet class' ReadXml method
to
load your Xml into a DataSet, find the respective tables that contain what
you need, and pass it into a SqlDataAdapter's Update method. Done!

--Peter
"Inside every large program, there is a small program trying to get out."
http://www.eggheadcafe.com
http://petesbloggerama.blogspot.com
http://www.blogmetafinder.com

"Gary" wrote:
>I have a requirement to read element values (can be in thousands) from an
xml file, and eventually write them all the SQL Server database.
I was thinking we could first write all the values from xml to some
datastructure (arraylist etc.) and then write from that datastructure to
the
DB with a single database call.

Which would be the best possible datastructure to achieve this? Any
pointers/suggestions are appreciated.

Nov 20 '07 #5
Yeah, my suggestion (another post) will work with either 2000 or 2005.

If the OP gives the version of Sql Server, that would present a few more
options (to which you are illuding).


"Marc Gravell" <ma**********@gmail.comwrote in message
news:35**********************************@41g2000h sh.googlegroups.com...
>I think this is a good approach (I have used similar in a bulk-data,
semi-structure scenario) - but just to highlight that SQL Server 2005
(and above) has the "xml" data-type, which allows you to handle xml
(shred, query, build, update, index, etc) /much/ more efficiently than
OPENXML can. Of course, if it is just storage/retrieval then go
"varbinary(max)" and have done with it (varbinary avoids codepage
issues associated with [n]varchar).

As an example, you can pass a single string over to an SP, cast it to
xml in the database and tear it apart with XPath; minimises round
trips.

Marc

Nov 20 '07 #6
I've handled 4 MB xml files with no issues using the OPENXML method, I have
no tested the upper limits, so don't know beyond that.
Just for info; for big imports, I have successfully used a hybrid
XmlReader / IDataReader to present the first-level elements (which are
frequently the most common) as successive rows in a data-reader - i.e.

<xml><row>...</row><row>...</row><row>...</row></xml>

would get parsed as 3* rows in an IDataReader, but it works while /
streaming/ rather than having to load it all into memory first. In
particular you can pass your spoof IDataReader directly to an
SqlBulkCopy to pump it (again, streaming) into a staging table (with a
key to identify the batch), before finally invoking an SP to commit
the whole lot into the real tables. This allows you to limit any
transaction to just the final merge, avoiding long locks during the
import process (especially important as I was getting chunks from a
remote client). You could also stream into a single record using
streamed BLOB/CLOB methods, but the xml-reader approach allows you
more flexibility to apply a schema and manipulate the data at the same
time.

Just sharing for info... I have posted the hybrid reader previously
(this forum) if anybody wants it.

(*=obviously the real number was significantly larger!)

Marc

Nov 20 '07 #7

Yeah, I kinda do a similar thing.

I use this approach when I have business rules for the imported data.

I use an IDataReader on the source data.
If a row passes the business rule test, then I add it to a (strong) dataset.

Every X number of rows (that pass the test), I send the ds.GetXml() to the
usp with the OPENXML method.

I then get a new (strong) ds, and start over until the IDataReader is done
reading.

I usually set my number to 1000, but I have this configurable.

The key is the business rules. This allows me to write DotNet code to
verify a row, before shipping it to the the (strong) ds, which of course
cuts it off from getting to the db altogether.

So its kind of the same concept, but not exactly. But both are memory
footprint aware .........

Sometimes I log the bad rows....sometimes I don't care about them.

"Marc Gravell" <ma**********@gmail.comwrote in message
news:24**********************************@p69g2000 hsa.googlegroups.com...
>I've handled 4 MB xml files with no issues using the OPENXML method, I
have
no tested the upper limits, so don't know beyond that.

Just for info; for big imports, I have successfully used a hybrid
XmlReader / IDataReader to present the first-level elements (which are
frequently the most common) as successive rows in a data-reader - i.e.

<xml><row>...</row><row>...</row><row>...</row></xml>

would get parsed as 3* rows in an IDataReader, but it works while /
streaming/ rather than having to load it all into memory first. In
particular you can pass your spoof IDataReader directly to an
SqlBulkCopy to pump it (again, streaming) into a staging table (with a
key to identify the batch), before finally invoking an SP to commit
the whole lot into the real tables. This allows you to limit any
transaction to just the final merge, avoiding long locks during the
import process (especially important as I was getting chunks from a
remote client). You could also stream into a single record using
streamed BLOB/CLOB methods, but the xml-reader approach allows you
more flexibility to apply a schema and manipulate the data at the same
time.

Just sharing for info... I have posted the hybrid reader previously
(this forum) if anybody wants it.

(*=obviously the real number was significantly larger!)

Marc

Nov 20 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Cherrish Vaidiyan | last post by:
sir, The following are the steps that i followed in setting up standby database on Red hat Linux 9. i am using Oracle 9i. i have followed the steps in this site : ...
6
by: Marvin Libson | last post by:
Hi All: I am running DB2 UDB V7.2 with FP11. Platform is Windows 2000. I have created a java UDF and trigger. When I update my database I get the following error: SQL1224N A database...
8
by: Kamlesh | last post by:
Hi, How do I know the physical database path of a database. When I goto the DB2INSTANCE users's directory (/home/db2inst1), I see following folders: /db2inst1/NODE0000/SQL00001...
1
by: pintur | last post by:
The message is: SQL1036C Errore di I/O durante l' accesso al database. SQLSTATE=58030 what is the proble? what for restore tables? thanks
3
by: josh.kuo | last post by:
Sorry about the subject, I can't think of a better one. I recently wrote some PHP classes that I think might be of interest to this group. Since I have been reaping the benefits of reading news...
8
by: morleyc | last post by:
Hi, until recently i was quite happy to add data sources from mssql database in visual studio and drag the datasets directly onto the form this creating a directly editable form which worked well....
0
by: Jack | last post by:
Training Classes for Oracle10g, 9i, 8i Certification training in Oracle10g and 9i: DBA, Developer, Discoverer. training conducted at your location worldwide. Courseware licensing also available....
0
by: Winder | last post by:
Training Classes for Oracle10g, 9i, 8i Certification training in Oracle10g and 9i: DBA, Developer, Discoverer. training conducted at your location worldwide. Courseware licensing also available....
0
by: Laurynn | last post by:
# (ebook - pdf) - programming - mysql - php database applicati # (Ebook - Pdf)Learnkey How To Design A Database - Sql And Crystal Report # (ebook-pdf) E F Codd - Extending the Database Relational...
9
by: Peter Duniho | last post by:
Is there a straightfoward API in .NET that allows for inspection of a database? That is, to look at the structure of the database, without knowing anything in advance about it? For example,...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.