Albert Leibbrandt wrote:
Hi
Just want to check which xml parser you guys have found to be the
quickest. I have xml documents with 250 000 records or more and the
processing of these documents are taking way to long. The validation is
the main problem. Any module names, non validating would be find to,
would help a lot.
It would help us help you if you posted samples of the target docs.
XML processing strategy often depends on the structure of the XML, just
as relational query optimization strategy often depends on the schema.
In general SAX or iterative tree-callback methods will give you the
best speed. Fredrik already mentioned ElementTree's IterParse.
Amara's pushbind and pushdom and 4Suite's Saxlette (which has some neat
callback features) are other options.
http://uche.ogbuji.net/tech/4suite/amara/ http://4suite.org/docs/CoreManual.xml#saxlette
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org
Articles:
http://uche.ogbuji.net/tech/publications/