QUESTION:
Does anyone know how I can use v2.6 of the MSXML parser with .NET?
BACKGROUND:
I "Web to Print" process that allows our clients (newspapers) to export
their data and pass it thru a custom Xslt stylesheet we have created for
their print system. The idea of the whole process for them is they request
the export and then they get a text file they can import (copy / paste) into
their system with all their styles and layout according to their business
rules. This works great except one of our new clients uses and older system
which requires ascii characters which are excluded by the W3C
recommendation. So we can't use a true, fully compliant Xslt 1.0 processing
engine.
CURRENT STATUS:
Since v 2.6 allows all ascii characters I can get the stylesheet to work
from the command line usnig the MSXML command line utility by specifying
the -u (version) parameter. So now all I need to do is find a way to declare
which version of MSXML I am using from .NET.
Thank you in advance for any input. 19 4705
use com interop. from a .net project, select references. click on the COM tab and point to the MSXML.dll (the version you're talking about). you can then use an Imports MSXML statement in a clas
i use the msxml 4.0 version using Interop. you can check if this works for v2.
----- Mark Miller wrote: ----
QUESTION
Does anyone know how I can use v2.6 of the MSXML parser with .NET
BACKGROUND
I "Web to Print" process that allows our clients (newspapers) to expor
their data and pass it thru a custom Xslt stylesheet we have created fo
their print system. The idea of the whole process for them is they reques
the export and then they get a text file they can import (copy / paste) int
their system with all their styles and layout according to their busines
rules. This works great except one of our new clients uses and older syste
which requires ascii characters which are excluded by the W3
recommendation. So we can't use a true, fully compliant Xslt 1.0 processin
engine
CURRENT STATUS
Since v 2.6 allows all ascii characters I can get the stylesheet to wor
from the command line usnig the MSXML command line utility by specifyin
the -u (version) parameter. So now all I need to do is find a way to declar
which version of MSXML I am using from .NET
Thank you in advance for any input
Thanks,
It works great. I found a VB sample for working w/ v2.6 here: http://www.informit.com/articles/art...31449&seqNum=3
Here is what I did in C#:
public string Transform(string sXML, string XSLTPath){
StringBuilder sbXML = new StringBuilder();
MSXML2.DOMDocument26Class XmlDoc26 = new MSXML2.DOMDocument26Class();
MSXML2.FreeThreadedDOMDocument26Class XsltDoc26 = new
MSXML2.FreeThreadedDOMDocument26Class();
MSXML2.XSLTemplate26Class Xslt26 = new MSXML2.XSLTemplate26Class();
MSXML2.IXSLProcessor XslProc26;
XmlDoc26.async = false;
XmlDoc26.loadXML(sXML);
XsltDoc26.async = false;
XsltDoc26.load(XSLTPath);
Xslt26.stylesheet = XsltDoc26;
XslProc26 = Xslt26.createProcessor();
XslProc26.input = XmlDoc26;
if(XslProc26.transform()){
sbXML.Append(XslProc26.output);
}
return sbXML.ToString();
}
Please be aware, all, what you describe is not supported. http://support.microsoft.com/default...b;en-us;815112
-Dino
"Mark Miller" <bl***@blank.com> wrote in message
news:%2****************@TK2MSFTNGP09.phx.gbl... Thanks,
It works great. I found a VB sample for working w/ v2.6 here: http://www.informit.com/articles/art...31449&seqNum=3
Here is what I did in C#:
public string Transform(string sXML, string XSLTPath){ StringBuilder sbXML = new StringBuilder();
MSXML2.DOMDocument26Class XmlDoc26 = new MSXML2.DOMDocument26Class(); MSXML2.FreeThreadedDOMDocument26Class XsltDoc26 = new MSXML2.FreeThreadedDOMDocument26Class(); MSXML2.XSLTemplate26Class Xslt26 = new MSXML2.XSLTemplate26Class(); MSXML2.IXSLProcessor XslProc26;
XmlDoc26.async = false; XmlDoc26.loadXML(sXML);
XsltDoc26.async = false; XsltDoc26.load(XSLTPath);
Xslt26.stylesheet = XsltDoc26; XslProc26 = Xslt26.createProcessor();
XslProc26.input = XmlDoc26; if(XslProc26.transform()){ sbXML.Append(XslProc26.output); }
return sbXML.ToString(); }
is there a MS website that lists all their COM interops that they DON'T recommend/support
for e.g. i use the MSXML 4.0 and the Web Browser control all the time using .NET interop services. ofcourse i never bothered to check support issues mostly because it worked
tMan wrote: is there a MS website that lists all their COM interops that they DON'T recommend/support?
I think that's obvious that any neglecting native .NET functionality
isn't recommended.
for e.g. i use the MSXML 4.0 and the Web Browser control all the time using .NET interop services. ofcourse i never bothered to check support issues mostly because it worked.
I'm not sure about Web Browser control, but using MSXML in .NET is
definitely a bit weird. And as anything weird that's not recommended
(and usually is not supported, what is a different matter).
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Oleg,
Oleg Tkachenko [MVP] wrote: I'm not sure about Web Browser control, but using MSXML in .NET is definitely a bit weird. And as anything weird that's not recommended (and usually is not supported, what is a different matter).
From the pure usability standpoint, MSXML looks a more mature product;
I don't think it is weird to use it in .NET instead of a much weaker
aboriginal counterpart. We at RenderX pick MSXML as the default
parser/transformer for our XSL-FO implementation under .NET.
The reasons are:
- the .NET component is terribly slow - about 3 times slower than MSXML
in our tests. It takes nearly the same time to transform as to generate
the PDF, a real nonsense for people with Java background :-);
- it is less compliant: we could not overcome problems with the document()
function in XSLT, and attribute whitespace normalization in XML parser.
You can test it yourself: the .NET version of our tool has an option to
switch between the two. Just feed it a big file to transform (e.g. the XSL
spec itself), and feel the difference :-).
Regards,
Nikolai Grigoriev
RenderX
Oleg Tkachenko [MVP] wrote: I think that's obvious that any neglecting native .NET functionality isn't recommended
i agree on the recommendation
but using MSXML in .NET is definitely a bit weird. And as anything weird that's not recommended
i disagree. the computing world is full of heterogeneous systems, data sources, services, frameworks and legacy applications
my primary reason to use msxml was
- if i have (read client) invested heavily in a MSXML-DOM processing codebase, or for e.g. an ADO middle tier codebase that works really well, i won't necessarily rewrite everything in .NET. in other words i built a new .NET project but i used existing logic or components that were already there and working
here's a comparison between MSXML and .NET XM http://www.ondotnet.com/pub/a/dotnet...4/xsltperf.htm
in light of the document that is shared and if i were to accept the results,
it would seem interesting that MS would be pushing everyone off MSXML.
Any additional thoughts?
ice
"Oleg Tkachenko [MVP]" <oleg@NO!SPAM!PLEASEtkachenko.com> wrote in message
news:es**************@tk2msftngp13.phx.gbl... tMan wrote: is there a MS website that lists all their COM interops that they DON'T
recommend/support? I think that's obvious that any neglecting native .NET functionality isn't recommended.
for e.g. i use the MSXML 4.0 and the Web Browser control all the time
using .NET interop services. ofcourse i never bothered to check support
issues mostly because it worked. I'm not sure about Web Browser control, but using MSXML in .NET is definitely a bit weird. And as anything weird that's not recommended (and usually is not supported, what is a different matter). -- Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Ice wrote: in light of the document that is shared and if i were to accept the results, it would seem interesting that MS would be pushing everyone off MSXML.
What do you mean exactly?
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
tMan wrote: my primary reason to use msxml was: - if i have (read client) invested heavily in a MSXML-DOM processing codebase, or for e.g. an ADO middle tier codebase that works really well, i won't necessarily rewrite everything in .NET. in other words i built a new .NET project but i used existing logic or components that were already there and working.
Using legacy applications written using MSXML is one thing. Legacy is
legacy. Nobody would argue against using it as far as it works ok.
Using MSXML in new .NET applications is another thing, which is not
recommended and is not supported by Microsoft due to unexpected and
hard-to-debug interop problems.
here's a comparison between MSXML and .NET XML http://www.ondotnet.com/pub/a/dotnet.../xsltperf.html
Well, that's not the comparison I'd consider seriously. The author even
didn't bother to use XPathDocument, which is primary source to XSLT in .NET.
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Nikolai Grigoriev wrote: - the .NET component is terribly slow - about 3 times slower than MSXML in our tests. It takes nearly the same time to transform as to generate the PDF, a real nonsense for people with Java background :-);
I don't think it's slower Java's XSLT processors. At least I didn't
found it so. Of course I'm talking about .NET 1.1 and XPathDocument.
MSXML is faster, but not without interop price.
- it is less compliant: we could not overcome problems with the document() function in XSLT, and attribute whitespace normalization in XML parser.
Hmmm, what's the problem with document() function?
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Oleg,
Oleg Tkachenko [MVP] wrote: I don't think it's slower Java's XSLT processors. At least I didn't found it so. Of course I'm talking about .NET 1.1 and XPathDocument. MSXML is faster, but not without interop price.
It may depend on the task. Here's what I get on a typical task of ours,
namely applying an XSL-FO stylesheet to produce the XSL-FO
version of the XSL-FO spec itself:
C:\> msxsl -t -o spec.fo xslspec.xml xmlspec20.xsl
Microsoft (R) XSLT Processor Version 4.0
Stylesheet execution time: 1077 milliseconds
C:\> nxslt -t -o spec.fo xslspec.xml xmlspec20.xsl
..NET XSLT command line utility, version 1.4 (Running under .NET {0}.{1})
Stylesheet execution time: 14325.210 milliseconds
C:\> java com.icl.saxon.StyleSheet -ds -t -o spec.fo xslspec.xml
xmlspec20.xsl
SAXON 6.5.3 from Michael Kay
Java version 1.4.1_01
Execution time: 4547 milliseconds
Hmmm, what's the problem with document() function?
It's in the resolution of the document('') function, with empty string
as the argument. Weirdly enough, it does not work unless the
stylesheet is loaded through an explicit URL. I mean that it works
in this case:
XslTransform xslt = new XslTransform();
xslt.Load("stylesheet.xsl");
but does not work if we load the stylesheet by some other method,
e.g. through XPathDocument:
XslTransform xslt = new XslTransform();
XPathDocument style = new XPathDocument("stylesheet.xsl");
xslt.Load(style, new XmlUrlResolver(), null);
or through an XMLReader:
XslTransform xslt = new XslTransform();
XmlReader reader =
new XmlValidatingReader (new XmlTextReader("stylesheet.xsl"));
xslt.Load(reader, new XmlUrlResolver(), null);
no matter which resolver you pass to the Transform() method. This
leaves us with no chance to load the stylesheet through a custom
XmlReader (highly desirable because no system reader implements
whitespace handling conformantly to XML 1.0 spec).
I am still unsure if the above is a bug, or a feature. It could be related
to the necessity of having a base URL to resolve the document()
function; but [1] both XPathNavigator and XmlReader have a BaseURI
property, [2] one hardly needs it to address the contents of the stylesheet
itself. I'm really clueless.
Regards,
Nikolai Grigoriev
RenderX
Nikolai Grigoriev wrote: C:\> nxslt -t -o spec.fo xslspec.xml xmlspec20.xsl .NET XSLT command line utility, version 1.4 (Running under .NET {0}.{1})
Oh, that "{0}.{1}" is silly bug in nxslt 1.4 :(
What's .NET version you are using here? Most likely it has something to
do with deadly slow xsl:key imnplementation, which AFAIR was fixed in
latest .NET SP (not sure if it was released though).
It's in the resolution of the document('') function, with empty string as the argument. Weirdly enough, it does not work unless the stylesheet is loaded through an explicit URL. I mean that it works in this case:
XslTransform xslt = new XslTransform(); xslt.Load("stylesheet.xsl");
but does not work if we load the stylesheet by some other method, e.g. through XPathDocument:
XslTransform xslt = new XslTransform(); XPathDocument style = new XPathDocument("stylesheet.xsl"); xslt.Load(style, new XmlUrlResolver(), null);
or through an XMLReader:
XslTransform xslt = new XslTransform(); XmlReader reader = new XmlValidatingReader (new XmlTextReader("stylesheet.xsl")); xslt.Load(reader, new XmlUrlResolver(), null);
no matter which resolver you pass to the Transform() method. This leaves us with no chance to load the stylesheet through a custom XmlReader (highly desirable because no system reader implements whitespace handling conformantly to XML 1.0 spec).
I am still unsure if the above is a bug, or a feature. It could be related to the necessity of having a base URL to resolve the document() function; but [1] both XPathNavigator and XmlReader have a BaseURI property, [2] one hardly needs it to address the contents of the stylesheet itself. I'm really clueless.
Regards, Nikolai Grigoriev RenderX
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Oleg Tkachenko [MVP] wrote: C:\> nxslt -t -o spec.fo xslspec.xml xmlspec20.xsl .NET XSLT command line utility, version 1.4 (Running under .NET {0}.{1})
Oh, that "{0}.{1}" is silly bug in nxslt 1.4 :( What's .NET version you are using here? Most likely it has something to do with deadly slow xsl:key imnplementation, which AFAIR was fixed in latest .NET SP (not sure if it was released though).
Here are results on my Win2K box with .NET 1.1 (not sure about which
..NET SP though):
D:\...\RenderX\XEP.NET 3.7\examples\xmlspec>nxslt -t -o spec.fo
xml2e.xml xmlspec20.xsl
..NET XSLT command line utility, version 1.5 (Running under .NET 1.1)
Source document load time: 246.344 milliseconds
Stylesheet document load time: 1.172 milliseconds
Stylesheet compile time: 251.400 milliseconds
Stylesheet execution time: 5482.756 milliseconds
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Nikolai Grigoriev wrote: It's in the resolution of the document('') function, with empty string as the argument. Weirdly enough, it does not work unless the stylesheet is loaded through an explicit URL. I mean that it works in this case:
XslTransform xslt = new XslTransform(); xslt.Load("stylesheet.xsl");
but does not work if we load the stylesheet by some other method, e.g. through XPathDocument:
XslTransform xslt = new XslTransform(); XPathDocument style = new XPathDocument("stylesheet.xsl"); xslt.Load(style, new XmlUrlResolver(), null);
or through an XMLReader:
XslTransform xslt = new XslTransform(); XmlReader reader = new XmlValidatingReader (new XmlTextReader("stylesheet.xsl")); xslt.Load(reader, new XmlUrlResolver(), null);
I think that's because of null Evidence. In fact MSDN states that about
Evidence argument of XslTransform.Load() method:
"If this is a null reference (Nothing in Visual Basic), script blocks
are not processed, the XSLT document() function is not supported, and
privileged extension objects are disallowed."
The following seems to be working:
xslt.Load(reader, new XmlUrlResolver(),
Assembly.GetExecutingAssembly().Evidence);
Also take a look at XmlResolver - you can implement any resolving logics
with it.
I am still unsure if the above is a bug, or a feature. It could be related to the necessity of having a base URL to resolve the document() function; but [1] both XPathNavigator and XmlReader have a BaseURI property, [2] one hardly needs it to address the contents of the stylesheet itself. I'm really clueless.
Btw, it's not always possible to support document('') at all. When
stylesheet is compiled and then can be saved (e.g. as with Apache
XSLTC), then compiled stylesheet only holds original URI of the
stylesheet document, which is obviously not always available from the
run-time environment.
Couple of years ago Mike Kay said that's ok for XSLT processor not to
support document('') in such cases.
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Oleg,
Oleg Tkachenko [MVP] wrote: What's .NET version you are using here? Most likely it has something to do with deadly slow xsl:key imnplementation, which AFAIR was fixed in latest .NET SP (not sure if it was released though).
I did my tests with the retail version of .NET Framework 1.1, build 4322,
no extra SPs installed. I need this component to be part of an application,
so I cannot afford to be picky about Service Packs. A component is good
only if it performs decently on something that a user gets by default -
I cannot blame clients for not installing SPs.
D:\...\RenderX\XEP.NET 3.7\examples\xmlspec>nxslt -t -o spec.fo xml2e.xml xmlspec20.xsl
Actually, I was running the same stylesheet but on a different document :-): http://www.w3.org/TR/2001/REC-xsl-20011015/xslspec.xml
It's the XSL spec, 400+ pages in the final PDF, ca 1 MB of XML source.
On these documents, the MSXML performance starts making the
difference; and unfortunately, they're typical enough.
Regards,
Nikolai Grigoriev
RenderX
Oleg,
Oleg Tkachenko [MVP] wrote: I think that's because of null Evidence. In fact MSDN states that about Evidence argument of XslTransform.Load() method: "If this is a null reference (Nothing in Visual Basic), script blocks are not processed, the XSLT document() function is not supported, and privileged extension objects are disallowed."
Thank you for the hint - this actually did the trick! I am still a bit
puzzled why document() requires different permissions than
xsl:import, and why document('') requires any permission settings
at all - but at least it behaves according to the docs.
Btw, it's not always possible to support document('') at all. When stylesheet is compiled and then can be saved (e.g. as with Apache XSLTC), then compiled stylesheet only holds original URI of the stylesheet document, which is obviously not always available from the run-time environment. Couple of years ago Mike Kay said that's ok for XSLT processor not to support document('') in such cases.
My case is nowhere as special as XSLTC. I just need a transformer that
supports XSLT 1.0 in its entirety, when passed XML data through standard
interfaces. I really find it weird that I need two additional steps to
confirm that I do want it to behave according to the spec: first an
explicit XmlResolver to enable imports, and then an explicit Evidence
to enable document(). Wouldn't it be more natural to provide a full
support for XSLT by default, and require some extra effort
to disable its parts - if one really has reasons to do so?
(After the XmlTextReader half-parser nightmare, I should have
been prepared to something like that; but I am amazed nevertheless).
Regards,
Nikolai Grigoriev
RenderX
Nikolai Grigoriev wrote: Actually, I was running the same stylesheet but on a different document :-):
http://www.w3.org/TR/2001/REC-xsl-20011015/xslspec.xml
It's the XSL spec, 400+ pages in the final PDF, ca 1 MB of XML source. On these documents, the MSXML performance starts making the difference; and unfortunately, they're typical enough.
Oh, my mistake.
Hmm, with XSL spec things look scary on my apparently rusty Dell box:
D:\...\RenderX\XEP.NET 3.7\examples\xmlspec>nxslt -t -o spec.fo
xslspec.xml xmlspec20.xsl
..NET XSLT command line utility, version 1.5 build 1591
Running under .NET 1.1.4322.985
Source document load time: 1417.447 milliseconds
Stylesheet load/compile time: 476.787 milliseconds
Stylesheet execution time: 53503.800 milliseconds
D:\...\RenderX\XEP.NET 3.7\examples\xmlspec>java
com.icl.saxon.StyleSheet -t -o spec.fo xslspec.xml xmlspec20.xsl
SAXON 6.5.2 from Michael Kay
Java version 1.4.2_03
Loading com.icl.saxon.sort.Compare_en
Preparation time: 2363 milliseconds
Processing file:/D:/Program Files/RenderX/XEP.NET
3.7/examples/xmlspec/xslspec.xml
Building tree for file:/D:/Program Files/RenderX/XEP.NET
3.7/examples/xmlspec/xslspec.xml
using class com.icl.saxon.tinytree.TinyBuilder
Tree built in 3415 milliseconds
Execution time: 77542 milliseconds
Well, that benchmarking is so convolute stuff. Forget it.
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com
Nikolai Grigoriev wrote: (After the XmlTextReader half-parser nightmare, I should have been prepared to something like that; but I am amazed nevertheless).
Btw, have you tried to turn on XmlTextReader.Normalization property?
It's false by default.
--
Oleg Tkachenko [XML MVP] http://blog.tkachenko.com This discussion thread is closed Replies have been disabled for this discussion. Similar topics
9 posts
views
Thread by LarryR |
last post: by
|
6 posts
views
Thread by Alfred Taylor |
last post: by
|
1 post
views
Thread by Michael McCarthy |
last post: by
|
reply
views
Thread by MLH |
last post: by
|
1 post
views
Thread by Michael Nemtsev |
last post: by
|
reply
views
Thread by MLH |
last post: by
|
3 posts
views
Thread by Sharon |
last post: by
|
5 posts
views
Thread by Jeroen |
last post: by
|
4 posts
views
Thread by Nuno |
last post: by
| | | | | | | | | | |