473,396 Members | 1,998 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

using XmlReader with namespace in a html file

mickey0
142 100+
Hello,
I must parse something like this:
Expand|Select|Wrap|Line Numbers
  1. <html>
  2. <head></head>
  3. <body>
  4. <using xmlns:namespace="myfile" />
  5. myText
  6. <p><namespace:myComponent Attribs="val"> //it crashes on 'namespace:myCOmponent'
  7. </body>
  8. </html>
  9.  
But my xml parser gets error on 'namespace:myComponent'. Is there a way to overcome this? The parser is this:
Expand|Select|Wrap|Line Numbers
  1.  public Parser(string fileContent) {
  2.     _settings.ConformanceLevel = ConformanceLevel.Fragment;
  3.     _settings.IgnoreWhitespace = true;
  4.     _settings.IgnoreComments = true;
  5.     _textReader = XmlReader.Create(fileContent, _settings);
  6.     _textReader.Read();
  7.     XmlNodeType nType = _textReader.NodeType;
  8.     while (_textReader.Read()) {
  9.        switch (_textReader.NodeType) {
  10.        case XmlNodeType.Element: // The node is an element.
  11.        Console.Write("<" + _textReader.Name);
  12.        Console.WriteLine(">");
  13.        break;
  14.        case XmlNodeType.Text: //Display the text in each element.
  15.        Console.WriteLine(_textReader.Value);
  16.        break;
  17.        case XmlNodeType.EndElement: //Display the end of the element. 
  18.        Console.Write("</" + _textReader.Name);       
  19.        Console.WriteLine(">");
  20.        break;
  21.     }
  22.   }
  23.  
Thanks
Jun 22 '10 #1
22 5109
mickey0
142 100+
Hello again,
no ideas about it?
Jun 26 '10 #2
Joseph Martell
198 Expert 128KB
Can you be more specific about the error you are receiving? Also, is there a reason you are declaring the namespace in such a fashion instead of as a normal xml/xhtml document:

Expand|Select|Wrap|Line Numbers
  1. <html xmlns:namespace="myfile">
Jun 27 '10 #3
Presumably in the example above the html document isn't what you used for real as it's incomplete.

Declaring the namespace on an element other than the root, is perfectly valid xml, so that shouldn't cause any issue.
Jun 27 '10 #4
Joseph Martell
198 Expert 128KB
True, it is valid, but isn't the namespace scope limited to the element that it is declared in? So using it in a self closing element means the namespace is not valid outside of the "<using..." tag?

If this is not correct then my apologies. I have limited experience with xml namespaces.
Jun 27 '10 #5
mickey0
142 100+
@jbm1313
That is my xml/html-like tag language. The Exception arises when the xmlReader sees the tag "<namespace:myComponent" as it says that the namespace hasn't been declared before.
I know it's not a fine XML but I need to manage that "my-own-tag language"; is there anyway to do that?
I was thinking something like this:
Expand|Select|Wrap|Line Numbers
  1. while (reader.next() ) {
  2.       switch(reader.NodeType) {
  3.              case STARTELEMENT:
  4.                       if (reader.Name == 'using' ) {
  5.                               //add as namespace in "somewhere" the part that follow "xmlns" in 'using' tag, eg. 'namespace'
  6.                        }
  7.                 break;
  8.              case: ....................................
  9.       }
  10. }
  11.  
  12.  
Jun 28 '10 #6
Ha! I missed that bit. That's entirely the problem. The namespace is declared on a self closing element, so is scoped to only that element. When the xmlreader encounters the namespace further down it throws a "namespace undeclared" exception. What you've got otherwise is invalid XML, which no xmlparser can read.

Change your xml to below or move the namespace declaration to the myComponent element.

Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0" encoding="utf-8" ?>
  2. <html>
  3.     <head></head>
  4.     <body>
  5.         <using xmlns:namespace="myfile">
  6.             myText
  7.             <p>
  8.                 <namespace:myComponent Attribs="val"> //it crashes on 'namespace:myCOmponent'</namespace:myComponent>
  9.             </p>
  10.         </using>
  11.     </body>
  12.  
  13. </html>
or this

Expand|Select|Wrap|Line Numbers
  1. <?xml version="1.0" encoding="utf-8" ?>
  2. <html>
  3.     <head></head>
  4.     <body>
  5.         <using/>
  6.             myText
  7.             <p>
  8.                 <namespace:myComponent xmlns:namespace="myfile" Attribs="val"> //it crashes on 'namespace:myCOmponent'</namespace:myComponent>
  9.             </p>
  10.     </body>
  11.  
  12. </html>
Jun 28 '10 #7
mickey0
142 100+
do you assure me that there is no way to keep my namespace way, and manage it in someway at programming language level?

thanks
Jun 28 '10 #8
Joseph Martell
198 Expert 128KB
I don't believe that you can without resorting to writing your own custom XML classes from scratch. Even if you add the schema to your xml document object programmatically (which can be done), the namespace prefix will still throw an exception during parsing because it does not have valid scope in your example case.

What you are talking about doing here is diverging from the way that the XML standard handles namespaces. That is fundamentally breaking to the .Net XML objects.

If I were you, I would take one of the changes suggested by simongh.
Jun 28 '10 #9
mickey0
142 100+
understood. Can you tell me what do you intend with 'schema'? I mean, if I use the tag </using>, the XMLReader work perfectly; so what do I have to add (with schema I mean)

regards.
Jun 28 '10 #10
Joseph Martell
198 Expert 128KB
@mickey0
Sorry, my mistake. I meant to say namespace, not schema.
Jun 28 '10 #11
mickey0
142 100+
basically doesn't change my question...........
Jun 28 '10 #12
Joseph Martell
198 Expert 128KB
Thanks for asking that last question. It made me dig deaper into the .Net XML objects.

In order to get your original scenario to work you would have to read the XML as a string, find the "<using..." tags, and then add them to a separate namespace manager that you use for parsing the string. MSDN has an article that shows something similar to what you are talking about.

MSDN Article

This will result in an XmlReader that reads through your xml file without generating an error.

Be aware that if you did something like this:

Expand|Select|Wrap|Line Numbers
  1. XmlDocument myDoc = new XmlDocument();
  2. myDoc.Load(reader);
then myDoc.OuterXml will reflect the addition of the new namespaces and you will wind up with a document very similar to version 2 of simongh's suggested fixes.

Using the XmlNamespaceManager does provide a work-around, but I still think that simongh's suggestions would be more correct because you would start off with readable, portable, and correct XML.
Jun 28 '10 #13
mickey0
142 100+
yes, it should be great; but I repeat this for completeness; are you sure that I can exploit what you say with out know the namespace keyword in advance? Can I embedded what you say here?
Expand|Select|Wrap|Line Numbers
  1. while (reader.next() ) {
  2.       switch(reader.NodeType) {
  3.              case STARTELEMENT:
  4.                       if (reader.Name == 'using' ) {
  5.                               //add as namespace in "somewhere" the part that follow "xmlns" in 'using' tag, eg. 'namespace'
  6.    ONLY HERE I KNOW THE NAME OF THE NAMESPACE i WILL ENCOUTER LATER, EG. 'NAMESPACE'....
  7.                        }
  8.                 break;
  9.              case: ....................................
  10.       }
  11. }
  12.  
  13.  
Jun 28 '10 #14
Joseph Martell
198 Expert 128KB
As far as I can tell, no your example code would NOT work. You cannot add new namespaces once the reader object has been instantiated. When I tried that in my example code I received an exception.

You would have to pull out the "<using..." tags manually BEFORE you instantiated your reader object. Changes to the XmlNamespaceManager object do not affect the XmlReader after the XmlReader has been instantiated.
Jun 29 '10 #15
mickey0
142 100+
@jbm1313
thanks. not to insist but maybe there is a way to do that? (you've understood fine what I was speaking of)
Jun 29 '10 #16
mickey0
142 100+
Can you tell me how switch off the namespace checking?
Jun 29 '10 #17
Joseph Martell
198 Expert 128KB
Pulling out the "<using..." tag is not necessarily difficult, just troublesome and probably slow. You would essentially have to do something like this:


Expand|Select|Wrap|Line Numbers
  1.                 NameTable nt = new NameTable();
  2.                 XmlNamespaceManager nsmgr = new XmlNamespaceManager(nt);
  3.  
  4.                 int usingStartIndex = 0;
  5.                 int usingPrefixIndex = 0;
  6.                 string usingPrefix = null;
  7.  
  8.                 int usingURIIndex = 0;
  9.                 string usingURI = null;
  10.                 usingStartIndex = xmlFrag.IndexOf("<using", usingStartIndex);
  11.  
  12.                 while (usingStartIndex >= 0)
  13.                 {
  14.                     usingPrefixIndex = xmlFrag.IndexOf(':', usingStartIndex) + 1;
  15.                     usingPrefix = xmlFrag.Substring(usingPrefixIndex, xmlFrag.IndexOf(' ', usingPrefixIndex) - usingPrefixIndex);
  16.  
  17.                     usingURIIndex = xmlFrag.IndexOf("=\"", usingStartIndex) + 2;
  18.                     usingURI = xmlFrag.Substring(usingURIIndex, xmlFrag.IndexOf('\"', usingStartIndex));
  19.  
  20.                     //add namespace info to namespace manager
  21.                     nsmgr.AddNamespace(usingPrefix, usingURI);
  22.  
  23.                     usingStartIndex = xmlFrag.IndexOf("<using", usingStartIndex);
  24.                 }
  25.  
This is untested, but it does compile.

To turn off the namespace checking, use:
Expand|Select|Wrap|Line Numbers
  1. reader.Namespaces = false;

This line MUST occur before any call to the Read() method on your reader object.
Jun 29 '10 #18
mickey0
142 100+
sorry, I meant that(quite new to C# as well):
http://msdn.microsoft.com/en-us/libr...amespaces.aspx

But your suggestion doens't compile:
Expand|Select|Wrap|Line Numbers
  1. Error    1    'System.Xml.XmlReader' does not contain a definition for 'Namespaces'    
  2.  
  3.  
Jun 29 '10 #19
Joseph Martell
198 Expert 128KB
What version of the .Net framework are you using?
Jun 29 '10 #20
mickey0
142 100+
using .NET 2005; BTW I've installed .net Frameworks 2.0 SP2, 3.0 SP2, 3.5 SP1 (why so many?)
Jun 30 '10 #21
Joseph Martell
198 Expert 128KB
I think that there are 2 options available to you. You can either construct an XmlTextReader object explicitly, OR you can look at the different versions of System.Xml.XmlReader.Create. There are several overloads that take different parameters. There are a couple of parameters that deal with how the XmlReader validates everything. I think that you should look at

http://msdn.microsoft.com/en-us/libr...t_members.aspx

and

http://msdn.microsoft.com/en-us/libr...rsettings.aspx
Jun 30 '10 #22
mickey0
142 100+
I saw them but I didn't find anything similar to 'disable namespace'.....
Jun 30 '10 #23

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Phong Ho | last post by:
Hi everyone, I try to write a simple web crawler. It has to do the following: 1) Open an URL and retrieve a HTML file. 2) Extract news headlines from the HTML file 3) Put the headlines into a...
10
by: Alex Hopson | last post by:
I'm trying to read an html file from my local server into a string, I'm using the following code: $attfile = $attachment; //create filenames $file_name = basename ($attfile); $lines =...
5
by: alejandro lapeyre | last post by:
How can I load / parse an HTML file with .NET? Thanks! Best regards, Alejandro Lapeyre
3
by: Pranav Shah | last post by:
What is the differrence between using the "using" caluse outside of the namespace definition and inside the namespace. Example Outside: using System; namespace Example.Outside { }
8
by: terrorix | last post by:
I also posted this article at: microsoft.public.dotnet.framework.aspnet -------- Hi, I Have this sql select(MS SQL): select a.ID, b.ID from Table as a
0
by: terrorix | last post by:
Hi, I Have this sql select(MS SQL): select a.ID, b.ID from Table as a left outer join table AS b
12
by: Keith Patrick | last post by:
Can someone tell me the difference in terms of actual implications using: namespace MyNamespace { using System; class MyClass {...} } vs. using System;
1
by: suresh_nsnguys | last post by:
Hi, I will allow users to download one zip file(contain HTML file and 1 image directory with -gif,jpeg files) to his local hard disk. HTML file contain source code to display images ...
1
by: Chris Fairles | last post by:
Possible? Take the following example: =================== //test1.hpp namespace B { typedef int type; void foo() {} } =================== //test.hpp
5
by: dbsmilr | last post by:
This is what I want to do: <html> <xml id="myXml"> <book title="book1" /> <book title="book2" /> </xml> <div id="out"><!-- I would output the xml formatted nicely for the user in this...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.