473,396 Members | 1,924 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

XML Parser Components?

Hi Folks,

This is my first post to this group, and I really am not sure whether
this is the right group to ask my question. If its not an appropriate
question to this group, please correct me and guide me to the right
place.

The thing is, I have been asked to design a XML parser using C. I have
done some study on XML so far and I know that I should have a design
before I start my coding.

And since I am new to the part of parser, I really am confused about
what would be components of my parser. All I know now is that I need a
validating component that validates the XML file, which should then
pass the XML file on to the parsing component for parsing.

My confusion lies on the parsing component. Its like I can't decide
what should be the sub-components of the parsing component.

Would some of you people be kind enough to enlighten me on this issue.

Thanks in Advance.

Mahesh.

Sep 14 '06 #1
6 1436
ma**************@gmail.com wrote:
validating component that validates the XML file, which should then
pass the XML file on to the parsing component for parsing.
It's usually done the other way around -- write a nonvalidating parser
to deal with the syntactic issues, then attach the validator to that.
(That isn't the only solution, or always the best solution, just the
easiest way to think about the problem.)
My confusion lies on the parsing component. Its like I can't decide
what should be the sub-components of the parsing component.
For a basic implementation, read any good book on parser design and/or
feed the XML grammar into any standard parser generator tool (eg the
YACC/LEX set).

Strong suggestion that -- unless this is a class assignment or you
believe you have a new approach that has significant advantages -- you
consider instead using one of the many parsers already available. (And I
assume that if the latter applied, you wouldn't have posted this vague a
question.) Reinventing wheels is sometimes useful; reimplementing
existing wheels is generally a waste of resources.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Sep 14 '06 #2
Joe Kesselman wrote:
Strong suggestion that -- unless this is a class assignment or you
believe you have a new approach that has significant advantages -- you
consider instead using one of the many parsers already available. (And I
Joe is right. If you really think that you should
write your own parser, be prepared to deal with all
the details of Unicode. For example, have you ever
heard of the BOM at the beginning of an XML file ?
Will your parser be able to deal with UTF-7 as well
as UTF-32 ?

Use Expat or libxml:

http://expat.sourceforge.net/
http://xmlsoft.org/
Sep 14 '06 #3
Jürgen Kahrs wrote:
Joe is right. If you really think that you should
write your own parser, be prepared to deal with all
the details of Unicode.
Well, one can start with an I/O library that handles Unicode; those
exist too. And sometimes it does make sense to have an implementation
that only supports a limited set of encodings, if you are certain that
those are all your application is ever going to see.

But there are lots of details in XML itself, especially if you want a
modern XML environment that supports namespaces, validation against
schemas, the standard XML APIs (DOM and/or SAX)...

A basic XML parser is a reasonable term project. A practical, efficient,
robust, validating XML parser is rather more. So unless this is a class
assignment (or equivalent), I'd definite go back to whoever said "write
one" and ask them why they want you to do that.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Sep 14 '06 #4

Jürgen Kahrs wrote:
Joe Kesselman wrote:
Strong suggestion that -- unless this is a class assignment or you
believe you have a new approach that has significant advantages -- you
consider instead using one of the many parsers already available. (And I

Joe is right. If you really think that you should
write your own parser, be prepared to deal with all
the details of Unicode. For example, have you ever
heard of the BOM at the beginning of an XML file ?
Will your parser be able to deal with UTF-7 as well
as UTF-32 ?
My parser need to worry only about UTF-8, which, i think, is not that
difficult to deal as compared to what you were asking (the UTF's).
>
Use Expat or libxml:

http://expat.sourceforge.net/
http://xmlsoft.org/
Sep 15 '06 #5
ma**************@gmail.com wrote:
>Will your parser be able to deal with UTF-7 as well
as UTF-32 ?

My parser need to worry only about UTF-8, which, i think, is not that
difficult to deal as compared to what you were asking (the UTF's).
Even UTF-8 data may contain a Byte-Oder-Mark (BOM).
Be prepared to read up to 4 bytes per "character"
and be prepared to read them in any byte-order.

But (as Joe suggested), there are libraries that
do the conversion for you. Use the libiconv, which
is a POSIX lib (see "man iconv").
Sep 15 '06 #6

Jürgen Kahrs wrote:
ma**************@gmail.com wrote:
Will your parser be able to deal with UTF-7 as well
as UTF-32 ?
My parser need to worry only about UTF-8, which, i think, is not that
difficult to deal as compared to what you were asking (the UTF's).

Even UTF-8 data may contain a Byte-Oder-Mark (BOM).
Be prepared to read up to 4 bytes per "character"
and be prepared to read them in any byte-order.
I shall make sure to handle the BOM.
>
But (as Joe suggested), there are libraries that
do the conversion for you. Use the libiconv, which
is a POSIX lib (see "man iconv").
I surely will look into the libconv. And I thank all of you guys who
have given suggestions and such.

Sep 18 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Rutger Claes | last post by:
I'm having troubles getting the euro sign through an XML parser. With the following test code: <?php $string = "<root><test>€</test></root>"; $parser = xml_parser_create();...
2
by: | last post by:
My asp.net application is giving the following error.Could anyone please find out the error and the apt solution for it. Parser Error Description: An error occurred during the parsing of a...
6
by: wilk | last post by:
Is anybody know here any class in .NET that would help me to parse html in C# ? Or maybe you can even tell me how to do it? -- -- -------------------------------------- Pozdrawiam WILK...
2
by: Big D | last post by:
Hi all, I'm working on a little app that will go through a text file (right now a "rich text" document), and parse it into a pseudo-html that our flash programmers can use in their presentation....
12
by: Janiek Buysrogge | last post by:
Hello, Does anyone know if there is a library or a sample project that can parse strings with mathematical expressions inside ? eg. string math = "(23 + 48) ^ 2 - (7.76 * 3.14)"; parser...
7
by: jagsmiles | last post by:
Hi Friends, I have to perform a lot of editing of xml data on the client-side(browser), using javascript. (before i display the data on the browser). Which xml parser is better for accessing...
2
by: Mike Lowery | last post by:
I'm trying to write an ASP.Net app that uses Log Parser 2.2's COM interface to read some Windows Media Server (W3C) log files and generate a chart (GIF file) on the web server. I'm getting an error...
6
by: Royan | last post by:
Ok the problem is quite hard to explain, but i'll try to keep it as simple as i can. Imagine I have the following structure of my files and folders: /root/global.inc |__/files/foo.php...
4
by: fbrewster | last post by:
I'm writing an HTML parser and would like to use Internet Explorers DOM parser. Can I use Internet Explorers DOM parser through a web service? thanks for the help
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.