473,756 Members | 3,973 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

???XML vs SGML for unicode support???

Hello,

Can any one please give me a short but concise pros and cons list of
Unicode support in both SGML and XML?

long story short, we are gonna port our leagacy SGML files to XML and
the new XML files will have foreign (CJK) and Ascii/English in them.

XML would be better to store the text in cuase it has better Unicode
support than SGML right???? what are these advantages that XML has
over SGML besides the default encoing for XML is unicode and SGML is
not unicode?

We gotta backup our move to XML with some reasons for management.

Any thouhgts on the pros and cons of SGML and XML when it comes to
Unicode support would be much appreciated!

thanks

krammer
Jul 20 '05 #1
1 2095
In article <2c************ ************@po sting.google.co m>,
krammer <kr************ ***@yahoo.com> wrote:

% Can any one please give me a short but concise pros and cons list of
% Unicode support in both SGML and XML?

The only real advantage of XML in this regard is that you know unicode
is going to be present in any parser you use, because it's a required
part of the language. With SGML, I'd think any recent parser will also
give you unicode support, and once you've got a parser that works, the
problem is solved and you can get on with your life.

XML stripped a lot of flexibility and user-convenience features out of
SGML, which makes it easier to write correct parsers, as well as to
write tools which can process data without intimate regard for its
structure. This may be an argument if you're spending a lot of time
developing special-purpose tools for each of your data formats, or
something like that.

I assume that the biggest problem you're having right now is dealing
with non-ascii data, in which case I'd justify the change by by showing
that you need to make a change in any case. Your current programs can't
handle your data, so you either have to replace your current SGML
programs with new SGML programs that can handle unicode data, or you
need to replace them with a new data representation and programs that
handle it. You then show that the approach you want to take is the
lower-cost one, or the one that will give you greater flexibility,
or you discover that it's not the right approach.
--

Patrick TJ McPhee
East York Canada
pt**@interlog.c om
Jul 20 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
5353
by: Usman | last post by:
Dear friends, I would like to ask about James Clark sx.exe parser from SGML to XML. I write the batch file like this : "E:\Project\sx\sx.exe" -wall "-DE:\Project\sx\entities" "-fE:\Project\error.log" -xndata "E:\Project\xyz.dtd" "E:\Project\xyz.sgm" > "E:\Project\xyz.xml" E:\Project\sx\sx.exe:E:\Project\xyz.sgm:15:55:W: reference to internal SDATA entity "ldquo" not allowed in
1
2165
by: krammer | last post by:
Hello, I have the following questions that I have not been able to find any *good* answers for. Your help would me much appreciated!, fyi, I am a Java XML guy and I have no experience with SGML so my questions will probably be XML biased. 1) Is is possible to have Unicode text inside an SGML file? an example would be something like this.......
6
2785
by: S. | last post by:
if in my website i am using the sgml { notation, is it accurate to say to my users that the site uses unicode or that it requires unicode? is there a mathematical formula to calculate a unicode value given its utf8 value? Rgds, Sam
25
3016
by: Andrew Thompson | last post by:
I was recently loading an HTML editor so I could find the charcode of that particularly obscure character using the editor's 'insert special character' dialog. It occured to me there had to be a better way. There are probably dozens, but here is my solution.. http://www.physci.org/codes/charset.jsp
48
4640
by: Zenobia | last post by:
Recently I was editing a document in GoLive 6. I like GoLive because it has some nice features such as: * rewrite source code * check syntax * global search & replace (through several files at once) * regular expression search & replace. Normally my documents are encoded with the ISO setting. Recently I was writing an XHTML document. After changing the encoding to UTF-8 I used the
32
49718
by: Wolfgang Draxinger | last post by:
I understand that it is perfectly possible to store UTF-8 strings in a std::string, however doing so can cause some implicaions. E.g. you can't count the amount of characters by length() | size(). Instead one has to iterate through the string, parse all UTF-8 multibytes and count each multibyte as one character. To address this problem the GTKmm bindings for the GTK+ toolkit have implemented a own string class Glib::ustring...
13
3309
by: Tomás | last post by:
Let's start off with: class Nation { public: virtual const char* GetName() const = 0; } class Norway : public Nation { public: virtual const char* GetName() const
2
2803
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the program sorts out every doublet and the hexadecimal unicode code is extracted, but I dont know a way to find the xml or sgml-entity equivalent to the unicode code. Anyone who could give me a pointer? Best regards
17
4532
by: Adam Olsen | last post by:
As was seen in another thread, there's a great deal of confusion with regard to surrogates. Most programmers assume Python's unicode type exposes only complete characters. Even CPython's own functions do this on occasion. This leads to different behaviour across platforms and makes it unnecessarily difficult to properly support all languages. To solve this I propose Python's unicode type using UTF-16 should have gaps in its index,...
0
9455
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9271
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10031
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9708
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8709
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7242
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5140
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2665
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.