473,385 Members | 1,218 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

replace illegal xml characters

hi!

I am working with InDesign exported xml and parse it in a python
application. I learned here: http://boodebr.org/main/python/all-a...on-and-unicode
that there actually are sets of illegal unicode characters for xml
(and henceforth for every compliant xml parser). I already implemented
a regex solution to replace the characters in question, but I wonder
if there is a efficient and out-of-the-box solution somewhere out
there for this problem. does anybody know?

thanks!
gabriel

Mar 21 '07 #1
5 3538
In <11**********************@d57g2000hsg.googlegroups .com>, killkolor
wrote:
I am working with InDesign exported xml and parse it in a python
application. I learned here: http://boodebr.org/main/python/all-a...on-and-unicode
that there actually are sets of illegal unicode characters for xml
(and henceforth for every compliant xml parser). I already implemented
a regex solution to replace the characters in question, but I wonder
if there is a efficient and out-of-the-box solution somewhere out
there for this problem. does anybody know?
Does InDesign export broken XML documents? What exactly is your problem?

Ciao,
Marc 'BlackJack' Rintsch
Mar 21 '07 #2
Does InDesign export broken XML documents? What exactly is your problem?

yes, unfortunately it does. it uses all possible unicode characters,
though not all are alowed in valid xml (see link in the first post).
in any way for my application i should be checking if the xml that
comes in is valid and replace all non-valid characters. is there
something out there to do this?

Mar 21 '07 #3
On Mar 21, 8:03 am, "killkolor" <gabriel.h...@gmail.comwrote:
Does InDesign export broken XML documents? What exactly is your problem?

yes, unfortunately it does. it uses all possible unicode characters,
though not all are alowed in valid xml (see link in the first post).
in any way for my application i should be checking if the xml that
comes in is valid and replace all non-valid characters. is there
something out there to do this?
You might be able to use "Beautiful Soup":

http://www.crummy.com/software/BeautifulSoup/

There are also some good examples for parsing XML at
http://www.devarticles.com/c/a/XML/P...AX-and-Python/

and the Dive Into Python site.
Mike

Mar 21 '07 #4
killkolor wrote:
>Does InDesign export broken XML documents? What exactly is your problem?

yes, unfortunately it does. it uses all possible unicode characters,
though not all are alowed in valid xml (see link in the first post).
in any way for my application i should be checking if the xml that
comes in is valid and replace all non-valid characters. is there
something out there to do this?
I doubt it. Dealing with broken XML is nothing standard-modules should cope
with. The link you provided has all you need - why not just use it?
Diez
Mar 21 '07 #5
killkolor wrote:
>Does InDesign export broken XML documents? What exactly is your problem?

yes, unfortunately it does. it uses all possible unicode characters,
though not all are alowed in valid xml (see link in the first post).
Are you sure about this? Could you post a small example?

If this is true, don't forget to file a bug report with Adobe too.

--Irmen
Mar 21 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

28
by: dingbat | last post by:
I'm writing a "tabbed folder" nav bar. Site standards are graphical prettiness, CSS throughout, valid code, but accesibility is ignored where it conflicts with prettiness. The particular issue...
14
by: deko | last post by:
Is there a way to check user input for illegal characters? For example, a user enters something into a text box and clicks OK. At that point I'd like to run code such as this: illegal =...
5
by: Dave | last post by:
I need to cut out illegal characters in a string submitted from a mobile phone to a web form. I need a way to check for the illegal characters in a textbox. I intend to loop through the text and...
0
by: Robin Munn | last post by:
I'm developing a simple proof-of-concept Web application, more as a personal programming exercise than anything else, that presents the user with a login form where they can type in a database...
1
by: Namshub | last post by:
I was wondering if there is a simple method of replacing characters within the xml document. I'm looking at writing a Biztalk Pipeline that escapes illegal characters before they are processed...
12
by: Goran Djuranovic | last post by:
Hi all, I ran into a problem where my XMLTextReader fails on .Read() when I have "<" character in one of the attribute's values. What I am trying to do is replace illegal characters ("<", "&" ,...
3
by: =?Utf-8?B?SG9seXNtb2tl?= | last post by:
Hi there, I am having a problem im my webservices method when trying to save a file with latin characters to disk passed through WSE. I have noticed that when trying to read the file name from...
6
by: uicouic | last post by:
I have a textbox named "txtName" and a button (btnSave) on a webform. After I have typed the illegal characters into the textbox and click "Save", I would like the webform to check for those...
2
by: Kamaria | last post by:
I wrote this program to calculate the income tax of a user depending on whether or not they were married: import javax.swing.JOptionPane; public class Income_Tax { public static void main...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.