473,387 Members | 1,483 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

processing html tags inside xml with fop

Hello,

I've got an object that's being converted into SAXSource and then converted into a pdf with FOP. Some of the data however is in HTML format inside the xml tags and is being escaped (>, etc) before it transformed.

I'd like to have this these html tags parsed as actual elements by the stylesheet, which would mean reading it in it's unescaped format, but I can't figure out how to do this. I have "disable-output-escaping" set in the stylesheet, but I think the data in the xml has already been parsed as "escaped" before it gets to the stylesheet and processed.

Here's the code for converting.


FOUserAgent foUserAgent = getUserAgent();

PDFRenderer pdfrenderer = new PDFRenderer();
pdfrenderer.setUserAgent(foUserAgent);
foUserAgent.setRendererOverride(pdfrenderer);

URIResolver resolver = myWebContext.getResolver();
foUserAgent.setURIResolver(resolver);

ByteArrayOutputStream out = new ByteArrayOutputStream();

byte[] b = null;
try {

TransformerFactory factory = TransformerFactory.newInstance();


Transformer transformer = factory.newTransformer();

//transformer.setOutputProperty("disable-output-escaping", "yes");
// kicks back error --- invalid property


Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);

Source xsl = resolver.resolve(xslFilename, null);

transformer = factory.newTransformer(xsl);

res = new SAXResult(fop.getDefaultHandler());

transformer.transform(xmlSrc, res);

b = out.toByteArray();


Any help would be greatly appreciated.

Thanks!

-Jamie
Dec 7 '07 #1
1 2334
After much futzing around with this, I ended up using JAXB to generate the xml I needed and enclosing the HTML data inside CDATA sections.

I then changed TransformerFactoryImpl to use saxon and process the stylesheet:

System.setProperty("javax.xml.transform.Transforme rFactory", "net.sf.saxon.TransformerFactoryImpl");

And used saxon's saxon:parse() function to further process the html-tag data inside the CDATA section since it's well-formed html. That gave me an xml result for fop to handle which included the html I needed to create the correct pdf formatting.

It now works, although probably not the greatest of solutions.

I couldn't figure out how to generate CDATA sections with SAX and LexicalHandlers, so this was the next best alternative.

Cheers


-Jamie
Dec 12 '07 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: sindre hiåsen | last post by:
Hi, What I try to do is to search a text for html tags. Why. Because I want to use nl2br on the text, but since this command add br instead of all /n it is not a nice thing to add t.x. a html...
2
by: Hubert Hung-Hsien Chang | last post by:
I know you could use the def start_a ..... def end_a ..... to process the <a href=...> anchor </a> tags, but is there a
9
by: Dominic Olivastro | last post by:
Hi all: I'm new to this newsgroup, and new to XML. We receive documents in XML, and I am trying to tear them apart to obtain information. I decided that, for my purposes, it would be fairly...
5
by: ma.giorgi | last post by:
hi to all! I've tried in all the way but I can't find a solution I show you an example: I have the following html code: <div id="aaa">text inside<br/> a div</div><span class="bbb"> text inside...
2
by: bissatch | last post by:
Hi, I am currently writing a simple PHP program that uses an XML file to output rows for a 'Whats New' page. Once written, I will only require updating the XML file and any pages that use the...
2
by: je | last post by:
Slightly newbie question: I have some XML to be transformed which contains (inter alia) tags containing text: <tag>Foo</tag> which sould appear unchanged in the transformed output. Using xopy-of...
3
by: Just D. | last post by:
All, What's the simplest way to show my own HTML string on the ASPX page assuming that this page is just created using the wizard and it has nothing on it? We're free to use any control adding...
1
by: Xah Lee | last post by:
Text Processing with Emacs Lisp Xah Lee, 2007-10-29 This page gives a outline of how to use emacs lisp to do text processing, using a specific real-world problem as example. If you don't know...
53
by: brave1979 | last post by:
Please check out my javascript library that allows you to create any layout for your web page, nested as deep as you like, adjusting to width and height of a browser window. You just describe it in...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.