processing html tags inside xml with fop

Hello,

I've got an object that's being converted into SAXSource and then converted into a pdf with FOP. Some of the data however is in HTML format inside the xml tags and is being escaped (>, etc) before it transformed.

I'd like to have this these html tags parsed as actual elements by the stylesheet, which would mean reading it in it's unescaped format, but I can't figure out how to do this. I have "disable-output-escaping" set in the stylesheet, but I think the data in the xml has already been parsed as "escaped" before it gets to the stylesheet and processed.

Here's the code for converting.

FOUserAgent foUserAgent = getUserAgent();

PDFRenderer pdfrenderer = new PDFRenderer();
pdfrenderer.setUserAgent(foUserAgent);
foUserAgent.setRendererOverride(pdfrenderer);

URIResolver resolver = myWebContext.getResolver();
foUserAgent.setURIResolver(resolver);

ByteArrayOutputStream out = new ByteArrayOutputStream();

byte[] b = null;
try {

TransformerFactory factory = TransformerFactory.newInstance();

Transformer transformer = factory.newTransformer();

//transformer.setOutputProperty("disable-output-escaping", "yes");
// kicks back error --- invalid property

Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);

Source xsl = resolver.resolve(xslFilename, null);

transformer = factory.newTransformer(xsl);

res = new SAXResult(fop.getDefaultHandler());

transformer.transform(xmlSrc, res);

b = out.toByteArray();

Any help would be greatly appreciated.

Thanks!

-Jamie

Dec 7 '07 #1

Subscribe Post Reply

2334

jamieg99

After much futzing around with this, I ended up using JAXB to generate the xml I needed and enclosing the HTML data inside CDATA sections.

I then changed TransformerFactoryImpl to use saxon and process the stylesheet:

System.setProperty("javax.xml.transform.Transforme rFactory", "net.sf.saxon.TransformerFactoryImpl");

And used saxon's saxon:parse() function to further process the html-tag data inside the CDATA section since it's well-formed html. That gave me an xml result for fop to handle which included the html I needed to create the correct pdf formatting.

It now works, although probably not the greatest of solutions.

I couldn't figure out how to generate CDATA sections with SAX and LexicalHandlers, so this was the next best alternative.

Cheers

-Jamie

Dec 12 '07 #2

Similar topics

How to list/find html tags in text

by: sindre hiåsen | last post by:

Hi, What I try to do is to search a text for html tags. Why. Because I want to use nl2br on the text, but since this command add br instead of all /n it is not a nice thing to add t.x. a html...

PHP

Question: processing HTML, re-write default processing action of many tags

by: Hubert Hung-Hsien Chang | last post by:

I know you could use the def start_a ..... def end_a ..... to process the <a href=...> anchor </a> tags, but is there a

Python

Processing Instructions

by: Dominic Olivastro | last post by:

Hi all: I'm new to this newsgroup, and new to XML. We receive documents in XML, and I am trying to tear them apart to obtain information. I decided that, for my purposes, it would be fairly...

.NET Framework

reg exp to clean html tags

by: ma.giorgi | last post by:

hi to all! I've tried in all the way but I can't find a solution I show you an example: I have the following html code: <div id="aaa">text inside<br/> a div</div><span class="bbb"> text inside...

Javascript

XML? Does all node contents require HTML special chars (ie. > = >)

by: bissatch | last post by:

Hi, I am currently writing a simple PHP program that uses an XML file to output rows for a 'Whats New' page. Once written, I will only require updating the XML file and any pages that use the...

PHP

XSLT: processing embedded (X)HTML

by: je | last post by:

Slightly newbie question: I have some XML to be transformed which contains (inter alia) tags containing text: <tag>Foo</tag> which sould appear unchanged in the transformed output. Using xopy-of...

.NET Framework

Show HTML String

by: Just D. | last post by:

All, What's the simplest way to show my own HTML string on the ASPX page assuming that this page is just created using the wizard and it has nothing on it? We're free to use any control adding...

ASP.NET

emacs lisp as text processing language...

by: Xah Lee | last post by:

Text Processing with Emacs Lisp Xah Lee, 2007-10-29 This page gives a outline of how to use emacs lisp to do text processing, using a specific real-world problem as example. If you don't know...

Python

No TABLES in html. No hacks in CSS. Any layout possible, crossbrowser. Try it.

by: brave1979 | last post by:

Please check out my javascript library that allows you to create any layout for your web page, nested as deep as you like, adjusting to width and height of a browser window. You just describe it in...

HTML / CSS

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++