473,839 Members | 1,530 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

processing html tags inside xml with fop

3 New Member
Hello,

I've got an object that's being converted into SAXSource and then converted into a pdf with FOP. Some of the data however is in HTML format inside the xml tags and is being escaped (>, etc) before it transformed.

I'd like to have this these html tags parsed as actual elements by the stylesheet, which would mean reading it in it's unescaped format, but I can't figure out how to do this. I have "disable-output-escaping" set in the stylesheet, but I think the data in the xml has already been parsed as "escaped" before it gets to the stylesheet and processed.

Here's the code for converting.


FOUserAgent foUserAgent = getUserAgent();

PDFRenderer pdfrenderer = new PDFRenderer();
pdfrenderer.set UserAgent(foUse rAgent);
foUserAgent.set RendererOverrid e(pdfrenderer);

URIResolver resolver = myWebContext.ge tResolver();
foUserAgent.set URIResolver(res olver);

ByteArrayOutput Stream out = new ByteArrayOutput Stream();

byte[] b = null;
try {

TransformerFact ory factory = TransformerFact ory.newInstance ();


Transformer transformer = factory.newTran sformer();

//transformer.set OutputProperty( "disable-output-escaping", "yes");
// kicks back error --- invalid property


Fop fop = fopFactory.newF op(MimeConstant s.MIME_PDF, foUserAgent, out);

Source xsl = resolver.resolv e(xslFilename, null);

transformer = factory.newTran sformer(xsl);

res = new SAXResult(fop.g etDefaultHandle r());

transformer.tra nsform(xmlSrc, res);

b = out.toByteArray ();


Any help would be greatly appreciated.

Thanks!

-Jamie
Dec 7 '07 #1
1 2360
jamieg99
3 New Member
After much futzing around with this, I ended up using JAXB to generate the xml I needed and enclosing the HTML data inside CDATA sections.

I then changed TransformerFact oryImpl to use saxon and process the stylesheet:

System.setPrope rty("javax.xml. transform.Trans formerFactory", "net.sf.saxon.T ransformerFacto ryImpl");

And used saxon's saxon:parse() function to further process the html-tag data inside the CDATA section since it's well-formed html. That gave me an xml result for fop to handle which included the html I needed to create the correct pdf formatting.

It now works, although probably not the greatest of solutions.

I couldn't figure out how to generate CDATA sections with SAX and LexicalHandlers , so this was the next best alternative.

Cheers


-Jamie
Dec 12 '07 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

1
2787
by: sindre hiåsen | last post by:
Hi, What I try to do is to search a text for html tags. Why. Because I want to use nl2br on the text, but since this command add br instead of all /n it is not a nice thing to add t.x. a html list inside the text. Of course it is possible to add the list and remove all space between li tags, but this is not so readable to the writer. What I try to do is also to add nl only in the plain text and inside li
2
1195
by: Hubert Hung-Hsien Chang | last post by:
I know you could use the def start_a ..... def end_a ..... to process the <a href=...> anchor </a> tags, but is there a
9
1946
by: Dominic Olivastro | last post by:
Hi all: I'm new to this newsgroup, and new to XML. We receive documents in XML, and I am trying to tear them apart to obtain information. I decided that, for my purposes, it would be fairly easy to write a simple XML parser, which it was. But now suddenly I find that some of the information I need is in the form of a Processing Instruction, and not tagged in the usual way. So I get information like this:
5
8156
by: ma.giorgi | last post by:
hi to all! I've tried in all the way but I can't find a solution I show you an example: I have the following html code: <div id="aaa">text inside<br/> a div</div><span class="bbb"> text inside a span</span><img src="blabla"/><p class='ccc' style='blablabla'>text inside a <b style="FONT-SIZE: 14px">paragraph</b> with some<!--Hey I'm a f$*!#ng comment--> bold text</p>
2
2456
by: bissatch | last post by:
Hi, I am currently writing a simple PHP program that uses an XML file to output rows for a 'Whats New' page. Once written, I will only require updating the XML file and any pages that use the XML file will get their row content from there. The rows may look some thing similar to this: - Added <a>mailing list</a> functionality to <a>homepage</a> - Competition winners announced, click <a>here</a>
2
1394
by: je | last post by:
Slightly newbie question: I have some XML to be transformed which contains (inter alia) tags containing text: <tag>Foo</tag> which sould appear unchanged in the transformed output. Using xopy-of would seem to be the thing to use. However, the enclosed text sometimes needs to contain HTML: <tag>Click <a href="foo.html">here</a></tag>
3
2891
by: Just D. | last post by:
All, What's the simplest way to show my own HTML string on the ASPX page assuming that this page is just created using the wizard and it has nothing on it? We're free to use any control adding it to this page just to show this HTML string. I know that in some cases we can use .InnerHtml property of some controls to inject the HTML string created in C# code, but it didn't work in my case. I tried to use TEXTAREA1.InnerHtml and it shows me...
1
3455
by: Xah Lee | last post by:
Text Processing with Emacs Lisp Xah Lee, 2007-10-29 This page gives a outline of how to use emacs lisp to do text processing, using a specific real-world problem as example. If you don't know elisp, first take a gander at Emacs Lisp Basics. HTML version with links and colors is at: http://xahlee.org/emacs/elisp_text_processing.html
53
4161
by: brave1979 | last post by:
Please check out my javascript library that allows you to create any layout for your web page, nested as deep as you like, adjusting to width and height of a browser window. You just describe it in javascript object and that's all. No need to know CSS hacks, no need to clutter your html with tables. http://www.bravelayout.scarabeo.biz/Quickstart
0
9856
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9698
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10911
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10589
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9429
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5683
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5867
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4066
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3136
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.