473,378 Members | 1,400 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Convert PDF to XML (XHTML)

Hi all.

Is it possible to convert a PDF file to XML format?
Using Cocoon, maybe?
I found how to do the opposite, i.e. to convert XML to PDF
(using XSL-FO and Cocoon), but not this.

Any help will be appreciated.

Thanks in advance

Anna
Jul 20 '05 #1
2 11944
Anna wrote:
Hi all.

Is it possible to convert a PDF file to XML format?
Using Cocoon, maybe?
I found how to do the opposite, i.e. to convert XML to PDF
(using XSL-FO and Cocoon), but not this.

Adobe have a "Save as XML" beta plugin for Acrobat 5.0 for windows.
http://www.adobe.com/support/downloa...atform=Windows

--
Bjorn Brox, CORENA Norge AS, http://www.corena.no/, ICQ 17872043
Industritunet, Dyrmyrgt. 35, N-3611 Kongsberg, NORWAY
Phone: +47 32717210, Fax: +47 32717201, Mobile: +47 92638590

Jul 20 '05 #2
Have a look at http://www.square1.nl/TGC-SITE/New-pdf2vector.htm

This converts PDF to SVG with a very high degree of accuracy. We've
been able to wrapper their SVG in XSL-FO and re-create the PDF again
which is pretty cool.
On 3 Nov 2003 04:26:30 -0800, an**@ubaccess.com (Anna) wrote:
Hi all.

Is it possible to convert a PDF file to XML format?
Using Cocoon, maybe?
I found how to do the opposite, i.e. to convert XML to PDF
(using XSL-FO and Cocoon), but not this.

Any help will be appreciated.

Thanks in advance

Anna


---
Rob Tweed
M/Gateway Developments Ltd

Global DOMination with eXtc : http://www.mgateway.tzo.com
---
Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Sebastien B. | last post by:
I'm looking for the best tool to convert 'every day' html into proper XHTML so that I can parse it as an XML document. So far I've been using Tidylib to do this, but it doesn't handle things as...
1
by: Jens Mueller | last post by:
Hi there, this is a Java-XML Question, so I am not sure whether this is the right place, haven't found anything better .... I try to convert a Java object to XML via SAX and let the FOP...
4
by: js | last post by:
I'm just wondering if anyone have come across a code that reads an html file,search for <table> tag and convert its contents into a datatable and save it into a dataset.If there are few html...
9
by: MLibby | last post by:
How do I convert an HTML page into XML? My initial thought is to convert the page to xslt but I'm not sure how to do this. Please provide any source code examples if you have them. Thanks, Mike...
6
by: PenguinPig | last post by:
Dear All Experts I would like to know how to convert a HTML into Image using C#. Or allow me contains HTML code (parsed) in Image? I also tried this way but it just display the character "<" &...
1
by: =?Utf-8?B?QUJO?= | last post by:
Hi, I am getting a HTML string from database. I need to convert this string to XHTML string, and assign it as a text to a XML node. My application is a .NET windows service, which will get...
4
by: donpro | last post by:
Hi, I've created a table where the header columns link to an AJAX function which calls a PHP file and returns content - the purpose is to sort the table on the heading. The code snippet is:...
11
by: Tim Arnold | last post by:
hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop to create CHM files. That application really hates xhtml, so I need to convert self-ending tags (e.g. <br />) to plain html...
0
by: John Krukoff | last post by:
-----Original Message----- One method which wouldn't require much python code, would be to run the XHTML through a simple identity XSL tranform with the output method set to HTML. It would...
0
by: M.-A. Lemburg | last post by:
On 2008-04-24 19:16, John Krukoff wrote: You could filter the XHTML through mxTidy and set the hide_endtags to 1: http://www.egenix.com/products/python/mxExperimental/mxTidy/ -- Marc-Andre...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.