By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,965 Members | 1,757 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,965 IT Pros & Developers. It's quick & easy.

PDF to HTML conversion help...

P: n/a
Hello,

I tried the following post a few weeks ago and never received any replies,
so I figured I'd try again.

I'm seeking suggestions for an interesting problem I have. I'm building a
web application that presents legal documents that users must accept in order
to proceed on the site. The documents are in PDF format. The legal
department requires that I convert the PDF docs to HTML and display them on a
web page. I can't just display the PDF document (which would be simple
enough) but rather need to convert the PDF to HTML. This will allow me to
include the standard "I agree..." checkbox and submit button at the bottom of
the page. Users will only be able to get to the submit button by scrolling
down the entire HTML page, and thus theoretically reading the entire
agreement. I found a tool called "ABC Amber PDF Converter" that converts PDF
files to HTML, but the output quality leaves a LOT to be desired, with the
output being unusable. Plus, it requires shelling out to launch the
command-line exe. Do you have any other ideas, or know of any other tools
that might do the trick?

Thanks for your help,
Greg S.
Nov 18 '05 #1
Share this Question
Share on Google+
7 Replies


P: n/a
I know Adobe Acrobat writer allows you to embed forms in it. You could
easily create a form to post to your server and insert the data.

hope this helps,
Anthony

--
Anthony J Biondo Jr. - MCP
Senior Web Developer
Keystone Mercy Health Plan - Philadelphia, PA
www.keystonemercy.com
"webgreginsf" <we*********@discussions.microsoft.com> wrote in message
news:CB**********************************@microsof t.com...
Hello,

I tried the following post a few weeks ago and never received any replies,
so I figured I'd try again.

I'm seeking suggestions for an interesting problem I have. I'm building a
web application that presents legal documents that users must accept in order to proceed on the site. The documents are in PDF format. The legal
department requires that I convert the PDF docs to HTML and display them on a web page. I can't just display the PDF document (which would be simple
enough) but rather need to convert the PDF to HTML. This will allow me to
include the standard "I agree..." checkbox and submit button at the bottom of the page. Users will only be able to get to the submit button by scrolling down the entire HTML page, and thus theoretically reading the entire
agreement. I found a tool called "ABC Amber PDF Converter" that converts PDF files to HTML, but the output quality leaves a LOT to be desired, with the
output being unusable. Plus, it requires shelling out to launch the
command-line exe. Do you have any other ideas, or know of any other tools
that might do the trick?

Thanks for your help,
Greg S.

Nov 18 '05 #2

P: n/a
Hello Anthony,

Thanks for your suggestion. Actually, I'm using a tool named PDFKit.Net to
generate the PDF. Basically, I'm doing what you suggested in that I have a
PDF Form in which I dynamically insert data from the db. The only problem is
that the legal department wants me to email the customer the PDF version of
the document I generate, but display an HTML version of it on the web page.
Their requirement is that I display the whole thing as one long document
(with no scroll bars) so that the user is forced to scroll down the entire
page to get to the accept checkbox and submit button. They won't let me just
display the PDF document in a web page or in some sort of viewer control. So
I'm trying to find some way to convert the PDF document to HTML while keeping
most of the formatting in tact. Hopefully there's something out there that
does this...

Thanks for your help,
Greg

"Anthony Biondo Jr." wrote:
I know Adobe Acrobat writer allows you to embed forms in it. You could
easily create a form to post to your server and insert the data.

hope this helps,
Anthony

--
Anthony J Biondo Jr. - MCP
Senior Web Developer
Keystone Mercy Health Plan - Philadelphia, PA
www.keystonemercy.com
"webgreginsf" <we*********@discussions.microsoft.com> wrote in message
news:CB**********************************@microsof t.com...
Hello,

I tried the following post a few weeks ago and never received any replies,
so I figured I'd try again.

I'm seeking suggestions for an interesting problem I have. I'm building a
web application that presents legal documents that users must accept in

order
to proceed on the site. The documents are in PDF format. The legal
department requires that I convert the PDF docs to HTML and display them

on a
web page. I can't just display the PDF document (which would be simple
enough) but rather need to convert the PDF to HTML. This will allow me to
include the standard "I agree..." checkbox and submit button at the bottom

of
the page. Users will only be able to get to the submit button by

scrolling
down the entire HTML page, and thus theoretically reading the entire
agreement. I found a tool called "ABC Amber PDF Converter" that converts

PDF
files to HTML, but the output quality leaves a LOT to be desired, with the
output being unusable. Plus, it requires shelling out to launch the
command-line exe. Do you have any other ideas, or know of any other tools
that might do the trick?

Thanks for your help,
Greg S.


Nov 18 '05 #3

P: n/a
You can buy Adobe Acrobat & convert them that way.

Or a better solution would probably be to embed the PDF into an HTML page
that has the "I Agree" button in it. That way you don't have to convert
anything at all. Search on "embed pdf html" -- here's a result I found
http://www.planetpdf.com/mainpage.asp?webpageid=1682

--
Ben Strackany
www.developmentnow.com

<a href="http://www.developmentnow.com">dn</a>
"webgreginsf" <we*********@discussions.microsoft.com> wrote in message
news:CB**********************************@microsof t.com...
Hello,

I tried the following post a few weeks ago and never received any replies,
so I figured I'd try again.

I'm seeking suggestions for an interesting problem I have. I'm building a
web application that presents legal documents that users must accept in order to proceed on the site. The documents are in PDF format. The legal
department requires that I convert the PDF docs to HTML and display them on a web page. I can't just display the PDF document (which would be simple
enough) but rather need to convert the PDF to HTML. This will allow me to
include the standard "I agree..." checkbox and submit button at the bottom of the page. Users will only be able to get to the submit button by scrolling down the entire HTML page, and thus theoretically reading the entire
agreement. I found a tool called "ABC Amber PDF Converter" that converts PDF files to HTML, but the output quality leaves a LOT to be desired, with the
output being unusable. Plus, it requires shelling out to launch the
command-line exe. Do you have any other ideas, or know of any other tools
that might do the trick?

Thanks for your help,
Greg S.

Nov 18 '05 #4

P: n/a
Ben,

Thanks for your suggestions. Unfortunately, because the server my web app
sits on is used for various other sites as well, and Adobe Acrobat tends to
be a bit of a resource hog, my bosses don't want to install it on the server.
Plus, since I will be generating the documents on the fly, I'm not sure how
Acrobat would handle simultaneous requests. I know MS doesn't recommend
using Office products in this fashion.

As for embedding the PDF in the HTML page, I thought of that one, too. The
problem is the legal department doesn't want it to be inside of a separate
control with scrollbars, as this conceivably would allow the user to scroll
down the page to get to the "I Agree" button without actually scrolling
through the legal agreement, so that one was a no go.

I really do appreciate your input, though.

Greg

"Ben Strackany" wrote:
You can buy Adobe Acrobat & convert them that way.

Or a better solution would probably be to embed the PDF into an HTML page
that has the "I Agree" button in it. That way you don't have to convert
anything at all. Search on "embed pdf html" -- here's a result I found
http://www.planetpdf.com/mainpage.asp?webpageid=1682

--
Ben Strackany
www.developmentnow.com

<a href="http://www.developmentnow.com">dn</a>
"webgreginsf" <we*********@discussions.microsoft.com> wrote in message
news:CB**********************************@microsof t.com...
Hello,

I tried the following post a few weeks ago and never received any replies,
so I figured I'd try again.

I'm seeking suggestions for an interesting problem I have. I'm building a
web application that presents legal documents that users must accept in

order
to proceed on the site. The documents are in PDF format. The legal
department requires that I convert the PDF docs to HTML and display them

on a
web page. I can't just display the PDF document (which would be simple
enough) but rather need to convert the PDF to HTML. This will allow me to
include the standard "I agree..." checkbox and submit button at the bottom

of
the page. Users will only be able to get to the submit button by

scrolling
down the entire HTML page, and thus theoretically reading the entire
agreement. I found a tool called "ABC Amber PDF Converter" that converts

PDF
files to HTML, but the output quality leaves a LOT to be desired, with the
output being unusable. Plus, it requires shelling out to launch the
command-line exe. Do you have any other ideas, or know of any other tools
that might do the trick?

Thanks for your help,
Greg S.


Nov 18 '05 #5

P: n/a
webgreginsf wrote:
Hello Anthony,

Thanks for your suggestion. Actually, I'm using a tool named
PDFKit.Net to generate the PDF.


You are *generating* those PDF's yourself? Why not generate the
HTML version from the same data, instead of first going to PDF
and then to HTML?
If you want output in PDF format, use your existing code,
if you want output in HTML, generate that directly from the data,
exactly the way *you* want it.

Hans Kesting
Nov 18 '05 #6

P: n/a
Hans,

Thanks for your idea. That is actually a very good idea and might be what I
have to do. The only reason I was trying to stear clear from generating the
HTML is that these documents I'm generating/displaying are 30+ page monster
legal contracts that can change fairly often. The legal department creates
the originals in MS Word, which we then convert to a PDF form using Acrobat
Writer. This form serves as a "template" that I place out on the web server
and populate certain user-specific pieces of data like the user's name,
company info, etc. at run time. We were hoping to automate the process so
that the users can generate the PDF templates without any manual intervention
from my team. If I use the HTML method like you talked about, I'll probably
have to manually convert the monsters to HTML on a fairly regular basis.
They don't want MS Word on the server, so I can't just use the built-in Word
to HTML conversion to generate the HTML as the generated HTML uses
Word-specific tags that requires Word to be on the web server.

I think you're right, though, and doing the HTML myself is probably the best
option, so that I'll have two templates for each doc (both PDF and HTML
versions). I can then dynamically fill in the same info into both of them at
run time. It requires some manual steps, but hey something is better than
nothing, right?

Thanks for your help,
Greg

"Hans Kesting" wrote:
webgreginsf wrote:
Hello Anthony,

Thanks for your suggestion. Actually, I'm using a tool named
PDFKit.Net to generate the PDF.


You are *generating* those PDF's yourself? Why not generate the
HTML version from the same data, instead of first going to PDF
and then to HTML?
If you want output in PDF format, use your existing code,
if you want output in HTML, generate that directly from the data,
exactly the way *you* want it.

Hans Kesting

Nov 18 '05 #7

P: n/a
webgreginsf wrote:
Hans,

Thanks for your idea. That is actually a very good idea and might be what I
have to do. The only reason I was trying to stear clear from generating the
HTML is that these documents I'm generating/displaying are 30+ page monster
legal contracts that can change fairly often. The legal department creates
the originals in MS Word, which we then convert to a PDF form using Acrobat
Writer. This form serves as a "template" that I place out on the web server
and populate certain user-specific pieces of data like the user's name,
company info, etc. at run time. We were hoping to automate the process so
that the users can generate the PDF templates without any manual intervention
from my team. If I use the HTML method like you talked about, I'll probably
have to manually convert the monsters to HTML on a fairly regular basis.
They don't want MS Word on the server, so I can't just use the built-in Word
to HTML conversion to generate the HTML as the generated HTML uses
Word-specific tags that requires Word to be on the web server.

I think you're right, though, and doing the HTML myself is probably the best
option, so that I'll have two templates for each doc (both PDF and HTML
versions). I can then dynamically fill in the same info into both of them at
run time. It requires some manual steps, but hey something is better than
nothing, right?

Thanks for your help,
Greg

"Hans Kesting" wrote:

webgreginsf wrote:
Hello Anthony,

Thanks for your suggestion. Actually, I'm using a tool named
PDFKit.Net to generate the PDF.


You are *generating* those PDF's yourself? Why not generate the
HTML version from the same data, instead of first going to PDF
and then to HTML?
If you want output in PDF format, use your existing code,
if you want output in HTML, generate that directly from the data,
exactly the way *you* want it.

Hans Kesting

If you are using one of the latest versions of office save it out as xml
and transform the file into html that way on the server.
Nov 18 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.