By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,949 Members | 1,824 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,949 IT Pros & Developers. It's quick & easy.

Parse PDF Form fields in PHP

P: n/a
Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?

Kind regards,

Andy

Feb 20 '07 #1
Share this Question
Share on Google+
7 Replies


P: n/a
"Perks" <an*******@gmail.comwrote in message
news:11**********************@t69g2000cwt.googlegr oups.com...
Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" at
http://www.accesspdf.com/index.php?topic=pdftk for a solution.

Cheers,

--
Bill Segraves
Feb 20 '07 #2

P: n/a
On Feb 20, 6:25 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" athttp://www.accesspdf.com/index.php?topic=pdftkfor a solution.

Cheers,

--
Bill Segraves
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).

Thanks,

Andy

Feb 20 '07 #3

P: n/a
"Perks" <an*******@gmail.comwrote in message
news:11**********************@t69g2000cwt.googlegr oups.com...
<snip>
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Feb 20 '07 #4

P: n/a
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

Feb 20 '07 #5

P: n/a
On Feb 21, 8:09 am, "Perks" <andype...@gmail.comwrote:
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.
A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.
Cheers,
--
Bill Segraves

Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
From previous research I understand what you are saying in terms of

you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

I might be misunderstanding, but to me it seems like the user-created
PDF POSTing to a PHP form would be a simpler UI than uploading a PDF.
This way the user just creates their PDF form and sets it to POST to a
predefined URL (like http://yourdomain.com/createform/). Your PHP
page could then do all the work behind the scenes and when it is done
send the user an email with the URL to the new HTML form.

Cheers

Feb 21 '07 #6

P: n/a
"Perks" <an*******@gmail.comwrote in message
news:11**********************@v45g2000cwv.googlegr oups.com...
<snip>
Hi Bill.

Thanks for taking the time to respond to me again.
You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source
You could use iText, available free from www.lowagie.com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available from www.manning.com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>
From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.
You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?
Yes.

Cheers,

--
Bill Segraves
Feb 21 '07 #7

P: n/a
On Feb 21, 4:41 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
"Perks" <andype...@gmail.comwrote in message

news:11**********************@v45g2000cwv.googlegr oups.com...
<snip>
Hi Bill.
Thanks for taking the time to respond to me again.

You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...
Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.
I then want to parse out the form fields dynamically from the pdf
source

You could use iText, available free fromwww.lowagie.com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available fromwww.manning.com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!

Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?

Yes.

Cheers,

--
Bill Segraves
Thanks a lot for your help Bill.

I managed to get the iText system installed, and have written a little
Java to get the fields, which I am running from within PHP.

I need to properly test the stability of doing it this way, but it is
indeed a solution that appears to be working pretty effectively at
present. I would have preferred to keep the solution wholly PHP, but
in the absence of something more suitable, this seems to be working a
treat.

Thanks again.

Andy
Feb 24 '07 #8

This discussion thread is closed

Replies have been disabled for this discussion.