473,466 Members | 1,369 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Parse PDF Form fields in PHP

Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?

Kind regards,

Andy

Feb 20 '07 #1
7 14110
"Perks" <an*******@gmail.comwrote in message
news:11**********************@t69g2000cwt.googlegr oups.com...
Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" at
http://www.accesspdf.com/index.php?topic=pdftk for a solution.

Cheers,

--
Bill Segraves
Feb 20 '07 #2
On Feb 20, 6:25 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" athttp://www.accesspdf.com/index.php?topic=pdftkfor a solution.

Cheers,

--
Bill Segraves
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).

Thanks,

Andy

Feb 20 '07 #3
"Perks" <an*******@gmail.comwrote in message
news:11**********************@t69g2000cwt.googlegr oups.com...
<snip>
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Feb 20 '07 #4
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

Feb 20 '07 #5
On Feb 21, 8:09 am, "Perks" <andype...@gmail.comwrote:
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.
A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.
Cheers,
--
Bill Segraves

Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
From previous research I understand what you are saying in terms of

you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

I might be misunderstanding, but to me it seems like the user-created
PDF POSTing to a PHP form would be a simpler UI than uploading a PDF.
This way the user just creates their PDF form and sets it to POST to a
predefined URL (like http://yourdomain.com/createform/). Your PHP
page could then do all the work behind the scenes and when it is done
send the user an email with the URL to the new HTML form.

Cheers

Feb 21 '07 #6
"Perks" <an*******@gmail.comwrote in message
news:11**********************@v45g2000cwv.googlegr oups.com...
<snip>
Hi Bill.

Thanks for taking the time to respond to me again.
You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source
You could use iText, available free from www.lowagie.com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available from www.manning.com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>
From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.
You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?
Yes.

Cheers,

--
Bill Segraves
Feb 21 '07 #7
On Feb 21, 4:41 pm, "Bill Segraves" <segraves_...@mindspring.com>
wrote:
"Perks" <andype...@gmail.comwrote in message

news:11**********************@v45g2000cwv.googlegr oups.com...
<snip>
Hi Bill.
Thanks for taking the time to respond to me again.

You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...
Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.
I then want to parse out the form fields dynamically from the pdf
source

You could use iText, available free fromwww.lowagie.com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available fromwww.manning.com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!

Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?

Yes.

Cheers,

--
Bill Segraves
Thanks a lot for your help Bill.

I managed to get the iText system installed, and have written a little
Java to get the fields, which I am running from within PHP.

I need to properly test the stability of doing it this way, but it is
indeed a solution that appears to be working pretty effectively at
present. I would have preferred to keep the solution wholly PHP, but
in the absence of something more suitable, this seems to be working a
treat.

Thanks again.

Andy
Feb 24 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: dave | last post by:
Hi, I know i shouldn't work on php forms when i'm going on 30 hours of no sleep, but i did anyway. I'm trying to get the below to work, and i keep getting a parse error line 74, an unexpected ','...
6
by: chuck amadi | last post by:
Hi , Im trying to parse a specific users mailbox (testwwws) and output the body of the messages to a file ,that file will then be loaded into a PostGresql DB at some point . I have read the...
6
by: nate | last post by:
Hello, Does anyone know where I can find an ASP server side script written in JavaScript to parse text fields from a form method='POST' using enctype='multipart/form-data'? I'd also like it to...
24
by: | last post by:
Hi, I need to read a big CSV file, where different fields should be converted to different types, such as int, double, datetime, SqlMoney, etc. I have an array, which describes the fields and...
11
by: hoopsho | last post by:
Hi Everyone, I am trying to write a program that does a few things very fast and with efficient use of memory... a) I need to parse a space-delimited file that is really large, upwards fo a...
29
by: gs | last post by:
let say I have to deal with various date format and I am give format string from one of the following dd/mm/yyyy mm/dd/yyyy dd/mmm/yyyy mmm/dd/yyyy dd/mm/yy mm/dd/yy dd/mmm/yy mmm/dd/yy
14
by: jmDesktop | last post by:
I have a food menu. Each area, like beverages, grill, etc. have items under them, Coke, Tea, Coffee would be under beverages for example. I want to add a new drink to beverages. In my database...
2
by: TheWeapon | last post by:
Im trying to parse form data from a form that has an input type: file, which requires a multipart encode. The form also has other input text boxes, the names of which are populated on the fly so the...
4
by: mmiller | last post by:
I have a pretty limited knowledge of PHP. My scenario is: I want one form to have two (2) submit buttons. I want one button to submit an email to a specific address then redirect to a page, and...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.