473,761 Members | 6,563 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parse PDF Form fields in PHP

Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?

Kind regards,

Andy

Feb 20 '07 #1
7 14160
"Perks" <an*******@gmai l.comwrote in message
news:11******** **************@ t69g2000cwt.goo glegroups.com.. .
Hi.

I am trying to find out if it is possible to open a pdf file from
within PHP, and parse its contents in order to extract all form
fieldnames that might have been previously setup within the pdf
itself.

I want to find this out so that I can then generate a HTML form with
all required questions, which when submitted, will generate a fdf /
xfdf file, using the techniques from the following tutorial
http://koivi.com/fill-pdf-form-fields/tutorial.php

Ideally, I don't want to have to install any libraries to help with
this though. Is it possible, and if so how?
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" at
http://www.accesspdf.com/index.php?topic=pdftk for a solution.

Cheers,

--
Bill Segraves
Feb 20 '07 #2
On Feb 20, 6:25 pm, "Bill Segraves" <segraves_...@m indspring.com>
wrote:
Google is your friend.

Pdftk, available from www.accesspdf.com.

See the example "Fill PDF Forms Using an HTML Front-End" athttp://www.accesspdf.c om/index.php?topic =pdftkfor a solution.

Cheers,

--
Bill Segraves
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).

Thanks,

Andy

Feb 20 '07 #3
"Perks" <an*******@gmai l.comwrote in message
news:11******** **************@ t69g2000cwt.goo glegroups.com.. .
<snip>
Hi Bill.

Thanks for your response.

Yes, indeed Google helped me come across Pdftk in my research, but
unfortunately I cannot use this as it falls under GPL, and my end
solution would be for a system that I unfortunately could not likewise
GPL, and therefore counts this out of the running (hence my need for a
lesser licensed library or indeed no library at all - DIY style :-/).
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Feb 20 '07 #4
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@m indspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.

A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.

Cheers,
--
Bill Segraves
Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

Feb 20 '07 #5
On Feb 21, 8:09 am, "Perks" <andype...@gmai l.comwrote:
On Feb 20, 9:32 pm, "Bill Segraves" <segraves_...@m indspring.com>
wrote:
Andy, it appears your reasoning may be flawed, i.e., you want to parse a PDF
with PHP in order to discover the names of the form fields. You really don't
have to parse the PDF to determine the names of the form fields, as you can
"submit" (as HTML - URLencoded name=value pairs) the PDF form to a script
that will do the parsing for you, transforming the form data into an HTML
form.
A script to parse the form data can be written in about one line of code,
exclusive of various decalrations, in Perl. I don't know how many LOC it
would take in PHP.
Cheers,
--
Bill Segraves

Hi Bill.

Thanks for taking the time to respond to me again.

Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
From previous research I understand what you are saying in terms of

you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

Does that make more sense now at all as to why I am trying to go about
it in this way?

Kind regards,

Andy

I might be misunderstandin g, but to me it seems like the user-created
PDF POSTing to a PHP form would be a simpler UI than uploading a PDF.
This way the user just creates their PDF form and sets it to POST to a
predefined URL (like http://yourdomain.com/createform/). Your PHP
page could then do all the work behind the scenes and when it is done
send the user an email with the URL to the new HTML form.

Cheers

Feb 21 '07 #6
"Perks" <an*******@gmai l.comwrote in message
news:11******** **************@ v45g2000cwv.goo glegroups.com.. .
<snip>
Hi Bill.

Thanks for taking the time to respond to me again.
You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...

Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.

I then want to parse out the form fields dynamically from the pdf
source
You could use iText, available free from www.lowagie.com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available from www.manning.com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!
Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>
From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.
You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?
Yes.

Cheers,

--
Bill Segraves
Feb 21 '07 #7
On Feb 21, 4:41 pm, "Bill Segraves" <segraves_...@m indspring.com>
wrote:
"Perks" <andype...@gmai l.comwrote in message

news:11******** **************@ v45g2000cwv.goo glegroups.com.. .
<snip>
Hi Bill.
Thanks for taking the time to respond to me again.

You're very welcome.
Perhaps I should elaborate a little more on the background to my
project...
Basically, I want for administrators to be able to upload a PDF Form,
that they have created in various authoring tools, and formatted
properly within acrobat etc to my system.
I then want to parse out the form fields dynamically from the pdf
source

You could use iText, available free fromwww.lowagie .com/iText/, to retrieve
the form fields from the PDF (See Chapter 16 of Bruno Lowagie's book, _iText
in Action_, available fromwww.manning .com/lowagie, for details). For
licensing details, see the FAQ.
so that I can then create a PHP / XHTML Form representation of
that pdf form, which their websites users will then complete. Upon
completion of their web-based form, I will generate an fdf on the fly
(following the tutorial link that I originally posted), which the user
can then open to get a pdf representation of their completed form.
Fine if it all works you might say!

Of course, with iText, you could merge the PDF and FDF on the server side,
serving the filled PDF to the client.
>From previous research I understand what you are saying in terms of
you can configure the pdf form to have a submit button which does a
POST of the value pairs to a designated script / page, but this is a
complexity that I had hoped to not make the user go through, I just
hoped that they could create the pdf and upload it, and I take care of
the rest to great a nice, easy to use, user experience.

You can do this with iText.
Does that make more sense now at all as to why I am trying to go about
it in this way?

Yes.

Cheers,

--
Bill Segraves
Thanks a lot for your help Bill.

I managed to get the iText system installed, and have written a little
Java to get the fields, which I am running from within PHP.

I need to properly test the stability of doing it this way, but it is
indeed a solution that appears to be working pretty effectively at
present. I would have preferred to keep the solution wholly PHP, but
in the absence of something more suitable, this seems to be working a
treat.

Thanks again.

Andy
Feb 24 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2001
by: dave | last post by:
Hi, I know i shouldn't work on php forms when i'm going on 30 hours of no sleep, but i did anyway. I'm trying to get the below to work, and i keep getting a parse error line 74, an unexpected ',' there is none, so i'm assuming it's on another line, i've checked around, but don't see it. I'm probably missing something so simple that i'll call myself dumb for a month, but that's later, as of now i'd appreciate any help/suggestions. Thanks.
6
7749
by: chuck amadi | last post by:
Hi , Im trying to parse a specific users mailbox (testwwws) and output the body of the messages to a file ,that file will then be loaded into a PostGresql DB at some point . I have read the email posts and been advised to use the email Module and mailbox Module. The blurb from a memeber of this list . Im not at work at the moment So I cant test this out , but if someone could take a look and check that im on the write track as this...
6
6105
by: nate | last post by:
Hello, Does anyone know where I can find an ASP server side script written in JavaScript to parse text fields from a form method='POST' using enctype='multipart/form-data'? I'd also like it to parse the filename. <form name='form1' method='POST' enctype='multipart/form-data' action='sub.asp'> <input type='text' name='title1' value='value1'> <input type='file' name='file1'>
24
3173
by: | last post by:
Hi, I need to read a big CSV file, where different fields should be converted to different types, such as int, double, datetime, SqlMoney, etc. I have an array, which describes the fields and their types. I would like to somehow store a reference to parsing operations in this array (such as Int32.Parse, Double.Parse, SqlMoney.Parse, etc), so I can invoke the appropriate one without writing a long switch.
11
3618
by: hoopsho | last post by:
Hi Everyone, I am trying to write a program that does a few things very fast and with efficient use of memory... a) I need to parse a space-delimited file that is really large, upwards fo a million lines. b) I need to store the contents into a unique hash. c) I need to then sort the data on a specific field. d) I need to pull out certain fields and report them to the user.
29
2908
by: gs | last post by:
let say I have to deal with various date format and I am give format string from one of the following dd/mm/yyyy mm/dd/yyyy dd/mmm/yyyy mmm/dd/yyyy dd/mm/yy mm/dd/yy dd/mmm/yy mmm/dd/yy
14
1804
by: jmDesktop | last post by:
I have a food menu. Each area, like beverages, grill, etc. have items under them, Coke, Tea, Coffee would be under beverages for example. I want to add a new drink to beverages. In my database I have an "area" like above, Bevs, Grill, Chicken... In my menu items I have an ID from the area so, table areas id 1 areaname beverages id 2 areaname grill
2
4667
by: TheWeapon | last post by:
Im trying to parse form data from a form that has an input type: file, which requires a multipart encode. The form also has other input text boxes, the names of which are populated on the fly so the names of the text boxes are unknown to hard code into the script that parses the form. Most of the research I've done on how to parse multipart forms suggests to use a format such as the following: $mycgi = new CGI; print $mycgi->header; # get...
4
1870
by: mmiller | last post by:
I have a pretty limited knowledge of PHP. My scenario is: I want one form to have two (2) submit buttons. I want one button to submit an email to a specific address then redirect to a page, and I want the second one to submit to the same email, then redirect to a different page. I've downloaded some code, so I can't take credit for it, but I'm trying to use my basic knowledge to create and IF and ELSE IF. Right now, I'm just trying to get...
0
9336
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9948
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9902
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8770
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7327
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6603
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5215
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5364
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3446
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.