Connecting Tech Pros Worldwide Forums | Help | Site Map

PDF Parser

Farhan
Guest
 
Posts: n/a
#1: Jul 17 '05
i am trying to make a PDF parser in PHP which will be able to extract
data from PDF files. i want to basically convert pdf files to XML
data. any idea from where i could start?

Shawn Wilson
Guest
 
Posts: n/a
#2: Jul 17 '05

re: PDF Parser


Farhan wrote:[color=blue]
>
> i am trying to make a PDF parser in PHP which will be able to extract
> data from PDF files. i want to basically convert pdf files to XML
> data. any idea from where i could start?[/color]

http://ca2.php.net/manual/en/ref.pdf.php
You may find the links in the comments helpful.
I've never done anything with PDFs/PHP, but some of the tutorials looked
promising.

Regards,
Shawn
--
Shawn Wilson
shawn@glassgiant.com
http://www.glassgiant.com

I have a spam filter. Please include "PHP" in the
subject line to ensure I'll get your message.
Farhan
Guest
 
Posts: n/a
#3: Jul 17 '05

re: PDF Parser


thanks, shawn, for your reply. but PDFLib is not really what i am
looking for. i need to extract data from PDF files. someone at #php in
freenode told me that PDFLib with PID will be able to do that. but i
don't think PID comes along with PHP, we need to by it.

farhan
Chung Leong
Guest
 
Posts: n/a
#4: Jul 17 '05

re: PDF Parser


See my PDF highlighting code:

http://www.conradish.net/pdfhi.php.txt

Pay attention to line 451 to 462.

Uzytkownik "Farhan" <god_father52@hotmail.com> napisal w wiadomosci
news:b68af333.0401150556.8083345@posting.google.co m...[color=blue]
> thanks, shawn, for your reply. but PDFLib is not really what i am
> looking for. i need to extract data from PDF files. someone at #php in
> freenode told me that PDFLib with PID will be able to do that. but i
> don't think PID comes along with PHP, we need to by it.
>
> farhan[/color]


Farhan
Guest
 
Posts: n/a
#5: Jul 17 '05

re: PDF Parser


> See my PDF highlighting code:[color=blue]
>
> http://www.conradish.net/pdfhi.php.txt
>
> Pay attention to line 451 to 462.
>[/color]

thanks chung. i will try looking at the code some other time, because
the server you are ointing me to seems to be down right now. but just
a question - do you just search the binary data or do you follow the
Adobe PDF Specification? if you do, is it too complicated? thanks
again.

farhan
Closed Thread


Similar PHP bytes