473,399 Members | 3,919 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

extracting text from pdf files

Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.

Oct 31 '06 #1
4 2644
Hi,

Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.

http://www.foolabs.com/xpdf/download.html

There's more tips at php.net/pdf.

ru*****@fastmail.fm wrote:
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
Nov 1 '06 #2
pe*******@gmail.com wrote:
Hi,

Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.

http://www.foolabs.com/xpdf/download.html

There's more tips at php.net/pdf.

ru*****@fastmail.fm wrote:
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.

Nov 1 '06 #3
runner7 wrote:
I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.
The system() function.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

Nov 1 '06 #4
ru*****@fastmail.fm wrote:
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
Our TET product extracts the text from PDF. It contains a programming
interface for PHP (and other languages); you can directly
fetch the text (and coordinates, font, etc.) from your PHP
script. Free evaluation version on our Web site.

Thomas

__________________________________________________ _____________
Thomas Merz tm@pdflib.com http://www.pdflib.com
PDFlib 7: Create PDF/A for archiving, format tables, and more!
_______PDFlib - a library for generating PDF on the fly________
Nov 1 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Nazgul | last post by:
Hi! I want to implement a small tool in Python for distributing "patches" and I need Your advice. This application should be able to package all files chosen by a user into a self-extracting.exe...
2
by: Avi | last post by:
hi, Can anyone tell me what the problem is and how to solve it The following piece of code resides on an asp page on the server and is used to download files from the server to the machine...
2
by: Mark Reed | last post by:
Hi all, I don't know if this is do-able but you never know until you ask. A situation at work has arisen where I have to trawl through over 500 text files, copy the contents and then paste them...
2
by: Dickyb | last post by:
Extracting an Icon and Placing It On The Desktop (C# Language) I constructed a suite of programs in C++ several years ago that handle my financial portfolio, and now I have converted them to...
1
by: Mark Jones | last post by:
Can anyone point me towards information/.net components that can be used for text extraction and pattern recognition? In particular, I am interested in working with a screenshot and extracting...
6
by: RSH | last post by:
Hi, I have quite a few .DAT data files that i need to extract the data out of. When i open the files in a text editor I see all of the text that I need to get at BUT there are a lot of junk...
2
by: Robert McEuen | last post by:
Sorry if this double-posts...Google doesn't do a very good job of communicating whether something has posted or not. Using Access 97, WindowsXP Is there a way to pass command line parameters...
2
by: bjm | last post by:
I created a self extracting zip file with about 9000 files in it. I extracted it manually from the command line without a problem. However, when I tried to do the same extraction at the same...
6
by: Werner | last post by:
Hi, I try to read (and extract) some "self extracting" zipefiles on a Windows system. The standard module zipefile seems not to be able to handle this. False Is there a wrapper or has...
4
by: dexter48 | last post by:
Hi I'm searching for a string occurance in a text file. I find the string ok and write the results to a log file. But on the line above is also some information I need. How can i get that. The string...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.