Connecting Tech Pros Worldwide Forums | Help | Site Map

Search inside PDF or MS Word with PHP, possible?

erwinschrijver@gmail.com
Guest
 
Posts: n/a
#1: Nov 22 '05
Is it possible to search inside a PDF-flie, using PHP?
Same question for MS Word documents?
Searched for it, but I cannot find anything about it, there are also no
PEAR-packages available...
anyone?


Erwin Moller
Guest
 
Posts: n/a
#2: Nov 22 '05

re: Search inside PDF or MS Word with PHP, possible?


erwinschrijver@gmail.com wrote:
[color=blue]
> Is it possible to search inside a PDF-flie, using PHP?
> Same question for MS Word documents?
> Searched for it, but I cannot find anything about it, there are also no
> PEAR-packages available...
> anyone?[/color]

Hi,

Partial answer: for pdf:
http://nl3.php.net/manual/en/ref.pdf.php

read the comments, especially the one of:
jorromer at uchile dot cl -- Krash
07-Jun-2005 07:51

Regards,
Erwin Moller
erwinschrijver@gmail.com
Guest
 
Posts: n/a
#3: Nov 22 '05

re: Search inside PDF or MS Word with PHP, possible?


Thanx man,
Someone got an answer for me for the MS Word?

Erwin Moller
Guest
 
Posts: n/a
#4: Nov 22 '05

re: Search inside PDF or MS Word with PHP, possible?


erwinschrijver@gmail.com wrote:
[color=blue]
> Thanx man,
> Someone got an answer for me for the MS Word?[/color]

Hi,

I remembered some package at phpclasses.org

Here is the link:
http://phpclasses.chimit.nl/browse/package/1352.html

That package can make html from MS Word.
I expect if you browse through the sourcecode you will find all relevant
tricks to do stuff yourself.
I didn't study it at all, so maybe I talk bull. Check yourself. :-)


summary:
----------------------

This class can be used to convert a Microsoft Word document to HTML, RTF or
plain text using COM objects.

The input document formats can be Microsoft Word DOC, RTF and plain text.

The class can also clean the generated HTML to remove unnecessary markup
that Microsoft Word adds.

Of course, you need MsWord installed on the server, and Windows OS.

It doesn't works ? Look below =>

1- your server must be running Win32
2- Microsoft Word must be installed on the server (I tested with Word2000)
3- readfile() is not available under PHP 4.3. You can use the following code
to replace it with PHP<4.3
if (str_replace(".", "", phpversion())<"430")
{
function readFile( $f ) {
$out = ""; $lines = file ($f); foreach( $lines as $l ) $out .= $l."\n";
return $out;
}
}
4- try to not open a file on the netword (ie \\server\doc...) unless you
fully understand the authentification process
------------------------------------

Regards,
Erwin Moller
Closed Thread