473,408 Members | 1,857 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

extracting data from pdf files

I used file_get_contents() to read a pdf into a string and then tried
to extract the encoded part between the "stream" and "endstream" words
using the strpos() and substr() functions. (I could not get
preg_match() to work.) The substr() pulled it out, but read past the
length I entered by 12 characters to include "endstream en". Besides
that minor problem, I tried gzuncompress() on the extracted string
which only generated a data error. Can anyone help me with extracting
data properly from a pdf file?

Oct 31 '06 #1
0 1350

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Nazgul | last post by:
Hi! I want to implement a small tool in Python for distributing "patches" and I need Your advice. This application should be able to package all files chosen by a user into a self-extracting.exe...
2
by: Avi | last post by:
hi, Can anyone tell me what the problem is and how to solve it The following piece of code resides on an asp page on the server and is used to download files from the server to the machine...
0
by: Nadav | last post by:
Hi, Introduction: *************************** I am using the MSI API to extract MSI embedded files, I do this by iterating through all of the records in the ‘_Streams’ table and dumping...
2
by: Dickyb | last post by:
Extracting an Icon and Placing It On The Desktop (C# Language) I constructed a suite of programs in C++ several years ago that handle my financial portfolio, and now I have converted them to...
6
by: RSH | last post by:
Hi, I have quite a few .DAT data files that i need to extract the data out of. When i open the files in a text editor I see all of the text that I need to get at BUT there are a lot of junk...
2
by: Robert McEuen | last post by:
Sorry if this double-posts...Google doesn't do a very good job of communicating whether something has posted or not. Using Access 97, WindowsXP Is there a way to pass command line parameters...
2
by: Gary Wessle | last post by:
Hi I would appreciate any idea on how to approach this task. I have files which look like this: file1 men 234 women 112 children 4 cars 11
2
by: bjm | last post by:
I created a self extracting zip file with about 9000 files in it. I extracted it manually from the command line without a problem. However, when I tried to do the same extraction at the same...
6
by: rlntemp-gng | last post by:
I need to extract information from some Excel files but am stuck with part of it: As an example, I have the following Excel File that has 3 tabbed sheets: FileName: ...
6
by: Werner | last post by:
Hi, I try to read (and extract) some "self extracting" zipefiles on a Windows system. The standard module zipefile seems not to be able to handle this. False Is there a wrapper or has...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.