473,394 Members | 1,774 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

PDF with CCITTFax encoded image streams - anyone have any experience?

I need to bang out a quick application to extract CCITT compressed TIF
images from a ton of PDFs. I've used PDFSharp in the past to work with PDFs
but ti doesn't have support for the PDF /CCITTFaxDecode filter.

I've googled for the obvious terms to try to find some code samples or
information about how to accomplish what I want but haven't found anything
at all.
If anyone here has experience in working with PDFs and extracting TIFFs
could you possible help with the following questions (some are very newbie
questions)

1) Does PDF store image data in a special PDF format or wrapped in any
other objects? In other words, I thought I would just be able to write the
image stream from the PDF to disk and it would result in a TIFF image (hah!)
but this isn't the case, I'm wondering if the image data is wrapped in an
additional format?

2) Anyone know of any (free) libraries that can decompress the CCITT codec?
I'm not sure if it's group 3 or 4, I imagine that is in the header of the
image data?

Any info greatly appreciated,
Steve
Jun 27 '08 #1
1 3116
I've found and read the relevant sections of the PDF specification.
It appears that the data is NOT wrapped in any additional structures, that
it's basically a stream of CCITT G3/G4 encoded data.

My challenge now is how to handle that encoded data. I'm wondering if I can
create an Image object from a MemoryStream (stream from PDF) then save with
the proper encoding?
if anyone has experience decoding the CCITT coded I would still really
appreciate any tips or help.

I will post back if I can get this working.
"sklett" <s@s.comwrote in message
news:em**************@TK2MSFTNGP06.phx.gbl...
>I need to bang out a quick application to extract CCITT compressed TIF
images from a ton of PDFs. I've used PDFSharp in the past to work with
PDFs but ti doesn't have support for the PDF /CCITTFaxDecode filter.

I've googled for the obvious terms to try to find some code samples or
information about how to accomplish what I want but haven't found anything
at all.
If anyone here has experience in working with PDFs and extracting TIFFs
could you possible help with the following questions (some are very newbie
questions)

1) Does PDF store image data in a special PDF format or wrapped in any
other objects? In other words, I thought I would just be able to write
the image stream from the PDF to disk and it would result in a TIFF image
(hah!) but this isn't the case, I'm wondering if the image data is wrapped
in an additional format?

2) Anyone know of any (free) libraries that can decompress the CCITT
codec? I'm not sure if it's group 3 or 4, I imagine that is in the header
of the image data?

Any info greatly appreciated,
Steve

Jun 27 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Andrew James | last post by:
Gentlemen, I'm currently in the process of designing a language which will be used to specify sets of files on a WebDAV server, encoded in a URL. The aims of the language are to (in no particular...
0
by: BW | last post by:
Sorted my problem. Issue - retrieve Base64 encoded Zlib compressed XML stream. The compressed XML stream was compressed using Zlib on a Java platform. Resolution. (VB.NET) 1) Retrieve...
5
by: Jim | last post by:
I've heard that resizing images through PHP (either GD2 or ImageMagick) is a processor intensive exercise. I'm setting up a site where users will be uploading up to 10 images along with the details...
2
by: Denise Smith | last post by:
Hello, I'm wondering if anyone can help me out here? I want to be able to browse records in a database where one of the fields contains an image. I think I might have to extract the image...
4
by: Detlef Huettenbach | last post by:
I was trying to convert a Windows Forms prototype application to an ASP.NET solution that makes use of loading data streams into the Image Web/Windows control. For WinForms no problem. However in...
6
by: hb | last post by:
Hi, Would you please give me some idea to convert/decode a Base 64 encoded GIF image string to a *.gif file in ASP.Net? Thank you hb
14
by: Schraalhans Keukenmeester | last post by:
I am building a default sheet for my linux-related pages. Since many linux users still rely on/prefer viewing textmode and unstyled content I try to stick to the correct html tags to pertain good...
0
by: CDMAPoster | last post by:
In: http://groups.google.com/group/comp.databases.ms-access/msg/9c3dcf952fc3e3d3 I said: '----- In: http://groups.google.com/group/comp.databases.ms-access/msg/c368352c1...
11
by: Diego Martins | last post by:
for me, these items are in the 'tricky zone' of C++ does anyone know good material with that? (dealing with subtle details, pitfalls, good practices...) anything like the Effective series from...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.