473,395 Members | 1,456 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

What is causing pdf to image to output blurry image?

I am working on pdf to xml conversion. I achieved pdf to text by using pdf clown samples. Now I am able to extract the pdf to text and image as well.

But the problem is,the saved image is blurry.


Expand|Select|Wrap|Line Numbers
  1. else if (content is it.stefanochizzolini.clown.documents.contents.objects.GraphicsObject)
  2.                 {
  3.                     /*=============================================================================
  4.                      *  TO EXTRACT THE GRAPHICAL OBJECT WITHIN THE PDF
  5.                      =============================================================================*/
  6.                     ContentScanner.GraphicsObjectWrapper objectWrapper = level.CurrentWrapper;
  7.                     if (objectWrapper == null)
  8.                     {
  9.                         continue;
  10.                     }
  11.  
  12.  
  13.                     /*
  14.                       NOTE: Images can be represented on a page either as
  15.                       external objects (XObject) or inline objects.
  16.                     */
  17.                     SizeF? imageSize = null; // Image native size.
  18.                     /*if (objectWrapper is PdfDataObject)
  19.                     {
  20.                      ContentScanner.GraphicsObjectWrapper gobjectwrapper=(ContentScanner.GraphicsObjectWrapper)gobjectwrapper;
  21.                         it.stefanochizzolini.clown.objects.PdfDataObject pdobjt=gobjectwrapper
  22.                     }*/
  23.                     //if(objectWrapper is Image)
  24.                     //{
  25.                     //    }
  26.                     if (objectWrapper is ContentScanner.XObjectWrapper)
  27.                     {
  28.                         ContentScanner.XObjectWrapper xObjectWrapper = (ContentScanner.XObjectWrapper)objectWrapper;
  29.                         it.stefanochizzolini.clown.documents.contents.xObjects.XObject Xobject = xObjectWrapper.XObject;
  30.                         // Is the external object an image?
  31.                         if (Xobject is it.stefanochizzolini.clown.documents.contents.xObjects.ImageXObject)
  32.                         {
  33.                             txtOutput.Text = txtOutput.Text + Environment.NewLine +
  34.                               "External Image '" + xObjectWrapper.Name + "' (" + Xobject.BaseObject + ")"; // Image key and indirect reference.
  35.  
  36.                             imageSize = Xobject.Size; // Image native size.
  37.  
  38.                             PdfDataObject dataObject = Xobject.BaseDataObject;
  39. ;
  40.                             PdfDictionary header = ((PdfStream)dataObject).Header;
  41.                             if (header.ContainsKey(PdfName.Type) && header[PdfName.Type].Equals(PdfName.XObject) && header[PdfName.Subtype].Equals(PdfName.Image))
  42.                             {
  43.                                 if (header[PdfName.Filter].Equals(PdfName.DCTDecode)) // JPEG image.
  44.                                 {
  45.                                     // Get the image data (keeping it encoded)!
  46.                                     IBuffer body1 = ((PdfStream)dataObject).GetBody(false);
  47.  
  48.                                     // Export the image!
  49.  
  50.                                     ExportImage(
  51.                                       body1,
  52.                                       txtOutputPath.Text + System.IO.Path.DirectorySeparatorChar + "Image_" + (index++) + ".bmp"
  53.                                       );
  54.                                 }
  55.                             }
  56.  
  57.  
  58.  
  59.                         }
  60.  
  61.  
  62.  
  63.  private void ExportImage(IBuffer data, string outputPath)
  64.         {
  65.             FileStream outputStream;
  66.             try
  67.             { outputStream = new FileStream(outputPath, FileMode.CreateNew); }
  68.             catch (Exception e)
  69.             { throw new Exception(outputPath + " file couldn't be created.", e); }
  70.  
  71.             try
  72.             {
  73.                 BinaryWriter writer = new BinaryWriter(outputStream);
  74.                 writer.Write(data.ToByteArray());
  75.                 writer.Close();
  76.                 outputStream.Close();
  77.             }
  78.             catch (Exception e)
  79.             { throw new Exception(outputPath + " file writing has failed.", e); }
  80.  
  81.             Console.WriteLine("Output: " + outputPath);
  82.         }
Jan 27 '11 #1
1 2496
Now that extracted images are negative,actully it is correctly in MAC OS,but when it comes to windows it is showing negative image.Please help me to get the solution.
Jan 27 '11 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Michael Powe | last post by:
Hello, I wish to extract some text from certain elements on the page and process them. I've done this in the past by keying on the className but I don't have that option in this case. Below is...
0
by: Mico | last post by:
I would be very grateful for any help with the following: I currently have the code below. This opens a MS Word document, and uses C#'s internal regular expressions library to find if there is a...
1
by: Mark Jones | last post by:
Can anyone point me towards information/.net components that can be used for text extraction and pattern recognition? In particular, I am interested in working with a screenshot and extracting...
2
by: Kevin K | last post by:
Hi, I'm having a problem with extracting text from a Word document using StreamReader. As I'm developing a web application, I do NOT want the server to make calls to Word. I want to simply...
2
by: Debbie | last post by:
Is there a standard way to extract text from a web page, without using innertext/innerhtml? It's an academic exercise, and we've been advised that we can't use Internet Explorer DOM extensions...
7
by: Tempo | last post by:
Hello. I am having a little trouble extracting text from a string. The string that I am dealing with is pasted below, and I want to extract the prices that are contained in the string below. Thanks...
4
by: runner7 | last post by:
Can anyone help me with how to extract text from pdf files using PHP or ColdFusion? Thanks for any help.
5
by: Bluecove | last post by:
After many unsuccessful attempts at this, including the use of software from Sobolsoft, I need help! I have a table comprised of thousands of long records (all in one column which is Memo type)....
9
by: sebzzz | last post by:
Hi, I work at this company and we are re-building our website: http://caslt.org/. The new website will be built by an external firm (I could do it myself, but since I'm just the summer student...
8
by: nicolas.edel | last post by:
Hi, suppose i get the simple xml sample: <foo> 1 <bar>2</bar> 3 </foo> Now suppose i want to extract all the text of only the 'foo' node, ie
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.