473,387 Members | 1,899 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Read pdf file

Hi All

How to read existing pdf file content using asp.net

Thanks
Anitha
Sep 24 '09 #1
3 13301
ssnaik84
149 100+
what you want to do exactly?

take a look at iTextSharp
Sep 24 '09 #2
Frinavale
9,735 Expert Mod 8TB
Read? What do you mean when you say "read"? Do you mean how do you read the file into your code? Do you mean how does the user get to Read the pdf file?
Sep 24 '09 #3
use "iTextSharp" and use the following code to read contents of pdf in your code


Expand|Select|Wrap|Line Numbers
  1. using iTextSharp.text.pdf;
  2. using iTextSharp.text;
  3.  
  4. private void openPDF()
  5. {           
  6.             string str = "";
  7.             string newFile = "c:\\New Document.pdf";
  8.             Document doc = new Document();
  9.  
  10.             PdfReader reader = new PdfReader("c:\\New Document.pdf");
  11.             for (int i = 1; i <= reader.NumberOfPages; i++)
  12.             {
  13.                 byte[] bt = reader.GetPageContent(i);
  14.  
  15.                 str += ExtractTextFromPDFBytes(bt);
  16.  
  17.             }
  18. }
  19.  
  20.  
  21.  private string ExtractTextFromPDFBytes(byte[] input)
  22.         {
  23.             if (input == null || input.Length == 0) return "";
  24.  
  25.             try
  26.             {
  27.                 string resultString = "";
  28.  
  29.                 // Flag showing if we are we currently inside a text object
  30.                 bool inTextObject = false;
  31.  
  32.                 // Flag showing if the next character is literal 
  33.                 // e.g. '\\' to get a '\' character or '\(' to get '('
  34.                 bool nextLiteral = false;
  35.  
  36.                 // () Bracket nesting level. Text appears inside ()
  37.                 int bracketDepth = 0;
  38.  
  39.                 // Keep previous chars to get extract numbers etc.:
  40.                 char[] previousCharacters = new char[_numberOfCharsToKeep];
  41.                 for (int j = 0; j < _numberOfCharsToKeep; j++) previousCharacters[j] = ' ';
  42.  
  43.  
  44.                 for (int i = 0; i < input.Length; i++)
  45.                 {
  46.                     char c = (char)input[i];
  47.  
  48.                     if (inTextObject)
  49.                     {
  50.                         // Position the text
  51.                         if (bracketDepth == 0)
  52.                         {
  53.                             if (CheckToken(new string[] { "TD", "Td" }, previousCharacters))
  54.                             {
  55.                                 resultString += "\n\r";
  56.                             }
  57.                             else
  58.                             {
  59.                                 if (CheckToken(new string[] { "'", "T*", "\"" }, previousCharacters))
  60.                                 {
  61.                                     resultString += "\n";
  62.                                 }
  63.                                 else
  64.                                 {
  65.                                     if (CheckToken(new string[] { "Tj" }, previousCharacters))
  66.                                     {
  67.                                         resultString += " ";
  68.                                     }
  69.                                 }
  70.                             }
  71.                         }
  72.  
  73.                         // End of a text object, also go to a new line.
  74.                         if (bracketDepth == 0 &&
  75.                             CheckToken(new string[] { "ET" }, previousCharacters))
  76.                         {
  77.  
  78.                             inTextObject = false;
  79.                             resultString += " ";
  80.                         }
  81.                         else
  82.                         {
  83.                             // Start outputting text
  84.                             if ((c == '(') && (bracketDepth == 0) && (!nextLiteral))
  85.                             {
  86.                                 bracketDepth = 1;
  87.                             }
  88.                             else
  89.                             {
  90.                                 // Stop outputting text
  91.                                 if ((c == ')') && (bracketDepth == 1) && (!nextLiteral))
  92.                                 {
  93.                                     bracketDepth = 0;
  94.                                 }
  95.                                 else
  96.                                 {
  97.                                     // Just a normal text character:
  98.                                     if (bracketDepth == 1)
  99.                                     {
  100.                                         // Only print out next character no matter what. 
  101.                                         // Do not interpret.
  102.                                         if (c == '\\' && !nextLiteral)
  103.                                         {
  104.                                             nextLiteral = true;
  105.                                         }
  106.                                         else
  107.                                         {
  108.                                             if (((c >= ' ') && (c <= '~')) ||
  109.                                                 ((c >= 128) && (c < 255)))
  110.                                             {
  111.                                                 resultString += c.ToString();
  112.                                             }
  113.  
  114.                                             nextLiteral = false;
  115.                                         }
  116.                                     }
  117.                                 }
  118.                             }
  119.                         }
  120.                     }
  121.  
  122.                     // Store the recent characters for 
  123.                     // when we have to go back for a checking
  124.                     for (int j = 0; j < _numberOfCharsToKeep - 1; j++)
  125.                     {
  126.                         previousCharacters[j] = previousCharacters[j + 1];
  127.                     }
  128.                     previousCharacters[_numberOfCharsToKeep - 1] = c;
  129.  
  130.                     // Start of a text object
  131.                     if (!inTextObject && CheckToken(new string[] { "BT" }, previousCharacters))
  132.                     {
  133.                         inTextObject = true;
  134.                     }
  135.                 }
  136.                 return resultString;
  137.             }
  138.             catch
  139.             {
  140.                 return "";
  141.             }
  142.         }
  143.  
  144.  private bool CheckToken(string[] tokens, char[] recent)
  145.    {
  146.      foreach (string token in tokens)
  147.        {
  148.           if ((recent[_numberOfCharsToKeep - 3] == token[0]) &&
  149.            (recent[_numberOfCharsToKeep - 2] == token[1]) &&
  150.            ((recent[_numberOfCharsToKeep - 1] == ' ') ||
  151.            (recent[_numberOfCharsToKeep - 1] == 0x0d) ||
  152.            (recent[_numberOfCharsToKeep - 1] == 0x0a)) &&
  153.            ((recent[_numberOfCharsToKeep - 4] == ' ') ||
  154.            (recent[_numberOfCharsToKeep - 4] == 0x0d) ||
  155.            (recent[_numberOfCharsToKeep - 4] == 0x0a))
  156.                  )
  157.            {
  158.                     return true;
  159.             }
  160.             }
  161.             return false;
  162.         }
Sep 25 '09 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

4
by: domtam | last post by:
Suppose I have one of those USB storage devices (like a mp3 player, USB thumbdrive, or even digital camera) connected to my computer. I'd like to write a C# program that can - detect that the...
40
by: Abby | last post by:
My .dat file will contain information like below. /////////// First 0x04 0x05 0x06 Second 0x07
4
by: John please don't spam me! | last post by:
VB.Net 2003 Hi, 2 questions: 1. I want to read a file in without locking it, as it is Log file. 2. I want to be able to read from the last point it wa read upto. The project is: Search...
2
by: somequestion | last post by:
During copying file , wanna read file Size like this string CheckFileSize(string fileName) { if( fileName == null ) return; FileInfo fi = new FileInfo(fileName); return fi.Length.ToString();...
1
by: potluri040 | last post by:
hi, could any one let me know how to read file thru VB script. Scenerio is like this: i am keeping a personal details such as name, age, sex, company, location in a text file called user details...
1
by: samira | last post by:
hi all, i built aproject to my client he need that the visitors to the site can read file from his computer, how can i do this , we uuse asp.net over c#
3
by: =?Utf-8?B?Sm9obiBXYWxrZXI=?= | last post by:
Hi, Is there anything wrong with the code below in sending my browser page to Excel? Before my page opens in Excel there's a message "Problems came up in the following areas during load:" and it...
2
by: xplode144 | last post by:
I have a Web application. i need to read a file once during the startup and preserve the read data throughout the life of the application. i will to access the data often during the page_load of...
2
by: danimian | last post by:
Hello, first i am creating xml file if file does not exist. String myFile = "C:\myxmlfile.xml"; if (!File.Exists(myFile)) { using (FileStream conStream = new FileStream(myFile,...
0
by: leeamiin | last post by:
Hi, i need help with bellow file format, i work for telecom company and i'm the developer, what i need help with is to read file with bellow format, as you can see the file has { and , as...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.