Is it possible to pull data( text contents and the file attribues, like filename ) from a pdf file and store in sql?
..using c#
I have web app with 100+ pdf files that I need keyword search capability for. It would produce results with link(s) to the corresponding pdf file. Not sure if this is possible.
thanks!
6 4415
it will be possible.
Though do you also want to read text contained inside the PDF file?
You would have to create a seperate program (could be console or windows application based, or Asp .Net also if you want) use the DirectoryInfo Class and fetch all the files contained in the directory using FileInfo.
Once you have all the files you wamt, you can insert them into your datatbase table
it will be possible.
Though do you also want to read text contained inside the PDF file?
Yes I would like to read the actual text inside the pdf. I found some info on how to convert to a text file. Perhaps I can do that and import to sql. I would just need a filename column that corresponds to the exported (text) from the pdf. What do you think?
Hi,
Since you have a lot of PDF files, and there would be significant amount of text in it, i think that storing all the text in the database, and searching for text within that will take a lot of time.
Have you looked any of the desktop search API's ?
Google provides one, but I havent looked into it, and am not sure on how you would integrate, but it would be a easier way out (You would have to keep all the PDF files within the same folder, or atleast should be within the same root path.
hi Shashi, thank you for the replies. I'll take a look at those APIs.
merry Christmas!
Is it possible to pull data( text contents and the file attribues, like filename ) from a pdf file and store in sql?
..using c#
I have web app with 100+ pdf files that I need keyword search capability for. It would produce results with link(s) to the corresponding pdf file. Not sure if this is possible.
thanks!
you can easily get text from PDF files using PDFBox library.
use google to find out how to use it in .NET2.0 because natively it's Java library.
you will also need IKVM.GNU
try this how to use pdfbox with c#
Diego thank you very much! very helpful.
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Michael Probst |
last post by:
Hi all,
I am new to .NET and the way XML data is handled in .NET
I wrote a small application with .NET forms in C++
The application reads data from an XML file to fill-in
the fields of the form. This works fine but when I try to add
new data to the XML file it does not comply with the XML schema
file I am using.
|
by: Luiz Antonio Gomes Pican?o |
last post by:
How i can store a variable length data in file ?
I want to do it using pure C, without existing databases.
I'm thinking to use pages to store data.
Anyone has idea for the file format ?
I want to store data like a database:
----------------------------------
Custumer:
|
by: Harley |
last post by:
I am trying to write a personal app to keep a bank balance and
history.
The problem I'm haveing is finding a decent way to store the data on a
pocketpc under .net compact framewok useing vb.net.
I need to store type, date, description, and total for each
check/debit transaction as well as a balance and user settings.
My problem is that I am learning vb.net now and I'm not ready to move
into datbases yet so xml seems my only viable option,...
|
by: sonu |
last post by:
I have following client side code which i have used in my asp.net
project
SummaryFeatured Resources from the
IBM Business Values Solution Center
WHITEPAPER :
CRM Done Right
Improve the likelihood of CRM success from less than 20 percent to 60
percent.
WHITEPAPER :
|
by: mesut demir |
last post by:
Hi All,
When I create fields (in files) I need assign a data type like char,
varchar, money etc.
I have some questions about the data types when you create fields in a
file.
What is the difference between data type 'CHAR' and 'TEXT'?
When do you use 'VAR' in your datatype word? e.g. VARCHAR ?
| |
by: The Cool Geek |
last post by:
I'm building a dynamic site that has 3 data bases. One DB contains all
of my store info ID#, Name, Address, Phone. Another DB contains member
info ID, Name, address, email,phone,etc...
The 3rd DB Tracks when a user logs in and logs out at a store. This DB
has the following columns Store ID, Member ID, log in time,log out
time, time in store.
The stores log in in the morning that starts a session-I need the
session to pull the store's...
|
by: kbutterly |
last post by:
Good afternoon, all!
Our security standards require all passwords to be changed regularly.
Since there is a password in the web.config file for the connnection
string, the following question has been raised:
Can we store the password in a database table somewhere and have the
web.config pull it in from there?
The thinking is that this will avoid changing production code once it
|
by: katy07 |
last post by:
Hello! I'm hoping someone might be able to help me. I am writing a program that pulls info from a csv file and imports into an oracle table. Right now all I'm trying to do is connect to the csv file (it's on my computer) so that I can pull the data out. But no matter what I try I get an error. Can u please tell me what I'm doing wrong? Everything commented out is code I've tried and hasn't worked.
private static void PullRecords (string...
|
by: RoomfulExpress |
last post by:
Here's the problem that I'm having- I'm trying to pull in 2 fields from my database and place them in the title of the HTML. I'm connecting to the db and selecting everything exactly the same as I am doing below, and it works fine. For some reason it's not pulling in the fields. Any ideas?
Here's the link to the actual page I'm working on.
http://www.roomfulexpress.com/newsite/php/familyprofile.php?FAMILY_CD=558167959
Please see below...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |