473,394 Members | 1,703 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Scan (OCR) into MS Access (form)?

Hi all,

Very new to this forum. Searched for relative postings but found none.

Question:

Does anyone know of any application that could convert/import scanned docs (PDFs) into access database (specified fields)?

I constantly deal with paper works from clients like purchase orders, etc. and data entry into access is very time-consuming and costly.

What I'm looking for is a program that would take my scanned PDFs and OCR them into an Access Form that would import the data into specific fields created, thus creating a temp table that I could then manipulate the data.

Not sure if I worded it correctly, but i'm hoping there is such a software.

Please post.

Thx.

H
Jul 2 '08 #1
6 28993
Stewart Ross
2,545 Expert Mod 2GB
Although there are commercial systems which can scan unstructured documents for index and retrieval purposes I am not aware of any that are aimed at scanning structured documents to database tables.

Commercial OCR systems used for high-volume scanning tend to be bought for particular purposes (e.g. scanning opinion survey results, attendance sheets, voting papers etc.) and rely on bespoke forms that the scanner can read easily (hence the typical light red print on many such forms, which fades out when scanned leaving the responses clear).

A quick Google search does not show anything like what you need available at present; sorry. Other respondents might know differently, but I think you will not find a solution at present.

-Stewart

ps doing another search I found a commercial PDF to excel table converter. I haven't tried it myself, so I cannot vouch for it nor give any recommendation about its strengths or weaknesses.

Following it up might help in your quest, although I remain doubtful. Here is the web link to it: http://www.snapfiles.com/get/pdf2xl.html.
Jul 3 '08 #2
Hi Stewart,

Thanks for your reply.

I've been googling as well hoping to find such a software, but been unsuccessful as well.

What about a simplified process?

IE. Having all documents correctly scanned into word doc, is there a way, in access, where I could create a template, indicating the location of fields, on a form, and then a process of converting the word document into an access template?

Peers mentioned the use of OLE embedding in access?

Any ideas? My knowledge with access is quite limited, so any input is well appreciated.

regards,

H
Jul 16 '08 #3
Stewart Ross
2,545 Expert Mod 2GB
Hi hayadooen. Can't think of any way to tackle even the simpler suggestion you've made.

Problem is that you need to define the structure of the scanned document. To use an analagous situation, it is like importing a text file containing fixed-length strings of characters where each line of text is one complete record. If you don't have a field list to know where one field ends and the next begins it is virtually impossible to import such text automatically.

The Excel table converter I mentioned was the closest I could find to what you needed. Importing to Word using character recognition software would not help, as even if the OCR software was very consistent you would need to define bookmarks in Word to identify the field structure before you could get the Word document into Access.

For OLE in Access all you would be doing is calling the scanner to embed an image of the document in a table as a binary file of some kind (a BLOB or Binary Large Object file). You would still need to OCR this into a regular Access table, and this in turn means being able to identify and impose a field structure on an unstructured scan - which by the lack of solutions available you can see is not a simple task.

Sorry!

-Stewart
Jul 17 '08 #4
youmike
69
I've done a bit of experimenting in this area and I think Stewart hit the nail on the head when he used "unstructured". The fact is that the volume of coding that would be needed to deal with all the possible alternatives simply is not worth it. The applications that I've developed all recognised the problem and they create documents which can later be processed using a Bar Code Id which retrieves most of the processing data from appropriate tables and prompts the user to add those only elements not so available, but even so there is a significant capturing overhead.

When it comes to third party documents, this approach becomes unworkable. The only other thing that might work is a series of prompts to read parts of a document using a hand held scanner, but I'd say that the labour would be no less than more conventional means of capture.

Sorry to be so negative.
Jul 18 '08 #5
You can use Regular Expression to look for certain patterns in your document and return the information.

Might be a little difficult to do it without field names, but definitely not impossible.
Nov 1 '10 #6
I had the same problem. You can try to use a PDF converter that converts the PDF into XRBL (excel) format first, then importing it into access directly into the data sheet, bypassing the form. This works well as long as the PDF's dont have any hand written characters (the converter has a hard time recognizing them).
Mar 20 '11 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Marcus | last post by:
Hi I have a very complex sql query and a explain plan. I found there is a full table scan in ID=9 9 8 TABLE ACCESS (FULL) OF 'F_LOTTXNHIST' (Cost=84573 Card=185892...
2
by: Dan Williams | last post by:
Does anyone know of any Windows software to scan the same paper document and enter the results to an SQL Server database? We have a Photocopier machine that can automatically scan several...
2
by: waldo | last post by:
How? Thanks
6
by: MLH | last post by:
I would like a scanner - something like a pencil that would allow me to attempt to scan printed text off hardcopies. Not all text on a page, mind you. Something I could use to scan just the words...
1
by: Caroline | last post by:
---------------------------------------------------------------- I need to write a VB.NET application that uses ActiveX to scan (remotely) a paper, OCR it, and save it into some file. - Where...
3
by: David de Passos | last post by:
Hi! I'm using MODI activex from Microsoft Office 2003 and Microsoft VB.NET to get the text from an OCR acquired image. Why after OCR the image, the MODI component rotate the image? ...
1
by: tulasi | last post by:
I need to write a VB.NET application that uses ActiveX to scan (remotely) a paper, OCR it, and save it into some file. - Where should I begin? - Any sample code to learn VB.NET for that purpose?...
1
by: shilpasharma | last post by:
Hi, Can anybody let me know how I can optimise following Query. Select * from reports where ( exists ( SELECT 1 FROM results_required rr, item_claims_trials ict, results res WHERE...
4
by: Bob Campbell | last post by:
I am trying to find C# .NET components that will allow me to scan an image file (.jpg, .bmp, etc.), extract information and file the form as a .doc, ..xls or even put the information into an SQL...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.