Looking for people who have built document scanning/retrieval intoan Access database

Bob Alston

I am looking for others who have built systems to scan documents, index
them and then make them accessible from an Access database. My
environment is a nonprofit with about 20-25 case workers who use
laptops. They have Access databases on their laptops and the data is
replicated.

The idea is that each case worker would scan their own documents,
either remotely or back at the office.

And NO I am not planning to store the scanned images in the Access
database. I already know not to do that. The Access database would
only have a record with an index of the document its file name.

Conceptual Approach
--------------------
Use a document scanner that can put the documents in a directory with a
sequential number affixed. Something like c:\ScannedDocs

Then I would plan to have a program - probably Access/VBA that goes
through all documents ( in sequential file number order)in the directory
and brings up the scanned image. At this point case worker will
identify the document by client/consumer and type of document.

Then I propose to copy the document to another location, something like
c:\IndexedDocs
And rename the doc to include the client/consumer #, document type and
scan date and a sequential number in the document name
xxxxxxx-TTTTTTTTTT-yyyy-mm-dd-ssss

I would delete the source document from the scanned-in documents folder.

At the same time I would add a record to the Access database that link
to the client/consumer, identify the document type and scan date and
would have the file name of the indexed document.

When viewing the document within Access, I would plan to use the method
of retrieving it, and inserting it into a blob within an Access form for
display only. I would NOT store the image in the Access database

The Access database is already planned to be replicated, so this
approach allows the information on scanned documents to be available to
central office personnel as well. I am planning to have a central file
of scanned images, so each time the user would come into the office and
the Access database would be replicated, all new scanned and indexed
documents would be uploaded to a central repository.

The laptop users would only have scanned documents on their own
clients/consumers. Generally about 100-150 clients/consumers at a time.
I am guessing that initially the system would record 10-20 scanned
images per consumer. However over time, I would anticipate that more
and more documents would be scanned. The central database would have a
copy of all scanned documents.

ISSUES

Anyone done this on a distributed, laptop oriented basis before?
If so, guidance would be appreciated

Any suggestions on scanners to use?

Anyone have experience in having multiple users scan in their own images
and run indexing processes vs. having a central scanning and indexing
function?

Should I try to combining multiple images together? Most documents are
single page but a few are 2-3 pages and one is a whopping 18 pages.
Paperport software says it has features to combine multiple scanned images?
Should I try to combine multiple scanned images together or keep
separate and just use page numbers?

What document format and resolution is needed? I am assuming that I
would use JPG but need suggestions on resolution.

Anyone have examples of doing this they would be willing to share?

Any comments on or suggestions for improving the overall approach.

Thanks

Bob

bobalston9 AT yahoo DOT com

Aug 13 '06 #1

Subscribe Post Reply

3827

Lyle Fairfield

Any comments on or suggestions for improving the overall approach.

As I understand your post you will store the Image rather than have
Optical Character Recognition read the text?

I haven't used OCR much but it's never failed me, although the
formatting and location of text can often be disappointing, and TTBOMK
it's included in most scanning software.

***
Have you considered an Indexing Service Application? Properly set up,
it will maintain catalogs of whatever folders and documents you
instruct it to, and its search capabilites (for documents) are many
times more powerful than one is likely even to dream about for
Access/Jet. Indexing Service is not really so well-documented, seems to
be infrequently used, and may appear arcane and difficult. Once you
become slightly familiar with it, well ... you can fall in love with
it!
It's fully accessible through ADO. Thoretically at least, by using an
unconnected (as opposed to disconnected - never connected rather than
once connected) ADP which will have zip to do with SQL-Server, one
could create ADO recordsets and use them for both forms and reports.

This would be my way ... but for whatever reason it does not seem to be
the way of many others.

Aug 13 '06 #2

Bob Alston

Lyle Fairfield wrote:

>>Any comments on or suggestions for improving the overall approach.

As I understand your post you will store the Image rather than have
Optical Character Recognition read the text?

I haven't used OCR much but it's never failed me, although the
formatting and location of text can often be disappointing, and TTBOMK
it's included in most scanning software.

***
Have you considered an Indexing Service Application? Properly set up,
it will maintain catalogs of whatever folders and documents you
instruct it to, and its search capabilites (for documents) are many
times more powerful than one is likely even to dream about for
Access/Jet. Indexing Service is not really so well-documented, seems to
be infrequently used, and may appear arcane and difficult. Once you
become slightly familiar with it, well ... you can fall in love with
it!
It's fully accessible through ADO. Thoretically at least, by using an
unconnected (as opposed to disconnected - never connected rather than
once connected) ADP which will have zip to do with SQL-Server, one
could create ADO recordsets and use them for both forms and reports.

This would be my way ... but for whatever reason it does not seem to be
the way of many others.

Thanks for the suggestions. I will look into that.

and yes it is the image I want to capture, not OCR to text because they
need the image of the client signature.

Bob

Aug 13 '06 #3

alfred

Download smart-it accounting from www.smartit.co.za
Goto Customers Maintenance and then click on the scan tab. Scan in
something. If that is what you want to do then I wil send the code or you
can use the program.
Alfred -- email to ad*******@gmail.com
"Bob Alston" <bo********@yahoo.comwrote in message
news:bF*************@newsfe02.lga...

>I am looking for others who have built systems to scan documents, index
them and then make them accessible from an Access database. My environment
is a nonprofit with about 20-25 case workers who use laptops. They have
Access databases on their laptops and the data is replicated.

The idea is that each case worker would scan their own documents,
either remotely or back at the office.

And NO I am not planning to store the scanned images in the Access
database. I already know not to do that. The Access database would only
have a record with an index of the document its file name.

Conceptual Approach
--------------------
Use a document scanner that can put the documents in a directory with a
sequential number affixed. Something like c:\ScannedDocs

Then I would plan to have a program - probably Access/VBA that goes
through all documents ( in sequential file number order)in the directory
and brings up the scanned image. At this point case worker will identify
the document by client/consumer and type of document.

Then I propose to copy the document to another location, something like
c:\IndexedDocs
And rename the doc to include the client/consumer #, document type and
scan date and a sequential number in the document name
xxxxxxx-TTTTTTTTTT-yyyy-mm-dd-ssss

I would delete the source document from the scanned-in documents folder.

At the same time I would add a record to the Access database that link to
the client/consumer, identify the document type and scan date and would
have the file name of the indexed document.

When viewing the document within Access, I would plan to use the method of
retrieving it, and inserting it into a blob within an Access form for
display only. I would NOT store the image in the Access database

The Access database is already planned to be replicated, so this approach
allows the information on scanned documents to be available to central
office personnel as well. I am planning to have a central file of scanned
images, so each time the user would come into the office and the Access
database would be replicated, all new scanned and indexed documents would
be uploaded to a central repository.

The laptop users would only have scanned documents on their own
clients/consumers. Generally about 100-150 clients/consumers at a time.
I am guessing that initially the system would record 10-20 scanned images
per consumer. However over time, I would anticipate that more and more
documents would be scanned. The central database would have a copy of all
scanned documents.

ISSUES

Anyone done this on a distributed, laptop oriented basis before?
If so, guidance would be appreciated

Any suggestions on scanners to use?

Anyone have experience in having multiple users scan in their own images
and run indexing processes vs. having a central scanning and indexing
function?

Should I try to combining multiple images together? Most documents are
single page but a few are 2-3 pages and one is a whopping 18 pages.
Paperport software says it has features to combine multiple scanned
images?
Should I try to combine multiple scanned images together or keep separate
and just use page numbers?

What document format and resolution is needed? I am assuming that I would
use JPG but need suggestions on resolution.

Anyone have examples of doing this they would be willing to share?

Any comments on or suggestions for improving the overall approach.

Thanks

Bob

bobalston9 AT yahoo DOT com

Aug 13 '06 #4

Bob Alston

alfred wrote:

Download smart-it accounting from www.smartit.co.za
Goto Customers Maintenance and then click on the scan tab. Scan in
something. If that is what you want to do then I wil send the code or you
can use the program.
Alfred -- email to ad*******@gmail.com
"Bob Alston" <bo********@yahoo.comwrote in message
news:bF*************@newsfe02.lga...

>>I am looking for others who have built systems to scan documents, index
them and then make them accessible from an Access database. My environment
is a nonprofit with about 20-25 case workers who use laptops. They have
Access databases on their laptops and the data is replicated.

The idea is that each case worker would scan their own documents,
either remotely or back at the office.

And NO I am not planning to store the scanned images in the Access
database. I already know not to do that. The Access database would only
have a record with an index of the document its file name.

Conceptual Approach
--------------------
Use a document scanner that can put the documents in a directory with a
sequential number affixed. Something like c:\ScannedDocs

Then I would plan to have a program - probably Access/VBA that goes
through all documents ( in sequential file number order)in the directory
and brings up the scanned image. At this point case worker will identify
the document by client/consumer and type of document.

Then I propose to copy the document to another location, something like
c:\IndexedDocs
And rename the doc to include the client/consumer #, document type and
scan date and a sequential number in the document name
xxxxxxx-TTTTTTTTTT-yyyy-mm-dd-ssss

I would delete the source document from the scanned-in documents folder.

At the same time I would add a record to the Access database that link to
the client/consumer, identify the document type and scan date and would
have the file name of the indexed document.

When viewing the document within Access, I would plan to use the method of
retrieving it, and inserting it into a blob within an Access form for
display only. I would NOT store the image in the Access database

The Access database is already planned to be replicated, so this approach
allows the information on scanned documents to be available to central
office personnel as well. I am planning to have a central file of scanned
images, so each time the user would come into the office and the Access
database would be replicated, all new scanned and indexed documents would
be uploaded to a central repository.

The laptop users would only have scanned documents on their own
clients/consumers. Generally about 100-150 clients/consumers at a time.
I am guessing that initially the system would record 10-20 scanned images
per consumer. However over time, I would anticipate that more and more
documents would be scanned. The central database would have a copy of all
scanned documents.

ISSUES

Anyone done this on a distributed, laptop oriented basis before?
If so, guidance would be appreciated

Any suggestions on scanners to use?

Anyone have experience in having multiple users scan in their own images
and run indexing processes vs. having a central scanning and indexing
function?

Should I try to combining multiple images together? Most documents are
single page but a few are 2-3 pages and one is a whopping 18 pages.
Paperport software says it has features to combine multiple scanned
images?
Should I try to combine multiple scanned images together or keep separate
and just use page numbers?

What document format and resolution is needed? I am assuming that I would
use JPG but need suggestions on resolution.

Anyone have examples of doing this they would be willing to share?

Any comments on or suggestions for improving the overall approach.

Thanks

Bob

bobalston9 AT yahoo DOT com

Thanks, downloading it now.

Is your app built in Access? What language is the code in?

Right now I don't have a scanner hooked up but hopefully I will get the
gist from the user interface.

Bob

Aug 13 '06 #5

Bob Alston

alfred wrote:

Download smart-it accounting from www.smartit.co.za
Goto Customers Maintenance and then click on the scan tab. Scan in
something. If that is what you want to do then I wil send the code or you
can use the program.
Alfred -- email to ad*******@gmail.com
"Bob Alston" <bo********@yahoo.comwrote in message
news:bF*************@newsfe02.lga...

>>I am looking for others who have built systems to scan documents, index
them and then make them accessible from an Access database. My environment
is a nonprofit with about 20-25 case workers who use laptops. They have
Access databases on their laptops and the data is replicated.

The idea is that each case worker would scan their own documents,
either remotely or back at the office.

And NO I am not planning to store the scanned images in the Access
database. I already know not to do that. The Access database would only
have a record with an index of the document its file name.

Conceptual Approach
--------------------
Use a document scanner that can put the documents in a directory with a
sequential number affixed. Something like c:\ScannedDocs

Then I would plan to have a program - probably Access/VBA that goes
through all documents ( in sequential file number order)in the directory
and brings up the scanned image. At this point case worker will identify
the document by client/consumer and type of document.

Then I propose to copy the document to another location, something like
c:\IndexedDocs
And rename the doc to include the client/consumer #, document type and
scan date and a sequential number in the document name
xxxxxxx-TTTTTTTTTT-yyyy-mm-dd-ssss

I would delete the source document from the scanned-in documents folder.

At the same time I would add a record to the Access database that link to
the client/consumer, identify the document type and scan date and would
have the file name of the indexed document.

When viewing the document within Access, I would plan to use the method of
retrieving it, and inserting it into a blob within an Access form for
display only. I would NOT store the image in the Access database

The Access database is already planned to be replicated, so this approach
allows the information on scanned documents to be available to central
office personnel as well. I am planning to have a central file of scanned
images, so each time the user would come into the office and the Access
database would be replicated, all new scanned and indexed documents would
be uploaded to a central repository.

The laptop users would only have scanned documents on their own
clients/consumers. Generally about 100-150 clients/consumers at a time.
I am guessing that initially the system would record 10-20 scanned images
per consumer. However over time, I would anticipate that more and more
documents would be scanned. The central database would have a copy of all
scanned documents.

ISSUES

Anyone done this on a distributed, laptop oriented basis before?
If so, guidance would be appreciated

Any suggestions on scanners to use?

Anyone have experience in having multiple users scan in their own images
and run indexing processes vs. having a central scanning and indexing
function?

Should I try to combining multiple images together? Most documents are
single page but a few are 2-3 pages and one is a whopping 18 pages.
Paperport software says it has features to combine multiple scanned
images?
Should I try to combine multiple scanned images together or keep separate
and just use page numbers?

What document format and resolution is needed? I am assuming that I would
use JPG but need suggestions on resolution.

Anyone have examples of doing this they would be willing to share?

Any comments on or suggestions for improving the overall approach.

Thanks

Bob

bobalston9 AT yahoo DOT com

Yes, that looks right on. I would appreciate a copy of the code.

Thank you.

Bob

bobalston9 AT yahoo DOT com

Aug 13 '06 #6

salad

Bob Alston wrote:

I am looking for others who have built systems to scan documents, index
them and then make them accessible from an Access database. My
environment is a nonprofit with about 20-25 case workers who use
laptops. They have Access databases on their laptops and the data is
replicated.

The idea is that each case worker would scan their own documents,
either remotely or back at the office.

And NO I am not planning to store the scanned images in the Access
database. I already know not to do that. The Access database would
only have a record with an index of the document its file name.

Conceptual Approach
--------------------
Use a document scanner that can put the documents in a directory with a
sequential number affixed. Something like c:\ScannedDocs

Then I would plan to have a program - probably Access/VBA that goes
through all documents ( in sequential file number order)in the directory
and brings up the scanned image. At this point case worker will
identify the document by client/consumer and type of document.

Then I propose to copy the document to another location, something like
c:\IndexedDocs
And rename the doc to include the client/consumer #, document type and
scan date and a sequential number in the document name
xxxxxxx-TTTTTTTTTT-yyyy-mm-dd-ssss

I would delete the source document from the scanned-in documents folder.

At the same time I would add a record to the Access database that link
to the client/consumer, identify the document type and scan date and
would have the file name of the indexed document.

When viewing the document within Access, I would plan to use the method
of retrieving it, and inserting it into a blob within an Access form for
display only. I would NOT store the image in the Access database

The Access database is already planned to be replicated, so this
approach allows the information on scanned documents to be available to
central office personnel as well. I am planning to have a central file
of scanned images, so each time the user would come into the office and
the Access database would be replicated, all new scanned and indexed
documents would be uploaded to a central repository.

The laptop users would only have scanned documents on their own
clients/consumers. Generally about 100-150 clients/consumers at a time.
I am guessing that initially the system would record 10-20 scanned
images per consumer. However over time, I would anticipate that more
and more documents would be scanned. The central database would have a
copy of all scanned documents.

ISSUES

Anyone done this on a distributed, laptop oriented basis before?
If so, guidance would be appreciated

Any suggestions on scanners to use?

Anyone have experience in having multiple users scan in their own images
and run indexing processes vs. having a central scanning and indexing
function?

Should I try to combining multiple images together? Most documents are
single page but a few are 2-3 pages and one is a whopping 18 pages.
Paperport software says it has features to combine multiple scanned images?
Should I try to combine multiple scanned images together or keep
separate and just use page numbers?

What document format and resolution is needed? I am assuming that I
would use JPG but need suggestions on resolution.

Anyone have examples of doing this they would be willing to share?

Any comments on or suggestions for improving the overall approach.

Thanks

Bob

bobalston9 AT yahoo DOT com

I know that my client uses a scanner. It's a good one. I scans both
sides, somehow can stick in a stack of documents and knows when 1 stack
completes a file and another goes on.

One thing they do is have a stamp. They stamp the document and enter
the "id" information on it. Then when they scan the document they know
the type and the identifier for the database association.

I too used the ScannedDocuments folder. When they scan, since I have
various types of documents I want to catalog, I have subfolders under
scanned docs...like AR, Project, Customer, etc.

Because there can be thousands of documents, I check to current date. I
then check to see if a folder for that year exists. Ex:
\ScannedDocuments\Ar\2006.

I use ScannedDocuments\Ar as my holding folder for AR files. Once the
file's been tagged to an Access record, I move it to 2006. This
compartamentalizes the files and keeps the files in each folder down.
There's nothing like going to a folder that has 20000+ documents in it
and waiting for explorer to present the list. It's a yawner.

If memory serves me correct, their scanning software creates the
filename. So for the most part I can assume the filename is unique.
However, a user might have a file on his hard drive he wants accessable
to all and I allow him so select the file via a File/Open. In this
case, he might have a filename that already exist in the 2006 folder.
If he copied it to 2006, it'd overwrite it unless I made an adjustment.
So I create a counter to it. The original might be Test.Txt. The
next will be Text1.Txt and so on.

If your scanner allows you to script filenames, that can make life
simple...simply parse the filename to figure out which records to
associate it with. If not, I allow the user to view the doc and with
the stamped info on it they can tag it to the document quickly.

It's funny, I wrote the program but haven't been involed in the process.
I think most documents that are scanned come in as Tifs. I know the
documents that are multi-pages remain multi-paged...as in 1 file. I
wish I new the type of scanner they use...it's a good one.

Aug 13 '06 #7

Looking for people who have built document scanning/retrieval intoan Access database

Similar topics