473,382 Members | 1,258 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Extracting text from .png images

Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

Thanks in advance,

Henrik
Jul 18 '05 #1
7 15180
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> writes:
I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


OCR is the TLA you're looking for ("Optical Character Recognition").

Dunno if there are any good free OCR engines. With these sorts of
hard algorithms, you tend to get what you pay for.
John
Jul 18 '05 #2
Henrik Berg Nielsen <hb*@imada.sdu.dk> spake thusly:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed
to a Python script for further processing. Any good ideas on how to go
about with this? I have no idea whatsoever about how to extract the
numbers out of the images...

This might help you out...
http://www.pricelessware.org/2003/PL...tm#Convert-OCR

I'm not sure if it does PNG, you might have to convert the file to tiff or
bmp or something.
--
Audio Bible Online:
http://www.audio-bible.com/
Jul 18 '05 #3
In article <wb*****************@news.get2net.dk>, Henrik Berg Nielsen wrote:
Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

http://www.claraocr.org/

Jul 18 '05 #4
John> OCR is the TLA you're looking for ("Optical Character Recognition").

John> Dunno if there are any good free OCR engines. With these sorts of
John> hard algorithms, you tend to get what you pay for.

Which often means there's a piece of free software out there which works
better than the most expensive commercial solutions. <wink>

A little googling suggests this might be a candidate:

http://www.claraocr.org/

I have no idea if there's an exported library and/or a Python wrapper, but
it's probably worth a look.

Skip

Jul 18 '05 #5
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> wrote:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #6
Henrik Berg Nielsen wrote:
Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Hi,
I'm dealing with similar problem now. My pictures are very complicated
(construction drawings). I am trying to use gamera
(http://dkc.jhu.edu/gamera/) for OCR and it seems very promising.

--
-- Lukas
Jul 18 '05 #7
On Wed, 01 Oct 2003 20:25:45 -0700, Tim Roberts <ti**@probo.com> wrote:
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> wrote:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.

Sounds interesting as a problem, but I wouldn't want to create a skeleton key
for any bad guys ;-)

Regards,
Bengt Richter
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dr. Lince M. Lawrence | last post by:
Is there any method to extract both the images (with and without mouse-over) from a web site. Thanks, Lince M Lawrence
5
by: Michael Hill | last post by:
Hi, folks. I am writing a Javascript program that accepts (x, y) data pairs from a text box and then analyzes that data in various ways. This is my first time using text area boxes; in the past,...
27
by: gRizwan | last post by:
Hello all, We have a problem on a webpage. That page is sent some email data in base64 format. what we need to do is, decode the base64 data back to original shape and extract attached image...
4
by: Moogy | last post by:
I'm pulling my hair out here. First, I'm new to XML, so that doesn't help, but none of this makes any sense to me. All I'm trying to do is take a simple source XML file and translate it with an...
1
by: Mark Jones | last post by:
Can anyone point me towards information/.net components that can be used for text extraction and pattern recognition? In particular, I am interested in working with a screenshot and extracting...
2
by: Kevin K | last post by:
Hi, I'm having a problem with extracting text from a Word document using StreamReader. As I'm developing a web application, I do NOT want the server to make calls to Word. I want to simply...
0
by: lakshmiMadhan | last post by:
I need to extract text from images like jpg/gif. can any give suggession on this other than aspriseOCR. can anyone tell me, this in pure java Thanx Lakshmi
2
by: ming | last post by:
Hello, I have a bunch of documents .DOC and .RTF and I want to extract just the images from them. I want to write a small software that will do the following: Input : vacation.RTF ( or .DOC)...
4
by: Ant | last post by:
Hi all, My kids have a bunch of games that have to be run from CD (on Windows XP). Now they're not very careful with them, and so I have a plan. I've downloaded a utility (Daemon Tools) which...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.