472,371 Members | 1,531 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,371 software developers and data experts.

Extracting text from .png images

Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

Thanks in advance,

Henrik
Jul 18 '05 #1
7 15116
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> writes:
I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


OCR is the TLA you're looking for ("Optical Character Recognition").

Dunno if there are any good free OCR engines. With these sorts of
hard algorithms, you tend to get what you pay for.
John
Jul 18 '05 #2
Henrik Berg Nielsen <hb*@imada.sdu.dk> spake thusly:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed
to a Python script for further processing. Any good ideas on how to go
about with this? I have no idea whatsoever about how to extract the
numbers out of the images...

This might help you out...
http://www.pricelessware.org/2003/PL...tm#Convert-OCR

I'm not sure if it does PNG, you might have to convert the file to tiff or
bmp or something.
--
Audio Bible Online:
http://www.audio-bible.com/
Jul 18 '05 #3
In article <wb*****************@news.get2net.dk>, Henrik Berg Nielsen wrote:
Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

http://www.claraocr.org/

Jul 18 '05 #4
John> OCR is the TLA you're looking for ("Optical Character Recognition").

John> Dunno if there are any good free OCR engines. With these sorts of
John> hard algorithms, you tend to get what you pay for.

Which often means there's a piece of free software out there which works
better than the most expensive commercial solutions. <wink>

A little googling suggests this might be a candidate:

http://www.claraocr.org/

I have no idea if there's an exported library and/or a Python wrapper, but
it's probably worth a look.

Skip

Jul 18 '05 #5
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> wrote:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #6
Henrik Berg Nielsen wrote:
Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Hi,
I'm dealing with similar problem now. My pictures are very complicated
(construction drawings). I am trying to use gamera
(http://dkc.jhu.edu/gamera/) for OCR and it seems very promising.

--
-- Lukas
Jul 18 '05 #7
On Wed, 01 Oct 2003 20:25:45 -0700, Tim Roberts <ti**@probo.com> wrote:
"Henrik Berg Nielsen" <hb*@imada.sdu.dk> wrote:

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.

Sounds interesting as a problem, but I wouldn't want to create a skeleton key
for any bad guys ;-)

Regards,
Bengt Richter
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dr. Lince M. Lawrence | last post by:
Is there any method to extract both the images (with and without mouse-over) from a web site. Thanks, Lince M Lawrence
5
by: Michael Hill | last post by:
Hi, folks. I am writing a Javascript program that accepts (x, y) data pairs from a text box and then analyzes that data in various ways. This is my first time using text area boxes; in the past,...
27
by: gRizwan | last post by:
Hello all, We have a problem on a webpage. That page is sent some email data in base64 format. what we need to do is, decode the base64 data back to original shape and extract attached image...
4
by: Moogy | last post by:
I'm pulling my hair out here. First, I'm new to XML, so that doesn't help, but none of this makes any sense to me. All I'm trying to do is take a simple source XML file and translate it with an...
1
by: Mark Jones | last post by:
Can anyone point me towards information/.net components that can be used for text extraction and pattern recognition? In particular, I am interested in working with a screenshot and extracting...
2
by: Kevin K | last post by:
Hi, I'm having a problem with extracting text from a Word document using StreamReader. As I'm developing a web application, I do NOT want the server to make calls to Word. I want to simply...
0
by: lakshmiMadhan | last post by:
I need to extract text from images like jpg/gif. can any give suggession on this other than aspriseOCR. can anyone tell me, this in pure java Thanx Lakshmi
2
by: ming | last post by:
Hello, I have a bunch of documents .DOC and .RTF and I want to extract just the images from them. I want to write a small software that will do the following: Input : vacation.RTF ( or .DOC)...
4
by: Ant | last post by:
Hi all, My kids have a bunch of games that have to be run from CD (on Windows XP). Now they're not very careful with them, and so I have a plan. I've downloaded a utility (Daemon Tools) which...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
2
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...
1
by: ezappsrUS | last post by:
Hi, I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...
0
by: jack2019x | last post by:
hello, Is there code or static lib for hook swapchain present? I wanna hook dxgi swapchain present for dx11 and dx9.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.