By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,635 Members | 2,227 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,635 IT Pros & Developers. It's quick & easy.

Rendering text question (context is MSWin UI Automation)

P: n/a
Hello,

I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app's window contents and use some form of OCR to get at
the texts (pywinauto can't get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app's windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app's windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I've verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic
Jan 23 '07 #1
Share this Question
Share on Google+
12 Replies


P: n/a
On 1/23/07, Boris Borcic <bb*****@gmail.comwrote:
Hello,

I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app's window contents and use some form of OCR to get at
the texts (pywinauto can't get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app's windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app's windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I've verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic
There are actually several different text rendering methods (and 2 or
more totally different engines) and they will give different results,
so if you want a fully generic solution that could be quite difficult.
However, it sounds like this is for a specific purpose.

Using the pywin32 modules to directly access the appropriate windows
API calls will be the most accurate. It will be fairly complicated and
you'll require knowledge of the win32 api to do it. You could also use
wxPython, which uses what will probably be the right API and will take
less code than win32 will. I'd suggest this if you aren't familiar
with the win32 API.

PyQt uses it's own text rendering engine, as far as I know, so it is
less likely to generate correct bitmaps. I'm not sure at what level
tkinters text drawing is done.

Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.
Jan 23 '07 #2

P: n/a
I am trying to use UI Automation to drive an MS Windows app (with pywinauto).
>
I need to scrape the app's window contents and use some form of OCR to get at
the texts (pywinauto can't get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app's windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app's windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I've verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic
I was looking for ( and still am searching for) similiar functionality.
Specifically I would like to be able to capture a small area of the
screen (a number or a code) and convert this to text that can be used
in my application.

When I asked my question, I was directed to the Microsoft Accessibility
tool kit.
Serach on this list for the post titled;
"Reading text labels from a Win32 window"

I work with wxPython and Win32 applications exclusively.

So if I can be of any help or assistance, please let me know.

Geoff.

Jan 23 '07 #3

P: n/a
On 23 Jan 2007 12:06:35 -0800, imageguy <im**********@gmail.comwrote:
I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app's window contents and use some form of OCR to get at
the texts (pywinauto can't get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app's windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app's windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I've verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic

I was looking for ( and still am searching for) similiar functionality.
Specifically I would like to be able to capture a small area of the
screen (a number or a code) and convert this to text that can be used
in my application.

When I asked my question, I was directed to the Microsoft Accessibility
tool kit.
Serach on this list for the post titled;
"Reading text labels from a Win32 window"

I work with wxPython and Win32 applications exclusively.

So if I can be of any help or assistance, please let me know.

Geoff.
The OP stated that pywinauto couldn't get at the text, so it's
probably drawn directly with GDI methods rather than being a static
text control. The accessibility toolkit only works if it's a static
text control or the application goes to some lengths to expose the
text to screen readers.
Jan 23 '07 #4

P: n/a
imageguy wrote:
>
I was looking for ( and still am searching for) similiar functionality.
Specifically I would like to be able to capture a small area of the
screen (a number or a code) and convert this to text that can be used
in my application.
There is a windows executable version of gnu ocr at
http://jocr.sourceforge.net/download.html that (in combination with screen
capture capability that pywinauto distributes) sort of can do that. An issue is
that it's not exceedingly accurate, for instance it recognizes "2" as "1" (in
the font that er, counts for me). I could probably manage such imprecisions but
I would rather have an exact solution.
....
>
I work with wxPython and Win32 applications exclusively.

So if I can be of any help or assistance, please let me know.

Geoff.
Thanks for the offer, I will keep it in mind,

Boris Borcic

Jan 24 '07 #5

P: n/a
Chris Mellon wrote:
On 1/23/07, Boris Borcic <bb*****@gmail.comwrote:
>...A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?
....
>
There are actually several different text rendering methods (and 2 or
more totally different engines) and they will give different results,
so if you want a fully generic solution that could be quite difficult.
However, it sounds like this is for a specific purpose.
Indeed.
>
...You could also use
wxPython, which uses what will probably be the right API and will take
less code than win32 will. I'd suggest this if you aren't familiar
with the win32 API.
Thanks for your guidance and quick code, I am going to try that.

Boris Borcic

Jan 24 '07 #6

P: n/a
Chris Mellon wrote:
>
Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.
Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?

Regards, Boris Borcic

Jan 24 '07 #7

P: n/a
On 1/24/07, Boris Borcic <bb*****@gmail.comwrote:
Chris Mellon wrote:

Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.

Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?
The GDI text api, which is what wx is wrapping here, only provides
pixel accuracy. You are probably seeing a kerning effect from your
chosen font and perhaps the effects of ClearType.
Jan 24 '07 #8

P: n/a
Chris Mellon wrote:
On 1/24/07, Boris Borcic <bb*****@gmail.comwrote:
>Chris Mellon wrote:
>>Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.
Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?

The GDI text api, which is what wx is wrapping here, only provides
pixel accuracy. You are probably seeing a kerning effect from your
chosen font and perhaps the effects of ClearType.
I am not. Turning antialiasing off (as a desktop setting) changes the rendering
but wx._gdi_ still insists that horizontal coordinates are between pixels (to
the contrary of vertical coordinates). This means thin black vertical lines are
rendered by two pixel columns, the left one red, the right one cyan.
Non-aliased, 90-degree rotated text is still smeared likewise left-to-right on
the screen what becomes top-to-bottom relative to the text. Setting the scales
at 0.5 and drawing the text one pixel off (to express a half-pixel shift)
doesn't work. A long almost vertical thin black line that's one pixel off
top-to-bottom results in two parallel vertical uniformly colored red and cyan
pixel columns, broken in the middle.

In short, wx._gdi_ fights quite hard to enforce what I am trying to avoid :( I
might admire its consistency if it extended to treating both axes similarly...

Regards, Boris Borcic

Jan 25 '07 #9

P: n/a
On 1/25/07, Boris Borcic <bb*****@gmail.comwrote:
Chris Mellon wrote:
On 1/24/07, Boris Borcic <bb*****@gmail.comwrote:
Chris Mellon wrote:
Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.
Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?
The GDI text api, which is what wx is wrapping here, only provides
pixel accuracy. You are probably seeing a kerning effect from your
chosen font and perhaps the effects of ClearType.

I am not. Turning antialiasing off (as a desktop setting) changes the rendering
but wx._gdi_ still insists that horizontal coordinates are between pixels (to
the contrary of vertical coordinates). This means thin black vertical lines are
rendered by two pixel columns, the left one red, the right one cyan.
Non-aliased, 90-degree rotated text is still smeared likewise left-to-right on
the screen what becomes top-to-bottom relative to the text. Setting the scales
at 0.5 and drawing the text one pixel off (to express a half-pixel shift)
doesn't work. A long almost vertical thin black line that's one pixel off
top-to-bottom results in two parallel vertical uniformly colored red and cyan
pixel columns, broken in the middle.

In short, wx._gdi_ fights quite hard to enforce what I am trying to avoid :( I
might admire its consistency if it extended to treating both axes similarly...
I have not recently had a need to examine drawn text output this
closely, but I am familiar with the C++ code that implements the
drawing and it's a direct wrapping of win32 GDI calls. If it's not
matching your source text, then the source may be drawn using a
different method or using one of the alternate engines, like GDI+.
Jan 25 '07 #10

P: n/a
Chris Mellon wrote:
On 1/25/07, Boris Borcic <bb*****@gmail.comwrote:
>Chris Mellon wrote:
>>>>>
Some quick & dirty wxPython code
>
def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
>
>
Raw win32 code will look similar but will be much more verbose.
Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?
<snip>
>In short, wx._gdi_ fights quite hard to enforce what I am trying to avoid :( I
might admire its consistency if it extended to treating both axes similarly...

I have not recently had a need to examine drawn text output this
closely, but I am familiar with the C++ code that implements the
drawing and it's a direct wrapping of win32 GDI calls. If it's not
matching your source text, then the source may be drawn using a
different method or using one of the alternate engines, like GDI+.
Maybe. In any case, color separation solves my (sub)problem : the blue layer
from the wx generated model matches the green layer from the app's window, pixel
for pixel (at least with antialiasing and cleartype on, while writing black on
white).

Best, Boris Borcic
Jan 25 '07 #11

P: n/a
On 1/25/07, Boris Borcic <bb*****@gmail.comwrote:
Chris Mellon wrote:
On 1/25/07, Boris Borcic <bb*****@gmail.comwrote:
Chris Mellon wrote:
>>>>
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.
Thx again for this base.

Quickly testing this, it appears that the result is rendered half a pixel off in
the x-direction. Does this make sense ? Is it possible to position text with
subpixel accuracy ?
<snip>
In short, wx._gdi_ fights quite hard to enforce what I am trying to avoid :( I
might admire its consistency if it extended to treating both axes similarly...
I have not recently had a need to examine drawn text output this
closely, but I am familiar with the C++ code that implements the
drawing and it's a direct wrapping of win32 GDI calls. If it's not
matching your source text, then the source may be drawn using a
different method or using one of the alternate engines, like GDI+.

Maybe. In any case, color separation solves my (sub)problem : the blue layer
from the wx generated model matches the green layer from the app's window, pixel
for pixel (at least with antialiasing and cleartype on, while writing black on
white).
That's... extremely interesting. If it works for you, go for it! If
you're interested in some other things to try, wx.EmptyBitmap takes a
depth parameter you can use to eliminate color.
Jan 25 '07 #12

P: n/a
Chris Mellon wrote:
>Maybe. In any case, color separation solves my (sub)problem : the blue layer
from the wx generated model matches the green layer from the app's window, pixel
for pixel (at least with antialiasing and cleartype on, while writing black on
white).

That's... extremely interesting.
Difficult to believe, you mean :) Well, you are right, somehow I mixed up layer
colors; in the end I compare just the blue layers and it does what I wanted.

Cheers, BB

Jan 26 '07 #13

This discussion thread is closed

Replies have been disabled for this discussion.