467,092 Members | 1,229 Online
Bytes | Developer Community
Ask Question

Home New Posts Topics Members FAQ

Post your question to a community of 467,092 developers. It's quick & easy.

Using VBA to determine Adobe Check Box state

twinnyfo
Expert Mod 2GB
Hello Friends,

Here's one that might have a very simple answer of "No," or it could be relatively simple solution and I just don't know what to look for, or it could be complex. So, here is the question:

Is it possible to use VBA to determine the state of a check box in an Adobe Acrobat Form?

Further details: We use our Database to extract data from standardized PDFs forms. I use virtual keys to simulate tabbing through the form and copying and pasting data from the form and saving it to the database. This is not my favorite way of doing this, but it works, although with occasional breakages which I am able to manage mostly through error handling.

The following restrictions apply, making any other options impossible (that I know of):
  • The PDFs are "SECURE" PDFs
  • We do not "own" the forms, so we can't modify the forms to suit our needs.
The "SECURE" PDFs prevent us from downloading the data to a CSV file. This would work perfectly for us, and would certainly be the preferred method for us, because we have to import several hundred of these forms at a time, and the CSV file would contain the file name and all data. However, the SECURE forms only export the file name to the CSV file and nothing else.

SECURE PDFs also prevent us from opening the PDF as an object within the DB. I've checked with the Adobe Acrobat SDK and played around with this over the years and every time we open a SECURE PDF, the object methods fail, as the form itself prevents us from having access to the data.

Using our method of virtual keys, we can gather just about everything we need from these forms and import it into our DB. The only thing we are missing is a few check boxes on the forms.

Using the virtual keys, it is possible to manipulate these check boxes. For example, I can simulate checking and un-checking these check boxes all day long. However, since I am unable to check the current state of the check box, I can never glean the value of the check box as it is sent to us.

I am able to add data to the clipboard for all the other data fields on the form and I am able to clear the clipboard. As an example of what I have tried so far, is that I have cleared the clipboard, navigated to the check box and tried to "add" data to the clipboard. Then, I check to see if there is anything in the clipboard, and there is not. So, obviously, there is no "data" behind the current location of the field on the form.

But, is it possible to somehow "test" or determine the current state of a check box on a PDF if that check box currently has the focus on the form?

As mentioned above, the answer might be a simple, "No!"

I could also be missing something simple that I haven't tried because I don't know about it.

It could be something rather more involved, but if it works, that might suit my needs.

I did not attach any forms as they are proprietary and contain personal data. And, this question would apply to any Adobe Acrobat PDF Form (AFAIK).

I'm glad to answer any other questions regarding this.
1 Week Ago #1

✓ answered by Rabbit

Finally found some free time. You can use this to capture a screenshot of a window with a given title. The next step is write code to generate similar data from reading a bitmap file, these files will store the checkbox template images you're looking for in the screenshot.

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4.  
  5. Private Type BITMAPINFOHEADER
  6.     biSize As Long
  7.     biWidth As Long
  8.     biHeight As Long
  9.     biPlanes As Integer
  10.     biBitCount As Integer
  11.     biCompression As Long
  12.     biSizeImage As Long
  13.     biXPelsPerMeter As Long
  14.     biYPelsPerMeter As Long
  15.     biClrUsed As Long
  16.     biClrImportant As Long
  17. End Type
  18.  
  19.  
  20. Private Type COLORQUAD
  21.     R As Long
  22.     G As Long
  23.     B As Long
  24.     A As Long
  25. End Type
  26.  
  27.  
  28. Private Type BITMAPINFO
  29.     bmiHeader As BITMAPINFOHEADER
  30.     bmiColors As COLORQUAD
  31. End Type
  32.  
  33.  
  34. Private Type BITMAPDATA
  35.     width As Long
  36.     height As Long
  37.     bmiData() As Byte
  38. End Type
  39.  
  40.  
  41. Private Type RECT
  42.     Left As Long
  43.     Top As Long
  44.     Right As Long
  45.     Bottom As Long
  46. End Type
  47.  
  48.  
  49. Private Const BI_RGB = 0
  50. Private Const SRCCOPY = &HCC0020
  51. Private Const DIB_RGB_COLORS = 0
  52. Private Const CF_BITMAP = 2
  53.  
  54.  
  55. Private Declare PtrSafe Function CloseClipboard Lib "user32" () As Boolean
  56. Private Declare PtrSafe Function EmptyClipboard Lib "user32" () As Boolean
  57. Private Declare PtrSafe Function FindWindowA Lib "user32" (ByVal lpClassName As Any, ByVal lpWindowName As Any) As LongPtr
  58. Private Declare PtrSafe Function GetClientRect Lib "user32" (ByVal hWnd As LongPtr, lpRect As Any) As Boolean
  59. Private Declare PtrSafe Function GetDC Lib "user32" (ByVal hWnd As LongPtr) As LongPtr
  60. Private Declare PtrSafe Function GetDesktopWindow Lib "user32" () As LongPtr
  61. Private Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hWndNewOwner As LongPtr) As Boolean
  62. Private Declare PtrSafe Function ReleaseDC Lib "user32" (ByVal hWnd As LongPtr, ByVal hdc As LongPtr) As Long
  63. Private Declare PtrSafe Function SetClipboardData Lib "user32" (ByVal format As Long, ByVal hMem As LongPtr) As LongPtr
  64.  
  65. Private Declare PtrSafe Function BitBlt Lib "gdi32" (ByVal hdc As LongPtr, ByVal x As Long, ByVal y As Long, ByVal cx As Long, ByVal cy As Long, ByVal hdcSrc As LongPtr, ByVal x1 As Long, ByVal y1 As Long, ByVal rop As Long) As Boolean
  66. Private Declare PtrSafe Function CreateCompatibleBitmap Lib "gdi32" (ByVal hdc As LongPtr, ByVal cx As Long, ByVal cy As Long) As LongPtr
  67. Private Declare PtrSafe Function CreateCompatibleDC Lib "gdi32" (ByVal hdc As LongPtr) As LongPtr
  68. Private Declare PtrSafe Function DeleteObject Lib "gdi32" (ByVal hObject As LongPtr) As Long
  69. Private Declare PtrSafe Function GetDIBits Lib "gdi32" (ByVal hdc As LongPtr, ByVal hBitmap As LongPtr, ByVal nStartScan As Long, ByVal nNumScans As Long, lpBits As Any, lpBI As BITMAPINFO, ByVal wUsage As Long) As Long
  70. Private Declare PtrSafe Function SelectObject Lib "gdi32" (ByVal hdc As LongPtr, ByVal hObject As LongPtr) As Long
  71.  
  72.  
  73. Sub Test()
  74.     Dim calcHwnd As Long
  75.     Dim lpBitmap As BITMAPDATA
  76.     Dim w As Long, h As Long, c As Long
  77.     Dim x As Long, y As Long
  78.  
  79.     calcHwnd = FindWindowA(vbNullString, "SAS INSTITUTE TRADEMARKS (SECURED) - Adobe Acrobat Reader DC")
  80.     lpBitmap = Screenshot(calcHwnd)
  81.     Debug.Print "Done, do a paste into mspaint to see what was screenshotted"
  82. End Sub
  83.  
  84.  
  85. Function Screenshot(ByRef hWnd As Long) As BITMAPDATA
  86.     Dim hdcWindow As Long
  87.     Dim rcClient As RECT
  88.     Dim hdcMemDC As Long
  89.     Dim hbmScreen As Long
  90.     Dim windowWidth As Long, windowHeight As Long
  91.     Dim bi As BITMAPINFO
  92.     Dim dwBmpSize As Long
  93.     Dim retval
  94.  
  95.     ' Get window info
  96.     hdcWindow = GetDC(hWnd)
  97.     GetClientRect hWnd, rcClient
  98.     windowWidth = rcClient.Right - rcClient.Left
  99.     windowHeight = rcClient.Bottom - rcClient.Top
  100.  
  101.     ' Create a compatible drawing context and bitmap
  102.     hdcMemDC = CreateCompatibleDC(hdcWindow)
  103.     hbmScreen = CreateCompatibleBitmap(hdcWindow, windowWidth, windowHeight)
  104.     SelectObject hdcMemDC, hbmScreen
  105.  
  106.     ' Create bitmap info
  107.     bi.bmiHeader.biSize = 40
  108.     bi.bmiHeader.biWidth = windowWidth
  109.     bi.bmiHeader.biHeight = windowHeight
  110.     bi.bmiHeader.biPlanes = 1
  111.     bi.bmiHeader.biBitCount = 32
  112.     bi.bmiHeader.biCompression = BI_RGB
  113.     bi.bmiHeader.biSizeImage = 0
  114.     bi.bmiHeader.biXPelsPerMeter = 0
  115.     bi.bmiHeader.biYPelsPerMeter = 0
  116.     bi.bmiHeader.biClrUsed = 0
  117.     bi.bmiHeader.biClrImportant = 0
  118.  
  119.     ' Bit block transfer into our compatible memory DC
  120.     BitBlt hdcMemDC, 0, 0, windowWidth, windowHeight, hdcWindow, 0, 0, SRCCOPY
  121.  
  122.     ' Gets the "bits" from the bitmap and copy them into byte array
  123.     dwBmpSize = Int(((windowWidth * bi.bmiHeader.biBitCount + 31) / 32) * 4 * windowHeight)
  124.     Screenshot.width = windowWidth
  125.     Screenshot.height = windowHeight
  126.     ReDim Screenshot.bmiData(dwBmpSize)
  127.     GetDIBits hdcWindow, hbmScreen, 0, windowHeight, Screenshot.bmiData(0), bi, DIB_RGB_COLORS
  128.  
  129.     ' Copy screenshot to clipboard for verification purposes, not needed in production
  130.     copyBitmapToClipboard hbmScreen
  131.  
  132.     ' clean up
  133.     DeleteObject hbmScreen
  134.     DeleteObject hdcMemDC
  135.     ReleaseDC hWnd, hdcWindow
  136. End Function
  137.  
  138.  
  139. Function copyBitmapToClipboard(ByVal hBitmap As Long)
  140.     OpenClipboard 0&
  141.     EmptyClipboard
  142.     SetClipboardData CF_BITMAP, hBitmap
  143.     CloseClipboard
  144. End Function
Once we have the 2 sets of data, we can write the code to compare the screenshot against the template data. Basically, what you'll be doing is iterate over every pixel in the screenshot and over the same area as the template image, calculate the difference in the pixel values to generate a match score. In pseudocode, something like this
Expand|Select|Wrap|Line Numbers
  1. For xi = 0 To screenshotWidth
  2.    For yi = 0 To screenshotHeight
  3.       matchValue = 0
  4.  
  5.       For xt = 0 To templateWidth
  6.          For yt = 0 To templateHeight
  7.             matchValue = matchValue + ((templateData(xt, yt) - screenshotData(xi + xt, yi + yt)) ^ 2
  8.          Next
  9.       Next
  10.  
  11.       If matchValue > threshholdValue Then
  12.          ' Match Found, do something
  13.       End If
  14.    Next
  15. Next

  • viewed: 2383
Share:
47 Replies
isladogs
Expert 64KB
Hi @twinnyfo
I have absolutely no idea as I've never needed to do this....
However, I wonder if you've seen this article (and example apps) by Access MVP, theDBGuy : Fillable PDF Forms. Possibly useful?
1 Week Ago #2
twinnyfo
Expert Mod 2GB
That was one of the few sites I had not seen yet. He covers a different approach to filling PDFs, but not extracting data--and does not touch on check boxes. I did contact the author to see if he has any insight into my quandary.

Thanks for the reference. If I find anything out, I will certainly post it here.
1 Week Ago #3
Rabbit
Expert Mod 8TB
I haven't worked programatically with PDFs so what follows might just be naive rambling.

Presumable, the Adobe SDK or API allows one to create secure PDFs by use of some sort of password. Which should also mean it allows one remove that security and/or open a secure PDF by providing that password.

However, since that PDF does not belong to you, I'm guessing you don't have the password for that level of access to the PDF. This leaves you hamstrung in the sense that you don't have "proper" programmatic access to the info you need.

I don't know the file spec of a PDF but if the security was properly implemented, you won't be able to bypass it using binary level manipulations. For example, an excel workbook has both document level protection and worksheet level protection. An excel workbook is really just a zip file of a bunch of XML documents, you can open them in a zip program if you wanted. The document level protection, for xlsx files anyways, uses AES encryption, there's no getting around that.

The worksheet level protection, however, is not implemented using encryption. It's just an attribute on an XML node. What this means is you can remove worksheet level protection by unzipping the excel workbook, finding the XML file for that worksheet, removing the protected attribute from the XML node, and rezipping the file.

Assuming all the above, that is, you don't have the password to bypass the secure portion of the PDF and that the security of the PDF is properly implemented, that leaves you with screen scraping as a possible solution.

What follows is a very high level overview of a very convoluted process in which to accomplish this. Let me know if you would like to pursue it further.
  1. Use whatever method you prefer to capture a screenshot
  2. Feed this screenshot along with a template image of a checked checkbox to the OpenCV library
  3. Use OpenCV to run a template match to find said template image within the screenshot, if it exists within the screenshot

This isn't the only way of accomplishing the task, but it's the way I can envision it working by piecing together techniques I've used for other tasks.

For example, there's a program called autohotkey that allows you to script interactions with the GUI such that you can use it to accomplish the steps above. I haven't used autohotkey before but if you were to learn it, it might be easier than attempting the steps I outlined.

There's also a Windows UI Automation DLL that you could potentially use to access the checkbox directly by making windows API calls. I've never done this but I could see it working as long as the checkboxes in a PDF are windows GUI checkboxes.
1 Week Ago #4
twinnyfo
Expert Mod 2GB
Rabbit,

Thanks for the thoughts--as usual. Your screenshot idea sounds interesting, but with our super-slow and highly security saturated systems, that may be a non-starter. And, as you say, highly convoluted. Not sure the effort would be worth the benefit for what we need this for.

Let me do some exploring into the UI DLL.
1 Week Ago #5
Rabbit
Expert Mod 8TB
A good way to see if the UI automation DLL will be able to identify the checkbox is to run the Narrator accessibility tool, I believe it uses that DLL to access the window elements. If Narrator can identify and read the value of the checkbox, then that could point to the UI automation API as a potentially good solution.

Is the objection to the other method and objection to taking a screenshot or the use of a third party program or library to read the screenshot? I can't speak to Autohotkey, but as far as OpenCV is concerned, you could call it as a local javascript library if that helps ease any concerns.
1 Week Ago #6
twinnyfo
Expert Mod 2GB
Rabbit! Buddy! Friend!

Using DLLs is already at the fringe of my programming universe! The other stuff you're talking about is the LUNATIC FRINGE!

I still feel like a total NOOB around you guys!

:-)
1 Week Ago #7
Rabbit
Expert Mod 8TB
You can give yourself a little more credit than that. If your security group won't allow it, that's one thing. But if there's no concern, I'm more than willing to help you through it. But put that on the backburner.

Here's some code I threw together that should get you started. You'll have to add a reference to the UIAutomationClient library. It dumps out all windows from the desktop and if there's a window with the name in the If clause, it recurses that window for all subwindows and dumps out that information.

Open up notepad if you want to see the information it dumps out for that.

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4.  
  5. 'Test uia just dumps all windows of the desktop to the debug window
  6. Sub testUIA()
  7.     Dim oCUI As New CUIAutomation
  8.     Dim oDesktop As IUIAutomationElement
  9.     Set oDesktop = oCUI.GetRootElement
  10.  
  11.     Dim oCondition As IUIAutomationCondition
  12.     Dim allElementArray As IUIAutomationElementArray
  13.     Dim oElement As IUIAutomationElement
  14.  
  15.     Dim i As Long
  16.  
  17.  
  18.     'Filter on true which means get them all
  19.     Set oCondition = oCUI.CreateTrueCondition
  20.     Set allElementArray = oDesktop.FindAll(TreeScope_Children, oCondition)
  21.  
  22.  
  23.     For i = 0 To allElementArray.Length - 1
  24.         Set oElement = allElementArray.GetElement(i)
  25.         Debug.Print oElement.CurrentClassName & " | " & oElement.CurrentName & " | " & oElement.CurrentControlType
  26.  
  27.         If oElement.CurrentClassName = "Notepad" Then
  28.             RecurseElements oCUI, oElement, 0
  29.         End If
  30.     Next
  31. End Sub
  32.  
  33.  
  34. Sub RecurseElements(oCUI As CUIAutomation, oElement As IUIAutomationElement, level)
  35.     Dim oCondition As IUIAutomationCondition
  36.     Dim allElementArray As IUIAutomationElementArray
  37.     Dim subElement As IUIAutomationElement
  38.     Dim i As Long
  39.  
  40.     Set oCondition = oCUI.CreateTrueCondition
  41.     Set allElementArray = oElement.FindAll(TreeScope_Children, oCondition)
  42.  
  43.  
  44.     For i = 0 To allElementArray.Length - 1
  45.         Set subElement = allElementArray.GetElement(i)
  46.         Debug.Print String(level, "-") & oElement.CurrentClassName & " | " & oElement.CurrentName & " | " & oElement.CurrentControlType
  47.  
  48.         RecurseElements oCUI, subElement, level + 1
  49.     Next
  50. End Sub
1 Week Ago #8
NeoPa
Expert Mod 16PB
Rabbit:
You can give yourself a little more credit than that.
TwinnyFo:
Rabbit! Buddy! Friend!

Using DLLs is already at the fringe of my programming universe! The other stuff you're talking about is the LUNATIC FRINGE!
I have to agree with Rabbit on this one I'm afraid. He really is very clever you know, so trust him to know that you undervalue yourself.

I can sympathise with feeling some of the techniques he suggests are a little bit out there though. I felt that about some of the SQL ideas he's suggested in the past. It was a great learning opportunity for me.
1 Week Ago #9
twinnyfo
Expert Mod 2GB
Rabbit,

OK, brother. I'm probably going to need some extended hepp on this one. So, I've made a few minor mods to the above code to make it specific to the Adobe Acrobat issue.

Expand|Select|Wrap|Line Numbers
  1. Public Sub testUIA()
  2.     Dim oCUI            As New CUIAutomation
  3.     Dim oDesktop        As IUIAutomationElement
  4.     Dim oCondition      As IUIAutomationCondition
  5.     Dim allElementArray As IUIAutomationElementArray
  6.     Dim oElement        As IUIAutomationElement
  7.     Dim i               As Long
  8.  
  9.     AppActivate "Adobe Acrobat Pro DC", False
  10.  
  11.     'Filter on true which means get them all
  12.     Set oDesktop = oCUI.GetRootElement
  13.     Set oCondition = oCUI.CreateTrueCondition
  14.     Set allElementArray = oDesktop.FindAll(TreeScope_Children, oCondition)
  15.  
  16.     For i = 0 To allElementArray.Length - 1
  17.         Set oElement = allElementArray.GetElement(i)
  18.         If oElement.CurrentClassName = "AcrobatSDIWindow" Then
  19.             RecurseElements oCUI, oElement, 0
  20.         End If
  21.     Next
  22.  
  23.     AppActivate "Microsoft Visual Basic for Applications", False
  24.  
  25. End Sub
  26.  
  27. Public Sub RecurseElements( _
  28.     oCUI As CUIAutomation, _
  29.     oElement As IUIAutomationElement, _
  30.     level)
  31.     Dim oCondition      As IUIAutomationCondition
  32.     Dim allElementArray As IUIAutomationElementArray
  33.     Dim subElement      As IUIAutomationElement
  34.     Dim i As Long
  35.  
  36.     Set oCondition = oCUI.CreateTrueCondition
  37.     Set allElementArray = oElement.FindAll(TreeScope_Children, oCondition)
  38.  
  39.     For i = 0 To allElementArray.Length - 1
  40.         Set subElement = allElementArray.GetElement(i)
  41.         Debug.Print _
  42.             String(level, "-") & _
  43.             oElement.CurrentClassName & " | " & _
  44.             oElement.CurrentName & " | " & _
  45.             oElement.CurrentControlType & " -- " & _
  46.             oElement.CurrentHasKeyboardFocus
  47.  
  48.         RecurseElements oCUI, subElement, level + 1
  49.     Next
  50.  
  51. End Sub
Now, when I switch to the home tab in Acrobat, the recursion displays all my recent documents and also indicates that the check boxes displayed next to those files are unchecked. I have tested, and this will identify when those items are checked or not. So, this, initially, is very promising.

I added an additional bit to your RecurseElements sub, and that is the .CurrentHasKeyboardFocus flag. When I run the code, none of the elements described is identified as having the keyboard focus. However, when I switch to the Adobe Home Tab, run the code and switch to Adobe very quickly, it can and will identify the highlighted item as having the focus by displaying the value of "1" instead of "0". Again, initially very promising.

Experimenting further, I open one of my PDFs in question, highlight a particular field and run the code, switch quickly back to Adobe before the code gets too far, and this time, no fields are identified as having focus.

Lines 9 and 23 were then added to make sure the system switched properly to Adobe and still, no identified elements with the focus.

Any ideas on why this form's elements aren't identified? Or, does this system just identify elements of the application itself?

Thanks again for the hepp!
1 Week Ago #10
Rabbit
Expert Mod 8TB
Unfortunately, I have as much experience with this API as you do at this point. It is my understanding that it identifies all Windows UI elements used by the application. If they implemented custom UI elements, it won't identify them.

For example, below is a subset of the output of my notepad++ preferences window. It identifies dialogs, list boxes, and combo boxes, because the notepad++ application is using standard windows UI elements.

If the output for the Adobe Acrobat elements aren't identifying the UI elements you're looking for, then that probably means they created their own custom UI element you won't be able to use this API for what you're trying to do. Which puts us back at image template matching as a possible solution. Let me know if you would like to pursue that.

Expand|Select|Wrap|Line Numbers
  1. Notepad++ | Notepad++ | 50032 | window | 1
  2. -#32770 | Preferences | 50032 | dialog | 1
  3. --ListBox |  | 50008 | list | 1
  4. --ListBox |  | 50008 | list | 1
  5. --ListBox |  | 50008 | list | 1
  6. --ListBox |  | 50008 | list | 1
  7. --ListBox |  | 50008 | list | 1
  8. --ListBox |  | 50008 | list | 1
  9. --ListBox |  | 50008 | list | 1
  10. --ListBox |  | 50008 | list | 1
  11. --ListBox |  | 50008 | list | 1
  12. --ListBox |  | 50008 | list | 1
  13. --ListBox |  | 50008 | list | 1
  14. --ListBox |  | 50008 | list | 1
  15. --ListBox |  | 50008 | list | 1
  16. --ListBox |  | 50008 | list | 1
  17. --ListBox |  | 50008 | list | 1
  18. -#32770 | Preferences | 50032 | dialog | 1
  19. -#32770 | Preferences | 50032 | dialog | 1
  20. -#32770 | Preferences | 50032 | dialog | 1
  21. -#32770 | Preferences | 50032 | dialog | 1
  22. -#32770 | Preferences | 50032 | dialog | 1
  23. -#32770 | Preferences | 50032 | dialog | 1
  24. -#32770 | Preferences | 50032 | dialog | 1
  25. -#32770 | Preferences | 50032 | dialog | 1
  26. -#32770 | Preferences | 50032 | dialog | 1
  27. -#32770 | Preferences | 50032 | dialog | 1
  28. -#32770 | Preferences | 50032 | dialog | 1
  29. -#32770 | Preferences | 50032 | dialog | 1
  30. -#32770 | Preferences | 50032 | dialog | 1
  31. -#32770 | Preferences | 50032 | dialog | 1
  32. -#32770 | Preferences | 50032 | dialog | 1
  33. -#32770 | Preferences | 50032 | dialog | 1
  34. -#32770 | Preferences | 50032 | dialog | 1
  35. -#32770 | Preferences | 50032 | dialog | 1
  36. -#32770 | Preferences | 50032 | dialog | 1
  37. -#32770 | Preferences | 50032 | dialog | 1
  38. --ComboBox | Localization | 50003 | combo box | 1
  39. --ComboBox | Localization | 50003 | combo box | 1
  40. -#32770 | Preferences | 50032 | dialog | 1
  41. -#32770 | Preferences | 50032 | dialog | 1
  42. -#32770 | Preferences | 50032 | dialog | 1
  43. -#32770 | Preferences | 50032 | dialog | 1
  44. -- |  | 50037 | title bar | 1
1 Week Ago #11
twinnyfo
Expert Mod 2GB
I think our similar results, and your explanation, means that the UIAutomation DLL only describes the UI. And, in the case of an Adobe Form, the form itself is not part of the UI. This explains why the code gave me tons of controls on the home page and few on the Form itself.

OK - Plan B (I think I'm much farther along than that! The first seven preparations, let's call them A through G, were total failures. But this final iteration, let's call it Preparation H, on the whole, feels good!)

I've embedded an image of what I am dealing with.


There are five check boxes I need to deal with. As far as I can tell, whenever I open the forms, they are in the same location as the forms always "appear" the same.

In the "Promotion Zone" section one or the other of the two check boxes could be checked, but never both and sometimes none. I would need to check the status of each of those check boxes.

Likewise for the "Overall Recommendation" section, one of the three check boxes could be checked, but never two and sometimes none. I would need to check the status of those check boxes also.

Some challenges that might come up with this:
  • Adobe is not maximized
  • Adobe is on a left or right screen in dual monitor mode
  • Different screen resolutions
  • There are some similar, older version forms (which we reject anyway) that may have slight location differences--but assuming all the same version of forms, we should be OK.
I will admit that I think I only barely understand "what" you are describing--that is, that we will use the system to "compare images" (for lack of a better term) to determine if the check box is checked. Is that correct?

I am all for learning new stuff. But at this point it "appears" beyond my level of comprehension. I will give it that valiant effort, though--NeoPa can attest to beginnings and growings....

Thank you so much for working through this with me!
Attached Images
File Type: png CheckBox.png (7.5 KB, 137 views)
1 Week Ago #12
NeoPa
Expert Mod 16PB
TwinnyFo:
NeoPa can attest to beginnings and growings
I can certainly attest to being impressed. This isn't trivial and requires a particular, and rare, attitude on the part of the student. In this case you clearly have that. I would add that it becomes more difficult the older we get so knowing you aren't green out of school makes it more impressive in a way.

I also wanted to let you know I'm continuing to monitor this thread. Not because I have any expectation that I will be able to help, but I'm very interested to see if you manage to make a success of it. I wish you both very well of course :-)

If I see anything I can help with I'll jump in of course, but I don't expect to. My expertise with Windows programming is from the very early days. I could create a .COM for you without a compiler from Intel machine code but have no experience with current APIs etc. In my Windows programming days the way to call on the OS was to trigger an interrupt ;-)
1 Week Ago #13
Rabbit
Expert Mod 8TB
I threw together an image matching example on my github pages: https://vincitego.github.io/OpenCV/opencv.html

It was a rush job, so the code works, but isn't ideal, more on that later. As you can see from the example, it identifies and outlines in red a selected template image within another selected image. You can select local image files to search and local image files to use as the template. So you can give it a whirl with what you have on hand.

Some caveats before you will actually be able to fold this into your process:
  • The code itself is pretty small and so feel free to look at the source code. The only thing is that it relies on the open source javascript OpenCV library.
  • The code runs locally, that is, your computer is actually crunching those numbers to produce the results. None of the data is sent to any server to process. However, I ran into an issue with cross origin errors with the preloaded image data when I tried to open the HTML file locally. So the webpage will have to be hosted somewhere, probably on your intranet if you have one. Either that or you will have to script selecting the file from the input, which I don't know if that's possible for browser security reasons.
  • The template match currently finds and returns the highest percent match, even if that match is very low. The code will have to be modified to filter out low percentage matches.
  • It doesn't work on Internet Explorer so you will have to automate a different browser to interact with it.
  • The code uses timed delays to make sure the images are loaded before running the template match. Ideally this would be done with callback functions instead.

As for your other concerns:
  • Adobe is not maximized - Shouldn't be an issue as long as you can take a screenshot of it.
  • Adobe is on a left or right screen in dual monitor mode - See above.
  • Different screen resolutions - A small sticking point. As long as you can set the zoom of the PDF at runtime, you should be able to create standard size image grabs. And even if you can't, you can scale the image with OpenCV.
  • There are some similar, older version forms (which we reject anyway) that may have slight location differences--but assuming all the same version of forms, we should be OK. - Location won't be an issue. As long as it looks like the template image. You can also run multiple template images if you wanted.

You might be thinking to yourself that it's a very convoluted approach. And yes, yes it is. While it is conceivable that you could port some of the OpenCV functionality into native VBA, that would be a bear of a project in of itself.
1 Week Ago #14
Expert 1GB
This is a VERY educational thread. Thank you all!

Jim
1 Week Ago #15
Rabbit
Expert Mod 8TB
Glad to hear it. Even if no final solution is arrived at, hopefully someone finds something they can use
1 Week Ago #16
twinnyfo
Expert Mod 2GB
Rabbit,

Thanks for your efforts on this. I truly appreciate any time you've spent on this.

The concept of this is truly fascinating to me, but I have a few concerns.

First, my HTML skills are extremely limited--I typically don't work at all with websites/webpages. My experience with HTML is limited to tagging in e-mails to affect appearance--very limited.

I have zero experience with Java, so I am not sure how to invoke any of this. I'm willing to learn, though.

Finally, this has to be run, ultimately, from a self-contained VBA application. If I have some template files, that's OK. We can store those in predefined locations.

So, to make a long story short, I have no idea where to start with this.... Also, keep in mind that we have hundreds of these imports to perform. Right now, each form takes about 6 seconds to import--but we are missing those check boxes. How much extra time will this add-on increase that import time? We won't know that until we try it out.

Again, willing to work through some trials with this. But, I do feel like I am starting on the ground floor with this.

Thanks again!
1 Week Ago #17
Rabbit
Expert Mod 8TB
First, my HTML skills are extremely limited--I typically don't work at all with websites/webpages. My experience with HTML is limited to tagging in e-mails to affect appearance--very limited.

I have zero experience with Java, so I am not sure how to invoke any of this. I'm willing to learn, though.
I can help you with the Javascript portion.

Finally, this has to be run, ultimately, from a self-contained VBA application. If I have some template files, that's OK. We can store those in predefined locations.
I guess that depends on what you mean by self contained. The solution will rely on opening a browser to the web page that does the template match. And the web page itself relies on javascript and a javascript library.

So, to make a long story short, I have no idea where to start with this.... Also, keep in mind that we have hundreds of these imports to perform. Right now, each form takes about 6 seconds to import--but we are missing those check boxes. How much extra time will this add-on increase that import time? We won't know that until we try it out.
This will add roughly 2-3 seconds per import

As for where to begin. The first thing is to make sure you will be able to host the webpage. You'll need to verify 2 things on your end. One, that your team has an intranet site where you can host web pages for staff to use. And two, that you have the permissions to copy files to that location so that you can move the screenshot image there at runtime.

If those requirements aren't met, we won't be able to proceed with this particular path. As an alternative path, I have been looking at the OpenCV documentation for their template matching algorithm and it doesn't look too complicated actually. I think it's very doable to port into VBA but it would require more effort and may be slower than using the the optimized javascript library.
6 Days Ago #18
twinnyfo
Expert Mod 2GB
This may be a non-starter, because I'm not sure what you mean by hosting a web page. If that is as simple as having a shared location in which I can throw an HTML file, then yes. If this has to do with webpage hosting, then, no, we can't do that.

As a trial, I saved the web page you linked to above and then tried to open it again, and it does "open" although I still have no clue what I am supposed to do with it. I am hoping this begins to resolve itself once we move forward with it.

I am still very much feeling like my avatar.
6 Days Ago #19
Rabbit
Expert Mod 8TB
Many larger organizations have internal hosting accessible only to those within their network. What we call an intranet.

When you saved and opened the code, did it display the images along with running the template match code? That is, did it create an output image with the red rectangle outlining the match that it found? Similar to what you get when you visit the page from the internet.

What we want to determine is if you have a website available only to employees and whether or not you have permission to put files on there.
6 Days Ago #20
twinnyfo
Expert Mod 2GB
When I opened the HTML, it only displayed the first image. The button was active, but the second image was missing, along with the red rectangle.

The closest our office could come to a webpage (that we can manage) is SharePoint, and I'm not certain of our capabilities there, either. Certainly, our network share drive houses all our DB stuff, and I've used that for template files for Access.

But, no, we do not have a true intranet.
6 Days Ago #21
Rabbit
Expert Mod 8TB
What is the default home page that is loaded when a standard employee opens their web browser? If your organization has an intranet, it will typically open to that portal. Usually listing department wide memos and helpful employee links.

It's also possible that it will open to Sharepoint, in which case, I'm don't know what the capabilities of Sharepoint are either.
6 Days Ago #22
twinnyfo
Expert Mod 2GB
We are in the Gub'ment. Their master portal is managed by others.

Methinks this may not be doable from our systems.

I still wish it was possible for me to at least experiment with this. I'd like to understand exactly "how" this is supposed to work. I think I understand the "what".....
6 Days Ago #23
Rabbit
Expert Mod 8TB
Oh, we're not done yet.

We are now on plan C, replicate template matching in VBA. It's the more correct method, it not the easiest.

First, we'll need to use windows API calls to return an array of bytes representing the image data of a screenshot of the window.

Here's some javascript code I have to take a screenshot of an application with a specified title, in this case, the windows calculator. You'll be making several API calls in addition to creating some cutom data types to replicate the structs that are used. See if you can make any headway with porting this to VBA. It's also entirely possible that someone has already done this and you just need to find it.

Expand|Select|Wrap|Line Numbers
  1. const calcHwnd = user32.FindWindowA(null, 'Calculator');
  2. console.log(screenshot(calcHwnd));
  3.  
  4. function screenshot(hWnd) {
  5.     let hdcWindow = null;
  6.     let hdcMemDC = null;
  7.     let hbmScreen = null;
  8.  
  9.         // Retrieve the handle to a display device context for the client area of the window.
  10.         hdcWindow = user32.GetDC(hWnd);
  11.         const rcClient = new win32_structs.RECT();
  12.         user32.GetClientRect(hWnd, rcClient);
  13.         const windowWidth = rcClient.right - rcClient.left;
  14.         const windowHeight = rcClient.bottom - rcClient.top;
  15.  
  16.         // Create a compatible DC and bitmap
  17.         hdcMemDC = gdi32.CreateCompatibleDC(hdcWindow);
  18.         hbmScreen = gdi32.CreateCompatibleBitmap(hdcWindow, windowWidth, windowHeight);
  19.         gdi32.SelectObject(hdcMemDC, hbmScreen);
  20.  
  21.         // create bitmap info
  22.         const bi = new win32_structs.BITMAPINFOHEADER();
  23.         bi.biSize = 40;
  24.         bi.biWidth = windowWidth;
  25.         bi.biHeight = windowHeight;
  26.         bi.biPlanes = 1;
  27.         bi.biBitCount = 32;
  28.         bi.biCompression = apiConstants.BI_RGB;
  29.         bi.biSizeImage = 0;
  30.         bi.biXPelsPerMeter = 0;
  31.         bi.biYPelsPerMeter = 0;
  32.         bi.biClrUsed = 0;
  33.         bi.biClrImportant = 0;
  34.  
  35.         const dwBmpSize = Math.floor(((windowWidth * bi.biBitCount + 31) / 32) * 4 * windowHeight);
  36.         const lpBitmap = new Buffer.alloc(dwBmpSize);
  37.  
  38.         // Bit block transfer into our compatible memory DC.
  39.         gdi32.BitBlt(hdcMemDC, 0, 0, windowWidth, windowHeight, hdcWindow, 0, 0, apiConstants.SRCCOPY);
  40.  
  41.         // Gets the "bits" from the bitmap and copies them into buffer lpbitmap
  42.         gdi32.GetDIBits(hdcWindow, hbmScreen, 0, windowHeight, lpBitmap, bi, apiConstants.DIB_RGB_COLORS);
  43.  
  44.         // clean up
  45.         if (hbmScreen != null) gdi32.DeleteObject(hbmScreen);
  46.         if (hdcMemDC != null) gdi32.DeleteObject(hdcMemDC);
  47.         if (hdcWindow != null) user32.ReleaseDC(hWnd, hdcWindow);
  48.  
  49.         return lpBitmap;
  50. }
6 Days Ago #24
Rabbit
Expert Mod 8TB
How's this coming along? I should have some free time in a few hours to whip something up
6 Days Ago #25
Rabbit
Expert Mod 8TB
Finally found some free time. You can use this to capture a screenshot of a window with a given title. The next step is write code to generate similar data from reading a bitmap file, these files will store the checkbox template images you're looking for in the screenshot.

Expand|Select|Wrap|Line Numbers
  1. Option Compare Database
  2. Option Explicit
  3.  
  4.  
  5. Private Type BITMAPINFOHEADER
  6.     biSize As Long
  7.     biWidth As Long
  8.     biHeight As Long
  9.     biPlanes As Integer
  10.     biBitCount As Integer
  11.     biCompression As Long
  12.     biSizeImage As Long
  13.     biXPelsPerMeter As Long
  14.     biYPelsPerMeter As Long
  15.     biClrUsed As Long
  16.     biClrImportant As Long
  17. End Type
  18.  
  19.  
  20. Private Type COLORQUAD
  21.     R As Long
  22.     G As Long
  23.     B As Long
  24.     A As Long
  25. End Type
  26.  
  27.  
  28. Private Type BITMAPINFO
  29.     bmiHeader As BITMAPINFOHEADER
  30.     bmiColors As COLORQUAD
  31. End Type
  32.  
  33.  
  34. Private Type BITMAPDATA
  35.     width As Long
  36.     height As Long
  37.     bmiData() As Byte
  38. End Type
  39.  
  40.  
  41. Private Type RECT
  42.     Left As Long
  43.     Top As Long
  44.     Right As Long
  45.     Bottom As Long
  46. End Type
  47.  
  48.  
  49. Private Const BI_RGB = 0
  50. Private Const SRCCOPY = &HCC0020
  51. Private Const DIB_RGB_COLORS = 0
  52. Private Const CF_BITMAP = 2
  53.  
  54.  
  55. Private Declare PtrSafe Function CloseClipboard Lib "user32" () As Boolean
  56. Private Declare PtrSafe Function EmptyClipboard Lib "user32" () As Boolean
  57. Private Declare PtrSafe Function FindWindowA Lib "user32" (ByVal lpClassName As Any, ByVal lpWindowName As Any) As LongPtr
  58. Private Declare PtrSafe Function GetClientRect Lib "user32" (ByVal hWnd As LongPtr, lpRect As Any) As Boolean
  59. Private Declare PtrSafe Function GetDC Lib "user32" (ByVal hWnd As LongPtr) As LongPtr
  60. Private Declare PtrSafe Function GetDesktopWindow Lib "user32" () As LongPtr
  61. Private Declare PtrSafe Function OpenClipboard Lib "user32" (ByVal hWndNewOwner As LongPtr) As Boolean
  62. Private Declare PtrSafe Function ReleaseDC Lib "user32" (ByVal hWnd As LongPtr, ByVal hdc As LongPtr) As Long
  63. Private Declare PtrSafe Function SetClipboardData Lib "user32" (ByVal format As Long, ByVal hMem As LongPtr) As LongPtr
  64.  
  65. Private Declare PtrSafe Function BitBlt Lib "gdi32" (ByVal hdc As LongPtr, ByVal x As Long, ByVal y As Long, ByVal cx As Long, ByVal cy As Long, ByVal hdcSrc As LongPtr, ByVal x1 As Long, ByVal y1 As Long, ByVal rop As Long) As Boolean
  66. Private Declare PtrSafe Function CreateCompatibleBitmap Lib "gdi32" (ByVal hdc As LongPtr, ByVal cx As Long, ByVal cy As Long) As LongPtr
  67. Private Declare PtrSafe Function CreateCompatibleDC Lib "gdi32" (ByVal hdc As LongPtr) As LongPtr
  68. Private Declare PtrSafe Function DeleteObject Lib "gdi32" (ByVal hObject As LongPtr) As Long
  69. Private Declare PtrSafe Function GetDIBits Lib "gdi32" (ByVal hdc As LongPtr, ByVal hBitmap As LongPtr, ByVal nStartScan As Long, ByVal nNumScans As Long, lpBits As Any, lpBI As BITMAPINFO, ByVal wUsage As Long) As Long
  70. Private Declare PtrSafe Function SelectObject Lib "gdi32" (ByVal hdc As LongPtr, ByVal hObject As LongPtr) As Long
  71.  
  72.  
  73. Sub Test()
  74.     Dim calcHwnd As Long
  75.     Dim lpBitmap As BITMAPDATA
  76.     Dim w As Long, h As Long, c As Long
  77.     Dim x As Long, y As Long
  78.  
  79.     calcHwnd = FindWindowA(vbNullString, "SAS INSTITUTE TRADEMARKS (SECURED) - Adobe Acrobat Reader DC")
  80.     lpBitmap = Screenshot(calcHwnd)
  81.     Debug.Print "Done, do a paste into mspaint to see what was screenshotted"
  82. End Sub
  83.  
  84.  
  85. Function Screenshot(ByRef hWnd As Long) As BITMAPDATA
  86.     Dim hdcWindow As Long
  87.     Dim rcClient As RECT
  88.     Dim hdcMemDC As Long
  89.     Dim hbmScreen As Long
  90.     Dim windowWidth As Long, windowHeight As Long
  91.     Dim bi As BITMAPINFO
  92.     Dim dwBmpSize As Long
  93.     Dim retval
  94.  
  95.     ' Get window info
  96.     hdcWindow = GetDC(hWnd)
  97.     GetClientRect hWnd, rcClient
  98.     windowWidth = rcClient.Right - rcClient.Left
  99.     windowHeight = rcClient.Bottom - rcClient.Top
  100.  
  101.     ' Create a compatible drawing context and bitmap
  102.     hdcMemDC = CreateCompatibleDC(hdcWindow)
  103.     hbmScreen = CreateCompatibleBitmap(hdcWindow, windowWidth, windowHeight)
  104.     SelectObject hdcMemDC, hbmScreen
  105.  
  106.     ' Create bitmap info
  107.     bi.bmiHeader.biSize = 40
  108.     bi.bmiHeader.biWidth = windowWidth
  109.     bi.bmiHeader.biHeight = windowHeight
  110.     bi.bmiHeader.biPlanes = 1
  111.     bi.bmiHeader.biBitCount = 32
  112.     bi.bmiHeader.biCompression = BI_RGB
  113.     bi.bmiHeader.biSizeImage = 0
  114.     bi.bmiHeader.biXPelsPerMeter = 0
  115.     bi.bmiHeader.biYPelsPerMeter = 0
  116.     bi.bmiHeader.biClrUsed = 0
  117.     bi.bmiHeader.biClrImportant = 0
  118.  
  119.     ' Bit block transfer into our compatible memory DC
  120.     BitBlt hdcMemDC, 0, 0, windowWidth, windowHeight, hdcWindow, 0, 0, SRCCOPY
  121.  
  122.     ' Gets the "bits" from the bitmap and copy them into byte array
  123.     dwBmpSize = Int(((windowWidth * bi.bmiHeader.biBitCount + 31) / 32) * 4 * windowHeight)
  124.     Screenshot.width = windowWidth
  125.     Screenshot.height = windowHeight
  126.     ReDim Screenshot.bmiData(dwBmpSize)
  127.     GetDIBits hdcWindow, hbmScreen, 0, windowHeight, Screenshot.bmiData(0), bi, DIB_RGB_COLORS
  128.  
  129.     ' Copy screenshot to clipboard for verification purposes, not needed in production
  130.     copyBitmapToClipboard hbmScreen
  131.  
  132.     ' clean up
  133.     DeleteObject hbmScreen
  134.     DeleteObject hdcMemDC
  135.     ReleaseDC hWnd, hdcWindow
  136. End Function
  137.  
  138.  
  139. Function copyBitmapToClipboard(ByVal hBitmap As Long)
  140.     OpenClipboard 0&
  141.     EmptyClipboard
  142.     SetClipboardData CF_BITMAP, hBitmap
  143.     CloseClipboard
  144. End Function
Once we have the 2 sets of data, we can write the code to compare the screenshot against the template data. Basically, what you'll be doing is iterate over every pixel in the screenshot and over the same area as the template image, calculate the difference in the pixel values to generate a match score. In pseudocode, something like this
Expand|Select|Wrap|Line Numbers
  1. For xi = 0 To screenshotWidth
  2.    For yi = 0 To screenshotHeight
  3.       matchValue = 0
  4.  
  5.       For xt = 0 To templateWidth
  6.          For yt = 0 To templateHeight
  7.             matchValue = matchValue + ((templateData(xt, yt) - screenshotData(xi + xt, yi + yt)) ^ 2
  8.          Next
  9.       Next
  10.  
  11.       If matchValue > threshholdValue Then
  12.          ' Match Found, do something
  13.       End If
  14.    Next
  15. Next
6 Days Ago #26
twinnyfo
Expert Mod 2GB
Hey Friend,

Have not had a chance at all to look at this. The first code was all Sanskrit to me (I can't say it looked Greek to me, because I know Greek). The second set of code looks more familiar, and I will try to set aside time to take a look at this today.

Just on the surface, it looks doable, yet convoluted. We will have to see how it works speed-wise. I am good taking a few extra seconds to get everything we need. More to follow.

Again, thank you for taking the time to work through this with me.
6 Days Ago #27
twinnyfo
Expert Mod 2GB
OK - for the most part, I can follow what's going on here.

However, once we grab that screenshot, do we need to save it as a bitmap before we can extract the pixels?

For example, once I have the screenshot, how would I examine the pixels for the range (593, 780) through (614, 802)?

Certainly the preferred method (and probably more faster method) is to do the comparison live, rather than saving a bitmap and reopening it (or extracting data from that file).

Additionally, I am also in serious doubt as to the functionality of this with other users. If I am always the one importing files and all settings are identical with Adobe each time I import, this might work. And, we sometimes get slightly outdated forms which have slightly different location for controls. But, this needs to be functional for others users--different resolutions, Adobe settings, all iterations of forms, etc.

I am interested in working this through to expand my toolkit. But in the big scheme of things, I probably will not be able to implement it in production.

I also understand if you if you are fine just cutting your losses and setting this to the side.....
6 Days Ago #28
Rabbit
Expert Mod 8TB
  • No need to save the screenshot as a file. When I said we now need code to read a bitmap file, I meant you should save the template image you want to search for as a bitmap file and load them into memory at runtime to use for comparison against the screenshot.
  • (593, 780) would be array item 780 * width + 593.
  • Location of the image doesn't matter. Template matching examines every location to find the best match.
  • Different iterations of the form don't matter as long as what you're searching for looks like the template image. And if it doesn't, then you can run multiple template images against the screenshot.
  • Different resolutions shouldn't matter as long as you set the document zoom when the PDF opens. You should be able to set the zoom with a flag when invoking Adobe.
6 Days Ago #29
twinnyfo
Expert Mod 2GB
I think now I am confused, or I haven't explained very well....
  • No need to save the screenshot as a file. When I said we now need code to read a bitmap file, I meant you should save the template image you want to search for as a bitmap file and load them into memory at runtime to use for comparison against the screenshot. I think this much makes sense.
  • (593, 780) would be array item 780 * width + 593. This is where I am struggling--I can't figure out how to access that data. I am attempting
    Expand|Select|Wrap|Line Numbers
    1. Debug.Print Screenshot.bmiData(790 * Screenshot.Width + 593)
    but that gives me an argument not optional error. Obviously, I'm not using this correctly.
  • Location of the image doesn't matter. Template matching examines every location to find the best match. Here is where I think my explanation must not have been clear. It does not help me to know if there is an existence of "any" check box, but the existence (or non-existence) of specific check boxes in specific locations. It appears that what you are saying is the former, rather than the latter. Again, maybe I simply am not grasping how this is "supposed" to work for my particular needs.

OR -- is the intent that this code ultimately will compare, starting at the screenshot, location (0, 0), and search each successive pixel to determine if it is the starting point of the template image? Also then realizing that I would have to perform that action up to five times in my case, as we are looking for the existence/non-existence of five different check boxes. Not to mention that I still don't know the second half of this as to check the pixels of an existing BMP--not a skillset I've worked on in the past.

If that is the case, I think I am at least starting to crawl out of the mud. This may be much too convoluted to be practical (for my purposes).
6 Days Ago #30
Rabbit
Expert Mod 8TB
Refer back to the Sub Test(). Screenshot takes in, as an argument, the handle ID of a window and returns the width, height, and bitmap byte data of said window.

Expand|Select|Wrap|Line Numbers
  1. lpBitmap = Screenshot(calcHwnd)
  2. Debug.Print "Red: " & lpBitmap.bmiData((790 * lpBitmap.Width + 593) * 4)
*Note: forgot the times 4 the first time around. Each pixel is represented by 4 bytes representing: red, blue, green, alpha.

I think you just misunderstand what the template image is going to look like. It's not just the checkbox, the template image will contain the text preceding the checkbox. See example template image below.



As for checking the byte data of a bitmap, forget that you're working on image data. You're just checking the how closely the numbers in one smaller array match the numbers in a larger array.
Attached Images
File Type: bmp Template.bmp (20.0 KB, 97 views)
5 Days Ago #31
twinnyfo
Expert Mod 2GB
I think my last paragraph grasped the idea of searching for the "template" in the actual screenshot. Thus, I will need to make five distinct template BMPs that contain enough space to make them each unique. Got it!

When I add the "* 4", I initially got an out of range error. Deleted the "* 4" and all worked fine. Changed to "* 3" and all worked fine. Changed back to "* 4" and all works fine. Go figure.....

Beating this dead horse now. How can I tell the difference between the R,G,B factors of the pixel? Is this related to the "* 4"? Obviously my basic understanding of how the data is actually saved in this array is completely faulty. Any pixel I choose (according to the code) and any multiplication factor between 1 and 4 comes out as an integer between 0-255, so we must be doing something right.
5 Days Ago #32
Rabbit
Expert Mod 8TB
Bitmap data is an array of numbers from 0-255. Each pixel is represented by 4 bytes: red, green, blue, alpha (you can ignore the alpha, screenshots don't have an alpha value, they're all 0). The pixels go from left to right, top to bottom.

This array contains 4 pixels. This could be the data for an image 1 pixel in length by 4 pixels in height. Or 4x1, or 2x2. The array for any of these sizes of images would look the same.
Expand|Select|Wrap|Line Numbers
  1. (127, 10, 8, 0, 255, 255, 255, 0, 25, 74, 24, 0, 37, 14, 68, 0)
Pixel 1 is (127, 10, 8, 0) where red = 127, green = 10, blue = 8, alpha = 0.
Pixel 2 is (255, 255, 255, 0) where red = 255, green = 255, blue = 255, alpha = 0.
And so on. The exact location of these pixels in a 2d image is defined outside of the array by a separately stored width and height.
5 Days Ago #33
twinnyfo
Expert Mod 2GB
OK - that hepps!

Let me play around with my screenshot for a while and then I'll come back and bother you for some more hepp!
5 Days Ago #34
Rabbit
Expert Mod 8TB
Sure thing, let me know.

Ultimately, what we're trying to do is, given a master image:
Expand|Select|Wrap|Line Numbers
  1. (127, 10, 8, 0, 255, 255, 255, 0, 25, 74, 24, 0, 37, 14, 68, 0)
And given a template image:
Expand|Select|Wrap|Line Numbers
  1. (250, 250, 250, 0, 50, 75, 25, 0)
Which subarray of the master image most closely resembles the template image?
5 Days Ago #35
Rabbit
Expert Mod 8TB
It might also make it easier to work with and understand if you converted the 1 dimensional array to a 3d (width, height, color) array or 3 separate 2d (width, height) arrays, one for each color. (Drop the alpha since we won't be using it.)
5 Days Ago #36
twinnyfo
Expert Mod 2GB
  1. When we build the bitmap array, it fills in the lines from the bottom. It took me a while to confirm, but this is the case. It fills bottom to top, and still left to right.
  2. It also fills in the colors as "B, G, R" instead of "R, G, B."
  3. Once I figured all that out, I could work with the data and use your advice from post #36. I created a new array, based upon how we typically think about it, top to bottom, left to right, R, G, B. Maybe I'm not quite as think as I dumb I am!

Expand|Select|Wrap|Line Numbers
  1. Public Sub Test()
  2.     Dim calcHwnd    As Long
  3.     Dim lpBitmap    As BITMAPDATA
  4.     Dim intHeight   As Integer
  5.     Dim intWidth    As Integer
  6.     Dim lngPixel    As Long
  7.     Dim arrBMP()    As Integer
  8.  
  9.     calcHwnd = FindWindowA( _
  10.         vbNullString, _
  11.         "PDFNAME.pdf (SECURED) - Adobe Acrobat Pro DC")
  12.     lpBitmap = Screenshot(calcHwnd)
  13.  
  14.     With lpBitmap
  15.         ReDim arrBMP(.Height - 1, .Width - 1, 3)
  16.         For intHeight = 0 To .Height - 1
  17.             For intWidth = 0 To .Width - 1
  18.                 lngPixel = _
  19.                     ((intHeight * .Width) * 4) + _
  20.                     (intWidth * 4)
  21.                 arrBMP(.Height - intHeight - 1, intWidth, 1) = _
  22.                     .bmiData(lngPixel + 2)
  23.                 arrBMP(.Height - intHeight - 1, intWidth, 2) = _
  24.                     .bmiData(lngPixel + 1)
  25.                 arrBMP(.Height - intHeight - 1, intWidth, 3) = _
  26.                     .bmiData(lngPixel)
  27.             Next intWidth
  28.         Next intHeight
  29.     End With
  30.  
  31.     Debug.Print _
  32.         "Pixel 940, 1045: RGB(" & _
  33.         arrBMP(940, 1045, 1) & ", " & _
  34.         arrBMP(940, 1045, 2) & ", " & _
  35.         arrBMP(940, 1045, 3) & ")"
  36.  
  37. End Sub
Lines 31-35 confirm the RGB values for a particular pixel (one which I KNOW has a very strange and unique color. Perfect match!

One step closer....

Now, anyone know how to read pixels from an existing BMP?
5 Days Ago #37
Rabbit
Expert Mod 8TB
Excellent! I think I'm just too used to working in RGB colors, didn't realize it was BGR. But I definitely thought it was top to bottom. But I've never actually worked with bitmap images at the byte level before, I usually just rely on existing libraries in javascript.

As for the bitmap files, I'm 90% sure someone has already done that work for you in the past. But if you can't find any examples, bitmap files consist of a 14 byte file header, followed by a 40 byte bitmap info header, followed by the pixel data array (on bitmaps created by Windows anyways). The bitmap info header portion of the file should contain the width and height of the image.

One thing you'll have to account for in a bitmap file though is that a row of pixels is padded out to a multiple of 4 bytes. So you'll have to skip those bytes at the end of each "row" of pixels. Also, a bitmap file doesn't typically have the alpha byte, so there's no need to drop it.

Actually, now that I think about it, the screenshot bitmap might also contain padding on the row? You might want to double check that.

And now that I re-rethink about it, the screenshot bitmap shouldn't have any padding because it has the alpha which would ensure the row always comes out to a multiple of 4 anyways.
5 Days Ago #38
Rabbit
Expert Mod 8TB
Another thing you'll want to confirm is that invoking Adobe with a specified zoom produces consistent results on a few different systems. The size of the screenshot doesn't have to be the same on all the systems. But the height and width of the text and the checkbox within those screenshots should be roughly about the same size.
5 Days Ago #39
twinnyfo
Expert Mod 2GB
So, whilst I've been waiting for my database to compile and transfer over a really slow VPN, I've had a chance to think about this for a while. Let me throw this idea off you and let me know what you think (keep in mind that at the present moment I have not tested anything like this as of yet....)

Looking at the form itself, I know several things:
  • The check boxes (including the outline) are always either 22 x 22 pixels or 22 x 23 pixels.
  • The check boxes are always in the same general location--this means I can limit my search to an area, instead of the entire bitmap.
  • Now that I have the pixels arrayed, that becomes much easier!
  • The Check boxes are the only items on the form that have the following description:
    • The top left pixel has a white pixel above and a white pixel to the left.
    • The top left pixel has black pixels to the right and below.
    • Final validation of the location of the top left pixel can be done by extending the validation of the pixels to the right and down.
    • These pixels go out no less than 22 pixels for each Check box--see above.
    • Such a validation would confirm the location of at least one check box.
  • The two upper check boxes are aligned horizontally--if I can find one of those, I can find the other.
  • The three lower check boxes are aligned vertically--if I can find one, I can find the other three.
  • Prior to the form being digitally signed, the background of the check boxes is light blue. After signature, they are pure white.
All this put together, gives me the logic necessary to:
  1. Find all the Check Boxes
  2. Determine if they are UN-checked

Rather than check for similarity betwixt one bitmap and another bitmap, when I find the check box, I examine the 20 x 20 pixel field to the bottom and right of the top left most pixel of the check box. This field will ALWAYS be one of two colors--and every pixel will be the same (white or light blue). A quick 20 x 20 array scan of that field will determine if ANY of those pixels have ANY other values than pure white or light blue. Any other values indicates that the box has been checked!

I am done at work for the day--and my brain is really starting to hurt with this.

But............... I am really looking forward to work tomorrow and trying to put this together!

Your hepp--as usual--has been incalculable in working through this. This might not be a waste of my time!!!

I will post my final findings hopefully tomorrow!

:-)
5 Days Ago #40
Rabbit
Expert Mod 8TB
If that is consistently the case, then yes, building a custom search function tailored to your situation will run more quickly than a template match that searches for a supplied template image across the entire master image.

Happy to help! Let us know how you get along.
5 Days Ago #41
twinnyfo
Expert Mod 2GB
Absolutely one of the coolest things I've ever worked through! Here is my final code:

Expand|Select|Wrap|Line Numbers
  1. Public Sub Test()
  2.     Dim calcHwnd    As Long
  3.     Dim lpBitmap    As BITMAPDATA
  4.     Dim intX        As Integer
  5.     Dim intY        As Integer
  6.     Dim intX2       As Integer
  7.     Dim intY2       As Integer
  8.     Dim lngPixel    As Long
  9.     Dim arrBMP()    As Integer
  10.     Dim fChecked    As Boolean
  11.  
  12.     calcHwnd = _
  13.         FindWindowA( _
  14.             vbNullString, _
  15.             "TEST SIGNED.pdf (SECURED) - Adobe Acrobat Pro DC")
  16.     lpBitmap = Screenshot(calcHwnd)
  17.  
  18.     With lpBitmap
  19.         ReDim arrBMP(.Width - 1, .Height - 1, 3)
  20.         For intY = 0 To .Height - 1
  21.             For intX = 0 To .Width - 1
  22.                 lngPixel = _
  23.                     ((intY * .Width) * 4) + _
  24.                     (intX * 4)
  25.                 arrBMP(intX, .Height - intY - 1, 1) = _
  26.                     .bmiData(lngPixel + 2)
  27.                 arrBMP(intX, .Height - intY - 1, 2) = _
  28.                     .bmiData(lngPixel + 1)
  29.                 arrBMP(intX, .Height - intY - 1, 3) = _
  30.                     .bmiData(lngPixel)
  31.             Next intX
  32.         Next intY
  33.     End With
  34.  
  35.     For intY = 1 To lpBitmap.Height - 23
  36.         For intX = 1 To lpBitmap.Width - 23
  37.             'Check for UN-signed Form
  38.             If (arrBMP(intX, intY, 1) = 77 And _
  39.                 arrBMP(intX, intY, 2) = 80 And _
  40.                 arrBMP(intX, intY, 3) = 89) Then
  41.  
  42.                 'Check if Top-Left Corner
  43.                 If Not (arrBMP(intX - 1, intY, 1) = 255 And _
  44.                     arrBMP(intX, intY - 1, 2) = 255) Then _
  45.                     GoTo NotCheckBox
  46.  
  47.                 'Check Horizontal Line
  48.                 For intX2 = intX To intX + 21
  49.                     If Not arrBMP(intX2, intY, 1) = 77 Then _
  50.                         GoTo NotCheckBox
  51.                 Next intX2
  52.  
  53.                 'Check Vertical Line
  54.                 For intY2 = intY To intY + 21
  55.                     If Not arrBMP(intX, intY2, 1) = 77 Then _
  56.                         GoTo NotCheckBox
  57.                 Next intY2
  58.  
  59.                 'Check open Pixel
  60.                 If arrBMP(intX + 11, intY + 1, 1) = 222 Then
  61.                     'This is a Check Box!
  62.                     fChecked = False
  63.                     For intX2 = intX + 1 To intX + 19
  64.                         For intY2 = intY + 1 To intY + 19
  65.                             If Not arrBMP(intX2, intY2, 1) = 222 Then
  66.                                 fChecked = True
  67.                                 intX2 = intX + 19
  68.                                 intY2 = intY + 19
  69.                             End If
  70.                         Next intY2
  71.                     Next intX2
  72.                     Debug.Print _
  73.                         "UN-Signed Form: Pixel " & _
  74.                         intX & ", " & _
  75.                         intY & "; Checked: " & _
  76.                         fChecked
  77.                 End If
  78.  
  79.             'Ceheck for Signed Form
  80.             ElseIf (arrBMP(intX, intY, 1) = 0 And _
  81.                 arrBMP(intX, intY, 2) = 0 And _
  82.                 arrBMP(intX, intY, 3) = 0) Then
  83.  
  84.                 'Check if Top-Left Corner
  85.                 If Not (arrBMP(intX - 1, intY, 1) = 255 And _
  86.                     arrBMP(intX, intY - 1, 2) = 255) Then _
  87.                     GoTo NotCheckBox
  88.  
  89.                 'Check Horizontal Line
  90.                 For intX2 = intX To intX + 21
  91.                     If Not arrBMP(intX2, intY, 1) = 0 Then _
  92.                         GoTo NotCheckBox
  93.                 Next intX2
  94.  
  95.                 'Check Vertical Line
  96.                 For intY2 = intY To intY + 21
  97.                     If Not arrBMP(intX, intY2, 1) = 0 Then _
  98.                         GoTo NotCheckBox
  99.                 Next intY2
  100.  
  101.                 'Check open Pixel
  102.                 If arrBMP(intX + 11, intY + 1, 1) = 255 Then
  103.                     'This is a Check Box!
  104.                     fChecked = False
  105.                     For intX2 = intX + 1 To intX + 19
  106.                         For intY2 = intY + 1 To intY + 19
  107.                             If Not arrBMP(intX2, intY2, 1) = 255 Then
  108.                                 fChecked = True
  109.                                 intX2 = intX + 19
  110.                                 intY2 = intY + 19
  111.                             End If
  112.                         Next intY2
  113.                     Next intX2
  114.                     Debug.Print _
  115.                         "Signed Form: Pixel " & _
  116.                         intX & ", " & _
  117.                         intY & "; Checked: " & _
  118.                         fChecked
  119.                 End If
  120.             End If
  121. NotCheckBox:
  122.         Next intX
  123.     Next intY
  124.  
  125. End Sub
A few notes: Even though we traverse the BMP top to bottom, left to right (line first, then pixel in row), I can't help but "think" right to left and then up and down (Like X and Y coordinates in math), and Paint (which I uses to verify pixels), lists in X, Y coordinates and the constant conversion was makin' me beat up grass, so I converted the Array to that format, as well as going to X, Y variable names for consistency.

I also added lines 59-61 and 101- 103 because, all other conditions being met, this is a sure-fire way to validate that we have a check box field.

When I run this code on either a signed form or an un-signed form, the results are always the same. I get five pixels listed and it identified whether the form is signed or not and then identifies whether the check box is checked or not. Because this code always goes top to bottom, left to right, the order of the check boxes will always be the same, in my case: BPZ, I/APZ, DP, P, DNP. This order can be used to set variable for those values and used elsewhere in the database.

My main concern was speed and how much this was going to slow down my Form importation. When I allow this code to run the entire BMP--not limiting to specific areas--it literally takes less than a second. I "should" be able to incorporate this into my production database.

I also have a really nifty gadget in my toolkit. I hope others will benefit from this!

Thanks Rabbit. You are one of the reasons I look for hepp on Bytes!
5 Days Ago #42
NeoPa
Expert Mod 16PB
Fantastic news.

I know I'm not the only other member following this with interest but I have to say that was a steep learning curve and you simply nailed it.

Also, it really does help to have Rabbit around the place for sure. I couldn't have helped you with that. Totally outside of my wheelhouse.
5 Days Ago #43
Rabbit
Expert Mod 8TB
That's excellent! Glad we were able to arrive at a workable solution.

Here are some design considerations for moving forward
  • Though not necessary, you can consider combining the separate color channels into a single number. Red * 256 * 256 + Green * 256 + Blue. This allows you to test for color by referencing a single numeric or hex value.
  • You can also move the data restructuring into the screenshot function so you don't need to restructure the bitmap data after calling the screenshot function.
  • It's great that it runs in under a second, but that time can add up over hundreds of forms. If you find that it adds too much time, then you can consider testing not every pixel but every n-th. When you find a potential matching pixel, you just have to go left until you find the end of that black line.
  • If setting zoom at runtime doesn't produce consistent results. Then instead of looking for a fixed line length, you could loop until you hit a non-black pixel, deriving a line length from that. This should allow you to account for different size squares on different systems.
  • Similarly, if the zoom doesn't produce consistent results, you may need to account for varying line widths.
  • I noticed that in some of your color checks, you only check one of the color channels, which is probably fine for your situation, though it might be safer to check the full color value.
  • You start at row 1, column 1 which is probably fine but shouldn't you start at 0, 0?
  • You can combine your unsigned and signed versions by using that initial color check to set the color values to look for in the subsequent code.
4 Days Ago #44
twinnyfo
Expert Mod 2GB
Though not necessary, you can consider combining the separate color channels into a single number. Red * 256 * 256 + Green * 256 + Blue. This allows you to test for color by referencing a single numeric or hex value.
I had considered doing this. For right now, it is easier for me to check colors with RGB values, but in the long run, this will probably be the best way forward.

You can also move the data restructuring into the screenshot function so you don't need to restructure the bitmap data after calling the screenshot function.
It took me a couple re-reads to understand this, but yes--that may be a good move going forward.

It's great that it runs in under a second, but that time can add up over hundreds of forms. If you find that it adds too much time, then you can consider testing not every pixel but every n-th. When you find a potential matching pixel, you just have to go left until you find the end of that black line.
I think I would revert to starting farther down the BMP first. In my case, there are straight black lines all over the form. My key is finding a top left corner of a square that is one pixel wide (there are others which are two pixels wide). I am literally looking for single points.

  • If setting zoom at runtime doesn't produce consistent results. Then instead of looking for a fixed line length, you could loop until you hit a non-black pixel, deriving a line length from that. This should allow you to account for different size squares on different systems.
  • Similarly, if the zoom doesn't produce consistent results, you may need to account for varying line widths.
So far, it appears Adobe is opening the forms maximized at zoom 100%. If we run into "fuzzy boxes" on other systems, then we will have to re-address this issue. But right now, things are looking good.

I noticed that in some of your color checks, you only check one of the color channels, which is probably fine for your situation, though it might be safer to check the full color value.
That is correct. The fields inside the check boxes are consistently one color. Whether there is a check mark or fuzzy text, any variation from the one value indicates something other than a blank field.

You start at row 1, column 1 which is probably fine but shouldn't you start at 0, 0?
This was intentional. I need to check the pixels above the current vertical and left of the current horizontal. Check pixel (-1, -1) would cause an out of range error. Again, this process lends itself toward beginning farther down and to the right of the the BMP. Eliminating 75% of the BMP would save 3/4 of a second!

You can combine your unsigned and signed versions by using that initial color check to set the color values to look for in the subsequent code.
I tried to think of ways to do this. Unfortunately, we don't know if a form is signed until we open it and import the data. The first time that we KNOW that a form is UNSIGNED, is when it finds the first check box, as the outline is gray, not black. As it stands now, it is checking for two different distinct colors. This does, however, lend itself toward your first point. Having one value to search for, rather than three will run things more quickly.

Great fodder the chew on! I'll take a look at these things and do a little tweaking!

Thanks for the ideas!
4 Days Ago #45
Rabbit
Expert Mod 8TB
It's great that it runs in under a second, but that time can add up over hundreds of forms. If you find that it adds too much time, then you can consider testing not every pixel but every n-th. When you find a potential matching pixel, you just have to go left until you find the end of that black line.
I think I would revert to starting farther down the BMP first. In my case, there are straight black lines all over the form. My key is finding a top left corner of a square that is one pixel wide (there are others which are two pixels wide). I am literally looking for single points.
You can always threshhold which black lines are important by how long the line is. But I digress, the suggestion is an optional speed up that is ultimately not worth the added effort.

I noticed that in some of your color checks, you only check one of the color channels, which is probably fine for your situation, though it might be safer to check the full color value.
That is correct. The fields inside the check boxes are consistently one color. Whether there is a check mark or fuzzy text, any variation from the one value indicates something other than a blank field.
To clarify, I didn't mean that you were checking one color, I meant you were checking one color channel, in some of your color checks, you only look at the red value or the blue value or the green value.

Unfortunately, we don't know if a form is signed until we open it and import the data. The first time that we KNOW that a form is UNSIGNED, is when it finds the first check box, as the outline is gray, not black.
You don't have to know beforehand to collapse the If. I meant something like this.

Expand|Select|Wrap|Line Numbers
  1. For intY = 1 To lpBitmap.Height - 23
  2.   For intX = 1 To lpBitmap.Width - 23
  3.  
  4.     potentialLineFound = false
  5.  
  6.     If (arrBMP(intX, intY, 1) = 77 And _
  7.       arrBMP(intX, intY, 2) = 80 And _
  8.       arrBMP(intX, intY, 3) = 89) Then
  9.  
  10.       potentialLineFound = True
  11.       lineColor = 77
  12.       openPixelColor = 222
  13.       isSigned = False
  14.  
  15.     ElseIf (arrBMP(intX, intY, 1) = 0 And _
  16.       arrBMP(intX, intY, 2) = 0 And _
  17.       arrBMP(intX, intY, 3) = 0) Then
  18.  
  19.       potentialLineFound = True
  20.       lineColor = 0
  21.       openPixelColor = 255
  22.       isSigned = True
  23.  
  24.     End If
  25.  
  26.     If potentialLineFound
  27.       'Check if Top-Left Corner
  28.       If Not (arrBMP(intX - 1, intY, 1) = 255 And _
  29.           arrBMP(intX, intY - 1, 2) = 255) Then _
  30.           GoTo NotCheckBox
  31.  
  32.       'Check Horizontal Line
  33.       For intX2 = intX To intX + 21
  34.           If Not arrBMP(intX2, intY, 1) = lineColor Then _
  35.               GoTo NotCheckBox
  36.       Next intX2
  37.  
  38.       'Check Vertical Line
  39.       For intY2 = intY To intY + 21
  40.           If Not arrBMP(intX, intY2, 1) = lineColor Then _
  41.               GoTo NotCheckBox
  42.       Next intY2
  43.  
  44.       'Check open Pixel
  45.       If arrBMP(intX + 11, intY + 1, 1) = openPixelColor Then
  46.           'This is a Check Box!
  47.           fChecked = False
  48.           For intX2 = intX + 1 To intX + 19
  49.               For intY2 = intY + 1 To intY + 19
  50.                   If Not arrBMP(intX2, intY2, 1) = openPixelColor Then
  51.                       fChecked = True
  52.                       intX2 = intX + 19
  53.                       intY2 = intY + 19
  54.                   End If
  55.               Next intY2
  56.           Next intX2
  57.           Debug.Print _
  58.               "Form: Pixel " & _
  59.               intX & ", " & _
  60.               intY & "; Checked: " & _
  61.               fChecked
  62.       End If
  63.     End If
  64. NotCheckBox:
  65.   Next intX
  66. Next intY
4 Days Ago #46
twinnyfo
Expert Mod 2GB
Rabbit,

I see now what you are after with the If ... ElseIf. That now makes sense. It's still doing the same checks but in a more efficient manner.

And yes, I understood about the color channels. If one channel has the same value in the supposedly consistent field, then all channels will be the same. I am cutting down on calculations.

I've been unable to get either Hex values to work, or using Long Integers with an assigned RGB() value. I am sure I just rushed through it and did not do it right. For now, I am happy with how it works and how quickly it works. Maybe I'll re-look at it next week with fresh eyes.
4 Days Ago #47
Rabbit
Expert Mod 8TB
Let us know if it makes it into production!
3 Days Ago #48

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

5 posts views Thread by Kamuela Franco | last post: by
2 posts views Thread by Askari | last post: by
7 posts views Thread by Najib Abi Fadel | last post: by
2 posts views Thread by Ron Vecchi | last post: by
1 post views Thread by wardy | last post: by
9 posts views Thread by PawelR | last post: by
3 posts views Thread by Mark | last post: by
1 post views Thread by milop | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.