473,401 Members | 2,125 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,401 software developers and data experts.

Extracting Rich Text data formats from win32clipboard

Hi,

I'm trying to use Mark Hammond's win32clipboard module to extract more
complex data than just plain ASCII text from the Windows clipboard.
For instance, when you select all the content on web page, you can
paste it into an app like Frontpage, or something Rich Text-aware, and
it will preserve all the formatting, HTML, etc. I'd like to include
that behavior in the application I'm writing.

In the interactive session below, before I run the clipboard_grab()
function, I've selected all of the www.google.com homepage in IE and
hit Control-C. The function cycles through all the formats stored on
the clipboard and loads up a data list with each type it finds.

Here's where it gets interesting: while data[2] is the textual data
that I would expect to see if I pasted the clipboard in a Notepad
file, data[0] and data[1] are in a weird, non-ASCII (binary?) format.
Are these pointers to (or metadata for) the actual HTML or rich text?
How do I use this data? Is there a reference I can use that will help
me decipher this information? Any help would be greatly appreciated.

Thanks!

----

Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
import win32clipboard

def clipboard_grab(): .... global format, formats, data
.... win32clipboard.OpenClipboard()
.... format = 1
.... formats = []
.... data = []
.... while 1:
.... format = win32clipboard.EnumClipboardFormats(format)
.... print "FORMAT:", format
.... if not format:
.... break
.... try:
.... datum = win32clipboard.GetClipboardData(format)
.... formats.append(format)
.... data.append(datum)
.... except:
.... print format, traceback.format_exception(sys.exc_type,
sys.exc_value, sys.exc_traceback)
.... win32clipboard.EmptyClipboard()
.... win32clipboard.CloseClipboard()
....

clipboard_grab() FORMAT: 49171
FORMAT: 16
FORMAT: 7
FORMAT: 0 len(data) 3 data[0] '\x00\x00\x00\x00\x18\x01\x00\x00\x01\x00\x00\x00\ x06\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\xe3\xc0\xc2w\x00\x00\x00\x00\x01 \x00\x00\x00\xff\xff\xff\xff\x
01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\xa2\xc0\xe9\x02\x
00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff\x01 \x00\x00\x00\x01\x00\x00\x00\x
00\x00\x00\x00\x00\x00\x00\x00K\xc1\xc2w\x00\x00\x 00\x00\x01\x00\x00\x00\xff\xff
\xff\xff\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x 00\x00\x00\x00\x00\x00L\xc1\xc
2w\x00\x00\x00\x00\x01\x00\x00\x00\xff\xff\xff\xff \x01\x00\x00\x00\x01\x00\x00\x
00\x00\x00\x00\x00\x00\x00\x00\x00\r\x00\xc2w\x00\ x00\x00\x00\x01\x00\x00\x00\xf
f\xff\xff\xff\x01\x00\x00\x00\x01\x00\x00\x00\x00\ x00\x00\x00\x00\x00\x00\x00\x0
1\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\xff\ xff\xff\xff\x01\x00\x00\x00\x0
1\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ x00\x00\x00\x00\x00\x00\x00\x0
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ x00\x00\x00' data[1] '\t\x04\x00\x00' data[2] '\r\n \tWeb\t \tImages\t \tGroups\t \tDirectory\t \tNews\t \r\n\r\n
\t\r\n\t \x0
7 Advanced Search\r\n \x07 Preferences\r\n \x07 Language
Tools\r\n\r\n\r\nAdvert
ise with Us - Business Solutions - Services & Tools - Jobs, Press, &
Help\r\n\r\
nc2003 Google - Searching 3,307,998,701 web pages'

Jul 18 '05 #1
2 3696
> >>> clipboard_grab()
FORMAT: 49171
FORMAT: 16
FORMAT: 7


7 = CF_OEMTEXT
16 = CF_LOCALE
49171 = 0xC013 = apparently OLE private data

That should help you with some searches. Basically the CF_OEMTEXT is the
only one that's going to be useful for you, unless you can figure out what
to do with the OLE private data.

-Mike
Jul 18 '05 #2
Thanks for your help, Neil! Your example code gave me an idea what I
should be seeing when the HTML/RTF stuff is working properly. I'd
been using a non-IE browser (Firebird) for testing, and it wasn't
giving me those results. Thanks for getting me on track! Trader

"Neil Hodgson" <nh******@bigpond.net.au> wrote in message news:<rl*******************@news-server.bigpond.net.au>...
Trader:
>> clipboard_grab()

FORMAT: 49171
FORMAT: 16
FORMAT: 7
FORMAT: 0


Now add in:

for f in formats:
if f >= 0xC000:
print win32clipboard.GetClipboardFormatName(f)

Formats above 0xC000 are dynamically registered clipboard types. I get:

FORMAT: 13
FORMAT: 49278
FORMAT: 49245
FORMAT: 49171
FORMAT: 16
FORMAT: 7
FORMAT: 0

HTML Format
Rich Text Format
Ole Private Data

The HTML has a prologue and then some HTML:

Version:1.0
StartHTML:000000195
EndHTML:000001891
StartFragment:000001597
EndFragment:000001710
StartSelection:000001597
EndSelection:000001710
SourceURL:http://sydney.citysearch.com.au/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD><TITLE>CitySearch.com.au Australia - Your guide to the city of
Sydney</TITLE>
...

The RTF looks normal:

{\rtf1\ansi\ansicpg-1\deff0\deflang3081{\fonttbl{\f0\froman\fcharset0 Times
New Roman;}{\f1\ftech\fcharset0 Symbol;}{\f2\fswiss\fcharset0
Arial;}{\f3\fswiss\fcharset0 Courier New;}{\f4\ftech\fcharset0
Wingdings;}}{\colortbl\red0\green0\blue0;\red0\gre en0\blue255;\red0\green255
\blue255;\red0\green255\blue0;\red255\green0\blue2 55;\red255\green0\blue0;\r
ed255\green255\blue0;\red255\green255\blue255;\
...

Neil

Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Nazgul | last post by:
Hi! I want to implement a small tool in Python for distributing "patches" and I need Your advice. This application should be able to package all files chosen by a user into a self-extracting.exe...
27
by: Eric | last post by:
Assume that disk space is not an issue (the files will be small < 5k in general for the purpose of storing preferences) Assume that transportation to another OS may never occur. Are there...
0
by: VIJAY KUMAR | last post by:
Hi, I need help in rich text control which is new to me. I want to insert a text which is different formats like some are BOLD, some are ITALIC, some are with different Colors. Store this format...
5
by: Michael Hill | last post by:
Hi, folks. I am writing a Javascript program that accepts (x, y) data pairs from a text box and then analyzes that data in various ways. This is my first time using text area boxes; in the past,...
1
by: whaletyr | last post by:
I created a rich tekst box in vb I need to write some data into a rich tekst box and I willput a standard line of tekst in there and the variable. but I want the variable in another color. eg...
4
by: sullivanz.pku | last post by:
Hi all I am using the standard python GUI Tkinter as my program's main interface. Although I know wxPython has some widget to support rich text widget, but I do not have time to shift to wx----...
2
by: penguin732901 | last post by:
Hi Steve, I hope you can help me with this. In Access 97, I'm getting the following error. "The formats that enable you to output data as a Microsoft Excel, rich-text format, MS-DOS text, or...
16
by: Neil | last post by:
I posted a few days ago that it seems to me that the Access 2007 rich text feature does not support: a) full text justification; b) programmatic manipulation. I was hoping that someone might...
6
by: Werner | last post by:
Hi, I try to read (and extract) some "self extracting" zipefiles on a Windows system. The standard module zipefile seems not to be able to handle this. False Is there a wrapper or has...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.