473,657 Members | 2,540 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Hex editor display - can this be more pythonic?

CC
Hi:

I'm building a hex line editor as a first real Python programming exercise.

Yesterday I posted about how to print the hex bytes of a string. There
are two decent options:

ln = '\x00\x01\xFF 456\x0889abcde~ '
import sys
for c in ln:
sys.stdout.writ e( '%.2X ' % ord(c) )

or this:

sys.stdout.writ e( ' '.join( ['%.2X' % ord(c) for c in ln] ) + ' ' )

Either of these produces the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E

I find the former more readable and simpler. The latter however has a
slight advantage in not putting a space at the end unless I really want
it. But which is more pythonic?

The next step consists of printing out the ASCII printable characters.
I have devised the following silliness:

printable = '
1!2@3#4$5%6^7&8 *9(0)aAbBcCdDeE fFgGhHiIjJkKlLm MnNoOpPqQrRsStT uUvVwWxXyYzZ\
`~-_=+\\|[{]};:\'",<.>/?'
for c in ln:
if c in printable: sys.stdout.writ e(c)
else: sys.stdout.writ e('.')

print

Which when following the list comprehension based code above, produces
the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E ... 456.89abcde~

I had considered using the .translate() method of strings, however this
would require a larger translation table than my printable string. I
was also using the .find() method of the printable string before
realizing I could use 'in' here as well.

I'd like to display the non-printable characters differently, since they
can't be distinguished from genuine period '.' characters. Thus, I may
use ANSI escape sequences like:

for c in ln:
if c in printable: sys.stdout.writ e(c)
else:
sys.stdout.writ e('\x1B[31m.')
sys.stdout.writ e('\x1B[0m')

print
I'm also toying with the idea of showing hex bytes together with their
ASCII representations , since I've often found it a chore to figure out
which hex byte to change if I wanted to edit a certain ASCII char.
Thus, I might display data something like this:

00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b)
63(c) 64(d) 65(e) 7E(~)

Where printing chars are shown in parenthesis, characters with Python
escape sequences will be shown as their escapes in parens., while
non-printing chars with no escapes will be shown with nothing in parens.

Or perhaps a two-line output with offset addresses under the data. So
many possibilities!
Thanks for input!


--
_______________ ______
Christopher R. Carlen
cr***@bogus-remove-me.sbcglobal.ne t
SuSE 9.1 Linux 2.6.5
Jul 29 '07 #1
5 2608
On Sun, 29 Jul 2007 12:24:56 -0700, CC wrote:
ln = '\x00\x01\xFF 456\x0889abcde~ '
import sys
for c in ln:
sys.stdout.writ e( '%.2X ' % ord(c) )

or this:

sys.stdout.writ e( ' '.join( ['%.2X' % ord(c) for c in ln] ) + ' ' )

Either of these produces the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E

I find the former more readable and simpler. The latter however has a
slight advantage in not putting a space at the end unless I really want
it. But which is more pythonic?
I would use the second with fewer spaces, a longer name for `ln` and in
recent Python versions with a generator expression instead of the list
comprehension:

sys.stdout.writ e(' '.join('%0X' % ord(c) for c in line))
The next step consists of printing out the ASCII printable characters.
I have devised the following silliness:

printable = '
1!2@3#4$5%6^7&8 *9(0)aAbBcCdDeE fFgGhHiIjJkKlLm MnNoOpPqQrRsStT uUvVwWxXyYzZ\
`~-_=+\\|[{]};:\'",<.>/?'
I'd use `string.printab le` and remove the "invisible" characters like '\n'
or '\t'.
for c in ln:
if c in printable: sys.stdout.writ e(c)
else: sys.stdout.writ e('.')

print

Which when following the list comprehension based code above, produces
the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E ... 456.89abcde~

I had considered using the .translate() method of strings, however this
would require a larger translation table than my printable string.
The translation table can be created once and should be faster.
I'd like to display the non-printable characters differently, since they
can't be distinguished from genuine period '.' characters. Thus, I may
use ANSI escape sequences like:

for c in ln:
if c in printable: sys.stdout.writ e(c)
else:
sys.stdout.writ e('\x1B[31m.')
sys.stdout.writ e('\x1B[0m')

print
`re.sub()` might be an option here.
I'm also toying with the idea of showing hex bytes together with their
ASCII representations , since I've often found it a chore to figure out
which hex byte to change if I wanted to edit a certain ASCII char. Thus,
I might display data something like this:

00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b)
63(c) 64(d) 65(e) 7E(~)

Where printing chars are shown in parenthesis, characters with Python
escape sequences will be shown as their escapes in parens., while
non-printing chars with no escapes will be shown with nothing in parens.
For escaping:

In [90]: '\n'.encode('st ring-escape')
Out[90]: '\\n'

Ciao,
Marc 'BlackJack' Rintsch
Jul 29 '07 #2
CC
Marc 'BlackJack' Rintsch wrote:
On Sun, 29 Jul 2007 12:24:56 -0700, CC wrote:
>>The next step consists of printing out the ASCII printable characters.
I have devised the following silliness:

printable = '
1!2@3#4$5%6^7 &8*9(0)aAbBcCdD eEfFgGhHiIjJkKl LmMnNoOpPqQrRsS tTuUvVwWxXyYzZ\
`~-_=+\\|[{]};:\'",<.>/?'

I'd use `string.printab le` and remove the "invisible" characters like '\n'
or '\t'.
What is `string.printab le` ? There is no printable method to strings,
though I had hoped there would be. I don't yet know how to make one.
>>for c in ln:
if c in printable: sys.stdout.writ e(c)
else: sys.stdout.writ e('.')
The translation table can be created once and should be faster.
I suppose the way I'm doing it requires a search through `printable` for
each c, right? Whereas the translation would just be a lookup
operation? If so then perhaps the translation would be better.
>>I'd like to display the non-printable characters differently, since they
can't be distinguished from genuine period '.' characters. Thus, I may
use ANSI escape sequences like:

for c in ln:
if c in printable: sys.stdout.writ e(c)
else:
sys.stdout.writ e('\x1B[31m.')
sys.stdout.writ e('\x1B[0m')

print

`re.sub()` might be an option here.
Yeah, that is an interesting option. Since I don't wish to modify the
block of data unless the user specifically edits it, so I might prefer
the simple display operation.
For escaping:

In [90]: '\n'.encode('st ring-escape')
Out[90]: '\\n'
Hmm, I see there's an encoder that can do my hex display too.

Thanks for the input!

--
_______________ ______
Christopher R. Carlen
cr***@bogus-remove-me.sbcglobal.ne t
SuSE 9.1 Linux 2.6.5
Jul 30 '07 #3
CC
Dennis Lee Bieber wrote:
On Sun, 29 Jul 2007 12:24:56 -0700, CC <cr***@BOGUS.sb cglobal.net>
declaimed the following in comp.lang.pytho n:
>>for c in ln:
if c in printable: sys.stdout.writ e(c)
else:
sys.stdout.writ e('\x1B[31m.')
sys.stdout.writ e('\x1B[0m')
Be aware that this does require having a terminal that understands
the escape sequences (which, to my understanding, means unusable on a
WinXP console window)
Yeah, with this I'm not that concerned about Windows. Though, can WinXP
still load the ansi.sys driver?
>>Thus, I might display data something like this:

00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b)
63(c) 64(d) 65(e) 7E(~)
UGH!
:-D Lovely isn't it?
If the original "hex bytes dotted ASCII" side by side isn't
workable, I'd suggest going double line...

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E
nul soh xFF sp 4 5 6 bs 8 9 a b c d e ~
Yeah, something like that is probably nicer.
Use the standard "name" for the control codes (though I shortened
"space" to "sp", and maybe just duplicate the hex for non-named,
non-printable, codes (mostly those in the x80-xFF range, unless you are
NOT using ASCII but something like ISO-Latin-1
I've got a lot to learn about this encoding business.
To allow for the names, means using a field width of four. Using a
line width of 16-data bytes makes for an edit window width of 64, and
you could fit a hex offset at the left of each line to indicate what
part of the file is being worked.
Right.
Thanks for the reply!
--
_______________ ______
Christopher R. Carlen
cr***@bogus-remove-me.sbcglobal.ne t
SuSE 9.1 Linux 2.6.5
Jul 30 '07 #4
On Sun, 29 Jul 2007 18:27:25 -0700, CC wrote:
Marc 'BlackJack' Rintsch wrote:
>I'd use `string.printab le` and remove the "invisible" characters like '\n'
or '\t'.

What is `string.printab le` ? There is no printable method to strings,
though I had hoped there would be. I don't yet know how to make one.
In [8]: import string

In [9]: string.printabl e
Out[9]: '0123456789abcd efghijklmnopqrs tuvwxyzABCDEFGH IJKLMNOPQRSTUVW XYZ!"#$%&\'(
)*+,-./:;<=>?@[\\]^_`{|}~\t\n\r\x 0b\x0c'
>>>for c in ln:
if c in printable: sys.stdout.writ e(c)
else: sys.stdout.writ e('.')
>The translation table can be created once and should be faster.

I suppose the way I'm doing it requires a search through `printable` for
each c, right? Whereas the translation would just be a lookup
operation?
Correct. And it is written in C.

Ciao,
Marc 'BlackJack' Rintsch
Jul 30 '07 #5
On 2007-07-30, Dennis Lee Bieber <wl*****@ix.net com.comwrote:
On Sun, 29 Jul 2007 18:30:22 -0700, CC <cr***@BOGUS.sb cglobal.net>
declaimed the following in comp.lang.pytho n:
>>
Yeah, with this I'm not that concerned about Windows. Though, can WinXP
still load the ansi.sys driver?
I'm actually not sure...

I think if one uses the 16-bit command parser it is available, but
not the 32-bit parser...

command.com vs cmd.exe
Yes. You can load the ansi.sys driver in command.com on Windows
2000 and XP, and it will work with simply batch files. But it
doesn't work with Python, for reasons I don't know enough about
Windows console programs to understand.

--
Neil Cerutti
The audience is asked to remain seated until the end of the recession.
--Church Bulletin Blooper
Jul 30 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

71
6468
by: tomy_baseo | last post by:
I'm new to HTML and want to learn the basics by learning to code by hand (with the assistance of an HTML editor to eliminate repetitive tasks). Can anyone recommend a good, basic HTML editor that's a step beyond Notepad (not a WYSIWYG tool). Thanks.
24
2541
by: John Salerno | last post by:
This is a real small point, but I'd like to hear what others do in this case. It's more an 'administrative' type question than Python code question, but it still involves a bit of syntax. One thing I like to do is use tabs for my indentation, because this makes it easy to outdent when I need to start a new line in column 1. I can press backspace once and move 4 spaces to the left. But I read in the PEP that spaces are recommended over...
4
1891
by: krishnakant Mane | last post by:
hello, right now I am involved on doing a very important accessibility work. as many people may or may not know that I am a visually handicap person and work a lot on accessibility. the main issue at hand is to create an accessible editor for open office. there are a lot of things remaining on that front. so right now I am trying to find out a temporary work around by creating a simple accessible editor (with wxpython) for viewing and...
25
2827
by: doznot | last post by:
Let's say you want to use Moodle to teach an introductory class in PHP programming. Some of the students have little or no computer experience. In addition to background reading and topics-oriented assignments supplied by Moodle, you want to build an online text editor into the course, so students can type their PHP programs and HTML directly into files on the server, so they don't have to fight with NotePad on Windows PCs in a lab, and...
0
8743
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8622
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6177
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5647
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4173
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4333
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2745
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1973
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1736
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.