473,395 Members | 1,441 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

yEnc implementation in Python, bit slow

Hi,

I posted a while ago for some help with my word finder program, which is now
quite a lot faster than I could manage. Thanks to all who helped :)

This time, I've written a basic batch binary usenet poster in Python, but
encoding the data into yEnc format is fairly slow. Is it possible to improve
the routine any, WITHOUT using non-standard libraries? I don't want to have
to rely on something strange ;)

yEncode1 tends to be slightly faster here for me on my K6/2 500:

$ python2.3 testyenc.py
yEncode1 401563 1.82
yEncode1 401563 1.83
yEncode2 401562 1.83
yEncode2 401562 1.83

Any help would be greatly appreciated :)

Freddie
import struct
import time
from zlib import crc32

def timing(f, n, a):
print f.__name__,
r = range(n)
t1 = time.clock()
for i in r:
#f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a)
f(a)
t2 = time.clock()
print round(t2-t1, 3)

def yEncSetup():
global YENC
YENC = [''] * 256

for I in range(256):
O = (I + 42) % 256
if O in (0, 10, 13, 61):
# Supposed to modulo 256, but err, why bother?
O += 64
YENC[i] = '=%c' % O
else:
YENC[i] = '%c' % O

def yEncode1(data):
global YENC
yenc = YENC

encoded = []
datalen = len(data)
n = 0
while n < datalen:
chunk = data[n:n+256]
n += len(chunk)
encoded.extend([yenc[ord(c)] for c in chunk])
encoded.append('\n')

print len(''.join(encoded)),

def yEncode2(data):
global YENC
yenc = YENC

lines = []
datalen = len(data)
n = 0

bits = divmod(datalen, 256)
format = '256s' * bits[0]
parts = struct.unpack(format, data[:-bits[1]])
for part in parts:
lines.append(''.join([yenc[ord(c)] for c in part]))

lines.append(''.join([yenc[ord(c)] for c in data[-bits[1]:]]))
print len('\n'.join(lines) + '\n'),
yEncSetup()

teststr1 = 'a' * 400000
teststr2 = 'b' * 400000

for meth in (yEncode1, yEncode2):
timing(meth, 1, teststr1)
timing(meth, 1, teststr2)

--
Remove the oinks!
Jul 18 '05 #1
3 2342
On Tue, Aug 05, 2003 at 12:50:58AM +1000, Freddie wrote:
Hi,

I posted a while ago for some help with my word finder program, which is now
quite a lot faster than I could manage. Thanks to all who helped :)

This time, I've written a basic batch binary usenet poster in Python, but
encoding the data into yEnc format is fairly slow. Is it possible to improve
the routine any, WITHOUT using non-standard libraries? I don't want to have
to rely on something strange ;)


Python is pretty quick as long as you avoid loops that operate character
by character. Try to use functions that operate on longer strings.

Suggestions:

For the (x+42)%256 build a translation table and use str.translate.
To encode characters as escape sequences use str.replace or re.sub.

Oren

Jul 18 '05 #2
Oren Tirosh <or*******@hishome.net> wrote in
news:ma**********************************@python.o rg:
Suggestions:

For the (x+42)%256 build a translation table and use str.translate.
To encode characters as escape sequences use str.replace or re.sub.

Oren


Aahh. I couldn't work out how to use translate() at 4am this morning, but I
worked it out now :) This version is a whoooole lot faster, and actually
meets the yEnc line splitting spec. Bonus!

$ python2.3 testyenc.py
yEncode1 407682 1.98
yEncode2 407707 0.18

I'm not sure how to use re.sub to escape the characters, I assume it would
also be 4 seperate replaces? Also, it needs a slightly more random input
string than 'a' * 400000, so here we go.
test = []
for i in xrange(256):
test.append(chr(i))
teststr = ''.join(test*1562)
def yEncode2(data):
trans = ''
for i in range(256):
trans += chr((i+42)%256)

translated = data.translate(trans)

# escape =, NUL, LF, CR
for i in (61, 0, 10, 13):
j = '=%c' % (i + 64)
translated = translated.replace(chr(i), j)
encoded = []
n = 0
for i in range(0, len(translated), 256):
chunk = translated[n+i:n+i+256]
if chunk[-1] == '=':
chunk += translated[n+i+256+1]
n += 1
encoded.append(chunk)
encoded.append('\n')

result = ''.join(encoded)

print len(result),
return result

--
-----------------------------------------------------------
Remove the oinks!
Jul 18 '05 #3
Freddie <oi*********@oinkshlick.oinknet> wrote in
news:Xn**********************************@218.100. 3.9:

Arr. There's an error here, the [n+i+256+1] shouldn't have a 1. I always get
that wrong :) The posted files actually decode now, and the yEncode()
overhead is a lot lower.

<snip>
encoded = []
n = 0
for i in range(0, len(translated), 256):
chunk = translated[n+i:n+i+256]
if chunk[-1] == '=':
chunk += translated[n+i+256] <<< this line
n += 1
encoded.append(chunk)
encoded.append('\n')


--
Remove the oinks!
Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Marvin Grill | last post by:
Hi, I'm working on a type of NewsHound program and am stuck on how to decode the yEnc encoded articles (body). I tried using Joe Feser's yEnc class but can't seem to get it to work properly Not...
2
by: Phillip Hamlyn | last post by:
I'm trying to get a Yenc Decoding algorithm working but keep getting the same problem. I'm using data from the yenc test newsgroup and have tried the same thing with most of the example source code...
0
by: David Elliott | last post by:
I have a Collection that inherits from CollectionBase and Implements IBindingList which I have bound to a DataGrid. So far everything works fine. However, I am missing one piece to the...
6
by: Scirious | last post by:
People, many of you may not know what yEnc is. To simplify, it is a methode to encode binary files very used in Usenet because it creates an overhead of only 2% when compare to 33%-40% of overhead...
1
by: Kairo Matthias | last post by:
How can i encode with yEnc?
0
by: BiT | last post by:
Hi, I'm working on a newsgroup program and stuck on how to decode the yEnc encoded articles (body). i've found this dll of yenc32 - yDecLib.dll in https://sourceforge.net/projects/yenc32/ but...
6
by: Extremest | last post by:
Does anyone know how to decode a yenc encoded file? I have created a decoder that seems to work great for text. Now i am trying to create another one for images and I can't get it to work. I...
2
by: John Savage | last post by:
I save posts from a midi music newsgroup, some are encoded with yenc encoding. This gave me an opportunity to try out the decoders in Python. The UU decoder works okay, but my YENC effort gives...
8
by: Uwe Schmitt | last post by:
Hi, Is anobody aware of this post: http://swtch.com/~rsc/regexp/regexp1.html ? Are there any plans to speed up Pythons regular expression module ? Or is the example in this artricle too...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.