472,796 Members | 1,518 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,796 software developers and data experts.

NNTP binary attachment downloader with asyncore and generators

Howdy,

I'm in middle of a personal project. Eventually it will download
multipart binary attachments and look for missing parts on other
servers. And so far I've got it to walk a newsgroup and download and
decode single part binaries.

I thought I'd post the code and see what people think. I'd appreciate
any feedback. It's my first program with generators and I'm worried
I'm making this twice and hard as it needs to be.

Thanks,
David Fisher

Oh yeah, email is fake since I'm deathly afraid of spam. Please post
replies here.

Code was working when I posted it. Just change the server name,
username, password, group info at the bottom to something less
virtual. :)
#!/usr/bin/env python2.3
#
import asyncore
import socket
import os
import uu
import email
#
class Newser(asyncore.dispatcher):

def __init__(self, host,port,user,password,group):
asyncore.dispatcher.__init__(self)
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.connect( (host,port) )
self.buffer = ''
self.user = user
self.password = password
self.group = group
self.n = 0
self.head = ''
self.body = ''
self.inbuffer = ''
self.dataline = ''
self.handleline = self.handleline_gen()

def handle_connect(self):
pass

def handle_close(self):
pass

def writable(self):
return (len(self.buffer) > 0)

def handle_write(self):
print 'sending: ' + self.buffer.strip()
sent = self.send(self.buffer)
self.buffer = self.buffer[sent:]
if self.buffer:
print 'didnt send whole line' #does this ever happen?
print 'didnt send whole line' #just getting my attention
print 'didnt send whole line' #in case it does

def handle_read(self):
self.inbuffer += self.recv(8192)
while 1:
n = self.inbuffer.find('\r\n')
if n > -1:
self.dataline = self.inbuffer[:n+2]
self.inbuffer = self.inbuffer[n+2:]
try:
result = self.handleline.next()
if result == 'OK':
pass # everything is groovy
elif result == 'DONE':
self.del_channel() # group walk is finished
break
else:
print 'something has gone wrong!'
print result
print self.dataline
self.del_channel()
break
except StopIteration:
print 'should never be here'
print 'why did my generator run out?'
print 'why god? why?!'
print self.dataline
self.del_channel()
break
else:
break

def handleline_gen(self):
#
# handshakey stuff
# welcome username password group
# after this is set we'll start the message walk
#
if self.dataline[:3] == '200': # welcome, post ok
print self.dataline.strip()
self.buffer = 'authinfo user ' + self.user + '\r\n'
yield 'OK'
else:
yield 'WTF?! fail welcome? god hates me!'
#
if self.dataline[:3] == '381': # more auth needed
print self.dataline.strip()
self.buffer = 'authinfo pass ' + self.password + '\r\n'
yield 'OK'
else:
yield 'WTF?! fail authinfo user'
#
if self.dataline[:3] == '281': # auth ok, go to town!
print self.dataline.strip()
self.buffer = 'group ' + self.group + '\r\n'
yield 'OK'
else:
yield 'WTF?! fail authinfo pass'
#
if self.dataline[:3] == '211': # group
print self.dataline.strip()
self.buffer = 'next\r\n'
yield 'OK'
else:
yield 'WTF?! fail group'
#
# main state loop
# walk from one message to the next
# issuing HEAD and BODY for each
# never reenter here after we receive '421', no next article
# so we should never issue StopIterator
#
while 1:
#
if self.dataline[:3] == '223': # next
print self.dataline.strip()
self.buffer = 'head\r\n'
yield 'OK'
elif self.dataline[:3] == '421': # err, no next article
yield 'DONE'
else:
yield 'WTF?! fail next'
#
if self.dataline[:3] == '221': # head
print self.dataline.strip()
self.head = ''
yield 'OK'
# XXX what am I going to do if the server explodes
while self.dataline <> '.\r\n':
self.head += self.dataline
yield 'OK'
# XXX parse headers here
# XXX decide whether we want body
self.buffer = 'body\r\n'
yield 'OK'
else:
yield 'WTF?! fail head'
#
if self.dataline[:3] == '222': # body
print self.dataline.strip()
self.body = ''
yield 'OK'
# XXX what am I going to do if the server explodes
while self.dataline <> '.\r\n':
# XXX line-by-line decode here (someday)
self.body += self.dataline
yield 'OK'
self.decode()
self.buffer = 'next\r\n'
yield 'OK'
else:
yield 'WTF?! fail body'

def decode(self):
"""decode message body.
try UU first, just decode body
then mime, decode head+body
save in tempfile if fail"""
tempname = 'temp' + `self.n` + '.txt'
self.n += 1
file(tempname,'wb').write(self.body)
f = file(tempname)
try:
uu.decode(f)
except Exception,v:
print 'uu failed code: ',v
print 'trying MIME'
file(tempname,'wb').write(self.head+self.body)
f = file(tempname)
message = email.message_from_file(f)
for part in message.walk():
print part.get_content_type()
filename = part.get_filename()
if filename:
if not os.path.isfile(filename):

file(filename,'wb').write(part.get_payload(decode= True))
print 'yay! MIME!'
os.remove(tempname)
else:
print "oops, we've already got one"
else:
print 'yay! UU!'
os.remove(tempname)

def main():
mynews = Newser('news.server',119,'fishboy','pass','alt.bin aries')
try:
asyncore.loop()
except KeyboardInterrupt:
mynews.del_channel()
print 'yay! I quit!'

if __name__ == '__main__':
main()

Jul 18 '05 #1
1 1895
Fishboy,

Good start.
self.buffer = ''
self.inbuffer = ''
self.dataline = ''
Depending on how much data you are looking to move, straight strings and
string concatenation is not always the fastest way to deal with things.

For incoming buffers...

self.inbuffer = []
#as we receive data
self.inbuffer.append(data)
#when we have as much as we need...
data = ''.join(self.inbuffer)
self.inbuffer = []
def handle_close(self):
pass
Um...you are going to want to actually handle that close...with a
'self.close()'
def handle_write(self):
print 'sending: ' + self.buffer.strip()
sent = self.send(self.buffer)
self.buffer = self.buffer[sent:]
if self.buffer:
print 'didnt send whole line' #does this ever happen?
print 'didnt send whole line' #just getting my attention
print 'didnt send whole line' #in case it does
Try sending a few megabytes to it, you'll find the upper end of the
amount you can send at any one time.

Generally, it all depends on both the configuration of your TCP/IP
implementation (sending window size), as well as the actual throughput
and latencies of the connection to the other machine.

What asynchat does (and many other libraries do) is to pre-partition the
data into small chunks. Asynchat sticks with 512 bytes (a little small
IMO), but one would even be conservative at 1024 bytes (ethernet frames
are ~1500, so there is even some headroom). Tune it based on what your
connection is doing over time, this is Python.

Also, use a real FIFO class for buffering. You can modify Raymond's
fastest fifo implementation listed here:
http://aspn.activestate.com/ASPN/Coo...n/Recipe/68436
to insert those blocks that are accidentally not sent completely.
def handle_read(self):
Check out the handle_read() method of asynchat.async_chat. It does some
really good things to make line-based protocols work well, and there are
even some optimizations that can be done for your specific protocol.

self.inbuffer += self.recv(8192)


Don't do the above. Not a good idea.

The generator method looks reasonable, but I admit, I didn't read too
deeply (I really should be going to bed).

Keep working with sockets, they can be fun (if not frustrating at
times). If at any point you contemplate going towards a threaded
server, just remember:
1. Asyncore (modified properly) can handle hundreds of connections (I'm
currently limited by the number of file handles Python is compiled with)
and saturate 100mbit with ease (I have written clients that saturate
1gbit for certain tasks).
2. Standard Python threads get laggy, and actually reduce bandwidth as
you approach 10-20 threads (word is that Stackless' tasklets are damn fast).
- Josiah
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: David M. Wilson | last post by:
Hi peeps, I finally got around recently to doing something serious using Python for serving network clients asynchronously, deciding on asyncore as my starting point. After 2 days or so of...
0
by: fishboy | last post by:
Howdy, Sorry if this is a double post. First try seemed to go into hyperspace. I'm working on a personal project. It's going to be a multipart binary attachment downloader that will search...
3
by: Anand Pillai | last post by:
This is for folks who are familiar with asynchronous event handling in Python using the asyncore module. If you have ever used the asyncore module, you will realize that it's event loop does not...
4
by: Jim | last post by:
Have you seen any NNTP classes that I may use or build upon to build a simple newsreader/downloader? Is there such a class in the .Net framework that I have overlooked? If not, inclusion of RFC...
11
by: Jim | last post by:
Have you seen any NNTP classes that I may use or build upon to build a simple newsreader/downloader? Is there such a class in the .Net framework that I have overlooked? If not, inclusion of RFC...
7
by: billie | last post by:
Hi all. I've just terminated a server application using asyncore / asynchat frameworks. I wrote a test script that performs a lot of connections to the server app and I discovered that asyncore...
0
by: Giampaolo Rodola' | last post by:
Hi, I post this message here in the hope someone using asyncore could review this. Since the thing I miss mostly in asyncore is a system for calling a function after a certain amount of time, I...
0
by: canistel | last post by:
Hi, I have a little python webservice that I created, and in one of the methods I need to store some binary data that was "posted"... I want to do something like this, but it doesn't work. ...
8
by: Frank Millman | last post by:
Hi all I have been using my own home-brewed client/server technique for a while, using socket and select. It seems to work ok. The server can handle multiple clients. It does this by creating a...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: kcodez | last post by:
As a H5 game development enthusiast, I recently wrote a very interesting little game - Toy Claw ((http://claw.kjeek.com/))。Here I will summarize and share the development experience here, and hope it...
0
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.