For a particular application of mine, I need to get the messages from
usenet , (and group them by each thread) . My startup python code looks
as follows.
<--- Startup code to read messages from a newsgroup -->
import nntplib, cStringIO, rfc822, sys
SRVR = '<my_news_server>' # Your news server
newsgroup = 'comp.lang.c' # Group of your choice
def inpdflt(s, d):
resp = raw_input("%s [%s]: " % (s, d))
return resp or d
news = nntplib.NNTP(SRVR)
resp, estimate, first, last, name = news.group(newsgroup)
if estimate == '0':
sys.exit("No messages in " + newsgroup)
#
# Get (article number, subject, poster, date, id, references, size,
lines)
# for each of the articles between first and last
#
xover = news.xover(first, last)
# loop through articles, extracting headers
for x in xover[1]:
# x == (article number, subject, poster, date, id, references,
size, lines)
try:
hdrs = news.head(x[0])[3]
mesg = rfc822.Message(cStringIO.StringIO("\r\n".join(hdrs )))
print '%s\n+++%s' % (mesg.getheader("from"),
mesg.getheader("subject"))
except nntplib.NNTPError:
pass
news.quit()
<-- End newsgroup -->
I am getting all the messages of the newsgroup stored in the newsgroup
server.
What I want is to *group the messages belonging to each thread* .
How would I do that ?
Eg:
Topic 1
|
-- Re: Topic:1
-- Re: Topic: 1
|
-- Re: Re: Topic 1
Topic 2
|
-- Re: Topic:2
Total number of messages 6, but number of threads = 2,
I want to get an abstraction something similar to this. 4 1542
Rakesh schrieb: What I want is to *group the messages belonging to each thread* .
Hello
Why not sort with Message-ID and References?
Attention - it is a Newbie-Solution.
import nntplib
hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass')
resp, count, first, last, name = hamster.group('comp.lang.python')
resp, items = hamster.xover(first,last)
start_dic = {}
re_dic = {}
numb = 1
for id,subject,author,date,message_id,references,size, lines in items:
if 'Re:' not in subject:
start_dic[subject] = (author, message_id)
else:
re_dic[numb] = (subject, author, references)
numb += 1
resp = hamster.quit()
for a in start_dic:
print a
print start_dic[a][0]
for b in re_dic:
if start_dic[a][1] in re_dic[b][2]:
print '|'
print ' ->', re_dic[b][0]
print ' ', re_dic[b][1]
print
--
Werner Amann
Werner Amann wrote: Rakesh schrieb:
What I want is to *group the messages belonging to each thread* .
Hello
Why not sort with Message-ID and References? Attention - it is a Newbie-Solution.
import nntplib
hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass') resp, count, first, last, name = hamster.group('comp.lang.python') resp, items = hamster.xover(first,last)
start_dic = {} re_dic = {} numb = 1
for id,subject,author,date,message_id,references,size, lines in items: if 'Re:' not in subject: start_dic[subject] = (author, message_id) else: re_dic[numb] = (subject, author, references) numb += 1
resp = hamster.quit()
for a in start_dic: print a print start_dic[a][0] for b in re_dic: if start_dic[a][1] in re_dic[b][2]: print '|' print ' ->', re_dic[b][0] print ' ', re_dic[b][1] print
Better still, do a Google search on "mail threading algorithm",
implement the algorithm described in http://www.jwz.org/doc/threading.html
and post your implementation back to the newsgroup :-)
regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119
Steve Holden wrote: Werner Amann wrote:
Rakesh schrieb:
What I want is to *group the messages belonging to each thread* .
Hello
Why not sort with Message-ID and References? Attention - it is a Newbie-Solution.
import nntplib
hamster = nntplib.NNTP('127.0.0.1', 119, 'user', 'pass') resp, count, first, last, name = hamster.group('comp.lang.python') resp, items = hamster.xover(first,last)
start_dic = {} re_dic = {} numb = 1
for id,subject,author,date,message_id,references,size, lines in
items: if 'Re:' not in subject: start_dic[subject] = (author, message_id) else: re_dic[numb] = (subject, author, references) numb += 1
resp = hamster.quit()
for a in start_dic: print a print start_dic[a][0] for b in re_dic: if start_dic[a][1] in re_dic[b][2]: print '|' print ' ->', re_dic[b][0] print ' ', re_dic[b][1] print Better still, do a Google search on "mail threading algorithm", implement the algorithm described in
http://www.jwz.org/doc/threading.html
Thanks a lot for the link. and post your implementation back to the newsgroup :-)
Sure I would. I would definitely do the same.
I am a python newbie and am reading nntp spec (rfc) right now.
Once I get a working version I would definitely post the same.
Steve Holden schreef: Better still, do a Google search on "mail threading algorithm", implement the algorithm described in
http://www.jwz.org/doc/threading.html
and post your implementation back to the newsgroup :-)
<http://www.google.com/search?q=jwz+python&btnI=I'm+Feeling+Lucky>
<http://www.amk.ca/python/code/jwz>
--
JanC
"Be strict when sending and tolerant when receiving."
RFC 1958 - Architectural Principles of the Internet - section 3.9 This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: python |
last post by:
Hello,
How to post a news article with NNTPlib if the news server requires
login. I did not find nay login command in nntplib module.
Thank you
lad.
|
by: Helmut Jarausch |
last post by:
Hi
I try to regularly extract recent news from some newsgroups.
If News is an NNTP object I try
(Response,Articles)= News.xover(str(int(Last)+1),'10000000')
where 'Last' is the (previously...
|
by: 2Good4You-Veki(Cro) |
last post by:
HI,
when I want use python nntplib:
THEN ERROR IS:
Traceback (most recent call last):
File "<pyshell#2>", line 1, in -toplevel-
s.group('hr.mag.bug')
File "C:\Python24\lib\nntplib.py",...
|
by: TamusJRoyce |
last post by:
Hello. This is my first thread here.
My problem has probably been came across by a lot of people, but tutorials and things I've seen don't address it (usually too basic).
My problem is that I...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |