471,356 Members | 1,603 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,356 software developers and data experts.

parsing long `To' and 'Cc' from email

Hi all. Im trying to develop yet another email filter. Just for fun for
now. Im having a little trouble parsing long 'To' and 'Cc' headers.
Sometimes p.e. the 'To' header comes like

'a*******@site.com, ac*****@site2.com'
others comes like
'"My self" <my********@self.net>, "My brother" <br*************@self.net>',
other times a \r\t comes inside the `To' header. And any combination of
the above mentioned (and shurely more) can ocur.

the email.* package dont seems to parse that kind of headers
`correctly'. What i want is to get a list with all the email address in
the `To' header.

Someone know if there is a more sofisticated parser for doing this?

Thanks!
Gerardo
Sep 6 '07 #1
1 1245
On Sep 6, 1:25 pm, Gerardo Herzig <gher...@fmed.uba.arwrote:
the email.* package dont seems to parse that kind of headers
`correctly'. What i want is to get a list with all the email address in
the `To' header.

Someone know if there is a more sofisticated parser for doing this?
If you're not interested in parsing the entire email message, you may
just want to run a regex on the message itself, looking for the "to"
header.

Here's a good start:

r"^to:\s*(.*)$"

You'll want to use the multi-line and case-insensitive options when
you use it.


Sep 6 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.