By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,812 Members | 1,324 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,812 IT Pros & Developers. It's quick & easy.

RE: ka-ping yee tokenizer.py

P: n/a
Hi Fredrik,

This is exactly what I need. Thank you.
I would like to do one additional function. I am not using the tokenizer to
parse python code. It happens to work very well for my application.
However, I would like either or both of the following variance:
1) I would like to add 2 other characters as comment designation
2) write a module that can readline, modify the line as required, and
finally, this module can be used as the argument for the tokenizer.

Def modifyLine( fileHandle ):
# readline and modify this string if required
....

For token in tokenize.generate_tokens( modifyLine( myFileHandle ) ):
Print token

Anxiously looking forward to your thoughts.
karl

-----Original Message-----
From: py*************************************@python.org
[mailto:py*************************************@pyt hon.org] On Behalf Of
Fredrik Lundh
Sent: Monday, September 15, 2008 2:04 PM
To: py*********@python.org
Subject: Re: ka-ping yee tokenizer.py

Karl Kobata wrote:
I have enjoyed using ka-ping yee's tokenizer.py. I would like to
replace the readline parameter input with my own and pass a list of
strings to the tokenizer. I understand it must be a callable object and
iteratable but it is obvious with errors I am getting, that this is not
the only functions required.
not sure I can decipher your detailed requirements, but to use Python's
standard "tokenize" module (written by ping) on a list, you can simple
do as follows:

import tokenize

program = [ ... program given as list ... ]

for token in tokenize.generate_tokens(iter(program).next):
print token

another approach is to turn the list back into a string, and wrap that
in a StringIO object:

import tokenize
import StringIO

program = [ ... program given as list ... ]

program_buffer = StringIO.StringIO("".join(program))

for token in tokenize.generate_tokens(program_buffer.readline):
print token

</F>

--
http://mail.python.org/mailman/listinfo/python-list

Sep 16 '08 #1
Share this Question
Share on Google+
1 Reply


P: n/a
On Sep 16, 2:48*pm, "Karl Kobata" <karl.kob...@syncira.comwrote:
Hi Fredrik,

This is exactly what I need. *Thank you.
I would like to do one additional function. *I am not using the tokenizer to
parse python code. *It happens to work very well for my application.
However, I would like either or both of the following variance:
1) I would like to add 2 other characters as comment designation
2) write a module that can readline, modify the line as required, and
finally, this module can be used as the argument for the tokenizer.

Def modifyLine( fileHandle ):
* # readline and modify this string if required
...

For token in tokenize.generate_tokens( modifyLine( myFileHandle ) ):
* * * * Print token

Anxiously looking forward to your thoughts.
karl

-----Original Message-----
From: python-list-bounces+kkobata=syncira....@python.org

[mailto:python-list-bounces+kkobata=syncira....@python.org] On Behalf Of
Fredrik Lundh
Sent: Monday, September 15, 2008 2:04 PM
To: python-l...@python.org
Subject: Re: ka-ping yee tokenizer.py

Karl Kobata wrote:
I have enjoyed using ka-ping yee's tokenizer.py. *I would like to
replace the readline parameter input with my own and pass a list of
strings to the tokenizer. *I understand it must be a callable object and
iteratable but it is obvious with errors I am getting, that this is not
the only functions required.

not sure I can decipher your detailed requirements, but to use Python's
standard "tokenize" module (written by ping) on a list, you can simple
do as follows:

* * *import tokenize

* * *program = [ ... program given as list ... ]

* * *for token in tokenize.generate_tokens(iter(program).next):
* * * * *print token

another approach is to turn the list back into a string, and wrap that
in a StringIO object:

* * *import tokenize
* * *import StringIO

* * *program = [ ... program given as list ... ]

* * *program_buffer = StringIO.StringIO("".join(program))

* * *for token in tokenize.generate_tokens(program_buffer.readline):
* * * * *print token

</F>

--http://mail.python.org/mailman/listinfo/python-list

This is an interesting construction:
>>a= [ 'a', 'b', 'c' ]
def moditer( mod, nextfun ):
.... while 1:
.... yield mod( nextfun( ) )
....
>>list( moditer( ord, iter( a ).next ) )
[97, 98, 99]

Here's my point:
>>a= [ 'print a', 'print b', 'print c' ]
tokenize.generate_tokens( iter( a ).next )
<generator object at 0x009FF440>
>>tokenize.generate_tokens( moditer( lambda s: s+ '#', iter( a ).next ).next )
It adds a '#' to the end of every line, then tokenizes.
Sep 17 '08 #2

This discussion thread is closed

Replies have been disabled for this discussion.