By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,275 Members | 1,745 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,275 IT Pros & Developers. It's quick & easy.

generic text read function

P: n/a
Hi,
matlab has a useful function called "textread" which I am trying to
reproduce
in python.

two inputs: filename, format (%s for string, %d for integers, etc and
arbitary delimiters)

variable number of outputs (to correspond to the format given as
input);

So suppose your file looked like this
str1 5 2.12
str1 3 0.11
etc with tab delimited columns.
then you would call it as

c1,c2,c3=textread(filename, '%s\t%d\t%f')

Unfortunately I do not know how to read a line from a file
using the line format given as above. Any help would be much
appreciated
les

Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
>>>>> "les" == les ander <le*******@yahoo.com> writes:

les> Hi, matlab has a useful function called "textread" which I am
les> trying to reproduce in python.

les> two inputs: filename, format (%s for string, %d for integers,
les> etc and arbitary delimiters)

les> variable number of outputs (to correspond to the format given
les> as input);

les> So suppose your file looked like this str1 5 2.12 str1 3 0.11
les> etc with tab delimited columns. then you would call it as

les> c1,c2,c3=textread(filename, '%s\t%d\t%f')

les> Unfortunately I do not know how to read a line from a file
les> using the line format given as above. Any help would be much
les> appreciated les

Not an answer to your question, but I use a different approach to
solve this problem. Here is a simple example

converters = (str, int, float)
results = []
for line in file(filename):
line = line.strip()
if not len(line): continue # skip blank lines
values = line.split('\t')
if len(values) != len(converters):
raise ValueError('Illegal line')
results.append([func(val) for func, val in zip(converters, values)])

c1, c2, c3 = zip(*results)

If you really need to emulate the matlab command, perhaps this example
will give you an idea about how to get started. Eg, set up a dict
mapping format strings to converter functions

d = {'%s' : str,
'%d' : int,
'%f' : float,
}

and then parse the format string to set up your converters and split function.

If you succeed in implementing this function, please consider sending
it to me as a contribution to matplotlib -- http://matplotlib.sf.net

Cheers,
JDH
Jul 18 '05 #2

P: n/a
John Hunter wrote:
>>"les" == les ander <le*******@yahoo.com> writes:

les> Hi, matlab has a useful function called "textread" which I am
les> trying to reproduce in python.

les> two inputs: filename, format (%s for string, %d for integers,
les> etc and arbitary delimiters)

Builing on John's solution, this is still not quite what you're looking for (the
delimiter preference is set for the whole line as a separate argument), but it's
one step closer, and may give you some ideas:

import re

dispatcher = {'%s' : str,
'%d' : int,
'%f' : float,
}
parser = re.compile("|".join(dispatcher))

def textread(iterable, formats, delimiter = None):

# Splits on any combination of one or more chars in delimeter
# or whitespace by default
splitter = re.compile("[%s]+" % (delimiter or r"\s"))

# Parse the format string into a list of converters
# Note that white space in the format string is ignored
# unlike the spec which calls for significant delimiters
try:
converters = [dispatcher[format] for format in parser.findall(formats)]
except KeyError, err:
raise KeyError, "Unrecogized format: %s" % err

format_length = len(converters)

iterator = iter(iterable)

# Use any line-based iterable - like file
for line in iterator:
cols = re.split(splitter, line)
if len(cols) != format_length:
raise ValueError, "Illegal line: %s" % cols
yield [func(val) for func, val in zip(converters, cols)]

# Example Usage:

source1 = """Item 5 8.0
Item2 6 9.0"""

source2 = """Item 1 \t42
Item 2\t43"""
for i in textread(source1.splitlines(),"%s %d %f"): print i ...
['Item', 5, 8.0]
['Item2', 6, 9.0] for i in textread(source2.splitlines(),"%s %f", "\t"): print i ...
['Item 1 ', 42.0]
['Item 2', 43.0] for item, value in textread(source2.splitlines(),"%s %f", "\t"): print item, value
...
Item 1 42.0
Item 2 43.0


Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.