473,387 Members | 1,899 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

String Fomat Conversion

mcg
Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6
Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003

Jul 18 '05 #1
10 2395
On 26 Jan 2005 20:53:02 -0800, mcg <mg******@garrett-technologies.com> wrote:
Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6

Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003


f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())

Regards,
Stephen Thorne
Jul 18 '05 #2
Stephen Thorne wrote:
f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())


Somewhat more memory efficient:

lines_iter = iter(file('input'))
labels = lines_iter.next()
for line in lines_iter:
x, y = [float(f) for f in line.split()]

By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.

I also opted to use a list comprehension instead of map, but this is
totally a matter of personal preference -- the performance differences
are probably negligible.

Steve
Jul 18 '05 #3
mcg wrote:
Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6
Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003


Either do what the other posters wrote, or if you really like scanf
try the following Python module:

Scanf --- a pure Python scanf-like module
http://hkn.eecs.berkeley.edu/~dyoo/python/scanf/

Bye,
Dennis
Jul 18 '05 #4
On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
<st************@gmail.com> wrote:
Stephen Thorne wrote:
f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())


Somewhat more memory efficient:

lines_iter = iter(file('input'))
labels = lines_iter.next()
for line in lines_iter:
x, y = [float(f) for f in line.split()]

By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.


I just did a teensy test. All three options used exactly the same
amount of total memory.

I did all I did in the name of clarity, considering the OP was on his
first day with python. How I would actually write it would be:

inputfile = file('input','r')
inputfile.readline()
data = [map(float, line.split()) for line in inputfile]

Notice how you don't have to call iter() on it, you can treat it as an
iterable to begin with.

Stephen.
Jul 18 '05 #5
Stephen Thorne wrote:
I did all I did in the name of clarity, considering the OP was on his
first day with python. How I would actually write it would be:

inputfile = file('input','r')
inputfile.readline()
data = [map(float, line.split()) for line in inputfile]

Notice how you don't have to call iter() on it, you can treat it as an
iterable to begin with.


Beware of mixing iterator methods and readline:

http://docs.python.org/lib/bltin-file-objects.html

next( )
...In order to make a for loop the most efficient way of looping
over the lines of a file (a very common operation), the next() method
uses a hidden read-ahead buffer. As a consequence of using a read-ahead
buffer, combining next() with other file methods (like readline()) does
not work right.

I haven't tested your code in particular, but this warning was enough to
make me generally avoid mixing iter methods and other methods.

Steve
Jul 18 '05 #6
Steven Bethard <st************@gmail.com> wrote:
...
Beware of mixing iterator methods and readline:
_mixing_, yes. But -- starting the iteration after some other kind of
reading (readline, or read(N), etc) -- is OK...

http://docs.python.org/lib/bltin-file-objects.html

next( )
...In order to make a for loop the most efficient way of looping
over the lines of a file (a very common operation), the next() method
uses a hidden read-ahead buffer. As a consequence of using a read-ahead
buffer, combining next() with other file methods (like readline()) does
not work right.

I haven't tested your code in particular, but this warning was enough to
make me generally avoid mixing iter methods and other methods.


Yeah, I know... it's hard to explain exactly what IS a problem and what
isn't -- not to mention that this IS to some extent a matter of the file
object's implementation and the docs can't/don't want to constrain the
implementer's future freedom, should it turn out to matter. Sigh.

In the Nutshell (2nd ed), which is not normative and thus gives me a tad
more freedom, I have tried to be a tiny bit more specific, taking
advantage, also, of the fact that I'm now addressing the 2.3 and 2.4
implementations, only. Quoting from my current draft (pardon the XML
markup...):

"""
interrupting such a loop prematurely (e.g., with <c>break</c>), or
calling <r>f</r><c>.next()</c> instead of <r>f</r><c>.readline()</c>,
leaves the file's current position at an arbitrary value. If you want
to switch from using <r>f</r> as an iterator to calling other reading
methods on <r>f</r>, be sure to set the file's current position to a
known value by appropriately calling <r>f</r><c>.seek</c>.
"""

I hope this concisely indicates that the problem (in today's current
implementations) is only with switching FROM iteration TO other
approaches to reading, and (if the file is seekable) there's nothing so
problematic here that a good old 'seek' won't cure...
Alex
Jul 18 '05 #7
Alex Martelli wrote:
Steven Bethard <st************@gmail.com> wrote:
...
Beware of mixing iterator methods and readline:

[snip]
I hope this concisely indicates that the problem (in today's current
implementations) is only with switching FROM iteration TO other
approaches to reading, and (if the file is seekable) there's nothing so
problematic here that a good old 'seek' won't cure...


Thanks for the clarification!

Steve
Jul 18 '05 #8
Stephen Thorne wrote:
On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
<st************@gmail.com> wrote:
By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.


I just did a teensy test. All three options used exactly the same
amount of total memory.


I would presume that, for a small file, the entire contents of the
file will be sucked into the read buffer implemented by the underlying
C file library. An iterator will only really save memory consumption
when the file size is greater than that buffer's size.

Actually, now that I think of it, there's probably another copy of the
data at Python level. For readlines(), that copy is the list object
itself. For iter and iter.next(), it's in the iterator's read-ahead
buffer. So perhaps memory savings will occur when *that* buffer size
is exceeded. It's also quite possible that both buffers are the same
size...

Anyhow, I'm sure that the fact that they use the same size for your
test is a reflection of buffering. The next question is, which
provides the most *conceptual* simplicity? (The answer to that one, I
think, depends on how your brain happens to see things...)

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #9
Do you really need to use the iter function here? As far as I can
tell, a file object is already an iterator. The file object
documentation says that, "[a] file object is its own iterator, for
example iter(f) returns f (unless f is closed)." It doesn't look like
it makes a difference one way or the other, I'm just curious.

Jul 18 '05 #10
enigma wrote:
Do you really need to use the iter function here? As far as I can
tell, a file object is already an iterator. The file object
documentation says that, "[a] file object is its own iterator, for
example iter(f) returns f (unless f is closed)." It doesn't look like
it makes a difference one way or the other, I'm just curious.


Nope, you're right -- that's just my obsessive-compulsive disorder
kicking in. ;) A lot of objects aren't their own iterators, so I tend
to ask for an iterator with iter() when I know I want one. But for
files, this definitely isn't necessary:

py> file('temp.txt', 'w').write("""\
.... x y
.... 1 2
.... 3 4
.... 5 6
.... """)
py> f = file('temp.txt')
py> f.next()
'x y\n'
py> for line in f:
.... print [float(f) for f in line.split()]
....
[1.0, 2.0]
[3.0, 4.0]
[5.0, 6.0]

And to illustrate Alex Martelli's point that using readline, etc. before
using the file as an iterator is fine:

py> f = file('temp.txt')
py> f.readline()
'x y\n'
py> for line in f:
.... print [float(f) for f in line.split()]
....
[1.0, 2.0]
[3.0, 4.0]
[5.0, 6.0]

But using readline, etc. after using the file as an iterator is *not*
fine, generally:

py> f = file('temp.txt')
py> f.next()
'x y\n'
py> f.readline()
''

In this case, if my understanding's right, the entire file contents have
been read into the iterator buffer, so readline thinks the entire file's
been read in and gives you '' to indicate this.

Steve
Jul 18 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Marcin Kalicinski | last post by:
Why string literals are regarded as char * not as const char *? (1) void f(char *); (2) void f(const char *); f("foo") will call version (1) of function f. I understand that the exact type...
2
by: Thomas Matthews | last post by:
Hi, I'm working with Borland C++ Builder 6.2. My project uses the std::string class. However, Borland in its infinite wisdom has its own string class, AnsiString. To make my life easier, I...
6
by: Marco Herrn | last post by:
Hi, I need to serialize an object into a string representation to store it into a database. So the SOAPFormatter seems to be the right formatter for this purpose. Now I have the problem that...
8
by: Ian Davies | last post by:
Hello I have the following sql string to run as a command in my VB6 project to update mysql table strSQL = "LOAD DATA INFILE " & ImportFile & " INTO TABLE tPupils FIELDS TERMINATED BY ','...
10
by: =?Utf-8?B?RWxlbmE=?= | last post by:
I am surprised to discover that c# automatically converts an integer to a string when concatenating with the "+" operator. I thought c# was supposed to be very strict about types. Doesn't it seem...
2
by: sarahbarnard | last post by:
I use Dreamweaver to generate asp pages. I have an access database with numbers. In access the numbers are displaying and formatted to Double, fomat 2 decimals. On teh website the numbers don't...
5
by: jeremyje | last post by:
I'm writing some code that will convert a regular string to a byte for compression and then beable to convert that compressed string back into original form. Conceptually I have.... For...
3
by: Kevin Frey | last post by:
I am porting Managed C++ code from VS2003 to VS2005. Therefore adopting the new C++/CLI syntax rather than /clr:oldSyntax. Much of our managed code is concerned with interfacing to native C++...
2
by: soni2926 | last post by:
hi, i have the following being returned to be in a string: 20.5 i need to make this with 2 places after the decimal, like 20.50, anyway to do that? I've tried the following but it doesn't seem...
1
by: holdingbe | last post by:
Hi, I need to export the table data as a insert statements.I used Toad 3.0.0.1952 versions. When i taken the insert statemnmet through toad, it shows a error like insert statements fomat is not...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.