String Fomat Conversion

mcg

Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6
Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003

Jul 18 '05 #1

Subscribe Post Reply

2395

Stephen Thorne

On 26 Jan 2005 20:53:02 -0800, mcg <mg******@garrett-technologies.com> wrote:

Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6

Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003

f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())

Regards,
Stephen Thorne

Jul 18 '05 #2

Steven Bethard

Stephen Thorne wrote:

f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())

Somewhat more memory efficient:

lines_iter = iter(file('input'))
labels = lines_iter.next()
for line in lines_iter:
x, y = [float(f) for f in line.split()]

By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.

I also opted to use a list comprehension instead of map, but this is
totally a matter of personal preference -- the performance differences
are probably negligible.

Steve

Jul 18 '05 #3

Dennis Benzinger

mcg wrote:

Investigating python day 1:

Data in file:
x y
1 2
3 4
5 6
Want to read file into an array of pairs.

in c: scanf("%d %d",&x,&y)---store x y in array, loop.

How do I do this in python??
In the actual application, the pairs are floating pt i.e. -1.003

Either do what the other posters wrote, or if you really like scanf
try the following Python module:

Scanf --- a pure Python scanf-like module
http://hkn.eecs.berkeley.edu/~dyoo/python/scanf/

Bye,
Dennis

Jul 18 '05 #4

Stephen Thorne

On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
<st************@gmail.com> wrote:

Stephen Thorne wrote:
f = file('input', 'r')
labels = f.readline() # consume the first line of the file.

Easy Option:
for line in f.readlines():
x, y = line.split()
x = float(x)
y = float(y)

Or, more concisely:
for line in f.readlines():
x, y = map(float, line.split())

Somewhat more memory efficient:

lines_iter = iter(file('input'))
labels = lines_iter.next()
for line in lines_iter:
x, y = [float(f) for f in line.split()]

By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.

I just did a teensy test. All three options used exactly the same
amount of total memory.

I did all I did in the name of clarity, considering the OP was on his
first day with python. How I would actually write it would be:

inputfile = file('input','r')
inputfile.readline()
data = [map(float, line.split()) for line in inputfile]

Notice how you don't have to call iter() on it, you can treat it as an
iterable to begin with.

Stephen.

Jul 18 '05 #5

Steven Bethard

Stephen Thorne wrote:

I did all I did in the name of clarity, considering the OP was on his
first day with python. How I would actually write it would be:

inputfile = file('input','r')
inputfile.readline()
data = [map(float, line.split()) for line in inputfile]

Notice how you don't have to call iter() on it, you can treat it as an
iterable to begin with.

Beware of mixing iterator methods and readline:

http://docs.python.org/lib/bltin-file-objects.html

next( )
...In order to make a for loop the most efficient way of looping
over the lines of a file (a very common operation), the next() method
uses a hidden read-ahead buffer. As a consequence of using a read-ahead
buffer, combining next() with other file methods (like readline()) does
not work right.

I haven't tested your code in particular, but this warning was enough to
make me generally avoid mixing iter methods and other methods.

Steve

Jul 18 '05 #6

Alex Martelli

Steven Bethard <st************@gmail.com> wrote:
...

Beware of mixing iterator methods and readline:
_mixing_, yes. But -- starting the iteration after some other kind of
reading (readline, or read(N), etc) -- is OK...

http://docs.python.org/lib/bltin-file-objects.html

next( )
...In order to make a for loop the most efficient way of looping
over the lines of a file (a very common operation), the next() method
uses a hidden read-ahead buffer. As a consequence of using a read-ahead
buffer, combining next() with other file methods (like readline()) does
not work right.

I haven't tested your code in particular, but this warning was enough to
make me generally avoid mixing iter methods and other methods.

Yeah, I know... it's hard to explain exactly what IS a problem and what
isn't -- not to mention that this IS to some extent a matter of the file
object's implementation and the docs can't/don't want to constrain the
implementer's future freedom, should it turn out to matter. Sigh.

In the Nutshell (2nd ed), which is not normative and thus gives me a tad
more freedom, I have tried to be a tiny bit more specific, taking
advantage, also, of the fact that I'm now addressing the 2.3 and 2.4
implementations, only. Quoting from my current draft (pardon the XML
markup...):

"""
interrupting such a loop prematurely (e.g., with <c>break</c>), or
calling <r>f</r><c>.next()</c> instead of <r>f</r><c>.readline()</c>,
leaves the file's current position at an arbitrary value. If you want
to switch from using <r>f</r> as an iterator to calling other reading
methods on <r>f</r>, be sure to set the file's current position to a
known value by appropriately calling <r>f</r><c>.seek</c>.
"""

I hope this concisely indicates that the problem (in today's current
implementations) is only with switching FROM iteration TO other
approaches to reading, and (if the file is seekable) there's nothing so
problematic here that a good old 'seek' won't cure...
Alex

Jul 18 '05 #7

Steven Bethard

Alex Martelli wrote:

Steven Bethard <st************@gmail.com> wrote:
...
Beware of mixing iterator methods and readline:

[snip]
I hope this concisely indicates that the problem (in today's current
implementations) is only with switching FROM iteration TO other
approaches to reading, and (if the file is seekable) there's nothing so
problematic here that a good old 'seek' won't cure...

Thanks for the clarification!

Steve

Jul 18 '05 #8

Jeff Shannon

Stephen Thorne wrote:

On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
<st************@gmail.com> wrote:
By using the iterator instead of readlines, I read only one line from
the file into memory at once, instead of all of them. This may or may
not matter depending on the size of your files, but using iterators is
generally more scalable, though of course it's not always possible.

I just did a teensy test. All three options used exactly the same
amount of total memory.

I would presume that, for a small file, the entire contents of the
file will be sucked into the read buffer implemented by the underlying
C file library. An iterator will only really save memory consumption
when the file size is greater than that buffer's size.

Actually, now that I think of it, there's probably another copy of the
data at Python level. For readlines(), that copy is the list object
itself. For iter and iter.next(), it's in the iterator's read-ahead
buffer. So perhaps memory savings will occur when *that* buffer size
is exceeded. It's also quite possible that both buffers are the same
size...

Anyhow, I'm sure that the fact that they use the same size for your
test is a reflection of buffering. The next question is, which
provides the most *conceptual* simplicity? (The answer to that one, I
think, depends on how your brain happens to see things...)

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #9

enigma

Do you really need to use the iter function here? As far as I can
tell, a file object is already an iterator. The file object
documentation says that, "[a] file object is its own iterator, for
example iter(f) returns f (unless f is closed)." It doesn't look like
it makes a difference one way or the other, I'm just curious.

Jul 18 '05 #10

Steven Bethard

enigma wrote:

Do you really need to use the iter function here? As far as I can
tell, a file object is already an iterator. The file object
documentation says that, "[a] file object is its own iterator, for
example iter(f) returns f (unless f is closed)." It doesn't look like
it makes a difference one way or the other, I'm just curious.

Nope, you're right -- that's just my obsessive-compulsive disorder
kicking in. ;) A lot of objects aren't their own iterators, so I tend
to ask for an iterator with iter() when I know I want one. But for
files, this definitely isn't necessary:

py> file('temp.txt', 'w').write("""\
.... x y
.... 1 2
.... 3 4
.... 5 6
.... """)
py> f = file('temp.txt')
py> f.next()
'x y\n'
py> for line in f:
.... print [float(f) for f in line.split()]
....
[1.0, 2.0]
[3.0, 4.0]
[5.0, 6.0]

And to illustrate Alex Martelli's point that using readline, etc. before
using the file as an iterator is fine:

py> f = file('temp.txt')
py> f.readline()
'x y\n'
py> for line in f:
.... print [float(f) for f in line.split()]
....
[1.0, 2.0]
[3.0, 4.0]
[5.0, 6.0]

But using readline, etc. after using the file as an iterator is *not*
fine, generally:

py> f = file('temp.txt')
py> f.next()
'x y\n'
py> f.readline()
''

In this case, if my understanding's right, the entire file contents have
been read into the iterator buffer, so readline thinks the entire file's
been read in and gives you '' to indicate this.

Steve

Jul 18 '05 #11

Similar topics

String literals non const?

by: Marcin Kalicinski | last post by:

Why string literals are regarded as char * not as const char *? (1) void f(char *); (2) void f(const char *); f("foo") will call version (1) of function f. I understand that the exact type...

C / C++

Conversion between string types

by: Thomas Matthews | last post by:

Hi, I'm working with Borland C++ Builder 6.2. My project uses the std::string class. However, Borland in its infinite wisdom has its own string class, AnsiString. To make my life easier, I...

C / C++

serializing an object to a string

by: Marco Herrn | last post by:

Hi, I need to serialize an object into a string representation to store it into a database. So the SOAPFormatter seems to be the right formatter for this purpose. Now I have the problem that...

C# / C Sharp

how to get " into a string without it terminating it

by: Ian Davies | last post by:

Hello I have the following sql string to run as a command in my VB6 project to update mysql table strSQL = "LOAD DATA INFILE " & ImportFile & " INTO TABLE tPupils FIELDS TERMINATED BY ','...

MySQL Database

String "+" operator conversion

by: =?Utf-8?B?RWxlbmE=?= | last post by:

I am surprised to discover that c# automatically converts an integer to a string when concatenating with the "+" operator. I thought c# was supposed to be very strict about types. Doesn't it seem...

C# / C Sharp

Fomat decimal numbers in ASP

by: sarahbarnard | last post by:

I use Dreamweaver to generate asp pages. I have an access database with numbers. In access the numbers are displaying and formatted to Double, fomat 2 decimals. On teh website the numbers don't...

ASP / Active Server Pages

string to byte[] back to string + Compression Failed!

by: jeremyje | last post by:

I'm writing some code that will convert a regular string to a byte for compression and then beable to convert that compressed string back into original form. Conceptually I have.... For...

C# / C Sharp

String^, const char*, std::string, and c_str( )

by: Kevin Frey | last post by:

I am porting Managed C++ code from VS2003 to VS2005. Therefore adopting the new C++/CLI syntax rather than /clr:oldSyntax. Much of our managed code is concerned with interfacing to native C++...

.NET Framework

fomat string with 2 decimal places

by: soni2926 | last post by:

hi, i have the following being returned to be in a string: 20.5 i need to make this with 2 places after the decimal, like 20.50, anyway to do that? I've tried the following but it doesn't seem...

.NET Framework

insert statements fomat is not licensed.

by: holdingbe | last post by:

Hi, I need to export the table data as a insert statements.I used Toad 3.0.0.1952 versions. When i taken the insert statemnmet through toad, it shows a error like insert statements fomat is not...

MySQL Database

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing