By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,813 Members | 1,130 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,813 IT Pros & Developers. It's quick & easy.

Record separator for readlines()

P: n/a

I know this has been asked before (I already consulted the Google
Groups archive), but I have not seen a definative answer. Is there a
way to change the record separator in readlines()? The documentation
does not mention any way to do this. I know way back in 1998, Guido
said he would consider adding it, but apparently that didn't happen.
Is there some way to do this?

--
"First they ignore you, then they laugh at you, then they fight you,
then you win."
-- Mohandas Gandhi
Sep 2 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
universal newlines?
http://www.python.org/doc/2.3.3/whatsnew/node7.html

Angelic Devil wrote:
I know this has been asked before (I already consulted the Google
Groups archive), but I have not seen a definative answer. Is there a
way to change the record separator in readlines()? The documentation
does not mention any way to do this. I know way back in 1998, Guido
said he would consider adding it, but apparently that didn't happen.
Is there some way to do this?

--
"First they ignore you, then they laugh at you, then they fight you,
then you win."
-- Mohandas Gandhi


Sep 2 '05 #2

P: n/a
I think you still have to roll your own.

Here's a start:
def ireadlines(f, s='\n', bs=4096):
if not s: raise ValueError, "separator must not be empty"
r = []
while 1:
b = f.read(bs)
if not b: break
ofs = 0
while 1:
next = b.find(s, ofs)
if next == -1: break
next += len(s)
yield ''.join(r) + b[ofs:next]
del r[:]
ofs = next
r.append(b[ofs:])
yield ''.join(r)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFDGRQXJd01MZaTXX0RAsZLAJ9g6A4nzcHAnwqUKrn5NL 8HxdORZgCeLvLH
dBrgevWmf9PQzqnw3zbD3KA=
=etbR
-----END PGP SIGNATURE-----

Sep 3 '05 #3

P: n/a
On Fri, 2 Sep 2005 22:10:18 -0500, je****@unpythonic.net wrote:

--SkvwRMAIpAhPCcCJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

I think you still have to roll your own.

Here's a start:
def ireadlines(f, s='\n', bs=4096):
if not s: raise ValueError, "separator must not be empty"
r = []
while 1:
b = f.read(bs)
if not b: break
ofs = 0
while 1:
next = b.find(s, ofs)
if next == -1: break
next += len(s)
yield ''.join(r) + b[ofs:next]
del r[:]
ofs = next
r.append(b[ofs:])
yield ''.join(r)

What if len(s)>1 and read(bs) reads a partial s?

I posted file splitter some time back which UIGoofed handles that
(still not tested beyond the shown examples, so caveat utor(??) ;-)

http://groups.google.com/group/comp....33f8b2e2fcdc49

Thought I might be missing something, but
def ireadlines(f, s='\n', bs=4096): ... if not s: raise ValueError, "separator must not be empty"
... r = []
... while 1:
... b = f.read(bs)
... if not b: break
... ofs = 0
... while 1:
... next = b.find(s, ofs)
... if next == -1: break
... next += len(s)
... yield ''.join(r) + b[ofs:next]
... del r[:]
... ofs = next
... r.append(b[ofs:])
... yield ''.join(r)
... from StringIO import StringIO as SIO
f = SIO('123xx678xxCxx_and so forth')
for s in ireadlines(f,'xx',4): print repr(s), ...
'123xx678xx' 'Cxx_and so forth' for s in ireadlines(f,'xx',5): print repr(s), ...
''
oops f.seek(0)
for s in ireadlines(f,'xx',5): print repr(s),

...
'123xx' '678xx' 'Cxx' '_and so forth'

Regards,
Bengt Richter
Sep 3 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.