By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,100 Members | 2,979 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,100 IT Pros & Developers. It's quick & easy.

how to refer to partial list, slice is too slow?

P: n/a
I'm a python newbie. It seems the slice operation will do copy.
for example:
>>a = [1,2,3,4,5,6,7,8,9,0]
b = a[7:]
b
[8, 9, 0]
>>a.remove(9)
a
[1, 2, 3, 4, 5, 6, 7, 8, 0]
>>b
[8, 9, 0]

if the list have large members, the slice operations will consume many
times.
for instance, I have a long string named it as S, the size is more
than 100K
I want to parser it one part-to-part. first, I process the first 100
byte, and pass the remainder to the next parser function. I pass the
S[100:] as an argument of the next parser function. but this operation
will cause a large bytes copy. Is there any way to just make a
reference to the remainder string not copy?

May 11 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
I make a sample here for the more clearly explanation

s = " ..... - this is a large string data - ......."

def parser1(data)
# do some parser
...
# pass the remainder to next parser
parser2(data[100:])

def parser2(data)
# do some parser
...
# pass the remainder to next parser
parser3(data[100:])

def parser3(data)
# do some parser
...
# pass the remainder to next parser
parser4(data[100:])

....
May 11 '07 #2

P: n/a
人言落日是天涯,望极天涯不见家 schrieb:
I'm a python newbie. It seems the slice operation will do copy.
for example:
>>>a = [1,2,3,4,5,6,7,8,9,0]
b = a[7:]
b
[8, 9, 0]
>>>a.remove(9)
a
[1, 2, 3, 4, 5, 6, 7, 8, 0]
>>>b
[8, 9, 0]

if the list have large members, the slice operations will consume many
times.
for instance, I have a long string named it as S, the size is more
than 100K
I want to parser it one part-to-part. first, I process the first 100
byte, and pass the remainder to the next parser function. I pass the
S[100:] as an argument of the next parser function. but this operation
will cause a large bytes copy. Is there any way to just make a
reference to the remainder string not copy?
You can use itertools.islice:

pya = [1,2,3,4,5,6,7,8,9,0]
pyb = itertools.islice(a, 7)
pyb
<itertools.islice object at 0xb7d9c34c>
pyb.next()
1
pyb.next()
2
pyb.next()
3
pyb.next()
4
pyb.next()
5
pyb.next()
6
pyb.next()
7
pyb.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration

HTH,
Martin
May 11 '07 #3

P: n/a
In <11*********************@o5g2000hsb.googlegroups.c om>,
浜鸿█钀芥棩鏄ぉ娑紝鏈涙瀬澶╂动涓嶈瀹 wrote:
I make a sample here for the more clearly explanation

s = " ..... - this is a large string data - ......."

def parser1(data)
# do some parser
...
# pass the remainder to next parser
parser2(data[100:])

def parser2(data)
# do some parser
...
# pass the remainder to next parser
parser3(data[100:])

def parser3(data)
# do some parser
...
# pass the remainder to next parser
parser4(data[100:])

...
Do you need the remainder within the parser functions? If not you could
split the data into chunks of 100 bytes and pass an iterator from function
to function. Untested:

def iter_chunks(data, chunksize):
offset = chunksize
while True:
result = data[offset:offset + chunksize]
if not result:
break
yield result
def parser1(data):
chunk = data.next()
# ...
parser2(data)
def parser2(data):
chunk = data.next()
# ...
parser3(data)

# ...

def main():
# Read or create data.
# ...
parser1(iter_chunks(data, 100))

Ciao,
Marc 'BlackJack' Rintsch
May 11 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.