473,386 Members | 1,908 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

How to increase the speed of this program?

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

import wave
import array
lfile = wave.open(lfilename)
rfile = wave.open(rfilename)
ofile = wave.open(ofilename, "w")
lformat = lfile.getparams()
rformat = rfile.getparams()
lframes = lfile.readframes(lformat[3])
rframes = rfile.readframes(rformat[3])
lfile.close()
rfile.close()
larray = array.array("h", lframes)
rarray = array.array("h", rframes)
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
oarray[0::2] = larray #2
oarray[1::2] = rarray #3
ofile.setnchannels(2)
ofile.setsampwidth(2)
ofile.setframerate(lformat[2])
ofile.setnframes(len(larray))
ofile.writeframes(oarray.tostring())
ofile.close()

Nov 28 '06 #1
18 1924
"HYRY" <zh*****@feng.co.jpwrote in message
news:11*********************@j44g2000cwa.googlegro ups.com...
>I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
I'm not overly familiar with the array module, but one place you may be
paying a penalty is in allocating the list of 0's, and then interleaving the
larray and rarray lists.

What if you replace lines 1-3 with:

def takeOneAtATime(tupleiter):
for i in tupleiter:
yield i[0]
yield i[1]

oarray = array.array("h",takeOneAtATime(itertools.izip(larr ay,rarray)))

Or in place of calling takeOneAtATime, using itertools.chain.

oarray = array.array("h", itertools.chain(*itertools.izip(larray,rarray)))

Use itertools.izip (have to import itertools somewhere up top) to take left
and right values in pairs, then use takeOneAtATime to yield these values one
at a time. The key though, is that you aren't making a list ahead of time,
but a generator expression. On the other hand, array.array may be just
building an internal list anyway, so this may just be a wash.

Also, try psyco, if you can, especially with this version. Or pyrex to
optimize this data-interleaving.

HTH,
-- Paul
Nov 28 '06 #2
HYRY wrote:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h").fromstring("\0" * size)

may be a bit faster.

Peter
Nov 28 '06 #3
HYRY wrote:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter
Nov 28 '06 #4
I think
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
oarray[0::2] = larray #2
oarray[1::2] = rarray #3
will be executed at C level, but if I use itertools, the program is
executed at Python level. So the itertools version is actually slower
than the original program.
I tested #1,#2,#3. the speed of #2 and #3 is OK, but #1 is slow.
So my question is : are there some methods to create a huge array
without an initializer?

Nov 28 '06 #5

Peter Otten wrote:
HYRY wrote:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter
Thank you very much, that is just what I want.

Nov 28 '06 #6
Peter Otten wrote:
HYRY wrote:
>I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
>oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.
Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a = array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Peter

Nov 28 '06 #7

Peter Otten wrote:
Peter Otten wrote:
HYRY wrote:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a = array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop
Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

-- Leo

Nov 28 '06 #8
Leo Kislov wrote:
>
Peter Otten wrote:
>Peter Otten wrote:
HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.
That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.

Peter

Nov 28 '06 #9
Peter Otten wrote:
Leo Kislov wrote:
>>
Peter Otten wrote:
>>Peter Otten wrote:

HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.
Oops, I have to work on my reading skills. You're right, of course...
That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.
.... and that could be spelled array.__mul__ as you suggest.

Peter

Nov 28 '06 #10
HYRY wrote:
Peter Otten wrote:
HYRY wrote:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter

Thank you very much, that is just what I want.
Even faster: oarray = larray + rarray

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6; b =
array('h', [0])*(N/2); c = b[:]" "a = b + c"
100 loops, best of 3: 5.7 msec per loop

-- Leo

Nov 28 '06 #11

Peter Otten wrote:
Peter Otten wrote:
Leo Kislov wrote:
>
Peter Otten wrote:
Peter Otten wrote:

HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

Oops, I have to work on my reading skills. You're right, of course...
That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.

... and that could be spelled array.__mul__ as you suggest.
I'm extremely agnostic about the spelling :-) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

Cheers,
John

Nov 28 '06 #12
John Machin wrote:
I'm extremely agnostic about the spelling :-) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).
array(t, [v])*n

</F>

Nov 28 '06 #13
Fredrik Lundh wrote:
John Machin wrote:
>I'm extremely agnostic about the spelling :-) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

array(t, [v])*n
Of course Leo was already there before I messed it up again.

$ python2.5 -m timeit -s'from array import array; s = "abc"' 'a = array("c",
s); a*1000000'
10 loops, best of 3: 53.5 msec per loop

$ python2.5 -m timeit -s'from array import array; s = "abc"' 'a = array("c",
s); s*1000000'
100 loops, best of 3: 7.63 msec per loop

So str * N is significantly faster than array * N even if the same amount of
data is copied.

Peter

Nov 28 '06 #14
Fredrik Lundh wrote:
John Machin wrote:
I'm extremely agnostic about the spelling :-) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

array(t, [v])*n

</F>
Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster. Looks like I'd better get a
copy of arraymodule.c and start fiddling.

Anyone who could use this? Suggestions on name? Argument order?

Functionality: same as array.array(typecode, [repeat_value]) *
repeat_count. So it would cope with array.filledarray('c', "foo", 10)

I'm presuming an additional constructor would be better than doubling
up on the existing one:

array.array(typecode[, initializer)
and
array.array(typecode[, repeat_value, repeat_count])

Cheers,
John

Nov 28 '06 #15
John Machin wrote:
Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster.
before you add a new API, you should probably start by borrowing the
repeat code from Object/stringobject.c and see if the speedup is good
enough.

</F>

Nov 28 '06 #16
John Machin wrote:
Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster. Looks like I'd better get a
copy of arraymodule.c and start fiddling.

Anyone who could use this? Suggestions on name? Argument order?

Functionality: same as array.array(typecode, [repeat_value]) *
repeat_count. So it would cope with array.filledarray('c', "foo", 10)
Why not just optimize array.__mul__? The difference is clearly in the
repeated memcpy() in arraymodule.c:683. Pseudo-unrolling the loop in
python demonstrates a speed up:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c',['\0'])*100000"
100 loops, best of 3: 3.14 msec per loop
[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c',['\0','\0','\0','\0'])*25000"
1000 loops, best of 3: 732 usec per loop
[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*20)*5000"10000 loops, best of 3: 148 usec per loop

Which is quite close to your fromstring solution:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c').fromstring('\0'*100000)"
10000 loops, best of 3: 137 usec per loop

In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop

For the record:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*100000)"
10000 loops, best of 3: 140 usec per loop

-Mike

Nov 28 '06 #17

Klaas wrote:
In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop
This is an unclean minimally-tested patch which achieves reasonable
performance (about 10x faster than unpatched python):

$ ./python -m timeit -s "from array import array" "array('c',
'\0')*100000"
10000 loops, best of 3: 71.6 usec per loop

You have my permission to use this code if you want to submit a patch
to sourceforge (it needs, proper benchmarking, testing, and tidying).

-Mike

Index: Modules/arraymodule.c
================================================== =================
--- Modules/arraymodule.c (revision 52849)
+++ Modules/arraymodule.c (working copy)
@@ -680,10 +680,29 @@
return NULL;
p = np->ob_item;
nbytes = a->ob_size * a->ob_descr->itemsize;
- for (i = 0; i < n; i++) {
- memcpy(p, a->ob_item, nbytes);
- p += nbytes;
- }
+
+ if (n) {
+ Py_ssize_t chunk_size = nbytes;
+ Py_ssize_t copied = 0;
+ char *src = np->ob_item;
+
+ /* copy first element */
+ memcpy(p, a->ob_item, nbytes);
+ copied += nbytes;
+
+ /* copy exponentially-increasing chunks */
+ while(chunk_size < (size - copied)) {
+ memcpy(p + copied, src, chunk_size);
+ copied += chunk_size;
+ if(chunk_size < size/10)
+ chunk_size *= 2;
+ }
+ /* copy remainder */
+ while (copied < size) {
+ memcpy(p + copied, src, nbytes);
+ copied += nbytes;
+ }
+ }
return (PyObject *) np;
}

Nov 28 '06 #18

Klaas wrote:
Klaas wrote:
In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop

This is an unclean minimally-tested patch which achieves reasonable
performance (about 10x faster than unpatched python):
<snip>

Never mind, that patch is bogus. A updated patch is here:
http://sourceforge.net/tracker/index...70&atid=305470

-Mike

Nov 29 '06 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: and | last post by:
hi everyone, i am using oracle 9.2.0 and i have written a simple jdbc java program to insert a record within a for loop to a table using jdbc thin driver(refer to the bottom of this email for the...
15
by: RAYYILDIZ | last post by:
I Know C is the fastest progrmming language. However, by using some bitwise operation you can get faster the your program. For instance, we talk about swap function. For a integer swapping we use...
5
by: velthuijsen | last post by:
I have a function that before I modified it took around 13.75 seconds to complete after the modification it took .325 seconds to complete. the function header: (Point **Input, size_t InputSize,...
14
by: Sameer | last post by:
Hello, i wish to read a file of int and store into an array dynamically... the size of memory allocated finally, should just be sufficeient to store n integers. I do not know the number of...
3
by: Jakob Petersen | last post by:
Hi, I need to increase the speed when retrieving data from a hosted SQL Server into VBA. I'm using simple SELECT statements. How important is the speed of my Internet connection? (I have...
1
by: Kelie | last post by:
hello, would there be any speed increase in code execution after python code being compiled into exe file with py2exe? thanks, kelie
1
by: AliRezaGoogle | last post by:
Dear members I am working with a 2000 GH P4 Intel, and 512GB RAM. I have a long list matrix 3000 * 15,000 of type double. I have a calculation procedure which can be executed on any single...
0
by: Charles | last post by:
3000 rows is not a big quantity. You can load it into VC program memory, a linked list for example, and "asynchronously" load into Oracle. The connection method can be embedded SQL or ODBC. ...
10
by: Devang | last post by:
Hello, I am using php script to upload file. some times if file size is too big(1GB) it takes too much time to upload. Can someone suggest me the way to increase upload speed. thanks
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.