473,382 Members | 1,743 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

pySQLite Insert speed

I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?
Feb 29 '08 #1
11 2035
On Thu, 28 Feb 2008 19:35:03 -0800 (PST), mdboldin wrote
I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?
My only problem with (B) is that it should really be this:

sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'
curs.execute( sqlxb, values )

Apart from that, (B) is better than (A). The parameter binding employed in (B)
is not only faster on many databases, but more secure. See, for example,
http://informixdb.blogspot.com/2007/...in-blanks.html for some
in-depth explanations of why parameter binding is better than string
formatting for performing SQL queries with variable values.

HTH,

--
Carsten Haese
http://informixdb.sourceforge.net
Feb 29 '08 #2
(B) is better than (A). The parameter binding employed in (B)
is not only faster on many databases, but more secure.
See, for example,http://informixdb.blogspot.com/2007/07/filling-in-
blanks.html

Thx. The link was helpful, and I think I have read similar things
before-- that B is faster.
So ... I just rewrote the test code from scratch and B is faster. I
must have had something wrong in my original timing.
Feb 29 '08 #3
md******@gmail.com wrote:
>
I hav read on this forum that SQL coding (A) below is preferred over
(B), but I find (B) is much faster (20-40% faster)

(A)

sqla= 'INSERT INTO DTABLE1 VALUES (%d, %d, %d, %f)' % values
curs.execute(sqla)

(B)
pf= '?, ?, ?, ?'
sqlxb= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
curs.execute( sqlxb, values )

Any intution on why (A) is slower?
I think you misunderstood. (B) is *ALWAYS* the proper way of doing
parameterized SQL queries. Unconditionally. The (A) style is way too
vulnerable to SQL injection attacks.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Mar 1 '08 #4
Steve, I want to make sure I understand. My test code is below, where
ph serves as a placeholder. I am preparing for a case where the number
of ? will be driven by the length of the insert record (dx)

dtable= 'DTABLE3'
print 'Insert data into table %s, version #3' % dtable
ph= '?, ?, ?, ?'
sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
t0a=time.time()
for dx in d1:
curs1.execute(sqlx,dx)
print (time.time()-t0a)
print curs1.lastrowid
conn1.commit()

I think you are saying that sqlx is re-evaluated in each loop, i.e.
not the same as pure hard coding of
sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
Is that right? Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.

And yes, I can see why (B) is always better from a security
standpoint. The python solutions for problems such as there are a
great help for people like me, in the sense that the most secure way
does not have a speed penalty (and in this case is 3-4x faster).
Mar 1 '08 #5
md******@gmail.com wrote:
Steve, I want to make sure I understand. My test code is below, where
ph serves as a placeholder. I am preparing for a case where the number
of ? will be driven by the length of the insert record (dx)

dtable= 'DTABLE3'
print 'Insert data into table %s, version #3' % dtable
ph= '?, ?, ?, ?'
sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)
t0a=time.time()
for dx in d1:
curs1.execute(sqlx,dx)
print (time.time()-t0a)
print curs1.lastrowid
conn1.commit()

I think you are saying that sqlx is re-evaluated in each loop, i.e.
not the same as pure hard coding of
sqlx= 'INSERT INTO DTABLE3 VALUES ( ?, ?, ?, ? ) '
Is that right?
Yes. If the sql is constant then you would be performing an unnecessary
computation inside the loop. Not a biggie, but it all takes time. Is the
loop above your original code? If so I was wrong about the loop.
Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.
Now this I don;t really understand at all. What's the point of trying to
replace sqlx with a copy of itself? Perhaps if you explained what you
hope this will achieve I could comment more intelligently.
And yes, I can see why (B) is always better from a security
standpoint. The python solutions for problems such as there are a
great help for people like me, in the sense that the most secure way
does not have a speed penalty (and in this case is 3-4x faster).
Yes, it's a real win-win. Since both the table and the number of
arguments appear to be variable one possible solution is to build a dict
that would allow you to look up the right SQL using the table name. So,
suppose you have the following tables and number of arguments:

tables = (("table1", 3),
("table2", 5),
("table3", 2)
)

you could create a suitable dict as (untested):

tdict = {}
for tbl, ct in tables:
tdict[tbl] = "INSERT INTO %s VALUES (%s)" % \
(tbl, ", ".join(["?"] * ct))

Then you can use the table to look up the right SQL, quite a fast
operation compared with actually building it.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Mar 1 '08 #6
mmm
Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.

Now this I don;t really understand at all. What's the point of trying to
replace sqlx with a copy of itself? Perhaps if you explained what you
hope this will achieve I could comment more intelligently.
I am/was attempting to convert

sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)

to code that did to need to be re-evaluated. i.e. to insert the
dtable and ph values as if they were hard coded.

copy.copy -- A shallow copy constructs a new compound object and
then (to the extent possible) inserts references into it to the
objects found in the original.
Mar 4 '08 #7
mmm wrote:
>>Hence (if I understand python convention), this can be
solved by adding
sqlx= copy.copy(sqlx)
before the looping. And in tests adding this step saved about 5-10% in
time.
Now this I don;t really understand at all. What's the point of trying to
replace sqlx with a copy of itself? Perhaps if you explained what you
hope this will achieve I could comment more intelligently.

I am/was attempting to convert

sqlx= 'INSERT INTO %s VALUES ( %s ) ' % (dtable,ph)

to code that did to need to be re-evaluated. i.e. to insert the
dtable and ph values as if they were hard coded.

copy.copy -- A shallow copy constructs a new compound object and
then (to the extent possible) inserts references into it to the
objects found in the original.
Unfortunately you weren't dealing with a compound object object here, so
all you are doing is creating a copy of the string you've just created
and replacing the original with it. Copy.copy() is meant for creating
(say) lists, tuples and dicts where the elements are references to the
same objects that the elements of the original structure referred to.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Mar 4 '08 #8
mmm

Steve, I think you were right the first time is saying
it should really be this:
sqlxb= 'INSERT INTO DTABLE2 VALUES (?, ?, ?, ?)'
my copy.copy() has the equivalent effect.

Running this test code produces the output below

import copy

print 'Test 1'
pf= '?,?,?,?'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print sqlx1

print
print 'Test 2'
sqlx2= copy.copy(sqlx1)
sqlx3= sqlx1
pf= '?,?,?, ****'
sqlx1= 'INSERT INTO DTABLE2 VALUES ( %s ) ' % pf
print 'sqlx1= ', sqlx1
print 'sqlx2= ', sqlx2
print 'sqlx3= ', sqlx2

== output
Test group 1
INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

Test group 2
sqlx1= INSERT INTO DTABLE2 VALUES ( ?,?,?, **** )
sqlx2= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )
sqlx3= INSERT INTO DTABLE2 VALUES ( ?,?,?,? )

I interpret this to mean that sqlx1 is not a simple string
Mar 4 '08 #9
Steve Holden wrote:
What I will repeat, however, is that while there is a *slight*
difference is semantics between

s = "some string"
s1 = s

and

s = "some string"
s1 = copy.copy(s)

that difference is only to ensure that s and s1 point to different
copies of the same string in the latter case, whereas in the former case
s and s1 point to the same string.
No, both "point" to the same string:
>>import copy
s = "some string"
s1 = s
s1 is s
True
>>s2 = copy.copy(s)
s2 is s
True

copy.copy() is just an expensive no-op here.

Peter
Mar 4 '08 #10
Peter Otten wrote:
Steve Holden wrote:
>What I will repeat, however, is that while there is a *slight*
difference is semantics between

s = "some string"
s1 = s

and

s = "some string"
s1 = copy.copy(s)

that difference is only to ensure that s and s1 point to different
copies of the same string in the latter case, whereas in the former case
s and s1 point to the same string.

No, both "point" to the same string:
>>>import copy
s = "some string"
s1 = s
s1 is s
True
>>>s2 = copy.copy(s)
s2 is s
True

copy.copy() is just an expensive no-op here.
I suppose wiht strings being immutable there is no need for copy.copy()
to actually return anything other than its argument for a string. Thanks
for pointing that out.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Mar 4 '08 #11
Peter Otten wrote:
Steve Holden wrote:
>What I will repeat, however, is that while there is a *slight*
difference is semantics between

s = "some string"
s1 = s

and

s = "some string"
s1 = copy.copy(s)

that difference is only to ensure that s and s1 point to different
copies of the same string in the latter case, whereas in the former case
s and s1 point to the same string.

No, both "point" to the same string:
>>>import copy
s = "some string"
s1 = s
s1 is s
True
>>>s2 = copy.copy(s)
s2 is s
True

copy.copy() is just an expensive no-op here.
I suppose wiht strings being immutable there is no need for copy.copy()
to actually return anything other than its argument for a string. Thanks
for pointing that out.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Mar 4 '08 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Chris | last post by:
Hi I am inserting 7000 rows in 9 columns, is 11 seconds slow or can that be improved. I am inserting to SQL Server on a server on the network Thanks
11
by: Sezai YILMAZ | last post by:
Hello I need high throughput while inserting into PostgreSQL. Because of that I did some PostgreSQL insert performance tests. ------------------------------------------------------------ --...
8
by: Josué Maldonado | last post by:
Hello List, I'm importing some data from Foxpro to Postgres, there is atable wich contains aprox 4.8 million rows and it size about 830MB. I uploaded it to Postgres using dbf2pg and worked fine,...
5
by: asdf | last post by:
I have a program that reads sorted data from a database and inserts it element by element into a set. If the data is sorted is there a faster way to insert ? Meaning is there a way to tell the...
2
by: lior3790 | last post by:
Hello, I'm working on a project which carrying a large scale of database (over 5 millions records). I'm looking for a way to improve the communication speed with the SQL server in any way that i...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.