By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,214 Members | 2,066 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,214 IT Pros & Developers. It's quick & easy.

Need help on best way to insert many items quickly

P: n/a
Hi,

My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to execute
in under 20 seconds (max execution time is 30s, so just to be sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0 then
I do an insert. This takes too long for 20000 items.

I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I do
an update in MySQL, and if not I do an insert. This also takes too long
(although less than the first method).

Of course I can have the script redirect to itself after 20 seconds and
continue from where it left off, but this seems kind of lame.
Since the client wants to enter extra data for products and not lose it
when he does a price update, I can't just replace everything on every
update, just the price.

Changing the hosting is out of the question (he paid 2 years in advance
and has like 100000 emails set up in the control panel).

I'm sorry if this does not belong in comp.lang.php, sorry to take your
time.


TIA

Jul 17 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
a-**@consultant.com wrote:
My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to
execute in under 20 seconds (max execution time is 30s, so just to be
sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0
then I do an insert. This takes too long for 20000 items.

I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I
do an update in MySQL, and if not I do an insert. This also takes too
long (although less than the first method).

Of course I can have the script redirect to itself after 20 seconds
and continue from where it left off, but this seems kind of lame.
Since the client wants to enter extra data for products and not lose
it when he does a price update, I can't just replace everything on
every update, just the price.

Changing the hosting is out of the question (he paid 2 years in
advance and has like 100000 emails set up in the control panel).


Try using set_time_limit(0) to see if that allows you to have an
unlimited execustion time. Although the default maximum execution time
is 30 seconds, not all hosts prevent you from extending it.

Read the manual page for more info: http://www.php.net/set_time_limit

--
Chris Hope - The Electric Toolbox - http://www.electrictoolbox.com/
Jul 17 '05 #2

P: n/a
On 18 Dec 2004 11:57:46 -0800, a-**@consultant.com wrote:
My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to execute
in under 20 seconds (max execution time is 30s, so just to be sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0 then
I do an insert. This takes too long for 20000 items.
Is there an index on the ID column?

I just knocked up a quick test to see how long it takes here; created a table
as follows:

mysql> desc product;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | | PRI | NULL | auto_increment |
| product_name | varchar(255) | YES | | NULL | |
| price | int(11) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

Filled it with 20000 rows, then did a PHP loop to update the price on them
all. This is on a creakingly ancient 200MHz machine, it took 38 seconds. I'd
hope your server has a little more horsepower than this very old machine.

I then dropped the primary key (primary keys are automatically backed by
indexes) and repeated the test. As expected it took an enormously larger amount
of time; actually I gave up waiting after several minutes of MySQL stuck at
100% CPU usage.
I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I do
an update in MySQL, and if not I do an insert. This also takes too long
(although less than the first method).

Of course I can have the script redirect to itself after 20 seconds and
continue from where it left off, but this seems kind of lame.

Since the client wants to enter extra data for products and not lose it
when he does a price update, I can't just replace everything on every
update, just the price.

Changing the hosting is out of the question (he paid 2 years in advance
and has like 100000 emails set up in the control panel).

I'm sorry if this does not belong in comp.lang.php, sorry to take your
time.


Your question is OK here; I believe there's a comp.databases.mysql newsgroup
in the process of being created, but since you're using PHP to do this, it's
on-topic here according to the charter. Perhaps when the MySQL group is created
similar questions could belong there but it's fine for now.

--
Andy Hassall / <an**@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Jul 17 '05 #3

P: n/a
What version of MySQL is your client on?

After 4.1 you can do "ON DUPLICATE KEY UPDATE", which first tries to insert,
then updates if the record exists.

You can also do multiple values sets like this:

INSERT INTO table (ID,price) VALUES (1000,4.95),(1001,5.95),etc... ON
DUPLICATE KEY UPDATE price=VALUES(price);

Even if the client isn't on a recent MySQL version, you can still do
multiple VALUES sets in a single query. So you could INSERT all new values
in a single statement (be sure to check MAX_ALLOWED_PACKET size to make sure
your query isn't too long). You must check that all your inserts are unique
first (which I believe you could do with a single SELECT statement).

- Kevin

<a-**@consultant.com> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Hi,

My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to execute
in under 20 seconds (max execution time is 30s, so just to be sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0 then
I do an insert. This takes too long for 20000 items.

I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I do
an update in MySQL, and if not I do an insert. This also takes too long
(although less than the first method).

Of course I can have the script redirect to itself after 20 seconds and
continue from where it left off, but this seems kind of lame.
Since the client wants to enter extra data for products and not lose it
when he does a price update, I can't just replace everything on every
update, just the price.

Changing the hosting is out of the question (he paid 2 years in advance
and has like 100000 emails set up in the control panel).

I'm sorry if this does not belong in comp.lang.php, sorry to take your
time.


TIA

Jul 17 '05 #4

P: n/a
How about using REPLACE INTO instead of INSERT INTO. This will work if the
records currently exist in the database or not.

--
Tony Marston

http://www.tonymarston.net

"Kevin" <ke***@wxREMOVE4SPAM3.com> wrote in message
news:BO0xd.211004$5K2.162710@attbi_s03...
What version of MySQL is your client on?

After 4.1 you can do "ON DUPLICATE KEY UPDATE", which first tries to
insert,
then updates if the record exists.

You can also do multiple values sets like this:

INSERT INTO table (ID,price) VALUES (1000,4.95),(1001,5.95),etc... ON
DUPLICATE KEY UPDATE price=VALUES(price);

Even if the client isn't on a recent MySQL version, you can still do
multiple VALUES sets in a single query. So you could INSERT all new
values
in a single statement (be sure to check MAX_ALLOWED_PACKET size to make
sure
your query isn't too long). You must check that all your inserts are
unique
first (which I believe you could do with a single SELECT statement).

- Kevin

<a-**@consultant.com> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Hi,

My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to execute
in under 20 seconds (max execution time is 30s, so just to be sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0 then
I do an insert. This takes too long for 20000 items.

I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I do
an update in MySQL, and if not I do an insert. This also takes too long
(although less than the first method).

Of course I can have the script redirect to itself after 20 seconds and
continue from where it left off, but this seems kind of lame.
Since the client wants to enter extra data for products and not lose it
when he does a price update, I can't just replace everything on every
update, just the price.

Changing the hosting is out of the question (he paid 2 years in advance
and has like 100000 emails set up in the control panel).

I'm sorry if this does not belong in comp.lang.php, sorry to take your
time.


TIA


Jul 17 '05 #5

P: n/a
Tony Marston wrote:
How about using REPLACE INTO instead of INSERT INTO. This will work if the
records currently exist in the database or not.


Not quite, since REPLACE first deletes the existing row, before inserting
the new, and as he said, he didn't want to delete the existing data, only
update parts of the row.

--
Tommy

Jul 17 '05 #6

P: n/a
Sorry if this appears twice, I posted this yesterday and it didn't show up:

What version of MySQL is your client on?

After 4.1 you can do "ON DUPLICATE KEY UPDATE", which first tries to insert,
then updates if the record exists.

You can also do multiple values sets like this:

INSERT INTO table (ID,price) VALUES (1000,4.95),(1001,5.95),etc... ON
DUPLICATE KEY UPDATE price=VALUES(price);

Even if the client isn't on a recent MySQL version, you can still do
multiple VALUES sets in a single query. So you could INSERT all new values
in a single statement (be sure to check MAX_ALLOWED_PACKET size to make sure
your query isn't too long). You must check that all your inserts are unique
first (which you could do with a single SELECT * WHERE ID IN () statement).

- Kevin

<a-**@consultant.com> wrote in message
news:11**********************@c13g2000cwb.googlegr oups.com...
Hi,

My client has a product database od around 20000 items. And it is
updated every few days when he gets a catalog from the supplier.

It's supposed to work like this: if there already is a product in the
database with that ID (primary key), just UPDATE the price, and if
there is no product with that ID it should do an INSERT for that item.

Now I've tried it in several ways and I can't seem to get it to execute
in under 20 seconds (max execution time is 30s, so just to be sure).

I've tried doing an update for every item (UPDATE product SET
price=$price where ID=$id) and if I get a mysql_affected_rows()==0 then
I do an insert. This takes too long for 20000 items.

I've also tried collecting all IDs in one array and then doing an
in_array() check for every ID value, and if it exists in the array I do
an update in MySQL, and if not I do an insert. This also takes too long
(although less than the first method).

Of course I can have the script redirect to itself after 20 seconds and
continue from where it left off, but this seems kind of lame.
Since the client wants to enter extra data for products and not lose it
when he does a price update, I can't just replace everything on every
update, just the price.

Changing the hosting is out of the question (he paid 2 years in advance
and has like 100000 emails set up in the control panel).

I'm sorry if this does not belong in comp.lang.php, sorry to take your
time.


TIA

Jul 17 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.