By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,257 Members | 928 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,257 IT Pros & Developers. It's quick & easy.

How to do a query between two mysql instances

P: 1
I will appreciate any help you can provide me.

In the company where i work we have a project which creates about 4-5M records daily of stats. We're currently storing this data in a db named 'summary' which contains all the stats data for up to 15 days. After that we begin to move the data in a daily basis to a single 'archived like' unique table in another fast performance mysql instance. Why we do this ? to respect the 80%-20% rule: 80% of our stats queries will use data from the last 30 days.

This way we save all the stats history for many years in the fast performance mysql instance. There we will have (haven't reached to that point yet) all the old data which could eventually be retrieved by a user which tries to get it. And slso, this way we deal with just 15 days of data (about 200 MB) for the most common queries in the 'summary' db. Data is moved daily.

Our app uses PHP do perform all the user interaction. And everything works ok so far but we have a problem: if i want to list data for a date range which spans data in both servers (like a 60 days query) we have to do two queries: one in the 'archived' db and other in the summary db. Should it be the same mysql instance we'd do a join and that's all. But no. Remember that they're different mysql instances (and possibly although not by now, different servers) so we can't do a join among them and this is mandatory because data could (and very possible will) exists in both instances. So i have to make the join 'virtually' with php.

The problem with that is that obviously it demands lots of memory. We could create a single temp table with the data aggregated from both db's but the output is the same: we'd have to read at least one of the tables ion memory in order to create the temp table.

So my question is, besides federated tables (which are more to be used like a data warehouse and which have the problem is sending all the records through the network connection for the join) and besides replication (that's a performance solution and this is not a performance solution (and actually one of the mysql instances doesn't accepts replication) what solution could i have to deal with this scenario ? how does big companies with lots of traffic do to retrieve stats for many years ? i'm sure they should use many db's/servers but how ? sharding ? mysql table partitioning ?

In fact my problem is to be able to make a join/sum of data of identical tables but splitted (by date) between two mysql instances.

Thanks for your help.
Sep 21 '09 #1
Share this question for a faster answer!
Share on Google+

Post your reply

Sign in to post your reply or Sign up for a free account.