Quote:
Originally Posted by TheServant
First of all, the game is not going to be *that* server intensive. The database will only be updated when the user submits a form to the server. So there will be cron jobs running every 10 minutes, but only updating about 5 or so variables. The user(s) information is not updated periodically but just called on every page.
If you update variables with cron-jobs, then you must somehow notify all sessions that use these tables to clear their cache and reload these tables.
You can achieve this in different ways. I will give you here a maybe not very elegant, but simple way:
You make a table with one row that stores only a version number, let's say "3". To be fast, you should use prepared statements and use a mySQL table engine that is memory-mapped and has no rollback. like myISAM or MEMORY. Now your session reads this version number along with all the table data and caches it in internal session variables. Every time a user submits a new form and his data needs to be processed, you will make an SQL first to check for this version number and compare it with the one stored inside the session. If it's the same, do nothing, else reload the tables. The version number is changed only by the cron-jobs. They increase the number by one when they make a change to the tables.
Quote:
Originally Posted by TheServant
So I guess a further question is updating x variables in a small table basically (obviously not exactly, but basically) the same as updating x variables in a big table?
If you update a single table, you have only one SQL to send over the network (or internal port if it's on the same machine). Also the SQL needs to be parsed only once. The result data is grabbed from the harddisk in one chunk.
But if you have multiple small tables instead, it will be slower because you need to send over an SQL for every table (or make a complicated join-query instead), you need to parse all SQLs and your haddisk must move its head to different location to grab all the needed data, which can be thousand times slower.
So reading/updating in a big table is faster than from many small tables. On the other side, if all the table data can be hold inside the memory of the database, then it's much faster but still not as fast as a big table, because of the parsing/sending overhead of the queries.
So if performance matters, you should use a big table and regularly defragment it. But if perfromance does not matter, it's better to hold all data in small tables, so changes and extensions can be done easier. And you have a better overview in case you need to debug or search manually, to export partial data etc.
Quote:
Originally Posted by TheServant
The tables are not joined, but all use the user_name or user_id as the primary key, so that they can be called in a similar way.
if you use user_id as primary key in one table which matches the user_id in another table, that's exactly called joined data. So they are joined.
if you want to fetch the data, you must execute either:
- select * from table1, table2 where table1.user_id = ? and table2.user_id=?
or call both statements:
[HTML]select * from table1 where user_id = ?;
select * from table2 where user_id = ?; [/HTML]
whereas a big table would let you get all dat with a single statement:
[HTML]select * from big_table where user_id = ? [/HTML]
that's a good example for the explanation in the chapter above. The assumption is that you need data from both tables.
But if you need only data from one table, then calling only
- select soldier_name, rank from table1 where user_id = ?
is slightly (and maybe not noticable) faster than calling
- select soldier_name, rank from big_table where user_id = ?
.
Quote:
Originally Posted by TheServant
Each form will only update a certain number of variables, so say a maximum of 10 variables. For example, if a player trains some soldiers the following groups of data (currently all the same table, but I am thinking about separating them):
{user resources - money, ore, wood}
{user army - swordsmen, archers, cavalry}
So the current setup will include the following data groups (slightly simplified):
- {user resources - money, ore, wood, food, stone}
- {user army - swordsmen, archers, cavalry}
- {user weapons - level 1 weapon, level 2 weapon, level 3 weapon}
- {user alliance information}
- {user attack log and history}
- {user messages}
Can't think of anymore examples but that will do. Now if I purchase weapons, it will only change {user resources} and {user weapons} but the page will require {user resources}, {user army} and {user weapons}. At the moment, it calls the big table (only opens one table) will all the user information, but I am wondering if it more efficient to separate the data into 6 smaller tables and so for this case, only 3 tables will be called (half the original size).
The proposed structure how you divide the data is very good.
I hope if you call the big table, you only return the needed fields (I mean not using "select *"). If you do otherwise, then you would return unneeded data which is costly.
Also do a proper normalization, which forces you to hold data in separate tables.
Never ever store the same data twice! Only store it in one place and reference to it everywhere. For example a user can send many chat messages and one name. Then avoid solutions a) and b), but do solution c):
Solution a)
one big table, stores user_name, chat_message:
"The Servant", "hello how are You?"
"The Servant", "I am fine, too !"
"The Servant", "I don't think so !"
Solution b)
one big table, stores user_name, chat_message1, chat_message2, ... chat_message100:
"The Servant", "hello how are You?", "I am fine, too !", "I don't think so !", null, ...
Solution c)
two tables,
c1) user_id, name
1, "The Servant"
c2) user_id, chat_message
1, "hello how are You?"
1, "I am fine, too !"
1, "I don't think so !"
On the other side, avoid splitting up data that should be logically together. For example you need to store the map position of a soldier: Don't do solution a) but do solution b):
a) two tables:
a1) soldier_id, x_coordinate
1, 800
a2) soldier_id, y_coordinate
1, 600
b) one table: soldier_id, x_coordinate, y_coordinate
1, 800, 600
first, it's faster and second, it's more secure. You probably would define all table columns "not null".
But in case a), if a table is corrupt or only half the data got saved because of a server crash it's possible to have the x-coordinate, but not the y-coordinate. This may lead to null-pointer exceptions in your program when you try to access a position on the map with these coordinates.
In case of b), all data is lost or all data is there, so no crashes.
Quote:
Originally Posted by TheServant
So my concern is that opening several smaller tables is more resource intensive than opening one big one? I hope you understand.
yes it is. And retrieval of data in case of corruption is harder (checking all the broken joins). But you will lose less data and can retrieve more.
If you want an overview of all user's data,the SQL is more complicated, but the overview over partial data is easier.
But on the other side, extension of additional data is very easy, and the performace loss may not be that bad or relevant.
If you have an index on the user_id (or primary key), that means you don't need to make a full table scan, then calling the data from the big table is faster. But if it doesn't go over the network, only local, the performance gain shrinks.
All tables where you need to make a full table scan should be divided into smaller ones. For example you are storing the user chat messages, and you want to know which user uses a bad, dirty word, I mean which can occur everywhere inside the stored user input string, then searching for this word is only possible with a full table scan (assuming you don't use mySqls text search facilities and you just have it stored as a normal string and you seach with "like" or "instr"). So in this case it's better to store it inside a small table, because the database most likely then can hold all the table data inside the memory and doesn't need to access the slow disk, or at least can read it in quick, whole disk buffer chunks. Reading a whole small file is faster than reading a whole big file or jumping with the head around a big file.
Quote:
Originally Posted by TheServant
I use PHP as my language.
I have one server and so no information exchanges occur.
All information I am referring to is user-specific, the game-specific ones are either in the page or called once during login so should not impact anything asked here.
Again, I sincerely thank you for your time, and I look forward to hearing back from you.
sounds good, but what if your game is a big success and one server is not enough anymore? What if your data grows so big that retrieval becomes a performance issue, and you decide to split up the big table into 2 smaller ones, and you store one table on one server, the other table on the second server, and then you inquire both servers in parallel to retrieve the data in double speed as if you would have retrieved from one server only?
Then you have to scap your whole design and rewrite all your SQL's. A big work and coding nightmare. If you have many small tables from the beginning, you don't need to do that. You can handle the relocation on database level (remote access, views etc.)
My advice:
It's much better to split the tables from the beginning on (as what you have proposed, or even more), and if you want to tune the performance of your game, you can put data together later on and only in cases of bottlenecks.