473,765 Members | 2,034 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Going for a LARGE Table: Any Tips?

Hi there

I'm developing a large web application. Part of this web application will
be storing numerical chart data in a MySQL table - these numbers will be
already calculated, and are just being stored for reference.

In this particular table, the stored data will never be deleted or
changed. The only actions performed will be SELECTs and INSERTs. There
will never be any DELETEs or UPDATEs.

The end-result charts themselves have many rows. Let's say about 40-100
rows per chart. I will be storing thousands of charts. I am definitely
looking at the possibility of there being millions (upon millions?) of
rows at some time in the future. I am very aware of relational database
concepts, and one large table truly makes the most sense in this
particular situation. Believe me, there are foreign keys up the wahoo.

A general question to the experts out there (or just people with
experience ;))... what kind of things should I 'look out for', or
prepare for, when designing/using a table that is so large? My previous
MySQL tables in all my years of programming have never really had more
than 40,000 rows. This table will of course be the largest table in a
database that holds many more tables (about 20 or 30).

Here are things I am concerned/wondering about. Any advice on them would
be very much appreciated.

1) Table type: all the tables in the database that depend on data being
inserted/updated to other tables are all InnoDB using transactions. This
table, as mentioned, will strictly be INSERTed to and SELECTed from.
Should I use MyISAM or InnoDB? Just about all columns will be storing
numbers, possibly a varchar column or two, and no blobs/text columns.
However, these columns will need to be 'searched' sometimes.

2) Backing up / Restoring: What should I look out for in terms of
dumping/restoring the table? Anything other than the traditional
mysqldump?

3) If I do go with an InnoDB table... I imagine I will probably have to
change that 'ibdata' setting, or at least get familiar/involved with it,
since this table will be so large. I've seen the term 'multiple
tablespaces' in reference to InnoDB tables. Is this something I should
be investigating for a table of this size? What exactly does it mean?

4) Do I need to think about replication of any sort? Or slave/master
things? Or can I just use the database as is? There would never be more
than 50 or so people using the database at once (if that matters).

The system is a P4 (2.8), running Linux RedHat, MySQL 4.1.7 (though I'm
pretty sure I will have it upgraded to MySQL 5 before putting this into
production or even building it - or do I even need to), 1 GB of RAM....
basically our own box being managed/hosted somewhere across the
continent.

I know this is a lot of advice to ask. I'm self-taught, and have been
re-reading some MySQL books in preperation for this project, which will
be my 'largest' to date, and will be quite complex overall (this is a
small but still critical part of the overall project). The books teach
well, but they don't offer to much advice to specific situations ;)

Thanks.

Nov 11 '05 #1
1 6198
Hi,

very interesting question ;-).

What the really best solution is depends strongly on how frequently you
query the differents parts of the data. Is there something like 'historical'
data (you need to query less often and only with SELECTs) and current data
(which you need all the time)? Then it might be an option to set up an
archive table. Assuming that you're using MySQL 5, you can create views to
query historical and current data together, if necessary (although I would
rather query a single table if possible than the view, even if it might seem
more convinient to always use the view).

For the "current" table I would use InnoDB, because locking is done at the
row level, so you'll probably get a better performance when doing SELECTs
and INSERTs simulaniously. For the historical tables, if you do not
necessarily need foreign key constraints for them (otherwise you would be
bound to InnoDB), I would rather choose MyISAM. With MyISAM, you can also
create compressed read only tables (if you're sure that there will be no
further INSERTs) that consume less disk space and can be queried faster.
With MyISAM, you could e.g. create one archive table for every month and
merge them together - and use this merged table in the view to get a
complete data set. This way, the data is split up into smaller pieces, but
you can still query them all together, if needed, and take advantage from
techniques that are preferable for dynamic data and for static data at the
same time.

This approach would need some maintainance work. You would need to create a
new archive table every new month, move the old data to this new archive
table and re-create the merge table to include the new table. A Stored
Procedure could help you to automate this process. But you would also gain
advantage in doing backups, because you would only have to backup the
archive tables once (they won't change anymore) and you can create smaller
and more frequent backups on the current data.

Replication should be an option, if the performance is still not suffient
(although I think that it will work for 50 users without replication). If
you use replication, you'd do best to use the master server to do the
INSERTs and the slave server to do the SELECTs. In that case, you can set
different indexes for the master table and for the slave table. As indexes
can slow down INSERTs, you could set the indexes only on the slave server's
table (except any unique indexes that you need to ensure data integrity, of
course) to speed up the SELECT statements, but not to slow down the INSERTs.
However, if the server's at a *very* high load (which is very unlikely for
50 users), the slave server could be a bit behind the master server. But in
most cases, the replication slave is almost as up to date as the master
server (maybe 1/100 of a second behind or so ;-)).

Well, so I guess, I gave you something to think about and maybe some other
newsgroup users a basis for further discussion ;-).

Markus
Nov 11 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
23626
by: Robert | last post by:
I am having performance issues on a SQL query in Access. My query is accessing and joining several tables (one very large one). The tables are linked ODBC. The client submits the query to the server, separated by several states. It appears the query is retrieving gigs of data from the table and processing the joins on the client. Is there away to perform more of the work on the server there by minimizing the amount of extraneous table...
6
1827
by: BT | last post by:
I'm trying to format a three column table and I can't get things to line up the way I need to. The first column is text of variable length that might wrap onto multiple lines. It is within <p></p> tags to right/left justify it. The second column is a column of ditto marks, centered left/right in the cell and aligned with the bottom line of text of column 1. The third column is a column of numbers also aligned with the bottom line of...
1
3283
by: Ian | last post by:
I want to open a form at a particular record, but I think I'm running into problems because the recordsource query is executing asynchronously. In the form's open event I use recordsetclone.findfirst to get a bookmark and then set the form's bookmark to this value. normally that works fine, but the form I am using has a very large number of records and isn't being set to the correct record, even though it is found successfully in the...
11
2165
by: Brian Henry | last post by:
Well here is the problem, I have a data set with about 9,000 to 20,000 people in it in the data table "people"... I am then reading it into a list view one at a time row by row... adding each person to the list view, well this works fine up to about 1,000 people then it start's to get really show putting the people in (a lot of process time used up)... does anyone know of any optimizations i could do to make this processess work a lot...
2
6642
by: Paul Morrison | last post by:
Hi, I am new to Javascript and am having a bit of difficulty. On my site, in order for a member to unsubscribe from an article, they go to the'Unsubscribe' page where they get a table of all of the articles that they are subscribed to. There is a table with each article, and a checkbox next to each article. For each checked checkbox it should send the value of the checkbox to stop_subscription.php. I have used the following...
10
2697
by: sasquatch | last post by:
X-No-Archive: Are there any good books that provide tips about how to understand and "play with" large C++ programs? It seems like a very important skill, and yet its hardly taught in the universities. Basic programming skills and knowledge of language constructs don't seem enough for understanding larger programs.
4
3189
by: Emin | last post by:
Dear Experts, What is the best way to do a large insert WITHOUT having direct access to the machine SQL Server is running on? For example, imagine I want to insert something like 20,000 records. If I were to have access to the server, I could BULK INSERT into a temp table and then insert into the destination table. But if I can't create a file on the server to use for BULK INSERT, what is the next best alternative to doing lots of 1...
25
20564
by: tekctrl | last post by:
Anyone: I have a simple MSAccess DB which was created from an old ASCII flatfile. It works fine except for something that just started happening. I'll enter info in a record, save the record, and try to move to another record and get an Access error "Record is too large". The record is only half filled, with many empty fields. If I remove the added data or delete some older data, then it saves ok and works fine again. Whenever I'm...
6
1636
mideastgirl
by: mideastgirl | last post by:
YIKES! I am having problems going from mideasthonors.org/addmember.php to either the error page because the form has not been filled out accurately, or to the thank you page!!!! Here is my script, and it just keeps saying that the page cannot be found because it is coming up as www.mideasthonors.org/mideasthonors.org/addmember_db.php! What the heck is going on! <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"...
0
9568
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9399
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
9957
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8832
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7379
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6649
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5276
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5423
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3532
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.