I am struggling with how I should store this data.
I get an xlm update every day which contains about 20,000 rows of data.
There are some new rows, maybe 150 a week, and some of the existing rows data changes.
Approximately 97 % of all the data is the same each day with just 3% changing.
I am using Mysql database tables.
So I was thinking that the best way to store the data would be: Base_Data_Table - 20,000 rows (static)
Structure:
id, title, desc, data1, data2, ... data18 Current_Data_Ta ble - 20,000 rows ( replaced daily)
Structure:
id, title, desc, data1, data2, ... data18 Transaction_Dat a_Table - 60 rows per day ( grows daily )
Structure:
date, id, record, new_data
--- record is the record that is changed
--- data is the new data fro that record
Although the above is nice and efficient for data storage, it is going to be
a nightmare to extract the data.
Everyday I need to be able to display the current data
( no problem with the Current Data table) but I also want to be able to calculate:
For every row ( could be done dynamically when needed)
a 2-day average for today
a 2-day average for 7 days ago
a 2-day average for 30 days ago
and grab the last 120 days data points for two records ( data8 and data9)
so that I can chart them.
The easiest way to extract the data would be to just write a new table
for each day, but that would mean duplicating 97% of data - big waste of disc space.
I think that there must be a good compormise - but I am not sure what to do.
Does anyone have any suggestions on how to organize the data.
Thanks.
6 1931 dlite922 1,584
Recognized Expert Top Contributor
20,000 is nothing for MySQL to update, even daily. I don't understand why you have 3 tables and over complicating it.
Just have another TimeStamp field in your table that gets updated when the row is inserted/updated.
Data_Table
id, title, data1,...data18 ,last_modified_ date
Now, let's understand your reporting needs.
What do you mean by a "2-day average for today"? Do you mean, get the number of records that have changed in the last 1 day, 7 days, and 30 days?
The 120 day query is easy. Just query for last_modified_d ata < 120 days ago and Select data8, and data9.
Dan
Thanks for your reply.
I just looked at the size of my daily table.
A daily file takes up 2.4 Mb
That means a year is 876 Mb - isn't that quite a lot ?
I must admit that it would make the reporting and charting
much easier if I only have the one table.
It just seems odd that 95% of that data is redundant..
But maybe it doesn't matter ?
My server has over 200Gb and I've only used 17Gb. BUT - doesn't a big file take a lot longer to search through?
Or maybe its quicker to search one large table rather than do several searches on smaller tables ??
Any experience with this ?
Thanks
Frinavale 9,735
Recognized Expert Moderator Expert
I've moved this question to the MySql forum since it's more of a database question than a PhP question. I think you'll get more help with your questions from the experts there.
Happy coding,
-Frinny
Thanks,
Would really appreciate any input from experienced database
coders.
RedSon 5,000
Recognized Expert Expert
Bigger files do not necessarily take longer to search through for a database, no. A database creates indexes, indexes are designed to be small and easy to use to lookup stuff.
I would expect mysql to easily and quickly perform it's tasks on files that are in the 10GB range as well so don't worry about that.
You should try to limit redundant data. Are any of the suggestions dlite made valuable to you?
MMcCarthy 14,534
Recognized Expert Moderator MVP
OK my understanding of what you are asking is this. You have a "Flat" file of data (By this I mean you haven't organised your data into relational tables as per a standard database. This article will give you more information on data structures and normalisation of data.
Now I understand you want to keep a transaction log of all updates to the data in your table. Exactly what do you need to store in that log. - Do you need to keep a complete history of all changes or just the last change made to the data.
- Do you need to record all changes to a record (i.e. all the new data values) or just record the fact that a change was made and the date of that change.
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Rob Tweed |
last post by:
I have a customer who is having problems when their Windows
2000/IIS/PHP-based system begins to experience a level of loading that
isn't, in my view, unreasonably high. I'm wondering what others
think, specifically:
- are these volumes, for this kind of configuration, at or beyond the
levels at which PHP should be able to cope? I hope not !
- what might be done to alleviate these problems? One of our
suspicions is that Windows...
|
by: Andreas Lauffer |
last post by:
I changed from Access97 to AccessXP and I have immense performance
problems.
Details:
- Access XP MDB with Jet 4.0 ( no ADP-Project )
- Linked Tables to SQL-Server 2000 over ODBC
I used the SQL Profile to watch the T-SQL-Command which Access ( who
creates the commands?) creates and noticed:
|
by: Randall Sell |
last post by:
Hello all,
I am migrating a Paradox application to SQL Server. My problem is that the
existing Paradox table structure is limited. Correcting it will mean a
re-write of the application (Delphi app). Although the record and column
limits are higher in SQL Server, it will eventually hit the wall. So I was
wondering if anyone could offer advise on how we might do this best....
The application is essentially an evaluation tool. It is very...
|
by: William Roberts |
last post by:
I am desiging an invoicing program that will do recurring billing as
well as charging for any additional service at a given time. All
charges will be service oriented. I am particularly concerned with
being able to seperate uninvoiced charges and invoiced charges and
would appreciate advice on do's or dont's, or any Ideas for a better
way to do it..
I have the usual tables ,,
,...
|
by: deko |
last post by:
DoCmd.CopyObject copies data, but I only need structure.
I'm trying to clone several tables in my Access 2003 mdb. The goal is to
link to a series of Excel spreadsheets and then run various action queries
against the data in the linked tables (to massage and import it). Since I
can't run action queries against linked Excel tables, I need to create temp
tables that are clones of the required existing tables in the database then
append...
| |
by: Alfonso Morra |
last post by:
Hi,
I am writing a messaging library which will allow me to send a generic
message structure with custom "payloads".
In many cases, a message must store a non-linear data structure (i.e.
"payload") using pointers. Examples of these are binary trees, hash
tables etc. Thus, the message itself contains only a pointer to the
actual data. When the message is sent to the same processor, these
pointers point to the original locations, which...
|
by: Jimakos Bilakis |
last post by:
Hi guys!
I'm using the C++ Builder 6 Enterprise Edition where I create some
tables in Paradox and with the help of a structure i pass my data from
the form (Edit boxes) to the Paradox table with a speedbutton. (I am
also using several data modules where i keep all my tables and
queries).
First of all i need some "mechanism" to read the data from the tables
and also to be able to change these already stored values. I tried with
|
by: Christoph Haas |
last post by:
Hi, list...
I have written an application in Perl some time ago (I was young and needed
the money) that parses multiple large text files containing nested data
structures and allows the user to run quick queries on the data.
(For the firewall admins among you: it's a parser and web-based query tool
for CheckPoint firewall rulebases. The user can search for source and
destination IPs and get the matching rules.)
The current application...
|
by: SteveT |
last post by:
Can someone point me in the right direction? Somewhere I read that you
reference a strongly typed dataset as if it were a class structure. For
example,
<SomeTests>
<TestsGroups>
<Group>
<TestName>TestA</TestNmae>
<Test>Run1</Test>
<Test>Run2</Test>
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |