473,406 Members | 2,345 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Dealing with large amounts of data

We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?

Thank you in advance.

Ben
Jul 20 '05 #1
10 2393
nib
Digety wrote:
We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?

Thank you in advance.

Ben


You need to consult with a profession firm that has experience with this
kind of thing. I wouldn't trust us yahoos! :D

Zach
Jul 20 '05 #2
This question is really too big for a newsgroup. The obvious question is
that if you want to build and run a 24/7 server-farm for 8 million
subscribers then you'll need a team of architecture, development and
operations staff with plenty of years of experience in high-volume,
high-availability environments... so why not ask them this question, since
they'll be in a better position to understand your requirements?

SQL Server's clustering support is for Failover Clustering so that may be
part of your solution for high availability. In SQL Server, distributed
queries can be implemented using partitioned views for load balancing.

You may find some useful information on Microsoft's scalability site:
http://www.microsoft.com/sql/techinf...calability.asp

--
David Portas
SQL Server MVP
--
Jul 20 '05 #3

"Digety" <be******@hotmail.com> wrote in message
news:82**************************@posting.google.c om...
We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Practical? What's your budget? What's your response time requirements?

Clustering ain't cheap.

6-8 million rows isn't all that much btw. What's more important is how much
it changes.

Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?
If you're just doing queries, my guess is you won't need this.

To give you an example, I've got a quad CPU Xeon box (700Mhz) that runs at
about 50% CPU these days (some new code just helped). It INSERTS I think 17
million rows a day (which then get moved to another server overnight.)


Thank you in advance.

Ben

Jul 20 '05 #4
Obviously you can tell that I don't know much about this subject, so
I'm sorry for my ignorance. What types of firms out there can handle
something like this? If you could give me some examples, that would
be great. Thank you so much.

[Ben]
Jul 20 '05 #5
Have you already chosen SQL Server as your database platform? If so then
contact Microsoft Sales, explain your requirements and ask them to suggest a
vendor in your area. I suggest you also hire someone with experience in
databse systems implementation to liaise with the vendor.

If you don't have any given constraints as to what hardware and software
platform to use then you may want to take some advice from an independent IT
consultant to help you decide on the right technology before you talk to
vendors.

--
David Portas
SQL Server MVP
--
Jul 20 '05 #6
David Portas scratched out in the sand
Have you already chosen SQL Server as your database platform? If so then
contact Microsoft Sales, explain your requirements and ask them to suggest
a vendor in your area. I suggest you also hire someone with experience in
databse systems implementation to liaise with the vendor.

If you don't have any given constraints as to what hardware and software
platform to use then you may want to take some advice from an independent
IT consultant to help you decide on the right technology before you talk
to vendors.


As someone else mentioned, your volume really isn't that bad. SQL Server (or
Oracle, MySQL, PostgreSQL) could handle it on one server very easily. I've
done this many times over on MSSQL and MySQL. Both handle the volume on one
server without a hitch.

If, you're worried about 24/7 operations, however, I'd shy away from
MS-based solutions, as they get pricy quickly and are difficult to
maintain. Though Windows is a decent departmental level solution, I've seen
too many cases were $$$ was thrown at it to make it 24/7 and still had it
fail.

I'd lean more towards a Unix-based system, simply because they're better
designed for datacenter operations.
--
kai - kai at 3gproductions dot com
www.gamephreakz.com || www.filesite.org
"friends don't let friends use windows xp"
Jul 20 '05 #7

"filesiteguy" <ab***@127.0.0.1> wrote in message
news:10*************@corp.supernews.com...

As someone else mentioned, your volume really isn't that bad. SQL Server (or Oracle, MySQL, PostgreSQL) could handle it on one server very easily. I've
done this many times over on MSSQL and MySQL. Both handle the volume on one server without a hitch.

I'd agree with this.

If, you're worried about 24/7 operations, however, I'd shy away from
MS-based solutions, as they get pricy quickly and are difficult to
maintain. Though Windows is a decent departmental level solution, I've seen too many cases were $$$ was thrown at it to make it 24/7 and still had it
fail.
But not this. Our main production SQL Server had over the course of the
past few years a 100% uptime (except for a planned move and a few planned
maintenances).

A lot of 24/7 really goes into planning, Unix, Windows or otherwise.

I'd lean more towards a Unix-based system, simply because they're better
designed for datacenter operations.
--
kai - kai at 3gproductions dot com
www.gamephreakz.com || www.filesite.org
"friends don't let friends use windows xp"

Jul 20 '05 #8
Greg D. Moore (Strider) scratched out in the sand

If, you're worried about 24/7 operations, however, I'd shy away from
MS-based solutions, as they get pricy quickly and are difficult to
maintain. Though Windows is a decent departmental level solution, I've

seen
too many cases were $$$ was thrown at it to make it 24/7 and still had it
fail.


But not this. Our main production SQL Server had over the course of the
past few years a 100% uptime (except for a planned move and a few planned
maintenances).

A lot of 24/7 really goes into planning, Unix, Windows or otherwise.


If you were down due to maintenance, then it isn't 100% uptime. Yes,
planning is VERY important. I just have seen that Windows is better being
planned as a small departmental solution and doesn't fit as an enterprise
system.

Eventually you'll learn. You'll also learn not to use Outlook Express,
someday.

--
kai - kai at 3gproductions dot com
www.gamephreakz.com || www.filesite.org
"friends don't let friends use windows xp"
Jul 20 '05 #9
> I just have seen that Windows is better being
planned as a small departmental solution and doesn't fit as an enterprise
system.


That tells us more about your experience than about Windows. Your
generalization is belied by the reality of thousands of major enterprises
whose experience apparently differs from yours.

--
David Portas
SQL Server MVP
--
Jul 20 '05 #10

"filesiteguy" <ab***@127.0.0.1> wrote in message
news:10*************@corp.supernews.com...
Greg D. Moore (Strider) scratched out in the sand

If, you're worried about 24/7 operations, however, I'd shy away from
MS-based solutions, as they get pricy quickly and are difficult to
maintain. Though Windows is a decent departmental level solution, I've seen
too many cases were $$$ was thrown at it to make it 24/7 and still had it fail.


But not this. Our main production SQL Server had over the course of the
past few years a 100% uptime (except for a planned move and a few planned maintenances).

A lot of 24/7 really goes into planning, Unix, Windows or otherwise.


If you were down due to maintenance, then it isn't 100% uptime.


Depends on your definition. And your budget. (i.e. whether you have the
budget for the hardware and software solutions that allow 0% downtime
maintenance. Same as in a Unix shop.)

And let's put it this way, the Windows solution has had far better uptime
than the Unix/Oracle solution used in a different division.
Yes,
planning is VERY important. I just have seen that Windows is better being
planned as a small departmental solution and doesn't fit as an enterprise
system.

Eventually you'll learn. You'll also learn not to use Outlook Express,
someday.
And perhaps someday you'll learn to not be so condescending.


--
kai - kai at 3gproductions dot com
www.gamephreakz.com || www.filesite.org
"friends don't let friends use windows xp"

Jul 20 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: michaaal | last post by:
If I use a form to pass data (say, for example, through a textbox) the data seems to be limited to somewhat smaller amounts. What should I do if I want to pass a large amount of data? For example...
4
by: oshanahan | last post by:
Does anyone have ideas on the best way to move large amounts of data between tables? I am doing several simple insert/select statements from a staging table to several holding tables, but because...
3
by: Wayne Marsh | last post by:
Hi all. I am working on an audio application which needs reasonably fast access to large amounts of data. For example, the program may load a 120 second stereo sound sample stored at 4bytes per...
2
by: Dennis C. Drumm | last post by:
What is the best way to add several pages of text to a readonly TextBox? The text does not change and was created in a Word rtf document but could as easly be put in a ASCII text file. Can this be...
1
by: Bart | last post by:
Dear all, I would like to encrypt a large amount of data by using public/private keys, but I read on MSDN: "Symmetric encryption is performed on streams and is therefore useful to encrypt large...
7
by: =?Utf-8?B?TW9iaWxlTWFu?= | last post by:
Hello everyone: I am looking for everyone's thoughts on moving large amounts (actually, not very large, but large enough that I'm throwing exceptions using the default configurations). We're...
4
by: bcomeara | last post by:
I am writing a program which needs to include a large amount of data. Basically, the data are p values for different possible outcomes from trials with different number of observations (the p...
17
by: Christopher Benson-Manica | last post by:
Some recent posts got me thinking about how one might have dealt with simplistic malloc() implementations which might return NULL for a 64K request but might accept two 32K requests or four 16K...
16
by: pereges | last post by:
ok so i have written a program in C where I am dealing with huge data(millions and lots of iterations involved) and for some reason the screen tends to freeze and I get no output every time I...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.