470,591 Members | 2,162 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,591 developers. It's quick & easy.

STD of multiple columns

Dear Mysql-ians,

I want to calculate the standard deviation of data that are in multiple
columns. I know how to calculate the STD of 1 column (e.g. X1 of
table_X) using:

SELECT STD(X1) FROM table_X;

but I want to calculate now the STD of the union of data of columns
(e.g. X1, X2, ..., X100 of table_X).

Does anyone has any suggestion on how to do that? I hoped something as
SELECT STD(X1,X2,...,X100) FROM table_X existed, but apparently it does
not.

Thanks for your suggestions in advance!

Kind regards,
Stef

Apr 14 '06 #1
7 2462
st***************@agr.kuleuven.ac.be wrote:
Dear Mysql-ians,

I want to calculate the standard deviation of data that are in multiple
columns. I know how to calculate the STD of 1 column (e.g. X1 of
table_X) using:

SELECT STD(X1) FROM table_X;

but I want to calculate now the STD of the union of data of columns
(e.g. X1, X2, ..., X100 of table_X).

Does anyone has any suggestion on how to do that? I hoped something as
SELECT STD(X1,X2,...,X100) FROM table_X existed, but apparently it does
not.

Thanks for your suggestions in advance!

Kind regards,
Stef


Stef,

I don't *think* it's possible from your current design.

Perhaps a redesign is in order. Having 100 columns containing basically the
same information is not a good design.

For instance, in the case of student test scores - you could do something like:

(table) studentid name test1scrore test2score test3score test4score

(Of course there would be more info)

A better design would be:

(table 1) studentid name

(table 2) studentid testid score

Such a design is more versatile - and cures your problem along the way.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 14 '06 #2
Thanks for your help!

If I look at my design, it looks like:
(table) id, obs_time1, obs_time2, ..., obs_time100
where:
obs_timeX = observation at "time X"

I have over 2 million records (with different unique id's) for this
table, and I want to create the STD of all observations of 1 id through
time.

If I understand your design well, you suggest to reform it towards:
(table_1) studentid + other info
(table_2) studentid obs_time obs_value
where:
obs_time = "time X" of obs_timeX
obs_value = value of obs_timeX with corresponding "time X"

If I am correct, I will get a very long table_2 since I create 2
millions (ids) *100 records (for every obs_time). Don't I create much
more redundant information then (having only 1 row of obs_timeX per id
and having unique id's)?

I hope i made myself clear?

Thanks again for your help!

Regards
Stef

Apr 14 '06 #3
st***************@agr.kuleuven.ac.be wrote:
Thanks for your help!

If I look at my design, it looks like:
(table) id, obs_time1, obs_time2, ..., obs_time100
where:
obs_timeX = observation at "time X"

I have over 2 million records (with different unique id's) for this
table, and I want to create the STD of all observations of 1 id through
time.

If I understand your design well, you suggest to reform it towards:
(table_1) studentid + other info
(table_2) studentid obs_time obs_value
where:
obs_time = "time X" of obs_timeX
obs_value = value of obs_timeX with corresponding "time X"

If I am correct, I will get a very long table_2 since I create 2
millions (ids) *100 records (for every obs_time). Don't I create much
more redundant information then (having only 1 row of obs_timeX per id
and having unique id's)?

I hope i made myself clear?

Thanks again for your help!

Regards
Stef


Stef,

Yep, that's exactly what I'm suggesting. Do some reading up on "Database
Normalization" - it can help you understand why this is potentially a better
solution.

And yes, the new table will be quite long. But your existing table is quite
wide! 200M rows (the max you could have) isn't as different than what you have
now - 2M rows with > 100 columns in each row.

Also, as you normalize your tables, you can potentially have more, if the
majority of the fields are filled. But you may also have less, if only a small
number are filled. And normalizing your tables makes things more flexible.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 14 '06 #4
Thanks for your suggestion! I will try to reorganize my data.

I was thinking of making a query to reorganize my data.
E.g.:
(SELECT id, "name(obs_time1)" AS obs_time, obs_time1 AS obs_value FROM
table_1)
UNION
(SELECT id, "name(obs_time2)" AS obs_time, obs_time2 AS obs_value FROM
table_1)
UNION
.......
UNION
(SELECT id, "name(obs_time100)" AS obs_time, obs_time100 AS obs_value
FROM table_1)
ORDER BY id

with:
"name(obs_timeX)"= the name of my columns I now use to extract
obs_timeX

Hopefully that will work!

Regards,
Stef

Apr 14 '06 #5

Stef,

do you really want to run a union of 100 select on a table with 2
millions records ?!?

Actually I don't see where is the problem, why dont' you just apply the
function std to each single column, select std(v1), std(v2), ... ? Why
do you feel you need a multivariate function?

-tom

st***************@agr.kuleuven.ac.be ha scritto:
Thanks for your suggestion! I will try to reorganize my data.

I was thinking of making a query to reorganize my data.
E.g.:
(SELECT id, "name(obs_time1)" AS obs_time, obs_time1 AS obs_value FROM
table_1)
UNION
(SELECT id, "name(obs_time2)" AS obs_time, obs_time2 AS obs_value FROM
table_1)
UNION
......
UNION
(SELECT id, "name(obs_time100)" AS obs_time, obs_time100 AS obs_value
FROM table_1)
ORDER BY id

with:
"name(obs_timeX)"= the name of my columns I now use to extract
obs_timeX

Hopefully that will work!

Regards,
Stef


Apr 14 '06 #6
st***************@agr.kuleuven.ac.be wrote:
Thanks for your suggestion! I will try to reorganize my data.

I was thinking of making a query to reorganize my data.
E.g.:
(SELECT id, "name(obs_time1)" AS obs_time, obs_time1 AS obs_value FROM
table_1)
UNION
(SELECT id, "name(obs_time2)" AS obs_time, obs_time2 AS obs_value FROM
table_1)
UNION
......
UNION
(SELECT id, "name(obs_time100)" AS obs_time, obs_time100 AS obs_value
FROM table_1)
ORDER BY id

with:
"name(obs_timeX)"= the name of my columns I now use to extract
obs_timeX

Hopefully that will work!

Regards,
Stef


Stef,

Actually, I think I'd do it in PHP or some other language and let it loop, i.e.

(Assuming you're using a version which can insert from a select statement)

for ($i = 1; $i <= 100; $i++) {
$query = "INSERT INTO newtable (studentid, obs_time, obs_value) " .
"VALUES (SELECT studentid, $i, obs_value" . $i , ") FROM oldtable";
result = mysql_query($query);
if (!$result) {
echo "MySQL Error: " . mysql_error();
break;
}
}

Also, if all of the times don't have values and you don't need to insert them,
you can do this in two queries - select the value; if it's null (or blank) then
you don't need to insert it into the new table.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 15 '06 #7
Thanx Jerry. I followed your advice and it worked wonderfully!

Apr 25 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Bob Hotschins | last post: by
7 posts views Thread by Billy Jacobs | last post: by
7 posts views Thread by =?Utf-8?B?TG9zdEluTUQ=?= | last post: by
2 posts views Thread by ray well | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.