By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,506 Members | 2,297 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,506 IT Pros & Developers. It's quick & easy.

Finding optimal values using vba/queries

P: n/a
Hi,

I am currently trying to automate a process that I have been studying
the manual results from for a while. The process simply identifies
events that meet certain criteria and at the moment it then requires
human interaction to decide what to do next (is a seperate process run
or not). I have set up the process to do this automatically based on a
couple of variables calculated but I am having a few problems
optimising the data to find these variables.

In order to increase the 'correctness' of the system I need to run an
optimisation weekly to take into account any new data collected. The
collected data takes the following form:
x Value Result
6 37 A
6 19 A
7 25 A
8 9 N
9 15 A
11 18 A
What I need to do is find a formula that calculates a value for each
x, above which the secondary process is run (A) and below which it
isn't (N). This isn't as simple as fitting a best fit line as in this
example the best fit line would take the form y=134.63x^-0.9706 which
would only be correct 50% of the time. If we instead use y = x^1.125
then it is correct 100% of the time. The underlying thing is that it
doesn't matter if I am 0.1 or 50 above the line, it will be actioned
and likewise, it won't be no matter how far below the line.

At the moment I run a series of queries that take each row and combine
it with a selection of m and c value (y=mx^c) and decide if for that
selection of values if that row would be correctly predicted or not.
When it has done this for every row (about 100x100 combinations) it
looks for the combination that is most accurate. This is obviously
time consuming and, as there are upwards of 90 datasets, it creates
loads of 'ghost' tables that bloat the database and can only be
removed by compacting.

What I am hoping is that someone has done something similar before or
has an idea how I could speed up the process without losing accuracy
or resolution (indeed I hope to increase this). I have tried exporting
data to excel to use solver but that can't deal with the non-linearity
and I can't afford Premium Solver (although having used the trial
version, that does work). I have also tried looping through a vba code
but can't help but create temp tables.

If anything written isn't clear or you need more data please ask.

Mike

ps. I'm using Office 2003 on Windows XP Professional

Feb 7 '07 #1
Share this Question
Share on Google+
1 Reply


P: n/a
On Feb 7, 6:30 am, "Mike Heywood" <michael.heyw...@easyjet.comwrote:
Hi,

I am currently trying to automate a process that I have been studying
the manual results from for a while. The process simply identifies
events that meet certain criteria and at the moment it then requires
human interaction to decide what to do next (is a seperate process run
or not). I have set up the process to do this automatically based on a
couple of variables calculated but I am having a few problems
optimising the data to find these variables.

In order to increase the 'correctness' of the system I need to run an
optimisation weekly to take into account any new data collected. The
collected data takes the following form:
x Value Result
6 37 A
6 19 A
7 25 A
8 9 N
9 15 A
11 18 A
What I need to do is find a formula that calculates a value for each
x, above which the secondary process is run (A) and below which it
isn't (N). This isn't as simple as fitting a best fit line as in this
example the best fit line would take the form y=134.63x^-0.9706 which
would only be correct 50% of the time. If we instead use y = x^1.125
then it is correct 100% of the time. The underlying thing is that it
doesn't matter if I am 0.1 or 50 above the line, it will be actioned
and likewise, it won't be no matter how far below the line.

At the moment I run a series of queries that take each row and combine
it with a selection of m and c value (y=mx^c) and decide if for that
selection of values if that row would be correctly predicted or not.
When it has done this for every row (about 100x100 combinations) it
looks for the combination that is most accurate. This is obviously
time consuming and, as there are upwards of 90 datasets, it creates
loads of 'ghost' tables that bloat the database and can only be
removed by compacting.

What I am hoping is that someone has done something similar before or
has an idea how I could speed up the process without losing accuracy
or resolution (indeed I hope to increase this). I have tried exporting
data to excel to use solver but that can't deal with the non-linearity
and I can't afford Premium Solver (although having used the trial
version, that does work). I have also tried looping through a vba code
but can't help but create temp tables.

If anything written isn't clear or you need more data please ask.

Mike

ps. I'm using Office 2003 on Windows XP Professional
I might be able to help. Let me give my perspective about it so that
you can help me understand the problem. I had a little trouble
following you in a few spots.

Equations of the form y = m x ^ c were called power curves in my
engineering classes. Finding m and c is a problem that comes up in
engineering quite often because many empirical relationships have this
form. If the data creates a straight line on a logarithmic or semi-
logarithmic plot it's possible to adjust a least-squares best fit line
to the situation:

Y = m X ^ c

log Y = m + c * log X (log can be any convenient base; suitable X and
Y values)

Thus the logarithms of the data points (Xi, Yi) can be used with the
basic equations to find m and c by using log Y in place of Y and log X
in place of X. Note that a correlation coefficient should be used to
determine how well the curve fit. Then for a given value you can use:

Sgn(Ys - m Xs ^ c)

to determine whether a newly sampled point (Xs, Ys) is above (1), on
(0) or below the best fit curve (-1).

I show the basic equations here:

http://groups.google.com/group/comp....f7af5f9a498f39

James A. Fortune
CD********@FortuneJames.com

Feb 7 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.