My app contains utility meter usage. One of the things we have to deal with
is when a usage is clearly incorrect. Perhaps someone wrote the meter
reading down incorrectly or made a factor of 10 error when entering the
reading, etc. At other times the usage is zero or somehow was entered as a
negative number.
So I'm thinking about adding functionality to search for such anomalies. For
instance, show months where the meter reading is 25% higher than the average
for the prior 12 months. Or show months for a particular meter where there
is a difference of 20% between adjacent monthly usage. Here's a data example
Meter 5678
Jan-06 100
Feb-06 105
Mar-06 75
Apr-06 90
May-06 101
Jun-06 900
Jul-06 89
So you can see from this data that 900 is clearly incorrect and probably
should be 90. The 75 usage in Mar-06 would show up on a search where there
is a difference between adjacent months of 25% or more. We'll probably also
code the functionality to search for zero usage and negative usage.
Bear in mind that we have several thousand meters and around a 100,000
monthly meter usages spanning several years.
I'm looking for an approach to implement this functionality. Searching row
by row through the tables would probably take a very long time. Is there a
clever way to handle this through SQL alone or mostly through SQL? Or does
anyone have any other suggestions? It would seem that this could be a very
slow process.
Thanks.
--
Message posted via AccessMonster.c om http://www.accessmonster.com/Uwe/For...ccess/200610/1 10 2099
On Wed, 04 Oct 2006 14:23:13 GMT, "rdemyan via AccessMonster.c om"
<u6836@uwewrote :
I would compare the readings against a scaled version of the common
trend. The trend would be an average over all meters, showing for
example that the usage in winter months is higher than in summer
months. The scaling is to account for a larger home putting up higher
numbers than a smaller one.
I would not worry about speed until it's proven to be an issue.
-Tom.
>My app contains utility meter usage. One of the things we have to deal with is when a usage is clearly incorrect. Perhaps someone wrote the meter reading down incorrectly or made a factor of 10 error when entering the reading, etc. At other times the usage is zero or somehow was entered as a negative number.
So I'm thinking about adding functionality to search for such anomalies. For instance, show months where the meter reading is 25% higher than the average for the prior 12 months. Or show months for a particular meter where there is a difference of 20% between adjacent monthly usage. Here's a data example
Meter 5678
Jan-06 100 Feb-06 105 Mar-06 75 Apr-06 90 May-06 101 Jun-06 900 Jul-06 89
So you can see from this data that 900 is clearly incorrect and probably should be 90. The 75 usage in Mar-06 would show up on a search where there is a difference between adjacent months of 25% or more. We'll probably also code the functionality to search for zero usage and negative usage.
Bear in mind that we have several thousand meters and around a 100,000 monthly meter usages spanning several years.
I'm looking for an approach to implement this functionality. Searching row by row through the tables would probably take a very long time. Is there a clever way to handle this through SQL alone or mostly through SQL? Or does anyone have any other suggestions? It would seem that this could be a very slow process.
Thanks.
"rdemyan via AccessMonster.c om" <u6836@uwewro te in
news:6743ad47a3 f7b@uwe:
My app contains utility meter usage. One of the things we have to
deal with is when a usage is clearly incorrect. Perhaps someone wrote
the meter reading down incorrectly or made a factor of 10 error when
entering the reading, etc. At other times the usage is zero or
somehow was entered as a negative number.
So I'm thinking about adding functionality to search for such
anomalies. For instance, show months where the meter reading is 25%
higher than the average for the prior 12 months. Or show months for
a particular meter where there is a difference of 20% between adjacent
monthly usage. Here's a data example
Meter 5678
Jan-06 100
Feb-06 105
Mar-06 75
Apr-06 90
May-06 101
Jun-06 900
Jul-06 89
So you can see from this data that 900 is clearly incorrect and
probably should be 90. The 75 usage in Mar-06 would show up on a
search where there is a difference between adjacent months of 25% or
more. We'll probably also code the functionality to search for zero
usage and negative usage.
Bear in mind that we have several thousand meters and around a 100,000
monthly meter usages spanning several years.
I'm looking for an approach to implement this functionality.
Searching row by row through the tables would probably take a very
long time. Is there a clever way to handle this through SQL alone or
mostly through SQL? Or does anyone have any other suggestions? It
would seem that this could be a very slow process.
Thanks.
OTTOMH
SELECT m.Reading, (m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m
LEFT JOIN
[SELECT Avg(Meter.Readi ng) AS Average, StDev(Meter.Rea ding) AS StDev
FROM Meter]. sq
ON m.Reading*1000 <sq.Average
WHERE ((m.Reading-sq.Average)/sq.StDev)>=2
ORDER BY (m.Reading-sq.Average)/sq.StDev
You, of course, would have to modify this for your own situation. I have
suggested that a Score >= 2 would be suspect but your own experience
would be the best guide here.
No, I don't really expect that you will be able to use this, but hope
springs eternal.
--
Lyle Fairfield
Interesting, Lyle. I'll see what I can do with this and report back. You
show 2 but that can be easily changed by the user on the form (however, I'll
have to think about what that really means in terms us mere mortals can
understand).
One definate thing I will want to add is the ability to select a specific
time frame.
Lyle Fairfield wrote:
>My app contains utility meter usage. One of the things we have to deal with is when a usage is clearly incorrect. Perhaps someone wrote
[quoted text clipped - 34 lines]
>> Thanks.
OTTOMH
SELECT m.Reading, (m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m LEFT JOIN [SELECT Avg(Meter.Readi ng) AS Average, StDev(Meter.Rea ding) AS StDev FROM Meter]. sq ON m.Reading*1000 <sq.Average WHERE ((m.Reading-sq.Average)/sq.StDev)>=2 ORDER BY (m.Reading-sq.Average)/sq.StDev
You, of course, would have to modify this for your own situation. I have suggested that a Score >= 2 would be suspect but your own experience would be the best guide here.
No, I don't really expect that you will be able to use this, but hope springs eternal.
--
Message posted via AccessMonster.c om http://www.accessmonster.com/Uwe/For...ccess/200610/1
Tom:
It's a good point about summer and winter months. This is also a function of
geographic area. In Seattle, electricity usage is fairly constant throughout
the year (no summer air conditioning). Water is also fairly constant (not
much irrigation needed in the Pacific Northwest). Heating, though will vary
substnatially.
In Hawaii, cooling occurs year round but will vary with the cooling degree
days. No heating degree days there so heating is not an issue.
In writing this, I realize that I may want to incorporate weather data in
determining what an "anomaly" is for those utilities that show variance due
to weather in the particular geographical area. My data tables contain all
the necessary weather data so this should be doable.
Tom van Stiphout wrote:
>I would compare the readings against a scaled version of the common trend. The trend would be an average over all meters, showing for example that the usage in winter months is higher than in summer months. The scaling is to account for a larger home putting up higher numbers than a smaller one.
I would not worry about speed until it's proven to be an issue.
-Tom.
>>My app contains utility meter usage. One of the things we have to deal with is when a usage is clearly incorrect. Perhaps someone wrote the meter
[quoted text clipped - 32 lines]
>> Thanks.
--
Message posted via AccessMonster.c om http://www.accessmonster.com/Uwe/For...ccess/200610/1
On Wed, 04 Oct 2006 15:43:31 GMT, Lyle Fairfield
<ly***********@ aim.comwrote:
I'll have to study this some more. The way I'm calculating z-scores
for a project is quite a bit more involved.
I think Abs(Score) >= 2 is worth another look.
-Tom.
<clip>
> OTTOMH
SELECT m.Reading, (m.Reading-sq.Average)/sq.StDev AS ZScore FROM Meter m LEFT JOIN [SELECT Avg(Meter.Readi ng) AS Average, StDev(Meter.Rea ding) AS StDev FROM Meter]. sq ON m.Reading*1000 <sq.Average WHERE ((m.Reading-sq.Average)/sq.StDev)>=2 ORDER BY (m.Reading-sq.Average)/sq.StDev
You, of course, would have to modify this for your own situation. I have suggested that a Score >= 2 would be suspect but your own experience would be the best guide here.
No, I don't really expect that you will be able to use this, but hope springs eternal.
Tom van Stiphout <no************ *@cox.netwrote in
news:vf******** *************** *********@4ax.c om:
I think Abs(Score) >= 2 is worth another look.
-Tom.
I think you are right. Abs() is a good idea.
--
Lyle Fairfield
Lyle:
I got the following to produce results, but I need to test it further.
SELECT m.USAGE, (m.USAGE-sq.Average)/sq.StDev AS ZScore
FROM [MONTHLY_METER_U SAGE] AS m LEFT JOIN (SELECT Avg(USAGE) AS Average,
StDev(USAGE) AS StDev
FROM [MONTHLY_METER_U SAGE]
WHERE METER_ID = '000001'
AND USAGE_END_DATE >= #03/01/2005# AND USAGE_END_DATE <= #02/28/2006#) AS sq
ON m.USAGE <sq.Average
WHERE ((m.USAGE-sq.Average)/sq.StDev)>=2
AND m.METER_ID = '000001'
AND USAGE_END_DATE >= #03/01/2005# AND USAGE_END_DATE <= #02/28/2006#
ORDER BY (m.USAGE-sq.Average)/sq.StDev;
Lyle Fairfield wrote:
>I think Abs(Score) >= 2 is worth another look.
-Tom.
I think you are right. Abs() is a good idea.
--
Message posted via AccessMonster.c om http://www.accessmonster.com/Uwe/For...ccess/200610/1
rdemyan via AccessMonster.c om wrote:
Lyle:
I got the following to produce results, but I need to test it further.
SELECT m.USAGE, (m.USAGE-sq.Average)/sq.StDev AS ZScore
FROM [MONTHLY_METER_U SAGE] AS m LEFT JOIN (SELECT Avg(USAGE) AS Average,
StDev(USAGE) AS StDev
FROM [MONTHLY_METER_U SAGE]
WHERE METER_ID = '000001'
AND USAGE_END_DATE >= #03/01/2005# AND USAGE_END_DATE <= #02/28/2006#) AS sq
ON m.USAGE <sq.Average
WHERE ((m.USAGE-sq.Average)/sq.StDev)>=2
AND m.METER_ID = '000001'
AND USAGE_END_DATE >= #03/01/2005# AND USAGE_END_DATE <= #02/28/2006#
ORDER BY (m.USAGE-sq.Average)/sq.StDev;
If you use Tom's revision:
WHERE Abs(((m.USAGE-sq.Average)/sq.StDev))>=2
you will identify scores that are unusually low as well as scores that
are unusually high,
Got this to work nicely but had to add a Having clause because there is the
possibility that StDev can be zero and dividing by zero, of course, leads to
an error.
Lyle Fairfield wrote:
>Lyle:
[quoted text clipped - 11 lines]
>AND USAGE_END_DATE >= #03/01/2005# AND USAGE_END_DATE <= #02/28/2006# ORDER BY (m.USAGE-sq.Average)/sq.StDev;
If you use Tom's revision: WHERE Abs(((m.USAGE-sq.Average)/sq.StDev))>=2 you will identify scores that are unusually low as well as scores that are unusually high,
--
Message posted via http://www.accessmonster.com This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Peter Hansen |
last post by:
Greetings.
Im trying to write a program that can be run from the command line.
If I want to search for example after a file with the ending .pdf, I should
be able to write in the command line:
python name of my program / the libary to search and what kind of file it
is example a .pdf file
So if my program name was test.py and the library name was library1 and the
test type i wanted to find was, a .pdf file
I should write python...
|
by: davidw |
last post by:
I want to check chose performance related parameters in my code.
Thanks!
|
by: Bryan Olson |
last post by:
The Python slice type has one method 'indices', and reportedly:
This method takes a single integer argument /length/ and
computes information about the extended slice that the slice
object would describe if applied to a sequence of length
items. It returns a tuple of three integers; respectively
these are the /start/ and /stop/ indices and the /step/ or
stride length of the slice. Missing or out-of-bounds indices
are handled in a manner...
|
by: sabarish |
last post by:
Hi to all. find out the biggest among two numbers without using any
conditional statements and any relational operators.
|
by: Nadav |
last post by:
Hi,
1. I am writing some kind of a CLI Linker
2. I am using the unmanaged meta-data API.
I wonder browsing "cor.h" I have encountered the 'ICeeGen' this interface is retrieved by the 'ICorModule' interface, I didn't find any API that provide me the means to get an 'ICorModule' interface, how should I get this interface? What object should i create? what method should I call?
Any comments remarks or pointers will be appreciated.
--...
| |
by: thebjorn |
last post by:
For the purpose of finding someone's age I was looking for a way to
find how the difference in years between two dates, so I could do
something like:
age = (date.today() - born).year
but that didn't work (the timedelta class doesn't have a year
accessor).
I looked in the docs and the cookbook, but I couldn't find anything, so
|
by: isha123 |
last post by:
how can i find the cpu usage of my windows XP SP2 machine from command line? And also how can i execute this command with Perl? or is there anyother way to find CPU usage for XP machine using Perl?
|
by: dittytwo |
last post by:
Hi all
I have been looking around the web
and can't seem to find a solution the solution that i have found and manipulated seems to bring back the whole list of currently running process's which hogs a lot of CPU time in its self..
basic idea of my program is i have multiple threads doing things and i have a thread that gathers data within this thread i can currently get all sorts of information using
...
|
by: sh.vipin |
last post by:
is there any way to find out number of bytes freed on a particular
free() call in C
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |