473,789 Members | 2,694 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Identifying the start of good data in a list

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False

del retHist[:i-4]

Thanks in advance for your help

Thomas Philips
Aug 26 '08
23 1269
On Aug 27, 3:42*pm, George Sakkis <george.sak...@ gmail.comwrote:
Below are two more versions that pass all the doctests: the first
works only for lists and modifies them in place and the second works
for arbitrary iterables:

def clean_inplace(s eq, good_ones=4):
* * start = 0
* * n = len(seq)
* * while start < n:
* * * * try: end = seq.index(0, start)
* * * * except ValueError: end = n
* * * * if end-start >= good_ones:
* * * * * * break
* * * * start = end+1
* * del seq[:start]

def clean_iter(iter able, good_ones=4):
* * from itertools import chain, islice, takewhile, dropwhile
* * iterator = iter(iterable)
* * is_zero = float(0).__eq__
* * while True:
* * * * # consume all zeros up to the next non-zero
* * * * iterator = dropwhile(is_ze ro, iterator)
* * * * # take up to `good_ones` non-zeros
* * * * good = list(islice(tak ewhile(bool,ite rator), good_ones))
* * * * if not good: # iterator exhausted
* * * * * * return iterator
* * * * if len(good) == good_ones:
* * * * * * # found `good_ones` consecutive non-zeros;
* * * * * * # chain them to the rest items and return them
* * * * * * return chain(good, iterator)

HTH,
George
You gave me an idea-- maybe an arbitrary 'lookahead' iterable could be
useful. I haven't seen them that much on the newsgroup, but more than
once. IOW a buffered consumer. Something that you could check a
fixed number of next elements of. You might implement it as a
iterator with a __getitem__ method.

Example, unproduced:
>>import itertools
a= itertools.count ( )
a.next()
0
>>a.next()
1
>>a[ 3 ]
5
>>a.next()
2
>>a[ 3 ]
6

Does this make sense at all?
Aug 28 '08 #21
On Aug 27, 11:50 am, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com .auwrote:
On Tue, 26 Aug 2008 17:04:19 -0700, tdmj wrote:
On Aug 26, 5:49 pm, tkp...@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero data).
For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I
would define the point at which data turns good to be 4 (1 followed by
2, 3, 4, 5).

...
With regular expressions:

Good grief. If you're suggesting that as a serious proposal, and not just
to prove it can be done, that's surely an example of "when all you have
is a hammer, everything looks like a nail" thinking.

In this particular case, your regex "solution" gives the wrong result,
indicating that you didn't test your code before posting. Hint:

re.search(r'[1-9]{5, }', "123456")

returns None.

The obvious fix for that specific bug is to use r'[1-9]{5,5}', but even
that will fail. Hint: what happens if an item has more than one digit?

Before posting another regex solution, make sure it does the right thing
with this:

[0, 0, 101, 0, 1002, 203, 3050, 4105, 5110, 623, 777]

--
Steven
Hey, it's clearer than a lot of the other proposals here. Too bad it
doesn't work. This is why you don't post after 8 p.m. after being at
work all day. I was seeing what I now recall as incorrect answers, but
at the time I was in the midst of a brainfart and for some reason took
them to be right. It can be made to work by removing the space in
"{5, }", inserting some kind of marker between the numbers, and using
the right regular expression to recognize nonzero numbers between the
markers, but I think I've already said too much in this thread.

Tommy McDaniel
Aug 28 '08 #22
On 27 Aug 2008 15:50:14 GMT, Steven D'Aprano <st***@REMOVE-THIS-cybersource.com .auwrote:
On Tue, 26 Aug 2008 17:04:19 -0700, tdmj wrote:
>On Aug 26, 5:49 pm, tkp...@hotmail. com wrote:
>>I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero data).
For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I
would define the point at which data turns good to be 4 (1 followed by
2, 3, 4, 5).

...
>With regular expressions:

Good grief. If you're suggesting that as a serious proposal, and not just
to prove it can be done, that's surely an example of "when all you have
is a hammer, everything looks like a nail" thinking.
Maybe I'm stumbling into a "REs are evil" flamewar here. Anyway:

He has a point though: this *can* be seen as a regex problem. Only a
solution which builds a string first is only good for laughs or
(possibly) quick hacks. What's missing is an RE library for lists of
objects, rather than just strings and Unicode strings.

Not sure such a library would be worth implementing -- problems like
this one are rare, I think.

/Jorgen

--
// Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se R'lyeh wgah'nagl fhtagn!
Aug 29 '08 #23
On Aug 29, 9:43*am, Jorgen Grahn <grahn+n...@sni pabacken.sewrot e:
On 27 Aug 2008 15:50:14 GMT, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com .auwrote:
On Tue, 26 Aug 2008 17:04:19 -0700, tdmj wrote:
On Aug 26, 5:49 pm, tkp...@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at *which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero data).
For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I
would define the point at which data turns good to be 4 (1 followed by
2, 3, 4, 5).

He has a point though: this *can* be seen as a regex problem. Only a
solution which builds a string first is only good for laughs or
(possibly) quick hacks. What's missing is an RE library for lists of
objects, rather than just strings and Unicode strings.

Not sure such a library would be worth implementing -- problems like
this one are rare, I think.
Every now and then, you see a proposal or a package for a finite state
machine--- how would you encode comparing of values into a string, if
you're not comparing a string?
Aug 29 '08 #24

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
4694
by: Mark | last post by:
I have an application that uses PHP to access a MySQL table and extract rows which match the user's search entry. Data from the matching rows are loaded into an HTML selection list, and the user can scan the list and highlight a selected item. My question: In PHP, what is the syntax for identifying which row is highlighted? My goal is to pass the unique row ID of the selected item to a subsequent form where all the fields will be...
1
5184
by: hokiegal99 | last post by:
This is not really a Python-centric question, however, I am using Python to solve this problem (as of now) so I thought it appropiate to pose the question here. I have some functions that search for files that contain certian strings and if the files found to have these string do not already have a filename extension (such as '.doc' or '.xls') the function will append that to the files and rename them. So, if a file named 'report' was...
21
2237
by: dkcpub | last post by:
I'm very new to Python, but I couldn't find anything in the docs or faq about this. And I fished around in the IDLE menus but didn't see anything. Is there a tool that can determine all the exceptions that can be raised in a Python function, or in any of the functions it calls, etc.? /Dan
3
4682
by: Jeff | last post by:
I am using SQL Server 2000. I have a table with, say, 20 columns. I have one procedure which updates all 20 columns at once, accepting a parameter for each column. However, I want to be able to pass any combination of parameters and only update those columns if passed. So I created the sp as something like create update_t1 ( @col1 int = null, @col2 int = null,
2
1999
by: gssstuff | last post by:
I have a piece of code I use to compare two identically structured tables. There are 15+ sets of tables I am comparing. I am looking to see what has changed between the "old" and "new" versions of the table. Any changes get reported to an audit file which contains the index, field name and field value of the records that have changed. I have the changes nailed for modifications, but cannot seem to get a handle on adds/deletes. Due to...
2
6879
by: Macca | last post by:
My app has an asynchronous socket server. It will have 20 clients connected to the server. Each client sends data every 500 millisecondsThe Connections once established will not be closed unless there is a problem with the connection. I need to know which client has sent the incoming data as each client has its own buffer on my "server" app. I am using the standard asynch socket code from MSDN to listen for connections and they...
5
4772
by: zxo102 | last post by:
Hi, I am doing a small project using socket server and thread in python. This is first time for me to use socket and thread things. Here is my case. I have 20 socket clients. Each client send a set of sensor data per second to a socket server. The socket server will do two things: 1. write data into a file via bsddb; 2. forward the data to a GUI written in wxpython. I am thinking the code should work as follow (not sure it is feasible)...
10
4514
by: Frankie | last post by:
It appears that System.Random would provide an acceptable means through which to generate a unique value used to identify multiple/concurrent asynchronous tasks. The usage of the value under consideration here is that it is supplied to the AsyncOperationManager.CreateOperation(userSuppliedState) method... with userSuppliedState being, more or less, a taskId. In this case, the userSuppliedState {really taskId} is of the object type,...
5
6147
by: Tom P. | last post by:
I am having the following problem: I create a FileSystemWatcher and wait for events. When the event does happen I refresh a FileSystemInfo list and set properties accordingly (IsFile, IsDir, ReadOnly, etc.). The problem I'm having is in identifying when a FileSystemInfo entry is a FileInfo or a DirectoryInfo type. I get the rare, and yet oddly common, "setup.inf" file that for some inexplicable reason passes the standard...
0
10404
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10195
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9979
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7525
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6765
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5548
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4090
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3695
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2906
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.