Identifying the start of good data in a list

tkpmep

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False

del retHist[:i-4]

Thanks in advance for your help

Thomas Philips

Aug 26 '08 #1

Subscribe Reply

1268

bearophileHUGS

First solutions I have found, not much tested beside the few doctests:

from itertools import islice

def start_good1(ali st, good_ones=4):
"""
Maybe more efficient for Python

>>start_good = start_good1
start_good([0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>start_good([])

-1

>>start_good([0, 0])

-1

>>start_good([0, 0, 0])

-1

>>start_good([0, 0, 0, 0, 1])

-1

>>start_good([0, 0, 1, 0, 1, 2, 3])

-1

>>start_good([0, 0, 1, 0, 1, 2, 3, 4])

>>start_good([0, 0, 1, 0, 1, 2, 3, 4, 5])

>>start_good([1, 2, 3, 4])

>>start_good([1, 2, 3])

-1

>>start_good([0, 0, 1, 0, 1, 2, 0, 4])

-1
"""
for i in xrange(len(alis t) - good_ones + 1):
if all(islice(alis t, i, i+good_ones)):
return i
return -1

def start_good2(ali st, good_ones=4):
"""
Maybe more efficient for Psyco

>>start_good = start_good2
start_good([0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>start_good([])

-1

>>start_good([0, 0])

-1

>>start_good([0, 0, 0])

-1

>>start_good([0, 0, 0, 0, 1])

-1

>>start_good([0, 0, 1, 0, 1, 2, 3])

-1

>>start_good([0, 0, 1, 0, 1, 2, 3, 4])

>>start_good([0, 0, 1, 0, 1, 2, 3, 4, 5])

>>start_good([1, 2, 3, 4])

>>start_good([1, 2, 3])

-1

>>start_good([0, 0, 1, 0, 1, 2, 0, 4])

-1
"""
n_good = 0
for i, el in enumerate(alist ):
if alist[i]:
if n_good == good_ones:
return i - good_ones
else:
n_good += 1
else:
n_good = 0
if n_good == good_ones:
return len(alist) - good_ones
else:
return -1
if __name__ == "__main__":
import doctest
doctest.testmod ()
print "Doctests done\n"

Bye,
bearophile

Aug 26 '08 #2

bearophileHUGS

Sorry, in the Psyco version replace this line:
for i, el in enumerate(alist ):

With:
for i in xrange(len(alis t)):

because Psyco doesn't digest enumerate well.

Bye,
bearophile

Aug 26 '08 #3

Matthew Fitzgibbons

tk****@hotmail. com wrote:

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False

del retHist[:i-4]

Thanks in advance for your help

Thomas Philips
--
http://mail.python.org/mailman/listinfo/python-list

Maybe this will do?

reHist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
count = 0
for i, d in enumerate(reHis t):
if d == 0:
count = 0
else:
count += 1
if count == 5:
break
else:
raise Exception("No data found")
reHist = reHist[i-4:]
print reHist
-Matt

Aug 26 '08 #4

Mensanator

On Aug 26, 4:49*pm, tkp...@hotmail. com wrote:

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at *which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

* * flag = True
* * i=-1
* * j=0
* * while flag and i < len(retHist)-1:
* * * * i += 1
* * * * if retHist[i] == 0:
* * * * * * j = 0
* * * * else:
* * * * * * j += 1
* * * * * * if j == 5:
* * * * * * * * flag = False

* * del retHist[:i-4]

Thanks in advance for your help

Thomas Philips

Here's my attempt:

LL = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

i = 0

while (i<len(LL)) and (0 in LL[i:i+5]):
i += 1

print i, LL[i:i+5]

##
## 4 [1, 2, 3, 4, 5]
##

Aug 26 '08 #5

Emile van Sebille

tk****@hotmail. com wrote:

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

>>for ii,dummy in enumerate(retHi st):

.... if 0 not in retHist[ii:ii+5]:
.... break

>>del retHist[:ii]

Well, to the extent short and sweet is elegant...

Emile

Aug 26 '08 #6

tdmj

On Aug 26, 5:49 pm, tkp...@hotmail. com wrote:

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False

del retHist[:i-4]

Thanks in advance for your help

Thomas Philips

With regular expressions:

import re

hist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
hist_str = ''.join(str(i) for i in hist)
match = re.search(r'[1-9]{5, }', hist_str)
hist = hist[-5:] if match is None else hist[match.start():]

Or slightly more concise:

import re

hist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
match = re.search(r'[1-9]{5, }', ''.join(str(i) for i in hist))
hist = hist[-5:] if match is None else hist[match.start():]

Tommy McDaniel

Aug 27 '08 #7

tkpmep

On Aug 26, 7:23*pm, Emile van Sebille <em...@fenx.com wrote:

tkp...@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at *which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

*>>for ii,dummy in enumerate(retHi st):
... * * if 0 not in retHist[ii:ii+5]:
... * * * * break

*>>del retHist[:ii]

Well, to the extent short and sweet is elegant...

Emile

This is just what the doctor ordered. Thank you, everyone, for the
help.

Sincerely

Thomas Philips

Aug 27 '08 #8

Terry Reedy

Matthew Fitzgibbons wrote:

tk****@hotmail. com wrote:

>
reHist = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
count = 0
for i, d in enumerate(reHis t):
if d == 0:
count = 0
else:
count += 1
if count == 5:
break
else:
raise Exception("No data found")
reHist = reHist[i-4:]
print reHist

This is what I would have suggested, except that the 'if count' test
should be left under the else clause, as in the original, so I consider
it the best of the responses ;-)

I thought of the repeated slicing alternative, but it would be slightly
slower. However, for occasional runs, the difference would be trivial.

Worrying about what Psyco does for this problem is rather premature
optimization.

My quarter's worth....

tjr

Aug 27 '08 #9

Scott David Daniels

tk****@hotmail. com wrote:

I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).

I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?

flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False

del retHist[:i-4]

Thanks in advance for your help

Thomas Philips

Here is one that can go iterator-to-iterator:

def started(source) :
src = iter(source)
lead = []
for x in src:
if x:
lead.append(x)
if len(lead) == 5:
return itertools.chain (lead, src)
else:
lead = []
print list(started([0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))
--Scott David Daniels
Sc***********@A cm.Org

Aug 27 '08 #10

Similar topics

4694

Identifying user's pick from selection list

by: Mark | last post by:

I have an application that uses PHP to access a MySQL table and extract rows which match the user's search entry. Data from the matching rows are loaded into an HTML selection list, and the user can scan the list and highlight a selected item. My question: In PHP, what is the syntax for identifying which row is highlighted? My goal is to pass the unique row ID of the selected item to a subsequent form where all the fields will be...

PHP

5184

Identifying File type by reading files

by: hokiegal99 | last post by:

This is not really a Python-centric question, however, I am using Python to solve this problem (as of now) so I thought it appropiate to pose the question here. I have some functions that search for files that contain certian strings and if the files found to have these string do not already have a filename extension (such as '.doc' or '.xls') the function will append that to the files and rename them. So, if a file named 'report' was...

Python

2236

Identifying exceptions that can be raised

by: dkcpub | last post by:

I'm very new to Python, but I couldn't find anything in the docs or faq about this. And I fished around in the IDLE menus but didn't see anything. Is there a tool that can determine all the exceptions that can be raised in a Python function, or in any of the functions it calls, etc.? /Dan

Python

4682

Identifying specified parameters in stored procedures

by: Jeff | last post by:

I am using SQL Server 2000. I have a table with, say, 20 columns. I have one procedure which updates all 20 columns at once, accepting a parameter for each column. However, I want to be able to pass any combination of parameters and only update those columns if passed. So I created the sp as something like create update_t1 ( @col1 int = null, @col2 int = null,

Microsoft SQL Server

1998

Identifying fields in a recordset where a wild card is used in SQL statement

by: gssstuff | last post by:

I have a piece of code I use to compare two identically structured tables. There are 15+ sets of tables I am comparing. I am looking to see what has changed between the "old" and "new" versions of the table. Any changes get reported to an audit file which contains the index, field name and field value of the records that have changed. I have the changes nailed for modifications, but cannot seem to get a handle on adds/deletes. Due to...

Microsoft Access / VBA

6879

Identifying client that sent data to Asynchronous socket.

by: Macca | last post by:

My app has an asynchronous socket server. It will have 20 clients connected to the server. Each client sends data every 500 millisecondsThe Connections once established will not be closed unless there is a problem with the connection. I need to know which client has sent the incoming data as each client has its own buffer on my "server" app. I am using the standard asynch socket code from MSDN to listen for connections and they...

C# / C Sharp

4770

start a multi-sockets server (a socket/per thread) with different ports but same host

by: zxo102 | last post by:

Hi, I am doing a small project using socket server and thread in python. This is first time for me to use socket and thread things. Here is my case. I have 20 socket clients. Each client send a set of sensor data per second to a socket server. The socket server will do two things: 1. write data into a file via bsddb; 2. forward the data to a GUI written in wxpython. I am thinking the code should work as follow (not sure it is feasible)...

Python

4513

Uniquely Identifying Multiple/Concurrent Async Tasks

by: Frankie | last post by:

It appears that System.Random would provide an acceptable means through which to generate a unique value used to identify multiple/concurrent asynchronous tasks. The usage of the value under consideration here is that it is supplied to the AsyncOperationManager.CreateOperation(userSuppliedState) method... with userSuppliedState being, more or less, a taskId. In this case, the userSuppliedState {really taskId} is of the object type,...

C# / C Sharp

6147

Correctly identifying FileInfo vs. DirectoryInfo entries

by: Tom P. | last post by:

I am having the following problem: I create a FileSystemWatcher and wait for events. When the event does happen I refresh a FileSystemInfo list and set properties accordingly (IsFile, IsDir, ReadOnly, etc.). The problem I'm having is in identifying when a FileSystemInfo entry is a FileInfo or a DirectoryInfo type. I get the rare, and yet oddly common, "setup.inf" file that for some inexplicable reason passes the standard...

C# / C Sharp

9628

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

10292

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10122

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9923

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

7471

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6722

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5368

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4031

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3627

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP