I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).
I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?
flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False
del retHist[:i-4]
Thanks in advance for your help
Thomas Philips
Aug 26 '08
23 1269
On Tue, 26 Aug 2008 17:04:19 -0700, tdmj wrote:
On Aug 26, 5:49 pm, tkp...@hotmail. com wrote:
>I have a list that starts with zeros, has sporadic data, and then has good data. I define the point at which the data turns good to be the first index with a non-zero entry that is followed by at least 4 consecutive non-zero data items (i.e. a week's worth of non-zero data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I would define the point at which data turns good to be 4 (1 followed by 2, 3, 4, 5).
....
With regular expressions:
Good grief. If you're suggesting that as a serious proposal, and not just
to prove it can be done, that's surely an example of "when all you have
is a hammer, everything looks like a nail" thinking.
In this particular case, your regex "solution" gives the wrong result,
indicating that you didn't test your code before posting. Hint:
re.search(r'[1-9]{5, }', "123456")
returns None.
The obvious fix for that specific bug is to use r'[1-9]{5,5}', but even
that will fail. Hint: what happens if an item has more than one digit?
Before posting another regex solution, make sure it does the right thing
with this:
[0, 0, 101, 0, 1002, 203, 3050, 4105, 5110, 623, 777]
--
Steven tk****@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).
I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?
flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False
del retHist[:i-4]
Thanks in advance for your help
Thomas Philips
data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def itergood(indata ):
indata = iter(indata)
buf = []
while len(buf) < 4:
buf.append(inda ta.next())
if buf[-1] == 0:
buf[:] = []
for x in buf:
yield x
for x in indata:
yield x
for d in itergood(data):
print d
On Aug 26, 10:39 pm, tkp...@hotmail. com wrote:
On Aug 26, 7:23 pm, Emile van Sebille <em...@fenx.com wrote:
tkp...@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).
I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?
>>for ii,dummy in enumerate(retHi st):
... if 0 not in retHist[ii:ii+5]:
... break
>>del retHist[:ii]
Well, to the extent short and sweet is elegant...
Emile
This is just what the doctor ordered. Thank you, everyone, for the
help.
Note that the version above (as well as most others posted) fail for
boundary cases; check out bearophile's doctest to see some of them.
Below are two more versions that pass all the doctests: the first
works only for lists and modifies them in place and the second works
for arbitrary iterables:
def clean_inplace(s eq, good_ones=4):
start = 0
n = len(seq)
while start < n:
try: end = seq.index(0, start)
except ValueError: end = n
if end-start >= good_ones:
break
start = end+1
del seq[:start]
def clean_iter(iter able, good_ones=4):
from itertools import chain, islice, takewhile, dropwhile
iterator = iter(iterable)
is_zero = float(0).__eq__
while True:
# consume all zeros up to the next non-zero
iterator = dropwhile(is_ze ro, iterator)
# take up to `good_ones` non-zeros
good = list(islice(tak ewhile(bool,ite rator), good_ones))
if not good: # iterator exhausted
return iterator
if len(good) == good_ones:
# found `good_ones` consecutive non-zeros;
# chain them to the rest items and return them
return chain(good, iterator)
HTH,
George
On Aug 27, 3:00 pm, Gerard flanagan <grflana...@gma il.comwrote:
tkp...@hotmail. com wrote:
I have a list that starts with zeros, has sporadic data, and then has
good data. I define the point at which the data turns good to be the
first index with a non-zero entry that is followed by at least 4
consecutive non-zero data items (i.e. a week's worth of non-zero
data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9], I would define the point at which data turns good to be 4 (1
followed by 2, 3, 4, 5).
I have a simple algorithm to identify this changepoint, but it looks
crude: is there a cleaner, more elegant way to do this?
flag = True
i=-1
j=0
while flag and i < len(retHist)-1:
i += 1
if retHist[i] == 0:
j = 0
else:
j += 1
if j == 5:
flag = False
del retHist[:i-4]
Thanks in advance for your help
Thomas Philips
data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def itergood(indata ):
indata = iter(indata)
buf = []
while len(buf) < 4:
buf.append(inda ta.next())
if buf[-1] == 0:
buf[:] = []
for x in buf:
yield x
for x in indata:
yield x
for d in itergood(data):
print d
This seems the most efficient so far for arbitrary iterables. With a
few micro-optimizations it becomes:
from itertools import chain
def itergood(indata , good_ones=4):
indata = iter(indata); get_next = indata.next
buf = []; append = buf.append
while len(buf) < good_ones:
next = get_next()
if next: append(next)
else: del buf[:]
return chain(buf, indata)
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood( x))"
100 loops, best of 3: 3.09 msec per loop
And with Psyco enabled:
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood( x))"
1000 loops, best of 3: 466 usec per loop
George
George Sakkis:
This seems the most efficient so far for arbitrary iterables.
This one probably scores well with Psyco ;-)
def start_good3(seq , good_ones=4):
"""
>>start_good = start_good3 start_good([0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
4
>>start_good([])
-1
>>start_good([0, 0])
-1
>>start_good([0, 0, 0])
-1
>>start_good([0, 0, 0, 0, 1])
-1
>>start_good([0, 0, 1, 0, 1, 2, 3])
-1
>>start_good([0, 0, 1, 0, 1, 2, 3, 4])
4
>>start_good([0, 0, 1, 0, 1, 2, 3, 4, 5])
4
>>start_good([1, 2, 3, 4])
0
>>start_good([1, 2, 3])
-1
>>start_good([0, 0, 1, 0, 1, 2, 0, 4])
-1
"""
n_good = 0
pos = 0
for el in seq:
if el:
if n_good == good_ones:
return pos - good_ones
else:
n_good += 1
elif n_good:
n_good = 0
pos += 1
if n_good == good_ones:
return pos - good_ones
else:
return -1
Bye,
bearophile
On Aug 27, 4:34*pm, bearophileH...@ lycos.com wrote:
George Sakkis:
This seems the most efficient so far for arbitrary iterables.
This one probably scores well with Psyco ;-)
def start_good3(seq , good_ones=4):
* * n_good = 0
* * pos = 0
* * for el in seq:
* * * * if el:
* * * * * * if n_good == good_ones:
* * * * * * * * return pos - good_ones
* * * * * * else:
* * * * * * * * n_good += 1
* * * * elif n_good:
* * * * * * * * n_good = 0
* * * * pos += 1
* * if n_good == good_ones:
* * * * return pos - good_ones
* * else:
* * * * return -1
Bye,
bearophile
There, that's the regular machine for it. Too much thinking in
objects, and you can't even write a linked list anymore, right?
On Aug 27, 5:34 pm, bearophileH...@ lycos.com wrote:
George Sakkis:
This seems the most efficient so far for arbitrary iterables.
This one probably scores well with Psyco ;-)
I think if you update this so that it returns the "good" iterable
instead of the starting index, it is equivalent to Gerard's solution.
George
On Aug 27, 5:48 pm, castironpi <castiro...@gma il.comwrote:
On Aug 27, 4:34 pm, bearophileH...@ lycos.com wrote:
George Sakkis:
This seems the most efficient so far for arbitrary iterables.
This one probably scores well with Psyco ;-)
def start_good3(seq , good_ones=4):
n_good = 0
pos = 0
for el in seq:
if el:
if n_good == good_ones:
return pos - good_ones
else:
n_good += 1
elif n_good:
n_good = 0
pos += 1
if n_good == good_ones:
return pos - good_ones
else:
return -1
Bye,
bearophile
There, that's the regular machine for it. Too much thinking in
objects, and you can't even write a linked list anymore, right?
And you're still wondering why do people killfile you or think you're
a failed AI project...
On Aug 27, 6:14*pm, George Sakkis <george.sak...@ gmail.comwrote:
On Aug 27, 5:48 pm, castironpi <castiro...@gma il.comwrote:
On Aug 27, 4:34 pm, bearophileH...@ lycos.com wrote:
George Sakkis:
This seems the most efficient so far for arbitrary iterables.
This one probably scores well with Psyco ;-)
def start_good3(seq , good_ones=4):
* * n_good = 0
* * pos = 0
* * for el in seq:
* * * * if el:
* * * * * * if n_good == good_ones:
* * * * * * * * return pos - good_ones
* * * * * * else:
* * * * * * * * n_good += 1
* * * * elif n_good:
* * * * * * * * n_good = 0
* * * * pos += 1
* * if n_good == good_ones:
* * * * return pos - good_ones
* * else:
* * * * return -1
Bye,
bearophile
There, that's the regular machine for it. *Too much thinking in
objects, and you can't even write a linked list anymore, right?
And you're still wondering why do people killfile you or think you're
a failed AI project...
Just jumping on the bandwagon, George. And you see, everyone else's
passed the doctests perfectly. Were all the running times O( n* k )?
George Sakkis wrote:
On Aug 27, 3:00 pm, Gerard flanagan <grflana...@gma il.comwrote:
>tkp...@hotmail .com wrote:
>>I have a list that starts with zeros, has sporadic data, and then has good data. I define the point at which the data turns good to be the first index with a non-zero entry that is followed by at least 4 consecutive non-zero data items (i.e. a week's worth of non-zero data). For example, if my list is [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], I would define the point at which data turns good to be 4 (1 followed by 2, 3, 4, 5). I have a simple algorithm to identify this changepoint, but it looks crude: is there a cleaner, more elegant way to do this? flag = True i=-1 j=0 while flag and i < len(retHist)-1: i += 1 if retHist[i] == 0: j = 0 else: j += 1 if j == 5: flag = False del retHist[:i-4] Thanks in advance for your help Thomas Philips
data = [0, 0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def itergood(indata ): indata = iter(indata) buf = [] while len(buf) < 4: buf.append(inda ta.next()) if buf[-1] == 0: buf[:] = [] for x in buf: yield x for x in indata: yield x
for d in itergood(data): print d
This seems the most efficient so far for arbitrary iterables. With a
few micro-optimizations it becomes:
from itertools import chain
def itergood(indata , good_ones=4):
indata = iter(indata); get_next = indata.next
buf = []; append = buf.append
while len(buf) < good_ones:
next = get_next()
if next: append(next)
else: del buf[:]
return chain(buf, indata)
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood( x))"
100 loops, best of 3: 3.09 msec per loop
And with Psyco enabled:
$ python -m timeit -s "x = 1000*[0, 0, 0, 1, 2, 3] + [1,2,3,4]; from
itergood import itergood" "list(itergood( x))"
1000 loops, best of 3: 466 usec per loop
George
--
I always forget the 'del slice' method for clearing a list, thanks.
I think that returning a `chain` means that the function is not itself a
generator. And so if the indata has length less than or equal
to the threshold (good_ones), an unhandled StopIteration is raised
before the return statement is reached.
G. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Mark |
last post by:
I have an application that uses PHP to access a MySQL table and
extract rows which match the user's search entry. Data from the
matching rows are loaded into an HTML selection list, and the user can
scan the list and highlight a selected item.
My question: In PHP, what is the syntax for identifying which row is
highlighted?
My goal is to pass the unique row ID of the selected item to a
subsequent form where all the fields will be...
|
by: hokiegal99 |
last post by:
This is not really a Python-centric question, however, I am using
Python to solve this problem (as of now) so I thought it appropiate to
pose the question here.
I have some functions that search for files that contain certian
strings and if the files found to have these string do not already
have a filename extension (such as '.doc' or '.xls') the function will
append that to the files and rename them. So, if a file named 'report'
was...
|
by: dkcpub |
last post by:
I'm very new to Python, but I couldn't find anything in the docs or faq
about this. And I fished around in the IDLE menus but didn't see anything.
Is there a tool that can determine all the exceptions that can be raised
in a Python function, or in any of the functions it calls, etc.?
/Dan
|
by: Jeff |
last post by:
I am using SQL Server 2000. I have a table with, say, 20 columns. I
have one procedure which updates all 20 columns at once, accepting a
parameter for each column. However, I want to be able to pass any
combination of parameters and only update those columns if passed. So
I created the sp as something like
create update_t1
(
@col1 int = null,
@col2 int = null,
|
by: gssstuff |
last post by:
I have a piece of code I use to compare two identically structured
tables. There are 15+ sets of tables I am comparing. I am looking to
see what has changed between the "old" and "new" versions of the table.
Any changes get reported to an audit file which contains the index,
field name and field value of the records that have changed. I have
the changes nailed for modifications, but cannot seem to get a handle
on adds/deletes.
Due to...
| |
by: Macca |
last post by:
My app has an asynchronous socket server. It will have 20 clients connected
to the server. Each client sends data every 500 millisecondsThe Connections
once established will not be closed unless there is a problem with the
connection.
I need to know which client has sent the incoming data as each client has
its own buffer on my "server" app.
I am using the standard asynch socket code from MSDN to listen for
connections and they...
|
by: zxo102 |
last post by:
Hi,
I am doing a small project using socket server and thread in python.
This is first time for me to use socket and thread things.
Here is my case. I have 20 socket clients. Each client send a set
of sensor data per second to a socket server. The socket server will
do two things: 1. write data into a file via bsddb; 2. forward the data
to a GUI written in wxpython.
I am thinking the code should work as follow (not sure it is
feasible)...
|
by: Frankie |
last post by:
It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.
The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSuppliedState) method... with
userSuppliedState being, more or less, a taskId.
In this case, the userSuppliedState {really taskId} is of the object type,...
|
by: Tom P. |
last post by:
I am having the following problem: I create a FileSystemWatcher and
wait for events. When the event does happen I refresh a
FileSystemInfo list and set properties accordingly (IsFile, IsDir,
ReadOnly, etc.). The problem I'm having is in identifying when a
FileSystemInfo entry is a FileInfo or a DirectoryInfo type. I get the
rare, and yet oddly common, "setup.inf" file that for some
inexplicable reason passes the standard...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |