P: n/a
|
Hi all,
I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example
..... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...
I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)
I am pretty sure I can figure out how to do that, but I would like to
have some guidance on the most pythonic approach to this.
Two paths I have considered is:
1. Convert the sequence of integers to a hex string, i.e., "...
16616616616616619330330330330A66166..." and use the re module to find
the patterns. Use the string positions to go back to the sequence
2. Put them in a list or an array and manually look for the patterns
by iterating and filtering the elements compare with sets.
I am not looking for a "solution" to this specific problem, just some
guidance
The rules for the sequence is:
1. The sequence may start in the middle of a pattern
2. There are one or two patterns, Pattern A and Pattern B in the
sequence
3. Pattern A only consists of the numbers 0, 3, and 9. 3, 3 is always
followed by 0
4. Pattern B only consists of the numbers 1, 6, and 10. 6, 6, is
always followed by 1
5. There may be other numbers interspersed within the sequence, but
they can be ignored
6. The relative position of 9 or 10 in the patterns varies from case
to case, but is consistent throughout a sequence.
7. There is always one 9 or one 10 in a pattern
7. The beginning of a pattern is marked by the transision from oner
pattern to the other.
8. If there is only one pattern in the sequence, the pattern beginning
is marked by the first occurance of either 9 or 10
9. The pattern is repetitive in the sequence,
e.g., ...ABABABAB..., ...AAA..., or ...BBB...
Thank you,
-- Slaunger | |
Share this Question
P: n/a
|
Slaunger wrote:
Hi all,
I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example
.... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...
I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)
Maybe:
#-----------------------------------------------------------------
data = '''
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1'''
data = [int(x) for x in data.split()]
from itertools import groupby
S1 = [0, 3, 9]
s = set()
for k, g in groupby(data, lambda x: x in S1):
seq = tuple(g)
# maybe the next line should be 'if 9 in seq or 10 in seq'?
if seq[0] in [9, 10]:
s.add(seq)
print s
#------------------------------------------------------------------
set(
[(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0),
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)])
hth
G. | |
P: n/a
|
Slaunger <Sl******@gmail.comwrites:
I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example
.... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...
I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)
I am pretty sure I can figure out how to do that, but I would like to
have some guidance on the most pythonic approach to this.
Then it would be a good starting point to write some code. Then you
could post it and ask how it can be made more 'pythonic'.
HTH
--
Arnaud | |
P: n/a
|
On Nov 21, 9:13*am, Slaunger <Slaun...@gmail.comwrote:
Hi all,
I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example
.... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...
I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)
I am pretty sure I can figure out how to do that, but I would like to
have some guidance on the most pythonic approach to this.
Two paths I have considered is:
1. Convert the sequence of integers to a hex string, i.e., "...
16616616616616619330330330330A66166..." and use the re module to find
the patterns. Use the string positions to go back to the sequence
2. Put them in a list or an array and manually look for the patterns
by iterating and filtering the elements compare with sets.
I am not looking for a "solution" to this specific problem, just some
guidance
Your rules appear to be incomplete and inconsistent.
>
The rules for the sequence is:
1. The sequence may start in the middle of a pattern
2. There are one or two patterns, Pattern A and Pattern B in the
sequence
3. Pattern A only consists of the numbers 0, 3, and 9. 3, 3 is always
followed by 0
But does a 3 always follow a 3? Can you have 3, 0, 3, 0?
Can 0's occur without 3's, such as 0, 0, 0?
4. Pattern B only consists of the numbers 1, 6, and 10. 6, 6, is
always followed by 1
5. There may be other numbers interspersed within the sequence, but
they can be ignored
So, I can have 3, 3, 0, 7, 3, 3, 0?
What if the 7 occurs after the pair of 3's? Is the number following
the 7 forced to be 0, i.e., is 3, 3, 7, 3, 3, 0 legal?
6. The relative position of 9 or 10 in the patterns varies from case
to case, but is consistent throughout a sequence.
7. There is always one 9 or one 10 in a pattern
7. The beginning of a pattern is marked by the transision from oner
pattern to the other.
Can there be an ignored number between the patterns? Is
9,3,3,0,7,10,6,6,1
legal? If NO, you violate Rule 5. If YES, you violate the second Rule
7.
8. If there is only one pattern in the sequence, the pattern beginning
is marked by the first occurance of either 9 or 10
9. The pattern is repetitive in the sequence,
e.g., ...ABABABAB..., ...AAA..., or ...BBB...
Thank you,
-- Slaunger
| |
P: n/a
|
On Fri, 21 Nov 2008 18:10:02 +0100
Gerard flanagan <gr********@gmail.comwrote:
data = '''
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1'''
data = [int(x) for x in data.split()]
from itertools import groupby
But groupby needs sorted data?
Suppose the rules do not conflict or overlap and between them divide
all the values, then maybe this would work:
class StateMachine:
def __init__(self,*rules):
self.rules = rules
self.state = len(rules) #deliberately unreachable
self.first = True
def change(self,x):
#check and/or change state
for i,rule in enumerate(self.rules):
if rule(x):
if i == self.state: #no state change
return False
else: #maybe state change
self.state = i
if self.first: #set initial state, no change
self.first = False
return False
else:
return True #state is changed
raise ValueError
def test():
data = '''
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10
6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10
6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10
6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1'''
data = map(int, data.split())
def rule1(x):
return x in set((0, 3, 9))
def rule2(x):
return x in set((6, 1, 10))
state = StateMachine(rule1,rule2)
L = []
res = []
for x in data:
if state.change(x):
res.append(list(L))
L =[]
L.append(x)
res.append(list(L))
print res
if __name__=='__main__':
test() | |
P: n/a
|
>
I am pretty sure I can figure out how to do that, but I would like to
have some guidance on the most pythonic approach to this.
Then it would be a good starting point to write some code. Then you
could post it and ask how it can be made more 'pythonic'.
That is actually a good point. I will do that.
-- ~~~~ | |
P: n/a
|
On 21 Nov., 23:36, Mensanator <mensana...@aol.comwrote:
Your rules appear to be incomplete and inconsistent.
OK. Let me try to clarify then...
3. Pattern A only consists of the numbers 0, 3, and 9. 3, 3 is always
followed by 0
But does a 3 always follow a 3? Can you have 3, 0, 3, 0?
Can 0's occur without 3's, such as 0, 0, 0?
Yes, 3s always comes in pairs. So, 3, 0, 3, 0 is not allowed.
And of the numbers 0, 3, and 9; 0 will always be the first after the
pair of 3s
>
4. Pattern B only consists of the numbers 1, 6, and 10. 6, 6, is
always followed by 1
5. There may be other numbers interspersed within the sequence, but
they can be ignored
So, I can have 3, 3, 0, 7, 3, 3, 0?
Yes, there is a point I did not mention propery in my first
description:
The number 7 for instance could appear in that position, but it would
not be repetitive;
as a matter of fact these other numbers can be filtered away before
looking for the pattern,
so let us just forgot about those.
>
What if the 7 occurs after the pair of 3's? Is the number following
the 7 forced to be 0, i.e., is 3, 3, 7, 3, 3, 0 legal?
No, it would have to be 3, 3, 0, 7, 3, 3, 0, it is sequeezed in - but
as mentioned they can be prefiltered out of the problem
>
7. The beginning of a pattern is marked by the transition from oner
pattern to the other.
Can there be an ignored number between the patterns? Is
9,3,3,0,7,10,6,6,1
legal? If NO, you violate Rule 5. If YES, you violate the second Rule
7.
Yes you are right. This complication is again eliminated by
prefiltering "other" numbers out
-- Slaunger | |
P: n/a
|
On 21 Nov., 18:10, Gerard flanagan <grflana...@gmail.comwrote:
Slaunger wrote:
Hi all,
I am a Python novice, and I have run into a problem in a project I am
working on, which boils down to identifying the patterns in a sequence
of integers, for example
.... 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 ...
I want to process this such that I get out two patterns, like:
(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0)
and
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)
Maybe:
#-----------------------------------------------------------------
data = '''
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1'''
data = [int(x) for x in data.split()]
from itertools import groupby
S1 = [0, 3, 9]
s = set()
for k, g in groupby(data, lambda x: x in S1):
* * *seq = tuple(g)
* * *# maybe the next line should be 'if 9 in seq or 10 in seq'?
* * *if seq[0] in [9, 10]:
* * * * *s.add(seq)
print s
#------------------------------------------------------------------
set(
[(9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0),
(10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1)])
hth
G.
Hi Gerard,
This definitely looks like a path to walk along, and I think your code
does the trick, although I have to play a little around with the
groupby method, of which I had no prior knowledge. I think I will
write some unit test cases to stress test you concept (on Monday, when
I am back at work). I appreciate your almost full implementation - it
would have sufficed to point me to the itertools module, and then I
think I would have figured out.
-- ~~~~ | |
P: n/a
|
So I think you just need to find the first two complete sequences of
1,6,10 and 0,3,9, remove any repetitions and then you're done.
data = '''
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 7 3 0 3 3 0 3 3 0 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 10 6 6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6
6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 9 3 3 0 3 3 0 3 3 0 3 3 0 10 6
6
1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1 6 6 1'''
data = [int(x) for x in data.split()]
s1 = frozenset([1,6,10])
s2 = frozenset([0,3,9])
diter = iter(data)
i = diter.next()
curset = (s1,s2)[i in s2]
otherset = lambda : (s1,s2)[curset is s1]
seq = { s1 : [], s2 : [] }
# read until there is the first change in state - discard
# these, since we may have started in the middle of a sequence
other = otherset()
while i not in other:
i = diter.next()
# read in 2 sequences
for _ in range(2):
other = curset
curset = otherset()
tmp = []
while i not in other:
if i in curset:
tmp.append(i)
i = diter.next()
seq[curset] = tmp[:]
# look for repeats in a seq, truncate
def truncateReps(s,sentinel):
if s.count(sentinel) 1:
loc1 = s.index(sentinel)
loc2 = s.index(sentinel,loc1+1)
s[:] = s[:loc2-loc1]
truncateReps(seq[s1],10)
truncateReps(seq[s2],9)
# the answer
print seq[s1]
print seq[s2]
Prints:
[10, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1, 6, 6, 1]
[9, 3, 3, 0, 3, 3, 0, 3, 3, 0, 3, 3, 0]
Your original sample was only the nominal, most friendly case, so it
is hard to know if any submitted solutions will work will all of your
other conditions. Please try this with more challenging data, such as
starting a sequence in the middle, numbers not in the set
(0,1,3,6,9,10), repeated patterns, sequences that don't start with 9
or 10, etc.
-- Paul | | This discussion thread is closed Replies have been disabled for this discussion. | | Question stats - viewed: 2145
- replies: 8
- date asked: Nov 21 '08
|