473,385 Members | 1,834 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

word shifts

Hello,

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

thanks!

Jun 27 '08 #1
10 1739
En Sun, 04 May 2008 02:17:07 -0300, dave <sq*************@invalid.comescribió:
Hello,

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.
First, looking at the code, you're evaluating line.strip() a lot of times; I'd avoid it.
Looks like you're using a dictionary just for the keys - the set type is more adequate here (see http://docs.python.org/lib/types-set.html ). In any case, I'd use a value like None instead of [1]
But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

--
Gabriel Genellina

Jun 27 '08 #2
On May 4, 2:04*am, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:
En Sun, 04 May 2008 02:17:07 -0300, dave <squareswallo...@invalid.comescribió:
Hello,
I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. *'abc' shifted 1 is 'bcd'
Please take a look and tell me if this is a viable solution.
*def shift(word, amt):
* *ans = ''
* *for letter in word:
* * * * * *ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
* *return ans
def fileshift(x):
* *fin = open(x)
* *d = {}
* *for line in fin:
* * * * * *d[line.strip()] = [1]
* * * * * *for i in range(1, 26):
* * * * * * * * * *ite = shift(line.strip(), i)
* * * * * * * * * *if ite in d:
* * * * * * * * * * * * * *print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

First, looking at the code, you're evaluating line.strip() a lot of times;I'd avoid it.
Looks like you're using a dictionary just for the keys - the set type is more adequate here (seehttp://docs.python.org/lib/types-set.html). In any case, I'd use a value like None instead of [1]
But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.
Here's a straightforward implementation that generates all the lists
of equally-shifted words:

from collections import defaultdict

def iter_shifted(words):
key2shifted = defaultdict(list)
for word in words:
ords = map(ord,word)
key = tuple((ords[i]-ords[i-1]) % 26
for i in xrange(1,len(ords)))
key2shifted[key].append(word)
for shifted in key2shifted.itervalues():
if len(shifted) 1:
yield shifted

if __name__ == '__main__':
words = 'abc bef jas cba cde zab azy hkl'.split()
for shifted in iter_shifted(words):
print shifted

# *** output ***
#['bef', 'hkl']
#['abc', 'cde', 'zab']
#['cba', 'azy']
Jun 27 '08 #3
dave wrote:
I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. Â*'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

Â*def shift(word, amt):
Â*Â*Â*Â*Â*Â*Â*Â*ans = ''
Â*Â*Â*Â*Â*Â*Â*Â*for letter in word:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*ans = ans + chr((ord(letter) - ord('a') + amt) % 26 +
ord('a'))
Â*Â*Â*Â*Â*Â*Â*Â*return ans

def fileshift(x):
Â*Â*Â*Â*Â*Â*Â*Â*fin = open(x)
Â*Â*Â*Â*Â*Â*Â*Â*d = {}
Â*Â*Â*Â*Â*Â*Â*Â*for line in fin:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*d[line.strip()] = [1]
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*for i in range(1, 26):
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*it e = shift(line.strip(), i)
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*if ite in d:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* Â*Â*Â*Â*Â*Â*Â*print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.
My ideas:

Move the open() call out of fileshift() and have it working with arbitrary
iterables.
Store a normalized version of the word in the dictionary. if every word in
the dict is guaranteed to start with an "a" you can reduce the number of
tests to one.
If you have many words you can build a fast shift() function based on
str.translate().

Invoke the following script both with and without a filename argument:

from string import ascii_lowercase as chars, maketrans

tables = [maketrans(chars, chars[i:] + chars[:i]) for i in range(26)]

def shift(word, n):
return word.translate(tables[n%26])

def normalize(word):
a = word[0]
return shift(word, ord("a") - ord(a))

def findshifted(words):
d = {}
for word in words:
d.setdefault(normalize(word), []).append(word)
return d
if __name__ == "__main__":
import sys
args = sys.argv[1:]
if args:
words = (line.strip() for line in open(args[0]))
else:
import random
sample = """\
alpha
beta
gamma
delta
""".split()

words = sample + [shift(random.choice(sample), random.randrange(26))
for _ in range(10)]

items = findshifted(words).items()
items.sort()
for k, v in items:
print k, "-->", v

Peter
Jun 27 '08 #4
dave <sq*************@invalid.comwrites:
Hello,

I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans
In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

>
def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite

You could write:

line = line.strip()

after your for line in fin: statement, rather than strip the line twice.

Let me suggest a different solution base on the str.translate() method

from string import lowercase, maketrans

shifted_lowercase = lowercase[1:] +lowercase[0]
table = string.maketrans(lowercase, shifted_lowercase)

def shift1(line):
return line.translate(table)

def fileshift(path):
lines = set()
for line in open(path):
lines.add(line)
for i in xrange(25):
line = shift1(line)
if line in lines:
print line

(untested)

--
Arnaud
Jun 27 '08 #5
George Sakkis:
A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters.
Very nice solution, it uses the same strategy used to find anagrams,
where keys are
"".join(sorted(word))
Such general strategy to look for a possible invariant key for the
subsets of the required solutions is quite useful.

Bye,
bearophile
Jun 27 '08 #6
be************@lycos.com writes:
George Sakkis:
>A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters.

Very nice solution, it uses the same strategy used to find anagrams,
where keys are
"".join(sorted(word))
Such general strategy to look for a possible invariant key for the
subsets of the required solutions is quite useful.
Or even:

def get_key(word):
initial = ord(word[0])
return tuple((ord(x) - initial) % 26 for x in word[1:])

--
Arnaud
Jun 27 '08 #7
En Sun, 04 May 2008 03:35:05 -0300, George Sakkis <ge***********@gmail.comescribió:
On May 4, 2:04*am, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:
>En Sun, 04 May 2008 02:17:07 -0300, dave <squareswallo...@invalid.comescribió:
I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. *'abc' shifted 1 is 'bcd'

But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.
Much better! I like it.

--
Gabriel Genellina

Jun 27 '08 #8
On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <ar*****@googlemail.comsaid:
dave <sq*************@invalid.comwrites:
>Hello,

I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??
Jun 27 '08 #9
On May 5, 11:02 pm, dave <squareswallo...@yahoo.comwrote:
On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <arno...@googlemail.comsaid:
dave <squareswallo...@invalid.comwrites:
Hello,
I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'
Please take a look and tell me if this is a viable solution.
def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans
In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??
Well if you do it once, that would be simply "ans += chr". Arnaud was
referring to the case where you do it in a loop, like in your snippet.
A direct translation to use ''.join would be:

ans = ''.join(chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
for letter in word)

Of course it is simpler and more efficient if you factor out of the
loop the subexpressions that don't need to be recomputed:

ord_a = ord('a')
shift = ord_a - amt
ans = ''.join(chr((ord(letter) - shift) % 26 + ord_a)
for letter in word)

George
Jun 27 '08 #10
George Sakkis <ge***********@gmail.comwrites:
On May 5, 11:02 pm, dave <squareswallo...@yahoo.comwrote:
>On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <arno...@googlemail.comsaid:
dave <squareswallo...@invalid.comwrites:
>Hello,
>I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'
>Please take a look and tell me if this is a viable solution.
>def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans
In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??

Well if you do it once, that would be simply "ans += chr". Arnaud was
referring to the case where you do it in a loop, like in your snippet.
A direct translation to use ''.join would be:

ans = ''.join(chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
for letter in word)

Of course it is simpler and more efficient if you factor out of the
loop the subexpressions that don't need to be recomputed:

ord_a = ord('a')
shift = ord_a - amt
ans = ''.join(chr((ord(letter) - shift) % 26 + ord_a)
for letter in word)

George
Or keep the same layout as your original effort, but build a list of
chars instead of a string, then join them all at the end.

def shift(word, amt):
shifted = ''
ord_a = ord('a')
shift = ord_a - amt
for letter in word:
shifted.append(chr((ord(letter) - shift) % 26 + ord_a))
return ''.join(shifted)

(Which exactly the same as George's snippet, except that the generator
expression is unwound)

--
Arnaud
Jun 27 '08 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Glen Able | last post by:
The behaviour of << and >> arithmetic operators (under VC++6, x86) seems rather odd to me. Shifts at runtime seem to only use the bottom 5 bits of the shift amount (an x86 issue I suppose), but...
24
by: shafique | last post by:
Hello, Can anybody have idea as how to find it there exist consective one's in a word/dword using bitwise operatiors in C language. Obviously this can be done by shifting and looping one by...
19
by: Ross A. Finlayson | last post by:
Hi, I hope you can help me understand the varargs facility. Say I am programming in ISO C including stdarg.h and I declare a function as so: void log_printf(const char* logfilename, const...
4
by: Dave | last post by:
Hello - Say I have a 32 bit value in data (v:4, r:4, l:8, p:16) unsigned char data; Now, for portabilities sake, I want to use shifts and masks to access these fields, instead of accessing...
3
by: Dave | last post by:
I need to rewrite the following using shifts and masks for portability sakes, and to get rid of the BYTE_ORDER issue. /* CODE */ struct test { #if BYTE_ORDER == BIG_ENDIAN unsigned char ...
3
by: larzeb2000 | last post by:
I have constructed a css stylesheet which has a centered layout. The wrapper div is defined as 1000px. It contains a header div for the entire wrapper width, and a left sidebar div and a content...
3
by: geraldshastri | last post by:
Hi, I am designing xhtml templates using table-less design with div's and CSS positioning. But the problem is that certain pages tend to shift to the left despite the fact that the underlying...
15
by: Christopher Layne | last post by:
So I recently ran into a situation where I invoked UB without specifically knowing I did it. Yes, human, I know. What exactly is/was the rationale for not allowing shifts to be the same width of...
2
by: skijor | last post by:
For a page that display's a catalogue of items in table format. If the number of rows extends below the view, in some browsers (safari, firefox) the page shifts to the left a little bit. It's...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.