word shifts - Python

dave

Hello,

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

thanks!

Jun 27 '08 #1

Subscribe Post Reply

1739

Gabriel Genellina

En Sun, 04 May 2008 02:17:07 -0300, dave <sq*************@invalid.comescribió:

Hello,

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

First, looking at the code, you're evaluating line.strip() a lot of times; I'd avoid it.
Looks like you're using a dictionary just for the keys - the set type is more adequate here (see http://docs.python.org/lib/types-set.html ). In any case, I'd use a value like None instead of [1]
But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

--
Gabriel Genellina

Jun 27 '08 #2

George Sakkis

On May 4, 2:04*am, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:

En Sun, 04 May 2008 02:17:07 -0300, dave <squareswallo...@invalid.comescribió:

Hello,

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. *'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

*def shift(word, amt):
* *ans = ''
* *for letter in word:
* * * * * *ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
* *return ans

def fileshift(x):
* *fin = open(x)
* *d = {}
* *for line in fin:
* * * * * *d[line.strip()] = [1]
* * * * * *for i in range(1, 26):
* * * * * * * * * *ite = shift(line.strip(), i)
* * * * * * * * * *if ite in d:
* * * * * * * * * * * * * *print ite

Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

First, looking at the code, you're evaluating line.strip() a lot of times;I'd avoid it.
Looks like you're using a dictionary just for the keys - the set type is more adequate here (seehttp://docs.python.org/lib/types-set.html). In any case, I'd use a value like None instead of [1]
But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.
Here's a straightforward implementation that generates all the lists
of equally-shifted words:

from collections import defaultdict

def iter_shifted(words):
key2shifted = defaultdict(list)
for word in words:
ords = map(ord,word)
key = tuple((ords[i]-ords[i-1]) % 26
for i in xrange(1,len(ords)))
key2shifted[key].append(word)
for shifted in key2shifted.itervalues():
if len(shifted) 1:
yield shifted

if __name__ == '__main__':
words = 'abc bef jas cba cde zab azy hkl'.split()
for shifted in iter_shifted(words):
print shifted

# *** output ***
#['bef', 'hkl']
#['abc', 'cde', 'zab']
#['cba', 'azy']

Jun 27 '08 #3

Peter Otten

dave wrote:

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. Â*'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

Â*def shift(word, amt):
Â*Â*Â*Â*Â*Â*Â*Â*ans = ''
Â*Â*Â*Â*Â*Â*Â*Â*for letter in word:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*ans = ans + chr((ord(letter) - ord('a') + amt) % 26 +

ord('a'))

Â*Â*Â*Â*Â*Â*Â*Â*return ans

def fileshift(x):
Â*Â*Â*Â*Â*Â*Â*Â*fin = open(x)
Â*Â*Â*Â*Â*Â*Â*Â*d = {}
Â*Â*Â*Â*Â*Â*Â*Â*for line in fin:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*d[line.strip()] = [1]
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*for i in range(1, 26):
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*it e = shift(line.strip(), i)
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*if ite in d:
Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* Â*Â*Â*Â*Â*Â*Â*print ite
Any tips/suggestions/critiques greatly appreciated.. I'm trying to
teach myself Python (and still beginning) and would love any helpful
info.

My ideas:

Move the open() call out of fileshift() and have it working with arbitrary
iterables.
Store a normalized version of the word in the dictionary. if every word in
the dict is guaranteed to start with an "a" you can reduce the number of
tests to one.
If you have many words you can build a fast shift() function based on
str.translate().

Invoke the following script both with and without a filename argument:

from string import ascii_lowercase as chars, maketrans

tables = [maketrans(chars, chars[i:] + chars[:i]) for i in range(26)]

def shift(word, n):
return word.translate(tables[n%26])

def normalize(word):
a = word[0]
return shift(word, ord("a") - ord(a))

def findshifted(words):
d = {}
for word in words:
d.setdefault(normalize(word), []).append(word)
return d
if __name__ == "__main__":
import sys
args = sys.argv[1:]
if args:
words = (line.strip() for line in open(args[0]))
else:
import random
sample = """\
alpha
beta
gamma
delta
""".split()

words = sample + [shift(random.choice(sample), random.randrange(26))
for _ in range(10)]

items = findshifted(words).items()
items.sort()
for k, v in items:
print k, "-->", v

Peter

Jun 27 '08 #4

Arnaud Delobelle

dave <sq*************@invalid.comwrites:

Hello,

I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

>
def fileshift(x):
fin = open(x)
d = {}
for line in fin:
d[line.strip()] = [1]
for i in range(1, 26):
ite = shift(line.strip(), i)
if ite in d:
print ite

You could write:

line = line.strip()

after your for line in fin: statement, rather than strip the line twice.

Let me suggest a different solution base on the str.translate() method

from string import lowercase, maketrans

shifted_lowercase = lowercase[1:] +lowercase[0]
table = string.maketrans(lowercase, shifted_lowercase)

def shift1(line):
return line.translate(table)

def fileshift(path):
lines = set()
for line in open(path):
lines.add(line)
for i in xrange(25):
line = shift1(line)
if line in lines:
print line

(untested)

--
Arnaud

Jun 27 '08 #5

bearophileHUGS

George Sakkis:

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters.

Very nice solution, it uses the same strategy used to find anagrams,
where keys are
"".join(sorted(word))
Such general strategy to look for a possible invariant key for the
subsets of the required solutions is quite useful.

Bye,
bearophile

Jun 27 '08 #6

Arnaud Delobelle

be************@lycos.com writes:

George Sakkis:
>A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters.

Very nice solution, it uses the same strategy used to find anagrams,
where keys are
"".join(sorted(word))
Such general strategy to look for a possible invariant key for the
subsets of the required solutions is quite useful.

Or even:

def get_key(word):
initial = ord(word[0])
return tuple((ord(x) - initial) % 26 for x in word[1:])

--
Arnaud

Jun 27 '08 #7

Gabriel Genellina

En Sun, 04 May 2008 03:35:05 -0300, George Sakkis <ge***********@gmail.comescribió:

On May 4, 2:04*am, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:
>En Sun, 04 May 2008 02:17:07 -0300, dave <squareswallo...@invalid.comescribió:

I made a function that takes a word list (one word per line, text file)
and searches for all the words in the list that are 'shifts' of
eachother. *'abc' shifted 1 is 'bcd'

But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.

Much better! I like it.

--
Gabriel Genellina

Jun 27 '08 #8

dave

On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <ar*****@googlemail.comsaid:

dave <sq*************@invalid.comwrites:

>Hello,

I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??

Jun 27 '08 #9

George Sakkis

On May 5, 11:02 pm, dave <squareswallo...@yahoo.comwrote:

On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <arno...@googlemail.comsaid:

dave <squareswallo...@invalid.comwrites:

Hello,

I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

Please take a look and tell me if this is a viable solution.

def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??

Well if you do it once, that would be simply "ans += chr". Arnaud was
referring to the case where you do it in a loop, like in your snippet.
A direct translation to use ''.join would be:

ans = ''.join(chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
for letter in word)

Of course it is simpler and more efficient if you factor out of the
loop the subexpressions that don't need to be recomputed:

ord_a = ord('a')
shift = ord_a - amt
ans = ''.join(chr((ord(letter) - shift) % 26 + ord_a)
for letter in word)

George

Jun 27 '08 #10

Arnaud Delobelle

George Sakkis <ge***********@gmail.comwrites:

On May 5, 11:02 pm, dave <squareswallo...@yahoo.comwrote:
>On 2008-05-04 01:10:40 -0600, Arnaud Delobelle <arno...@googlemail.comsaid:

dave <squareswallo...@invalid.comwrites:

>Hello,

>I made a function that takes a word list (one word per line, text
file) and searches for all the words in the list that are 'shifts' of
eachother. 'abc' shifted 1 is 'bcd'

>Please take a look and tell me if this is a viable solution.

>def shift(word, amt):
ans = ''
for letter in word:
ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
return ans

In Python, if you want to build a string from lots of parts you can
use ''.join(parts). I think it is considered more efficient.

what would be the best way to write a "ans = ans + chr" into a
''.join(parts) ??

Well if you do it once, that would be simply "ans += chr". Arnaud was
referring to the case where you do it in a loop, like in your snippet.
A direct translation to use ''.join would be:

ans = ''.join(chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
for letter in word)

Of course it is simpler and more efficient if you factor out of the
loop the subexpressions that don't need to be recomputed:

ord_a = ord('a')
shift = ord_a - amt
ans = ''.join(chr((ord(letter) - shift) % 26 + ord_a)
for letter in word)

George

Or keep the same layout as your original effort, but build a list of
chars instead of a string, then join them all at the end.

def shift(word, amt):
shifted = ''
ord_a = ord('a')
shift = ord_a - amt
for letter in word:
shifted.append(chr((ord(letter) - shift) % 26 + ord_a))
return ''.join(shifted)

(Which exactly the same as George's snippet, except that the generator
expression is unwound)

--
Arnaud

Jun 27 '08 #11

Similar topics

Strangeness with shifts

by: Glen Able | last post by:

The behaviour of << and >> arithmetic operators (under VC++6, x86) seems rather odd to me. Shifts at runtime seem to only use the bottom 5 bits of the shift amount (an x86 issue I suppose), but...

C / C++

Interesting Quest - Any optimized way to find if consective one's exist in a word!

by: shafique | last post by:

Hello, Can anybody have idea as how to find it there exist consective one's in a word/dword using bitwise operatiors in C language. Obviously this can be done by shifting and looping one by...

C / C++

Variadic functions calling variadic functions with the argument list, HLL bit shifts on LE processors

by: Ross A. Finlayson | last post by:

Hi, I hope you can help me understand the varargs facility. Say I am programming in ISO C including stdarg.h and I declare a function as so: void log_printf(const char* logfilename, const...

C / C++

shifts and masks

by: Dave | last post by:

Hello - Say I have a 32 bit value in data (v:4, r:4, l:8, p:16) unsigned char data; Now, for portabilities sake, I want to use shifts and masks to access these fields, instead of accessing...

C / C++

replace bitfields with shifts and masks

by: Dave | last post by:

I need to rewrite the following using shifts and masks for portability sakes, and to get rid of the BYTE_ORDER issue. /* CODE */ struct test { #if BYTE_ORDER == BIG_ENDIAN unsigned char ...

C / C++

Centered layout shifts

by: larzeb2000 | last post by:

I have constructed a css stylesheet which has a centered layout. The wrapper div is defined as 1000px. It contains a header div for the entire wrapper width, and a left sidebar div and a content...

HTML / CSS

My css layout shifts a bit to the right on certain pages

by: geraldshastri | last post by:

Hi, I am designing xhtml templates using table-less design with div's and CSS positioning. But the problem is that certain pages tend to shift to the left despite the fact that the underlying...

HTML / CSS

left shifts (rationale question)

by: Christopher Layne | last post by:

So I recently ran into a situation where I invoked UB without specifically knowing I did it. Yes, human, I know. What exactly is/was the rationale for not allowing shifts to be the same width of...

C / C++

content shifts when extending below view

by: skijor | last post by:

For a page that display's a catalogue of items in table format. If the number of rows extends below the view, in some browsers (safari, firefox) the page shifts to the left a little bit. It's...

HTML / CSS

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing