469,950 Members | 1,963 Online

Programming challenge: wildcard exclusion in cartesian products

The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

#-------------------------------------------------------------------------------
# Short algorithm description
# using function _genAll the program generates
# cartesian product without sets, which match
# some wildcarts
# Sets generation uses recursion ->
# first of all sets will be generated with dimension 1 and than
filtered through wildcarts
# then sets will be generated with dimension 2 and filtered again
# until the required set dimension is reached
# Program avoids explicit generation of some part of CP sets
# if the end of whildcart is asterics (*) and if the first part of
whildcart (without astrics)
# matches current set => then this set will be filtered out and won't
be used in
# higher dimension set generation
# example *,1,*,2,* [1,2] dim = 10
# by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
# => array [1,2] won't be used in next recursion levels
#-------------------------------------------------------------------------------
# To obtaine result use function
# CPWithoutWC first parameter is a list of any elements
(char,int,string,class exemplar ,.... any type)
# secont param is CP dimension
# other parameters are wildcarts => lists with any values then may
include
# special value ALL - asterics equivalent
#Example of usage: command line
# >>> import cartesianProduct as cp
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 1]
# [2, 1, 2]
# [2, 2, 1]
# [2, 2, 2]
# >>> for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
# print i
# ['a', 'a', 'a']
# ['a', 'b', 'a']
# ['b', 'a', 'b']
# ['b', 'b', 'b']
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 2]
# [2, 2, 2]
# >>>
# >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
# print i
## execute immediately
# >>>
# if You don't want to print cp. before ALL and CPWithoutWC use import
like this:
# from cartesianProduct import ALL,CPWithoutWC
# CPWithoutWC is a python generator. Which means that it returns values

# immediately and generate next in next cycle.
# Program example
#
## from cartesianProduct import ALL,CPWithoutWC
## def main():
## for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
## ## do what You want with current value
## .........
## ## go back to for statement and generate new
## if __name__ == "__main__":
## main()
#
"""
Using logical combinations of WC:
1) It's possible to pass on to the function CPWithoutWC
any number of wildcarts after first two parameters, for example:
CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
where ... - is any other wildcart's additional function parameters.
Number of additional WC is not limited.
Function will filter out all combinations, which match any passed on
WC.
It's equal to WC1 | WC2 | .... , where | is python analog of OR
logical operations.
2) To use more complex WC combinations follow these steps
a) First of all create all needed WC
b) Then use operators |, & and braces () to create combinations
required and then pass it on to function
CPWithoutWCEx as the third parameter. Don't use "or" and "and"
python statement, otherwise program will
work improper. First two parameters of this function are the same as
of CPWithoutWC function - set of
elements and CP dimension. An example of what was described above in
command line:
from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
a = WC([ALL,1,ALL])
b = WC([ALL,2,ALL])
c = a & b #filter out all sets which match a and b
for i in CPWithoutWCEx([1,2],3,c) : print i [1, 1, 1]
[2, 2, 2] # all sets where both 1 and 2 are present will be filtered out
d = a | b
for i in CPWithoutWCEx([1,2],3,d) : print i
# returns nothing
for i in CPWithoutWCEx([1,2,3],3,d) : print i [3, 3, 3] a = WC([2,1,ALL])
b = WC([1,2,ALL])
c = WC([ALL,2])
d = ( a | b ) & c
for i in CPWithoutWCEx([1,2],3,d) : print i [1, 1, 1]
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]
[2, 2, 1]
[2, 2, 2] # filters out all combinations which start with [1,2] or [2,1] and end with 2

Number of WC, which are used to form logical combinations is not
limited.
"""
"""
13.02.2006
a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
Their interface is the same as of CPWithoutWCEx and CPWithoutWC
accordingly, except that the third parameter is WC list and
they accept strictly three parameters.

As You can see these functions are very simple because
python is quite flexible => def s(x,y): return x * y
d = [3,2]
s(*d) ## == s(3,2) 6

b)Now WC can take string as parameter, and You can use string
as parameters of functions CPWithoutWC and CPWithoutWC_L
Strings transform into WC according to these rules
1)If first symbol in the string is
alphanumeric (a-z or A-Z or 0-9) or '*'
character the every character of the string will be recognized as
a distinct set element. Examples:
"*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
2)If first character is not (alphanumeric or '*')
it will be treated as a delimitator. Examples:
":a:A:1:*" == ['a','A','1',cp.ALL]
":aA1:*" == ['aA1',cp.ALL]
it's not necessary to write delimitators around the asterics
":aA1*" == ['aA1',cp.ALL]
"%aA%1*" == ['aA','1',cp.ALL]
3)If all non delimit and non asterics character in elements
are digits => they will be treated as numbers.Examples:
"123*" == [1,2,3,cp.ALL]
":12:3*" == [12,3,cp.ALL]
but
":12:a:3*" == ['12','a','3',cp.ALL]
Examples of use: for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b'] for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
#You can mixe strings and lists for wildcarts for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b'] for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):

print i
['abc', 'abc', 'abc']
['abc', 'xyz', 'abc']
['xyz', 'abc', 'abc']
['xyz', 'abc', 'xyz']
['xyz', 'xyz', 'abc']
['xyz', 'xyz', 'xyz']
"""
#-------------------------------------------------------------------------------
class ALL(object):pass
#-------------------------------------------------------------------------------
class NO_ONE(object):pass
#-------------------------------------------------------------------------------
class BFunctor(object):
def __init__(self,func):
self.func = func
def __call__(self,*dt,**mp):
return self.func(*dt,**mp)
@classmethod
def OR(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
@classmethod
def AND(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)
#-------------------------------------------------------------------------------
if len(curr) != 0 and n != 0:
for i in curr:
if n != 1 :
# needed dimension are not reached
# -> we mast tell WC that some other values
# may concatenate in the end of nhead in next recursion levels
# but if WC is ended with asterics (ALL), than dosn't matter
# so i use special walue NO_ONE to resolve this problem
# if WC with final asterics like [1,2,3,ALL] are matched nhead
=>
# they matched nhead + [NO_ONE] to
# but if WC is like [1,ALL,2,3] => they dont match
[1,2,3,NO_ONE] =>
# they don't prevent to generate [1,2,3,4] on next recursion
level
if False == x:
if n == 1 : yield nhead
else:
for i in _genAll(nhead,n - 1,WCF,curr):
yield i
elif n == 0 :
#-------------------------------------------------------------------------------
class WC(object):
def __init__(self,wc):
self.wc = wc
self.transformWC()
self.num_els = 0
self.compress()
self.comphdr = None
self.ln = len(self.wc)

#-----------------------------------------------------------------------------
def transformWC(self):
if self.wc.__class__ not in (str,unicode) : return
if len(self.wc) == 0 : return
if self.wc.isalnum() or self.wc == "*":
wc = self.wc
else:
wc = self.wc[1:].split(self.wc)
nwc = []
for i in wc:
if i == '*' : nwc.append(ALL)
elif '*' in i :
for j in i.split('*'):
if j : nwc.append(j)
nwc.append(ALL)
del nwc[-1]
else : nwc.append(i)
#check if all elements are numbers or *
allnum = True
for i in nwc:
if i is ALL : continue
try : int(i)
except :
allnum = False
break
if allnum:
for i,j in enumerate(nwc):
if j is not ALL:
nwc[i] = int(j)
self.wc = nwc

#-----------------------------------------------------------------------------
return

#-----------------------------------------------------------------------------
def compress(self):
"delete dublicated * values"
if len(self.wc) == 0 : return
wc_ = self.wc[:1]
for i in self.wc[1:]:
if i == ALL and i == wc_[-1] : continue
wc_.append(i)
self.wc = wc_

#-----------------------------------------------------------------------------
def matchExact(self,hd,pos = 0):
if pos == len(self.wc) : return len(hd) == 0
if self.wc[pos] == ALL :
if pos + 1 == len(self.wc) : return True
vl = self.wc[pos + 1]
cpos = -1
while True:
try : cpos = hd.index(vl,cpos + 1)
except : return False
if self.matchExact(hd[cpos + 1:],pos + 2) : return True
else:
if len(hd) == 0 : return False
if hd != self.wc[pos] : return False
return self.matchExact(hd[1:],pos + 1)

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)

#-----------------------------------------------------------------------------
def __call__(self,hd,st):
return self.matchExact(hd)
#-------------------------------------------------------------------------------
def CPWithoutWCEx(set,n,wc):
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC(set,n,*dt):
if len(dt) == 0 :
wc = lambda hd,st : True
else:
wc = WC(dt)
#print wc.wc
for i in dt[1:]:
wc = wc | WC(i)
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC_L(set,n,WCs):
for i in CPWithoutWC(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def CPWithoutWCEx_L(set,n,WCs):
for i in CPWithoutWCEx(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def main():
for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
#-------------------------------------------------------------------------------
if __name__ == "__main__" : main()
#-------------------------------------------------------------------------------

Mar 16 '06 #1
78 4323 wk*******@cox.net wrote:
The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

What is your goal? You want to learn or to cause a flamewar? ;-)

Anyway, I found the problem entertaining, so here you go, here is my
Haskell code. It could be shorter if I didn't care about performance and
wrote in specification style. It's not very efficient either, because it
will generate all lists matching the given patterns.

In GHCi you can test it by:

\$ ghci
:l WildCartesian.hs
test

I apologise for the lack of comments.

----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
module WildCartesian where

import Data.Set (Set)
import qualified Data.Set as Set
import Control.Exception (assert)
import Maybe
import List

data Pat a = All | Lit a deriving Show

generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]]
generateMatching 0 _ [] = [[]]
generateMatching 0 _ (_:_) = []
generateMatching len alphabet (Lit x : ps)
| x `Set.member` alphabet =
[ (x : xs) | xs <- generateMatching (len - 1) alphabet ps ]
| otherwise =
[ ]
generateMatching len alphabet (All : ps) =
[ (x : xs)
| x <- Set.toList alphabet
, xs <- unionSorted
(generateMatching (len - 1) alphabet ps)
(generateMatching (len - 1) alphabet (All : ps)) ]
`unionSorted`
generateMatching len alphabet ps
generateMatching _ _ [] = []

generateNotMatching :: (Ord a) => [a] -> Int -> [[Pat a]] -> [[a]]
generateNotMatching alphabet len patterns =
generateMatching len alphaSet [All]
`subtractSorted`
foldr unionSorted []
(map (generateMatching len alphaSet . simplifyPat) patterns)
where
alphaSet = Set.fromList alphabet

simplifyPat (All : All : ps) = simplifyPat (All : ps)
simplifyPat (p : ps) = p : simplifyPat ps
simplifyPat [] = []

joinSorted :: Ord a => [a] -> [a] -> [(Maybe a, Maybe a)]
joinSorted (x1:x2:_) _ | assert (x1 < x2) False = undefined
joinSorted _ (y1:y2:_) | assert (y1 < y2) False = undefined
joinSorted (x:xs) (y:ys) =
case x `compare` y of
LT -> (Just x, Nothing) : joinSorted xs (y:ys)
EQ -> (Just x, Just y) : joinSorted xs ys
GT -> (Nothing, Just y) : joinSorted (x:xs) ys
joinSorted (x:xs) [] = (Just x, Nothing) : joinSorted xs []
joinSorted [] (y:ys) = (Nothing, Just y) : joinSorted [] ys
joinSorted [] [] = []

unionSorted :: Ord a => [a] -> [a] -> [a]
unionSorted xs ys = catMaybes (map (uncurry mplus) (joinSorted xs ys))

subtractSorted :: Ord a => [a] -> [a] -> [a]
subtractSorted xs ys = catMaybes (map f (joinSorted xs ys))
where
f (Just x, Nothing) = Just x
f _ = Nothing

test = do
t [1,2] 3 [[Lit 1, All, Lit 2]]
t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']]
t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]]
where
t a b c = do
putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c]))
mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c)
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----

Best regards
Tomasz

--
I am searching for programmers who are good at least in
(Haskell || ML) && (Linux || FreeBSD || math)
for work in Warsaw, Poland
Mar 16 '06 #2
Tomasz Zielonka wrote:
putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c]))

Minor correction: it should be "generateNotMatching".

Best regards
Tomasz

--
I am searching for programmers who are good at least in
(Haskell || ML) && (Linux || FreeBSD || math)
for work in Warsaw, Poland
Mar 16 '06 #3
Major correction (missing case):

Tomasz Zielonka wrote:
generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]]
generateMatching 0 _ [] = [[]] generateMatching 0 alphabet (All:ps) = generateMatching 0 alphabet ps generateMatching 0 _ (_:_) = []

Best regards
Tomasz

--
I am searching for programmers who are good at least in
(Haskell || ML) && (Linux || FreeBSD || math)
for work in Warsaw, Poland
Mar 16 '06 #4
Flame war? Absolutely not. My reason is to learn. There are many sites
dedicated to reasonably objective comparisons between languages. Here
are two examples:

http://www.smallscript.org/Language%...on%20Chart.asp
http://www.jvoegele.com/software/langcomp.html

The wildcard exclusion problem is interesting enough to have many
distinct, elegant solutions in as many languages. It would be
interesting to see if they converge to roughly the same solution or if
there are essential differences. And your code is a crash course in
Haskell! Tossing aside the 'flame war' inquiry your code response is my
only goal. I hope many others find the problem and responses as
fascinating as I do.

Mar 16 '06 #5
The point is to submit elegant code that showcases the features of each
language. And the problem is, just to clarify, given a set WC of
wildcards in any logical combination, and if WC(S^n) is the set all s
in S^n that matches the wildcards, then efficiently generate the
complement S^n\WC(S^n). You are free to restate the problem in any
equivalent way.

Mar 16 '06 #6
wk*******@cox.net schreef:
There are many sites
dedicated to reasonably objective comparisons between languages. Here
are two examples:

http://www.smallscript.org/Language%...on%20Chart.asp
http://www.jvoegele.com/software/langcomp.html

http://shootout.alioth.debian.org/

--
Affijn, Ruud

"Gewoon is een tijger."
echo 014C8A26C5DB87DBE85A93DBF |perl -pe 'tr/0-9A-F/JunkshoP cartel,/'
Mar 16 '06 #7
Without much testing. Common Lisp

(defun all-lists (list length)
(unless (zerop length)
(if (= length 1) (mapcar #'list list)
(loop for elt in list
nconc
(mapcar (lambda (rest)
(cons elt rest))
(loop for rest on list
nconc (all-lists rest (1- length))))))))

(defun cp-without-wc (source-list &rest patterns)
(let* ((length (length (first patterns)))
(all-lists (all-lists source-list length)))
(dolist (pattern patterns)
(setf all-lists
(set-difference all-lists
(mapcar (lambda (insertion)
(let ((cp (copy-list pattern)))
(loop for place on cp
when (eql :any (car place)) do
(setf (car place) (pop insertion)))
cp))
(all-lists source-list (count :any pattern)))
:test #'equal)))
(remove-duplicates all-lists :test #'equal)))

CL-USER 22 > (cp-without-wc '(a b) '(a :any b) '(b :any a))
((A A A) (A B A) (B A B) (B B B))

CL-USER 23 > (cp-without-wc '(abc xyz) '(abc :any xyz))
((XYZ XYZ XYZ) (XYZ XYZ ABC) (XYZ ABC XYZ) (XYZ ABC ABC) (ABC XYZ ABC) (ABC ABC ABC))

CL-USER 24 > (cp-without-wc '(a b) '(a :any :any))
((B B B) (B B A) (B A B) (B A A))

CL-USER 25 > (cp-without-wc '(a b) '(a :any :any) '(b :any :any))
NIL

CL-USER 26 > (cp-without-wc '(a b) '(:any :any b))
((B B A) (B A A) (A B A) (A A A))

CL-USER 27 >

Mar 16 '06 #8
What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

Mar 16 '06 #9
here is my version of the same.

REPL output:

CL-USER> (tests)
set = (1 2)
n = 3
patterns = ((1 ANY 2))
-----------------------
(1 1 1)
(1 2 1)
(2 1 1)
(2 1 2)
(2 2 1)
(2 2 2)
set = (A B)
n = 3
patterns = ((A ANY B) (B ANY A))
-----------------------
(A A A)
(A B A)
(B A B)
(B B B)
set = (1 2)
n = 3
patterns = ((1 ANY 2) (2 ANY 1))
-----------------------
(1 1 1)
(1 2 1)
(2 1 2)
(2 2 2)
NIL
CL-USER>

source:

;;;; cartesian products minus wildcard patterns per:
;;;;
;;;; >Newsgroups: comp.lang.lisp, etc...
;;;; >Subject: Programming challenge: wildcard exclusion in cartesian
products
;;;; >Date: 16 Mar 2006 03:14:23 -0800
;;;;
;;;;

(defun show-me (x) (format t "~A~%" x))

(defun set^n (fn set n &optional acc)
"call `fn' on each permutation of `set' raised to the `n' power"
(if (<= n 0)
(funcall fn (reverse acc))
(dolist (e set)
(set^n fn set (- n 1) (cons e acc)))))

;; test set^n by printing and visually inspecting the result
(defun pr-set^n (set n) (set^n #'show-me set n))

;; curry `set^n' so that `fn' is the only parameter
(defun set^n-gen (set n)
(lambda (fn) (set^n fn set n)))

(defun mk-matchl-p (pat-list)
"return a function that tests a value against the patterns in
`pat-list'"
(labels ((matchp (pat val)
(cond ((null pat) t)
((or (eq (car pat) (car val))
(eq (car pat) :any))
(matchp (cdr pat) (cdr val))))))
(lambda (val)
"predicate: return true if val matches any pattern in `pat-list'"
(dolist (p pat-list)
(if (matchp p val)
(return t))))))

(defun not-fp (f-pred)
"return the complement of predicate `f-pred'"
(lambda (x) (not (funcall f-pred x))))

;; f-gen is a generator of the form returned by set^n-gen
(defun accumulate-if (f-gen f-pred)
"accumulate values generated by f-gen that satisfy f-pred"
(let (acc)
(funcall f-gen (lambda (x) (if (funcall f-pred x) (push x acc))))
(nreverse acc)))

;; `pr-set^n-withoutWC' is the lisp equivalent (more or less) of
;; python code:
;; >>> for i in cp.CPWithoutWC(x,y,z): print i
(defun pr-set^n-withoutWC (set n pat-list)
(format t "~%~%set = ~A~%n = ~A~%patterns = ~A~%~A~%"
set n pat-list "-----------------------")
(dolist (e (accumulate-if (set^n-gen set n)
(not-fp (mk-matchl-p pat-list))))
(format t "~A~%" e)))

(defun tests ()
"generate test output per the original problem examples"
(pr-set^n-withoutWC '(1 2) 3 '((1 :any 2)))
(pr-set^n-withoutWC '(a b) 3 '((a :any b) (b :any a)))
(pr-set^n-withoutWC '(1 2) 3 '((1 :any 2) (2 :any 1))))

Mar 16 '06 #10
NOTE: I am a lisp newbie. I'm sure our resident lisp experts can
create much better (both faster, shorter and clearer) solutions than
the one above.

Even I could have created something shorter but I thought it would be
fun to apply the "utility function" approach in decomposing the
problem.

--jfc

Mar 16 '06 #11
in k:

cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;&x=-1;:[;!c]]}'p}

examples:

cp[2;3;,0 -1 1]
(0 0 0
0 1 0
1 0 0
1 0 1
1 1 0
1 1 1)

cp[2;3;(0 -1 1;1 -1 0)]
(0 0 0
0 1 0
1 0 1
1 1 1)

cp[2;3;(0 -1 1;1 -1 1)]
(0 0 0
0 1 0
1 0 0
1 1 0)

arguments of cp:

c = cardinality of the input set
n = power
p = list of patterns (-1 = wildcard)

the algorithm directly computes the target set. in other words,
it does not generate the set, then filter the matches from the
target.

modifying cp to accept s instead of the cardinality of s,
patterns expressed in terms of elements of s, &c. adds nothing
of interest to the problem.

<wk*******@cox.net> wrote in message news:11**********************@v46g2000cwv.googlegr oups.com...
The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in
a CAS like maple or mathematica.

#-------------------------------------------------------------------------------
# Short algorithm description
# using function _genAll the program generates
# cartesian product without sets, which match
# some wildcarts
# Sets generation uses recursion ->
# first of all sets will be generated with dimension 1 and than
filtered through wildcarts
# then sets will be generated with dimension 2 and filtered again
# until the required set dimension is reached
# Program avoids explicit generation of some part of CP sets
# if the end of whildcart is asterics (*) and if the first part of
whildcart (without astrics)
# matches current set => then this set will be filtered out and won't
be used in
# higher dimension set generation
# example *,1,*,2,* [1,2] dim = 10
# by dimension 2 only arrays [1,1],[2,1],[2,2] are will be generated
# => array [1,2] won't be used in next recursion levels
#-------------------------------------------------------------------------------
# To obtaine result use function
# CPWithoutWC first parameter is a list of any elements
(char,int,string,class exemplar ,.... any type)
# secont param is CP dimension
# other parameters are wildcarts => lists with any values then may
include
# special value ALL - asterics equivalent
#Example of usage: command line
# >>> import cartesianProduct as cp
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 1]
# [2, 1, 2]
# [2, 2, 1]
# [2, 2, 2]
# >>> for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
# print i
# ['a', 'a', 'a']
# ['a', 'b', 'a']
# ['b', 'a', 'b']
# ['b', 'b', 'b']
# >>> for i in cp.CPWithoutWC([1,2],3,[1,cp.ALL,2],[2,cp.ALL,1]):
# print i
# [1, 1, 1]
# [1, 2, 1]
# [2, 1, 2]
# [2, 2, 2]
# >>>
# >>> for i in cp.CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1]):
# print i
## execute immediately
# >>>
# if You don't want to print cp. before ALL and CPWithoutWC use import
like this:
# from cartesianProduct import ALL,CPWithoutWC
# CPWithoutWC is a python generator. Which means that it returns values

# immediately and generate next in next cycle.
# Program example
#
## from cartesianProduct import ALL,CPWithoutWC
## def main():
## for i in
cp.CPWithoutWC(['a','b'],3,['a',cp.ALL,'b'],['b',cp.ALL,'a']):
## ## do what You want with current value
## .........
## ## go back to for statement and generate new
## if __name__ == "__main__":
## main()
#
"""
Using logical combinations of WC:
1) It's possible to pass on to the function CPWithoutWC
any number of wildcarts after first two parameters, for example:
CPWithoutWC([1,2],121212,[1,cp.ALL],[2,cp.ALL,1],...)
where ... - is any other wildcart's additional function parameters.
Number of additional WC is not limited.
Function will filter out all combinations, which match any passed on
WC.
It's equal to WC1 | WC2 | .... , where | is python analog of OR
logical operations.
2) To use more complex WC combinations follow these steps
a) First of all create all needed WC
b) Then use operators |, & and braces () to create combinations
required and then pass it on to function
CPWithoutWCEx as the third parameter. Don't use "or" and "and"
python statement, otherwise program will
work improper. First two parameters of this function are the same as
of CPWithoutWC function - set of
elements and CP dimension. An example of what was described above in
command line:
>>> from cartesianProduct import ALL,CPWithoutWC,CPWithoutWCEx,WC
>>> a = WC([ALL,1,ALL])
>>> b = WC([ALL,2,ALL])
>>> c = a & b #filter out all sets which match a and b
>>> for i in CPWithoutWCEx([1,2],3,c) : print i [1, 1, 1]
[2, 2, 2] >>> # all sets where both 1 and 2 are present will be filtered out
>>> d = a | b
>>> for i in CPWithoutWCEx([1,2],3,d) : print i
>>> # returns nothing
>>> for i in CPWithoutWCEx([1,2,3],3,d) : print i [3, 3, 3] >>> a = WC([2,1,ALL])
>>> b = WC([1,2,ALL])
>>> c = WC([ALL,2])
>>> d = ( a | b ) & c
>>> for i in CPWithoutWCEx([1,2],3,d) : print i [1, 1, 1]
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]
[2, 2, 1]
[2, 2, 2] >>> # filters out all combinations which start with [1,2] or [2,1] and end with 2

Number of WC, which are used to form logical combinations is not
limited.
"""
"""
13.02.2006
a)Two new function - CPWithoutWCEx_L and CPWithoutWC_L are added.
Their interface is the same as of CPWithoutWCEx and CPWithoutWC
accordingly, except that the third parameter is WC list and
they accept strictly three parameters.

As You can see these functions are very simple because
python is quite flexible => >>> def s(x,y): return x * y
>>> d = [3,2]
>>> s(*d) ## == s(3,2) 6

b)Now WC can take string as parameter, and You can use string
as parameters of functions CPWithoutWC and CPWithoutWC_L
Strings transform into WC according to these rules
1)If first symbol in the string is
alphanumeric (a-z or A-Z or 0-9) or '*'
character the every character of the string will be recognized as
a distinct set element. Examples:
"*A*b3*%^('" == [cp.ALL,'A',cp.ALL.'b','3',cp.ALL,'%','(',"'"]
2)If first character is not (alphanumeric or '*')
it will be treated as a delimitator. Examples:
":a:A:1:*" == ['a','A','1',cp.ALL]
":aA1:*" == ['aA1',cp.ALL]
it's not necessary to write delimitators around the asterics
":aA1*" == ['aA1',cp.ALL]
"%aA%1*" == ['aA','1',cp.ALL]
3)If all non delimit and non asterics character in elements
are digits => they will be treated as numbers.Examples:
"123*" == [1,2,3,cp.ALL]
":12:3*" == [12,3,cp.ALL]
but
":12:a:3*" == ['12','a','3',cp.ALL]
Examples of use: for i in cp.CPWithoutWC(['a','b'],3,'a*b','b*a'): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b'] for i in cp.CPWithoutWC_L(['a','b'],3,['a*b','b*a']): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b']
#You can mixe strings and lists for wildcarts for i in cp.CPWithoutWC_L(['a','b'],3,['a*b',['b',cp.ALL,'a']]): print i
['a', 'a', 'a']
['a', 'b', 'a']
['b', 'a', 'b']
['b', 'b', 'b'] for i in cp.CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):

print i
['abc', 'abc', 'abc']
['abc', 'xyz', 'abc']
['xyz', 'abc', 'abc']
['xyz', 'abc', 'xyz']
['xyz', 'xyz', 'abc']
['xyz', 'xyz', 'xyz']
"""
#-------------------------------------------------------------------------------
class ALL(object):pass
#-------------------------------------------------------------------------------
class NO_ONE(object):pass
#-------------------------------------------------------------------------------
class BFunctor(object):
def __init__(self,func):
self.func = func
def __call__(self,*dt,**mp):
return self.func(*dt,**mp)
@classmethod
def OR(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) | y(*dt,**mp))
@classmethod
def AND(cls,x,y):
return cls(lambda *dt,**mp : x(*dt,**mp) & y(*dt,**mp))

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)
#-------------------------------------------------------------------------------
if len(curr) != 0 and n != 0:
for i in curr:
if n != 1 :
# needed dimension are not reached
# -> we mast tell WC that some other values
# may concatenate in the end of nhead in next recursion levels
# but if WC is ended with asterics (ALL), than dosn't matter
# so i use special walue NO_ONE to resolve this problem
# if WC with final asterics like [1,2,3,ALL] are matched nhead
=>
# they matched nhead + [NO_ONE] to
# but if WC is like [1,ALL,2,3] => they dont match
[1,2,3,NO_ONE] =>
# they don't prevent to generate [1,2,3,4] on next recursion
level
if False == x:
if n == 1 : yield nhead
else:
for i in _genAll(nhead,n - 1,WCF,curr):
yield i
elif n == 0 :
#-------------------------------------------------------------------------------
class WC(object):
def __init__(self,wc):
self.wc = wc
self.transformWC()
self.num_els = 0
self.compress()
self.comphdr = None
self.ln = len(self.wc)

#-----------------------------------------------------------------------------
def transformWC(self):
if self.wc.__class__ not in (str,unicode) : return
if len(self.wc) == 0 : return
if self.wc.isalnum() or self.wc == "*":
wc = self.wc
else:
wc = self.wc[1:].split(self.wc)
nwc = []
for i in wc:
if i == '*' : nwc.append(ALL)
elif '*' in i :
for j in i.split('*'):
if j : nwc.append(j)
nwc.append(ALL)
del nwc[-1]
else : nwc.append(i)
#check if all elements are numbers or *
allnum = True
for i in nwc:
if i is ALL : continue
try : int(i)
except :
allnum = False
break
if allnum:
for i,j in enumerate(nwc):
if j is not ALL:
nwc[i] = int(j)
self.wc = nwc

#-----------------------------------------------------------------------------
return

#-----------------------------------------------------------------------------
def compress(self):
"delete dublicated * values"
if len(self.wc) == 0 : return
wc_ = self.wc[:1]
for i in self.wc[1:]:
if i == ALL and i == wc_[-1] : continue
wc_.append(i)
self.wc = wc_

#-----------------------------------------------------------------------------
def matchExact(self,hd,pos = 0):
if pos == len(self.wc) : return len(hd) == 0
if self.wc[pos] == ALL :
if pos + 1 == len(self.wc) : return True
vl = self.wc[pos + 1]
cpos = -1
while True:
try : cpos = hd.index(vl,cpos + 1)
except : return False
if self.matchExact(hd[cpos + 1:],pos + 2) : return True
else:
if len(hd) == 0 : return False
if hd != self.wc[pos] : return False
return self.matchExact(hd[1:],pos + 1)

#-----------------------------------------------------------------------------
def __or__(self,x):
return BFunctor.OR(self,x)

#-----------------------------------------------------------------------------
def __and__(self,x):
return BFunctor.AND(self,x)

#-----------------------------------------------------------------------------
def __call__(self,hd,st):
return self.matchExact(hd)
#-------------------------------------------------------------------------------
def CPWithoutWCEx(set,n,wc):
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC(set,n,*dt):
if len(dt) == 0 :
wc = lambda hd,st : True
else:
wc = WC(dt)
#print wc.wc
for i in dt[1:]:
wc = wc | WC(i)
for i in _genAll([],n,wc,set) :
yield i
#-------------------------------------------------------------------------------
def CPWithoutWC_L(set,n,WCs):
for i in CPWithoutWC(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def CPWithoutWCEx_L(set,n,WCs):
for i in CPWithoutWCEx(set,n,*WCs):
yield i
#-------------------------------------------------------------------------------
def main():
for i in CPWithoutWC_L(['abc','xyz'],3,[':abc*xyz']):
print i
#-------------------------------------------------------------------------------
if __name__ == "__main__" : main()
#-------------------------------------------------------------------------------

Mar 16 '06 #12
"wk*******@cox.net" <wk*******@cox.net> writes:
The python code below generates a cartesian product subject to any
logical combination of wildcard exclusions. For example, suppose I want
to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes
'*a*b*' and '*c*d*a*'. See below for details.

I'm afraid that different programs in this thread has understood the
asterisk differently: that it matches any single element, or that it
matches any sequence of elements.

--
__("< Marcin Kowalczyk
\__/ qr****@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/
Mar 16 '06 #13
The asterisk '*' matches any sequence of elements, not just one
element. The wildcard '*1*2*' would then correspond to a tuple with a 1
preceding a 2 in any positions. The wc '1*2' would correspond to a
starting 1 and an ending 2 with anything in between. The wc *12* would
correspond to a 1 adjacent to a 2 with the pair in any position.
Possibilities like '*a*a*b*' and '*a*a*a*' of any length are also
allowed. If n is the dimension, then any n-tuple wc is just a point. My
apologies for the confusion.

Mar 17 '06 #14
Is this Haskell implementation what you want? It does the wildcard
matching through a state machine and it essentially threads the
state machine through the cartesian product, switching to the
ordinary cartesian product when possible as an optimisation.
The execution of the state machine is shared by strings with the
same prefix making it reasonably efficient even though the state
machine itself isn't optimised.

If it doesn't work, I'm sure it's only a few typos away...

-- generate strings of length n from alphabet l such that
-- the state machine, with transition function t, is not on
-- a final state (determined by function f) at the
-- end of the string.
-- If the state is ever 'unmatchable' (as determined by u)
-- we just return the cartesian product as no rejection
-- can take place.
generate f u t s 0 l = if f s then [] else [[]]
generate f u t s n l | u s = sequence (replicate n l)
| otherwise =
[a:b | a <- l, let s' = t s a,
b <- generate f u t s' (n-1) l]

-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

-- This is the transition function for our machine.
transition pat a = pat >>= d a where
-- Brzozowski derivative
d a [] = []
d a p@('*':pat) = p:d a pat
d a (p:pat) | a==p = [pat]
| otherwise = []

-- A terminal state is one that matches the null string
terminal p = or \$ map terminal' p where
terminal' "" = True
terminal' ('*':p) = terminal' p
terminal' _ = False

run n alphabet pat =
generate terminal null transition [pat] n alphabet

test = run 3 "abc" "aa*a"

Mar 17 '06 #15
wk*******@cox.net wrote:
What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

Another attempt. I have made no special attempt to create an
exclusion language, just used an anonymous lambda predicate.

(defclass odometer ()
((base :initform 0 :accessor base)
(meter :initform nil :accessor meter)
(n-digits :initarg :n-digits :accessor n-digits)
(digit-set :initarg :digit-set :accessor digit-set)))

(defmethod initialize-instance :after ((obj odometer) &rest initargs)
(setf (base obj) (length (digit-set obj))
(meter obj) (make-array (n-digits obj) :initial-element 0)
(digit-set obj) (coerce (digit-set obj) 'vector)))

(defun inc-odometer (odometer)
(loop with carry = 1
for i from (1- (n-digits odometer)) downto 0
for digit = (incf (aref (meter odometer) i) carry)
if (= digit (base odometer)) do
(setf (aref (meter odometer) i) 0)
(setf carry 1)
else do
(setf carry 0)
while (not (zerop carry))))

(defun zero-meter-p (odometer)
(every #'zerop (meter odometer)))

(defmethod next-set ((obj odometer))
(prog1 (map 'list (lambda (digit)
(aref (digit-set obj) digit))
(meter obj))
(inc-odometer obj)))

(defclass cs-with-wc (odometer)
((exclusion :initarg :exclusion :accessor exclusion)
(at-end :initform nil :accessor at-end)))

(defmethod next-set ((obj odometer))
(tagbody
:next
(unless (at-end obj)
(let ((set (call-next-method)))
(when (zero-meter-p obj) (setf (at-end obj) t))
(if (not (funcall (exclusion obj) set))
(return-from next-set set)
(go :next))))))

(defun print-all-cs (set length exclusion)
(let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set
:exclusion exclusion)))
(loop for set = (next-set cs-with-wc)
while set do (print set))))

CL-USER 134 > (cs-with-wc '(a b) 3 (lambda (set)
(destructuring-bind (x y z)
set
(or (and (eql x 'a) (eql z 'b))
(and (eql x 'b) (eql z 'a))))))

(A A A)
(A B A)
(B A B)
(B B B)
NIL

CL-USER 135 > (cs-with-wc '(a b) 3 (lambda (set)
(eql (second set) 'a)))

(A B A)
(A B B)
(B B A)
(B B B)
NIL

CL-USER 136 > (cs-with-wc '(abc xyz) 3 (lambda (set)
(and (eql (first set) 'abc)
(eql (third set) 'xyz))))

(ABC ABC ABC)
(ABC XYZ ABC)
(XYZ ABC ABC)
(XYZ ABC XYZ)
(XYZ XYZ ABC)
(XYZ XYZ XYZ)
NIL
Mar 17 '06 #16
Oops, problems cutting an pasting, should be,

(defclass odometer ()
((base :initform 0 :accessor base)
(meter :initform nil :accessor meter)
(n-digits :initarg :n-digits :accessor n-digits)
(digit-set :initarg :digit-set :accessor digit-set)))

(defmethod initialize-instance :after ((obj odometer) &rest initargs)
(setf (base obj) (length (digit-set obj))
(meter obj) (make-array (n-digits obj) :initial-element 0)
(digit-set obj) (coerce (digit-set obj) 'vector)))

(defun inc-odometer (odometer)
(loop with carry = 1
for i from (1- (n-digits odometer)) downto 0
for digit = (incf (aref (meter odometer) i) carry)
if (= digit (base odometer)) do
(setf (aref (meter odometer) i) 0)
(setf carry 1)
else do
(setf carry 0)
while (not (zerop carry))))

(defun zero-meter-p (odometer)
(every #'zerop (meter odometer)))

(defmethod next-set ((obj odometer))
(prog1 (map 'list (lambda (digit)
(aref (digit-set obj) digit))
(meter obj))
(inc-odometer obj)))

(defclass cs-with-wc (odometer)
((exclusion :initarg :exclusion :accessor exclusion)
(at-end :initform nil :accessor at-end)))

(defmethod next-set ((obj cs-with-wc))
(tagbody
:next
(unless (at-end obj)
(let ((set (call-next-method)))
(when (zero-meter-p obj) (setf (at-end obj) t))
(if (not (funcall (exclusion obj) set))
(return-from next-set set)
(go :next))))))

(defun print-all-cs (set length exclusion)
(let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set
:exclusion exclusion)))
(loop for set = (next-set cs-with-wc)
while set do (print set))))

CL-USER 7 > (print-all-cs '(a b) 3 (lambda (set)
(destructuring-bind (x y z)
set
(or (and (eql x 'a) (eql z 'b))
(and (eql x 'b) (eql z 'a))))))

(A A A)
(A B A)
(B A B)
(B B B)
NIL

CL-USER 8 > (print-all-cs '(abc xyz) 3 (lambda (set)
(and (eql (first set) 'abc)
(eql (third set) 'xyz))))

(ABC ABC ABC)
(ABC XYZ ABC)
(XYZ ABC ABC)
(XYZ ABC XYZ)
(XYZ XYZ ABC)
(XYZ XYZ XYZ)
NIL

CL-USER 9 >
Mar 17 '06 #17
-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded, so the list
[wc1..wcn] is to correspond generating all tuples matching not(wc1 and
... and wcn). Maybe you're already doing that. The wc's themselves
could be logical statements among the 'primitive' wc's. That's why I
named it the 'wildcard exclusion problem'. It's a lot easier to specify
a list of simpler wc's than create a long logical expression.

Thanks to all who are making this an interesting thread.

Mar 17 '06 #18
Heh, here's a Prolog version:

================================================== ========

gen( _, 0, [] ) :- !.
gen( S, N, [H | T] ) :- member( H, S ), M is N - 1, gen( S, M, T ).

================================================== ========

Yep, that's it :)))

Here's how to test it:

================================================== ========

1 ?- gen([a, b, c], 3, X), print(X), nl, fail.
[a, a, a]
[a, a, b]
[a, a, c]
[a, b, a]
[a, b, b]
[a, b, c]
[a, c, a]
[a, c, b]
[a, c, c]
[b, a, a]
[b, a, b]
[b, a, c]
[b, b, a]
[b, b, b]
[b, b, c]
[b, c, a]
[b, c, b]
[b, c, c]
[c, a, a]
[c, a, b]
[c, a, c]
[c, b, a]
[c, b, b]
[c, b, c]
[c, c, a]
[c, c, b]
[c, c, c]

No
2 ?- gen([a, b, c], 3, X), not(member(X, [[a, _, _], [_, b, _], [_, _,
c]])), print(X), nl, fail.
[b, a, a]
[b, a, b]
[b, c, a]
[b, c, b]
[c, a, a]
[c, a, b]
[c, c, a]
[c, c, b]

No

================================================== ========

Mar 17 '06 #19
Dan Piponi wrote:
Is this Haskell implementation what you want? It does the wildcard
matching through a state machine and it essentially threads the
state machine through the cartesian product, switching to the
ordinary cartesian product when possible as an optimisation.
The execution of the state machine is shared by strings with the
same prefix making it reasonably efficient even though the state
machine itself isn't optimised.

I've implemented the same concept yesterday evening:

----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
module WildCartesian where

import List

data Pat a = All | Lit a deriving (Show, Eq)

advancePattern :: Eq a => a -> [Pat a] -> [[Pat a]]
advancePattern y (Lit x : ps)
| x == y = [ps]
| otherwise = []
advancePattern y (All : ps) = [All : ps] ++ [ps] ++ advancePattern y ps

generateNotMatching :: Eq a => [a] -> Int -> [[Pat a]] -> [[a]]
generateNotMatching alphabet = gen []
where
gen _ n pats
| any (\ps -> all (== All) ps && (not (null ps) || n == 0)) pats = []
gen acc 0 _
= [reverse acc]
gen acc n pats
= [ w | x <- alphabet
, let pats' = [ p' | p <- pats, p' <- advancePattern x p ]
, w <- gen (x : acc) (n - 1) pats' ]

test :: IO ()
test = do
t [1,2] 3 [[Lit 1, All, Lit 2]]
t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']]
t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]]
where
t a b c = do
putStrLn (concat (intersperse " " ["generateNotMatching", show a, show b, show c]))
mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c)
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----

Best regards
Tomasz

--
I am searching for programmers who are good at least in
(Haskell || ML) && (Linux || FreeBSD || math)
for work in Warsaw, Poland
Mar 17 '06 #20
wk*******@cox.net wrote:
-- The states are lists of regular expressions
-- where [a,b,..] means match a or b or...

I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded, so the list
[wc1..wcn] is to correspond generating all tuples matching not(wc1 and
.. and wcn). Maybe you're already doing that. The wc's themselves
could be logical statements among the 'primitive' wc's. That's why I
named it the 'wildcard exclusion problem'. It's a lot easier to specify
a list of simpler wc's than create a long logical expression.

I missed "any logical combination" :-(

It would be quite easy to fix my first program, but I don't have the
time to do it right now.

Best regards
Tomasz

--
I am searching for programmers who are good at least in
(Haskell || ML) && (Linux || FreeBSD || math)
for work in Warsaw, Poland
Mar 17 '06 #21
It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

Mar 17 '06 #23
wk*******@cox.net said:
I haven't run or studied your program yet myself but what I had in mind
was that the list of wc's are *all* to be excluded

I think it already does what you want. You might want to change

run n alphabet pat =
generate terminal null transition [pat] n alphabet to

to

run n alphabet pat =
generate terminal null transition pat n alphabet

to allow lists of patterns where the patterns in a list are ORed
together.

Mar 17 '06 #24
wk*******@cox.net wrote:
It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

Oops...missed that part.

It took me a while to study the exchange on this topic more thoroughly,
and I now fully appreciate the fact that the problem calls for a much
more sophisticated approach.

Sorry for the hasty shot, I'll give it another shortly.

Cheers,

Dinko

Mar 17 '06 #25
sa wrote:
in k:

cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;&x=-1;:[;!c]]}'p}
That one goes a long way as a proof of eg evolution theory, you know,
monkeys reproducing shakespeare with a typewriter k-board and all that :)

examples:

cp[2;3;,0 -1 1]
(0 0 0
0 1 0
1 0 0
1 0 1
1 1 0
1 1 1)

cp[2;3;(0 -1 1;1 -1 0)]
(0 0 0
0 1 0
1 0 1
1 1 1)

cp[2;3;(0 -1 1;1 -1 1)]
(0 0 0
0 1 0
1 0 0
1 1 0)

arguments of cp:

c = cardinality of the input set
n = power
p = list of patterns (-1 = wildcard)

the algorithm directly computes the target set. in other words,
it does not generate the set, then filter the matches from the
target.

modifying cp to accept s instead of the cardinality of s,
patterns expressed in terms of elements of s, &c. adds nothing
of interest to the problem.

Mar 17 '06 #26
Tomasz Zielonka said:
I've implemented the same concept yesterday evening...

It's uncanny reading such similar code coming from another person!

Mar 17 '06 #27

wk*******@cox.net wrote:
What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

Another attempt. I have made no special attempt to create an
exclusion language, just used an anonymous lambda predicate.

FWIW, here's my Q-and-D pattern matcher (only partially tested).

(defun match(list pattern &optional (test #'eql))
"Match a list of atoms against a pattern list
using :all as a 0-to-many wildcard, :single as a
1-to-1 wildcard, a list of elements or a single
element to match a specific place. Optional
argument test for comparing elements (default eql).

Returns: T if match is made, NIL otherwise.

Examples: (match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :all)) => T
(match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :single)) =>
NIL"
(let ((current (first pattern))
(next-pattern (rest pattern))
(candidate (first list)))
(cond ((and (null pattern) (null list))
t)
((and (eq :single current) candidate)
(match (rest list) next-pattern test))
((eq :all current)
(loop for new-list on list
when (match new-list next-pattern test)
do (return-from match t))
(null next-pattern)) ; last case null remainder
((if(atom current)
(funcall test candidate current)
(member candidate current :test test))
(match (rest list) next-pattern test)))))

--
Geoff

Mar 17 '06 #28
"wk*******@cox.net" <wk*******@cox.net> writes:
What I have in mind is the efficient, <enumerated> generation of the
complement S^n/WC(S^n). A good program should initialize, generate, and
terminate.

T=cartprodex(S,n,WC); //initialize
for all i in T do
what you want with i
test to see if any more
terminate if not

and it should do this without explicitly generating WC and then
complementing. For example, if the cardinality of S is m, and the WC is
just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality
(m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016
while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general
the program should directly generate EX from arbitrary WC. Of course,
in practice the WC should themselves occur in a logically consistent
manner, but let's just assume they're a given.

The following code doesn't build a data structure. It
recurses n deep and those n stack frames contain state that
tracks progress through the product. But it still generates
and tests, taking time proportional to S^n. Is this what you

;; A macro to let you loop over the S^n possibilities
;; without building a big data structure
;; The macro is structured as syntactic sugar on
;; a higher order function
;;
(defmacro cartesian-power ((variable set exponent) &body code)
(let ((body-function (gensym)))
`(flet ((,body-function(,variable),@code))
(cartesian-power-hof ,set ,exponent '() (function ,body-function)))))

;; The standard idea of using recursion to implement a nest
;; of loops of indefinite depth
;;
(defun cartesian-power-hof (set exponent prefix f)
(if (zerop exponent)
(funcall f prefix)
(dolist (item set)
(cartesian-power-hof set
(- exponent 1)
(cons item prefix)
f))))

;; A simple recursive pattern match
;; I haven't thought through the implications
;; I guess that it is exponentially slow on some long
;; patterns
;;
(defun wild-match (pattern data)
(cond ((endp pattern) (endp data))
((endp data) (or (endp pattern)
(equal pattern '(:wild))))
((eql (car pattern) :wild)
(or (null (cdr pattern))
(wild-match (cdr pattern)
data)
(wild-match pattern
(cdr data))))
('literal-pattern
(and (eql (car pattern)
(car data))
(wild-match (cdr pattern)
(cdr data))))))

;; close over a data item to get a function
;; suitable for checking several patterns
(defun match-data (data)
(lambda(pattern)
(wild-match pattern data)))

;; Use the macro and the utilities to count how many are not excluded
(defun count-remainder (set exponent &rest exclusions)
(let ((count 0))
(cartesian-power (item set exponent)
(when (notany (match-data item) exclusions)
(incf count)))
count))

CL-USER> (loop for i from 3 to 10
do (format t "~&~4D~10D" i
(count-remainder '(a b c d e) i '(:wild a :wild b :wild))))
3 112
4 512
5 2304
6 10240
7 45056
8 196608
9 851968
10 3670016

I can see how a pattern such as (a b :wild) would knock out
an element from each of the first two sets so reducing the
task from m^n to (m-1)^2 * m^(n-2).

Also (:wild a :wild) knocks it down from m^n to (m-1)^n

However I can only see the exploitation of (:wild a :wild b
:wild) as a hairy special case. Pick one of n places for the
first a. Pick earlier elements from the set excluding a,
pick later elements from the set excluding b. Add in all the
items with a missing altogether (so b can be used freely).
I cannot see what algorithm exploits the constraints more
generally. Is there a standard technique, page nn of Knuth
or the like? Is that what you are actually wanting to see
coded?

Alan Crowe
Edinburgh
Scotland
Mar 17 '06 #29
wk*******@cox.net wrote:
The wildcard exclusion problem is interesting enough to have many
distinct, elegant solutions in as many languages.

In that case, you should have crossposted to comp.lang.python also.

Your program looks like a dog's breakfast.

Mar 18 '06 #30
wk*******@cox.net wrote:
It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

One advantage of a generator over filtering the full product is that I,
as the user of the generator, am not obligated to iterate over the
entire solution space.

Are there other _practical_ advantages of generators over mapping &
filtering complete sets?

Mar 18 '06 #31
"funkyj" <fu****@gmail.com> writes:
One advantage of a generator over filtering the full product is that I,
as the user of the generator, am not obligated to iterate over the
entire solution space.

Are there other _practical_ advantages of generators over mapping &
filtering complete sets?

Storage. You can iterate over problem spaces much too large to fit in
memory. Also generate + iterate can be faster because of reduced
memory pressure.
Mar 18 '06 #32
Yes, the program is quite a jumble: but it works. And I didn't post to
python newsgroup since I was limited to just 5 newsgroups and didn't
feel like doing multiple postings to multiple newsgroups.

Mar 18 '06 #33
Here is an nice intro to K:

http://www.kuro5hin.org/?op=displays...1/14/22741/791

"This is where K starts to set itself from apart from most of the
common programming languages in use today. You rarely write loops in K
function, returning another function, changing the ways it operates
over its arguments and what it does with it's return values."

How about an interactive loop-like version? Generating the target set
is good for baby test cases but not if the cardinality of the target is
large. Does that make the problem more intersesting?

Mar 18 '06 #34
When I run this I get through ghc I get

C:\Documents and Settings\User\My Documents\wildcard>ghc
"./wc-zielonka.hs"
compilation IS NOT required
C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x1d):Main.c:
undefined refe
rence to `__stginit_ZCMain'
C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x43):Main.c:
undefined refe
rence to `ZCMain_main_closure'
collect2: ld returned 1 exit status

Unless there's a command line option I'm missing?

Mar 18 '06 #35
The cardinality of excluding '*a*b*' from S^n should be
(m-1)^(n-1)*(m+n-1), where m=|S|. For m=5 this becomes 4^(n-1)*(n+4),
and your table fits this formula. As far as generating and testing, an
'ideal' solution would be to 'start from the ground up', as in
excluding length 2 wc, and then length 3, etc, until all wc's have been
excluded. The 'ideal' solution would intrinsically exclude wc's and not
test against a background generation of all of S^n. Does that make
sense?

Mar 18 '06 #36
Nice! How to put it in a loop? I'm totally a newbie to Lisp myself,
just gettng into Graham and Touretzky. Let's create a problem. Suppose
after excluding I want to know if the digits sum to 12, say, like maybe
they're part of a partition. S={0,..6}, S^5, excluding "*1*5*" and
"1*2*3*", say. How would I do that?

Mar 18 '06 #37
wk*******@cox.net schrieb:
"This is where K starts to set itself from apart from most of the
common programming languages in use today. You rarely write loops in K
function, returning another function, changing the ways it operates
over its arguments and what it does with it's return values."

Doesn't sound too different from what closures do. Or lazy parameter
passing.
<rant> I'm not sure whether the K designer actually fits that
description, but there are too many language designers around
reinventing the wheel, arguing whether it should have seven, eight or
thirteen sides... </rant>

Regards,
Jo
Mar 18 '06 #38
OK, a bad case of RTFM. I saved your file as WildCartesian.hs and then

1) command line: ghci WildCartesian.hs
3) command line: test

and it works! But how do I compile it to get a program with command
line arguments? I'm looking through Daume's tutorial right now.

Mar 19 '06 #39
Doug Quale wrote:
"funkyj" <fu****@gmail.com> writes:
One advantage of a generator over filtering the full product is that I,
as the user of the generator, am not obligated to iterate over the
entire solution space.

Are there other _practical_ advantages of generators over mapping &
filtering complete sets?

Storage. You can iterate over problem spaces much too large to fit in
memory. Also generate + iterate can be faster because of reduced
memory pressure.

Hmmm...storage is not an issue in the Prolog version. It generates a
candidate solution, then checks membership in the wildcard set, then
backtracks (backtracking is caused by "fail" in the test goal.) On
backtracking, it effectively "forgets" the last solution, so the memory
is freed up (or at least free to be reclaimed through GC.)

Cheers,

Dinko

Mar 20 '06 #40
wk*******@cox.net wrote:
It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

OK, having read some of the comments so far, I have the feeling that I
may be missing the point in more than one way, so let's set this
straight:

If I understand correctly, for an alphabet S, and a subset W of S*
specified by the wildcards, you expect the enumeration of sequences of
length n to run in Theta( n*|S^n - W| ) instead of Theta( n*|S^n| ).

First, this doesn't seem to hold for your Python program. Try, for
example, S = { a, b, c }, W = { *a*b*, *b*c*, *c*a*, *b*a*, *c*b*,
*a*c* }, with some large values of n. Theta( n*|S^n - W| ) predicts
that the enumeration time should grow linearly with n, as |S^n - W| =
3, but if you take some measurements, you'd notice that it grows faster
than that.

Second, my current bet is that such an improvement in asymptotic
complexity is not possible, if we consider *both* pre-processing of the
wildcard set and subsequent enumeration. Speculation: the time for
building-up a smart structure to speed-up enumeration, together with
the time for enumerating the set using that structure, should sum up to
roughly Theta( n*|S^n| ), even with a really smart algorithm.

Even if you're willing to pay up-front for tighter loop execution
later, and you build a suitable structure for this purpose, you would
have to consider the structure's size, so here's another speculation:
such structure would likely take up Theta( |S^n| ) space in memory, in
the worst case.

I would really appreciate it if you could pour some light into what
you're trying to do exactly, and possibly point out anything that I
might have missed so far.

Cheers,

Dinko

Mar 20 '06 #41
Hi,

You wrote into the Qilang News group with your problem.
This is a solution in 17 lines of Qi for any n-product >= 2.
It falls short of your complete requirement since it uses
generate and then test, rather than interleaving the
two.

(define challenge
Patterns N X -> (filter (/. Y (member Y Patterns)) (n-product N X)))

(define n-product
2 X -> (cartesian-product l X X)
N X -> (cartesian-product c X (n-product (- N 1) X)))

(define cartesian-product
_ [ ] _ -> [ ]
c [X | Y] Z -> (append (map (/. W [X | W]) Z) (cartesian-product c Y
Z))
l [X | Y] Z -> (append (map (/. W [X W]) Z) (cartesian-product l Y
Z)))

(define filter
_ [] -> []
F [X | Y] -> (filter F Y) where (F X)
F [X | Y] -> [X | (filter F Y)])

(define member
_ [] -> false
X [Pattern | _] -> true where (query-prolog [[= Pattern X]])
X [_ | Patterns] -> (member X Patterns))

Notes:

Pattern filtering is done by a unification test within the member
function. You
can do this most easily by calling Qi Prolog using query-prolog.
Here's a test.

(42 -) (n-product 3 [a b c])
[[a a a] [a a b] [a a c] [a b a] [a b b] [a b c] [a c a] [a c b] [a c
c]
[b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c
c]
[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c
c]]

OK, remove all lists beginning [a a ....].

(43-) (challenge [[a a | X]] 3 [a b c])
[[a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a
c]
[b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a
c]
[c b a] [c b b] [c b c] [c c a] [c c b] [c c c]]

Remove all lists beginning with a or b.

(51-) (challenge [[a | X] [b | X]] 3 [a b c])
[[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c
c]]

Mark

Mar 20 '06 #42
I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm

Here's the code, just two simple functions:

void encipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v,z=v,sum=0,delta=0x9E3779B9,
a=k,b=k,c=k,d=k,n=32;

while(n-->0)
{
sum += delta;
y += (z << 4)+a ^ z+sum ^ (z >> 5)+b;
z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
}

w=y; w=z;
}

void decipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v,z=v,sum=0xC6EF3720,
delta=0x9E3779B9,a=k,b=k,
c=k,d=k,n=32;

/* sum = delta<<5, in general sum = delta * n */

while(n-->0)
{
z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
sum -= delta;
}

w=y; w=z;
}

I had a crack at it in Lisp. My version doesn't work - but of greater
concern to me is that it doesn't appear nearly as compact as the C
version. Anyway, here's my Lisp code (no prizes for guessing that I'm a
noob to Lisp):

(defconstant delta 2654435769 ) ; delta= 0x9E3779B9

(defun floorn (n) (nth-value 0 (floor n)))

(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(let* (t1 t2)
(setf t1 (expt 2 num-bytes))
(setf t2 (/ val t1))
(floor t2)))

(defun << (val num-bytes)
"Left-shift positive integer v by num-bytes"
(* val (expt 2 num-bytes)))

(defun <<4 (i) (<< i 4))

(defun byte-n (v n)
"Return the nth byte of a value v"
(let* ((bits-to-shift (* 8 (1- n)))
(shifted-value (>> v bits-to-shift)))
(logand shifted-value 256)))
(defun transform (v1 v2 v3 v4)
(let (t1 t2 t3)
(setf t1 (<<4 v1))
(setf t2 (expt v2 v1))
(setf t3 (expt v3 (>> v2 5)))
(+ t1 t2 t3 v4)))

(defun pack64 (b1 b2) (+ (<< b1 32) b2))

(defun encipher (v k)
(let ((sum 0)
(a (byte-n k 3)) ; a=k
(b (byte-n k 2)) ; b=k
(c (byte-n k 1)) ; c=k
(d (byte-n k 0)) ; d=k
(y (byte-n v 1)) ; y=v
(z (byte-n v 0))) ; z=v
(loop for n from 0 to 31 do ;n=32, while(n-->0)
(incf sum delta) ;sum += delta;
(incf y (transform z a sum b)) ; y += (z << 4)+a ^ z+sum ^ (z >> 5)+b
(incf z (transform y c sum d)) ;z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
)

(pack64 y z) ; w=y; w=z;
))
(defun decipher (v k)
(let ((sum 3337565984) ; 0xC6EF3720
(a (byte-n k 3)) ; a=k
(b (byte-n k 2)) ; b=k
(c (byte-n k 1)) ; c=k
(d (byte-n k 0)) ; d=k
(y (byte-n v 1)) ; y=v
(z (byte-n v 0))) ; z=v

(loop for n from 0 to 31 do ;n=32, while(n-->0)
(decf z (transform y c sum d)) ;z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
(decf y (transform z a sum b)) ;y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
(decf sum delta) ;sum -= delta;
)

(pack64 y z) ; w=y; w=z;
))
Mar 20 '06 #43
Interesting. But you probably need to post this as a new
message, since it is a distinctly different
problem from the one driving this thread.

Mark

Mar 20 '06 #44
Mark Tarver wrote:
Interesting.

At the risk of being labelled a troll, one thought that is occuring to
me is that in Lisp it seems that sometimes it is difficult to achieve a
simple thing in a simple way. To clarify ... recently, I had been
trying to obtain md5 hashes of the files we had on our server (a
different exercise than the one I mentioned in my OP, just in case you
thought that I didn't understand the difference between encryption and
hashing). There is an md5 package for Lisp available on the web, which I
used with CLISP. I had a file that contained a non-standard character,
causing CLISP to throw an error when it tried to print it.

Well, I suppose I could have tried to figure out a way to cajole CLISP
into printing something it didn't want to print, but I was keen to give
Corman Lisp 2.5 a try-out anyway, so I tried the package on it. EXCEPT,
for some reason when you try to read a file with an :element-type of
(unsigned-byte 8) (or something similar), Corman didn't like it.

In the end, I hacked together an md5 DLL from some sources I found on
the internet. You can get the package here, together with Corman Lisp
bindings:
http://www.markcarter.me.uk/computin...d5mc/md5mc.htm

In the past, I had also employed a similar technique in order to get

My worry is that it seems to be a recurring theme with me ... get
stumped in Lisp, realise that it is probably just plain easier in C, and
then link the whole thing together in Lisp. Which is kinda less than
expected.

Mar 20 '06 #45

"Dinko Tenev" <di*********@gmail.com> wrote in message
wk*******@cox.net wrote:
It would seem that your program is just filtering the full cartesian
product, right? The solution I'm looking for generates the elements
one-by-one so that it could be used in a loop.

Oops...missed that part.

It took me a while to study the exchange on this topic more thoroughly,
and I now fully appreciate the fact that the problem calls for a much
more sophisticated approach.

Sorry for the hasty shot, I'll give it another shortly.

I wouldn't worry about it, Prolog generated the elements one-by-one.
The loop was the print,nl,fail line. Just beefing it up a bit, I
didn't take the time to clean it up though. :-)

gen(_,0,[]).
gen(S,N,[H|T]):- N > 0, N1 is N - 1, member(H,S), gen(S,N1,T).

filter([],[]).
filter([X|T],[X|T1]):- filter(T,T1).
filter([*|T],L):- filter(T,L).
filter([*|T],[_|T1]):- filter([*|T],T1).

filter_list(L,[[and|T]|_]):- filter_and(L,T), !.
filter_list(L,[[or|T]|_]):- filter_list(L,T), !.
filter_list(L,[H|_]):- H \= [and|_], H \= [or|_], filter(H,L),!.
filter_list(L,[H|T]):- H \= [and|_], H \= [or|_], filter_list(L,T).

filter_and(_,[]) :- !.
filter_and(L,[H|T]):- filter_list(L,[H]), filter_and(L,T).

generate_member(X,S,N,[]):-gen(S,N,X).
generate_member(X,S,N,[H|T]):-gen(S,N,X),\+ filter_list(X,[H|T]).

1 ?- generate_member(X,[a,b],3,[[a,*,b],[b,*,a]]).

X = [a, a, a] ;

X = [a, b, a] ;

X = [b, a, b] ;

X = [b, b, b] ;

No
2 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*]]]]).

X = [1, 1, 1] ;

X = [1, 1, 2] ;

X = [1, 2, 1] ;

X = [2, 1, 1] ;

X = [2, 2, 1] ;

X = [2, 2, 2] ;

No

---
Geoff
Mar 20 '06 #46
Mark Carter <me@privacy.net> writes:
I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm

Here's the code, just two simple functions:

void encipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v,z=v,sum=0,delta=0x9E3779B9,
a=k,b=k,c=k,d=k,n=32;

while(n-->0)
{
sum += delta;
y += (z << 4)+a ^ z+sum ^ (z >> 5)+b;
z += (y << 4)+c ^ y+sum ^ (y >> 5)+d;
}

w=y; w=z;
}

void decipher(unsigned long *const v,unsigned long *const w,
const unsigned long *const k)
{
register unsigned long y=v,z=v,sum=0xC6EF3720,
delta=0x9E3779B9,a=k,b=k,
c=k,d=k,n=32;

/* sum = delta<<5, in general sum = delta * n */

while(n-->0)
{
z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d;
y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b;
sum -= delta;
}

w=y; w=z;
}

I get it shorter than in C:

(defun op (x a b sum) (logxor (+ (ash x 4) a) (+ x sum) (+ (ash x -5) b)))
(declaim (inline op))
(defmacro ciploop ((v w k y z a b c d (sum init-sum) delta) &body body)
`(let ((,y (aref ,v 0)) (,z (aref ,v 1)) (,sum ,init-sum) (,delta #x9E3779B9)
(,a (aref ,k 0)) (,b (aref ,k 1)) (,c (aref ,k 2)) (,d (aref ,k 3)))
(loop repeat 32 do ,@body finally (setf (aref ,w 0) ,y (aref ,w 1) ,z))))
(defmacro c-incf (var expr) `(setf ,var (mod (+ ,var ,expr) #x100000000)))
(defmacro c-decf (var expr) `(setf ,var (mod (- ,var ,expr) #x100000000)))
(defun encipher (v w k)
(ciploop (v w k y z a b c d (sum 0) delta)
(c-incf sum delta) (c-incf y (op z a b sum)) (c-incf z (op y c d sum))))
(defun decipher (v w k)
(ciploop (v w k y z a b c d (sum #xC6EF3720) delta)
(c-decf z (op y c d sum)) (c-decf y (op z a b sum)) (c-decf sum delta)))
You can also easily modify it to implement the improved version of TEA...
Note that this Lisp programs will work equally well on a 16-bit,
32-bit or 64-bit Common Lisp implementation. The same cannot be said
of the C program above.

(defun word (a b c d)
(dpb a (byte 8 24) (dpb b (byte 8 16) (dpb c (byte 8 8) d))))

(loop
for bytes = (progn (format *query-io* "Please enter ~D bits of ~A: "
bits what)
(ext:convert-string-to-bytes
while (< (* 8 (length bytes)) bits)
finally (return
(loop for i from 0 by 4 below (truncate (+ 7 bits) 8)
collect (word (aref bytes (+ i 0))
(aref bytes (+ i 1))
(aref bytes (+ i 2))
(aref bytes (+ i 3))) into words
finally (return (coerce words 'vector))))))

(defun test ()
(loop
with code = (vector 0 0)
with decr = (vector 0 0)
for clear = (read-words 64 "clear text")
for key = (read-words 128 "key")
do (progn (encipher clear code key)
(format t "(encipher ~S ~S)~% --> ~S~%" clear key code)
(decipher code decr key)
(format t "(decipher ~S ~S)~% --> ~S~%" code key decr)
(unless (equalp clear decr) (format t "!!! ERROR !!!~%")))))
> (test)
Please enter 64 bits of clear text: Hello World!
Please enter 128 bits of key: John McCarthy invented LISP.
(encipher #(1214606444 1864390511) #(1248815214 541942595 1634890856 2032167278))
--> #(913593965 183139965)
(decipher #(913593965 183139965) #(1248815214 541942595 1634890856 2032167278))
--> #(1214606444 1864390511)
Please enter 64 bits of clear text: Big Secret: LISP!
Please enter 128 bits of key: A very very secure key.
(encipher #(1114203936 1399153522) #(1092646501 1920540790 1702000928 1936024437))
--> #(3198111104 1851109064)
(decipher #(3198111104 1851109064) #(1092646501 1920540790 1702000928 1936024437))
--> #(1114203936 1399153522)
Please enter 64 bits of clear text: ^C

--
__Pascal Bourguignon__ http://www.informatimago.com/

ATTENTION: Despite any other listing of product contents found
herein, the consumer is advised that, in actuality, this product
consists of 99.9999999999% empty space.
Mar 20 '06 #47
> I had a crack at it in Lisp. My version doesn't work - but of greater
concern to me is that it doesn't appear nearly as compact as the C
version. Anyway, here's my Lisp code (no prizes for guessing that I'm a
noob to Lisp):
Lot's of things you can write more compact.
But compact is not always the best way to
write source. For me the most important
after, say, a year absence and everything

(defconstant delta 2654435769 ) ; delta= 0x9E3779B9
(defconstant +delta+ #x9E3779B9)
(defun floorn (n) (nth-value 0 (floor n)))
is above used?
(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(let* (t1 t2)
(setf t1 (expt 2 num-bytes))
(setf t2 (/ val t1))
(floor t2)))
(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(floor (/ val (expt 2 num-bytes))))
(defun transform (v1 v2 v3 v4)
(let (t1 t2 t3)
(setf t1 (<<4 v1))
(setf t2 (expt v2 v1))
(setf t3 (expt v3 (>> v2 5)))
(+ t1 t2 t3 v4)))

(defun transform (v1 v2 v3 v4)
(+ (<<4 v1)
(expt v2 v1)
(expt v3 (>> v2 5))
v4))

and so on...

Mar 20 '06 #48
jo****@corporate-world.lisp.de writes:
(defun >> (val num-bytes)
"Right-shift positive integer val by num-bytes"
(floor (/ val (expt 2 num-bytes))))

or just (floor val (expt 2 num-bytes))

'as
Mar 20 '06 #49
[ note followups ]

Mark Carter <me@privacy.net> writes:
I'd like to propose a coding challenge of my own. The challenge is to
reproduce the TEA (Tiny Encryption Algorith):
http://www.simonshepherd.supanet.com/tea.htm

Here's mine, in Common Lisp.

(defmacro define-tea-pair ((encrypt decrypt) (delta n) (i1 j1 i2 j2))
`(macrolet ((32bitize (form) `(logand ,form #xffffffff))
(tea-kernel (x i j)
`(logxor (+ (ash ,x 4) (aref key ,i)) (+ ,x sum)
(+ (ash ,x -5) (aref key ,j))))
(tea-loop ((text n sum) &body body)
`(let ((result (make-array 2 :element-type '(unsigned-byte 32)))
(y (aref ,text 0))
(z (aref ,text 1)))
(do ((n ,n (- n 1)) (sum ,sum))
((<= n 0)
(prog1 result
(setf (aref result 0) y (aref result 1) z)))
,@body))))
(defun ,encrypt (plaintext key &aux (delta ,delta))
(declare (type (simple-array (unsigned-byte 32) (2)) plaintext)
(type (simple-array (unsigned-byte 32) (4)) key))
(tea-loop (plaintext ,n 0)
(setq sum (32bitize (+ sum delta))
y (32bitize (+ y (tea-kernel z ,i1 ,j1)))
z (32bitize (+ z (tea-kernel y ,i2 ,j2))))))
(defun ,decrypt (ciphertext key &aux (delta ,delta))
(declare (type (simple-array (unsigned-byte 32) (2)) ciphertext)
(type (simple-array (unsigned-byte 32) (4)) key))
(tea-loop (ciphertext ,n (32bitize (* ,n delta)))
(setq z (32bitize (- z (tea-kernel y ,i2 ,j2)))
y (32bitize (- y (tea-kernel z ,i1 ,j1)))
sum (32bitize (- sum delta)))))))

(define-tea-pair (encipher decipher) (#x9e3779b9 32) (0 1 2 3))

So far, so ordinary; only marginally shorter than the C version; I'm
certainly nowhere near Pascal's 14 lines, although my version has the
advantage over his that each constant is mentioned only once, and all
quantities derived from it are computed at compile-time; I can define
a different pair with

(define-tea-pair (eprime dprime) (#xabcdef01) (3 1 2 0))

and the new functions are inverses of each other as before.

The other thing that might be of interest is the inner loop. There
are no declarations other than the argument declarations, and all the
code that I have written is portable Common Lisp, and should work in
any conforming implemenation. In SBCL (on the PowerPC; other
platforms are similar), the inner loop for ENCIPHER is

rlwinm \$nl5,\$nl2,4,0,27
lwz \$nl6,1(\$fdefn)
xor \$cfunc,\$nl6,\$nl0
rlwinm \$nl0,\$nl2,27,5,31
mr \$nl5,\$nl0
lwz \$nl6,5(\$fdefn)
xor \$nl6,\$cfunc,\$nl6
rlwinm \$nl5,\$nl1,4,0,27
lwz \$nl6,9(\$fdefn)
xor \$cfunc,\$nl6,\$nl0
rlwinm \$nl0,\$nl1,27,5,31
mr \$nl5,\$nl0
lwz \$nl6,13(\$fdefn)
xor \$nl6,\$cfunc,\$nl6
cmpwi cr0,\$nl4,0
bt gt,l0

and while this may be opaque to some readers, the point is that it is
pretty much comparable to the C code in performance (the differences
between this disassembly and gcc -O2 lie in the fact that SBCL's
instruction scheduler is pretty much nonexistent on the PowerPC
architecture).

Christophe
Mar 20 '06 #50