435,314 Members | 2,155 Online + Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,314 IT Pros & Developers. It's quick & easy.

Programming challenge: wildcard exclusion in cartesian products

78 Replies

 P: n/a wk*******@cox.net wrote: The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. What is your goal? You want to learn or to cause a flamewar? ;-) Anyway, I found the problem entertaining, so here you go, here is my Haskell code. It could be shorter if I didn't care about performance and wrote in specification style. It's not very efficient either, because it will generate all lists matching the given patterns. In GHCi you can test it by: \$ ghci :l WildCartesian.hs test I apologise for the lack of comments. ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- module WildCartesian where import Data.Set (Set) import qualified Data.Set as Set import Control.Monad import Control.Exception (assert) import Maybe import List data Pat a = All | Lit a deriving Show generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]] generateMatching 0 _ [] = [[]] generateMatching 0 _ (_:_) = [] generateMatching len alphabet (Lit x : ps) | x `Set.member` alphabet = [ (x : xs) | xs <- generateMatching (len - 1) alphabet ps ] | otherwise = [ ] generateMatching len alphabet (All : ps) = [ (x : xs) | x <- Set.toList alphabet , xs <- unionSorted (generateMatching (len - 1) alphabet ps) (generateMatching (len - 1) alphabet (All : ps)) ] `unionSorted` generateMatching len alphabet ps generateMatching _ _ [] = [] generateNotMatching :: (Ord a) => [a] -> Int -> [[Pat a]] -> [[a]] generateNotMatching alphabet len patterns = generateMatching len alphaSet [All] `subtractSorted` foldr unionSorted [] (map (generateMatching len alphaSet . simplifyPat) patterns) where alphaSet = Set.fromList alphabet simplifyPat (All : All : ps) = simplifyPat (All : ps) simplifyPat (p : ps) = p : simplifyPat ps simplifyPat [] = [] joinSorted :: Ord a => [a] -> [a] -> [(Maybe a, Maybe a)] joinSorted (x1:x2:_) _ | assert (x1 < x2) False = undefined joinSorted _ (y1:y2:_) | assert (y1 < y2) False = undefined joinSorted (x:xs) (y:ys) = case x `compare` y of LT -> (Just x, Nothing) : joinSorted xs (y:ys) EQ -> (Just x, Just y) : joinSorted xs ys GT -> (Nothing, Just y) : joinSorted (x:xs) ys joinSorted (x:xs) [] = (Just x, Nothing) : joinSorted xs [] joinSorted [] (y:ys) = (Nothing, Just y) : joinSorted [] ys joinSorted [] [] = [] unionSorted :: Ord a => [a] -> [a] -> [a] unionSorted xs ys = catMaybes (map (uncurry mplus) (joinSorted xs ys)) subtractSorted :: Ord a => [a] -> [a] -> [a] subtractSorted xs ys = catMaybes (map f (joinSorted xs ys)) where f (Just x, Nothing) = Just x f _ = Nothing test = do t [1,2] 3 [[Lit 1, All, Lit 2]] t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']] t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]] where t a b c = do putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c])) mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c) ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland Mar 16 '06 #2

 P: n/a Tomasz Zielonka wrote: putStrLn (concat (intersperse " " ["generateMatching", show a, show b, show c])) Minor correction: it should be "generateNotMatching". Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland Mar 16 '06 #3

 P: n/a Major correction (missing case): Tomasz Zielonka wrote: generateMatching :: (Ord a) => Int -> Set a -> [Pat a] -> [[a]] generateMatching 0 _ [] = [[]] generateMatching 0 alphabet (All:ps) = generateMatching 0 alphabet ps generateMatching 0 _ (_:_) = [] Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland Mar 16 '06 #4

 P: n/a Flame war? Absolutely not. My reason is to learn. There are many sites dedicated to reasonably objective comparisons between languages. Here are two examples: http://www.smallscript.org/Language%...on%20Chart.asp http://www.jvoegele.com/software/langcomp.html The wildcard exclusion problem is interesting enough to have many distinct, elegant solutions in as many languages. It would be interesting to see if they converge to roughly the same solution or if there are essential differences. And your code is a crash course in Haskell! Tossing aside the 'flame war' inquiry your code response is my only goal. I hope many others find the problem and responses as fascinating as I do. Mar 16 '06 #5

 P: n/a The point is to submit elegant code that showcases the features of each language. And the problem is, just to clarify, given a set WC of wildcards in any logical combination, and if WC(S^n) is the set all s in S^n that matches the wildcards, then efficiently generate the complement S^n\WC(S^n). You are free to restate the problem in any equivalent way. Mar 16 '06 #6

 P: n/a wk*******@cox.net schreef: There are many sites dedicated to reasonably objective comparisons between languages. Here are two examples: http://www.smallscript.org/Language%...on%20Chart.asp http://www.jvoegele.com/software/langcomp.html http://shootout.alioth.debian.org/ -- Affijn, Ruud "Gewoon is een tijger." echo 014C8A26C5DB87DBE85A93DBF |perl -pe 'tr/0-9A-F/JunkshoP cartel,/' Mar 16 '06 #7

 P: n/a Without much testing. Common Lisp Pattern exclusions are made lispy. (defun all-lists (list length) (unless (zerop length) (if (= length 1) (mapcar #'list list) (loop for elt in list nconc (mapcar (lambda (rest) (cons elt rest)) (loop for rest on list nconc (all-lists rest (1- length)))))))) (defun cp-without-wc (source-list &rest patterns) (let* ((length (length (first patterns))) (all-lists (all-lists source-list length))) (dolist (pattern patterns) (setf all-lists (set-difference all-lists (mapcar (lambda (insertion) (let ((cp (copy-list pattern))) (loop for place on cp when (eql :any (car place)) do (setf (car place) (pop insertion))) cp)) (all-lists source-list (count :any pattern))) :test #'equal))) (remove-duplicates all-lists :test #'equal))) CL-USER 22 > (cp-without-wc '(a b) '(a :any b) '(b :any a)) ((A A A) (A B A) (B A B) (B B B)) CL-USER 23 > (cp-without-wc '(abc xyz) '(abc :any xyz)) ((XYZ XYZ XYZ) (XYZ XYZ ABC) (XYZ ABC XYZ) (XYZ ABC ABC) (ABC XYZ ABC) (ABC ABC ABC)) CL-USER 24 > (cp-without-wc '(a b) '(a :any :any)) ((B B B) (B B A) (B A B) (B A A)) CL-USER 25 > (cp-without-wc '(a b) '(a :any :any) '(b :any :any)) NIL CL-USER 26 > (cp-without-wc '(a b) '(:any :any b)) ((B B A) (B A A) (A B A) (A A A)) CL-USER 27 > Wade Mar 16 '06 #8

 P: n/a What I have in mind is the efficient, generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. Mar 16 '06 #9

 P: n/a here is my version of the same. REPL output: CL-USER> (tests) set = (1 2) n = 3 patterns = ((1 ANY 2)) ----------------------- (1 1 1) (1 2 1) (2 1 1) (2 1 2) (2 2 1) (2 2 2) set = (A B) n = 3 patterns = ((A ANY B) (B ANY A)) ----------------------- (A A A) (A B A) (B A B) (B B B) set = (1 2) n = 3 patterns = ((1 ANY 2) (2 ANY 1)) ----------------------- (1 1 1) (1 2 1) (2 1 2) (2 2 2) NIL CL-USER> source: ;;;; cartesian products minus wildcard patterns per: ;;;; ;;;; >Newsgroups: comp.lang.lisp, etc... ;;;; >Subject: Programming challenge: wildcard exclusion in cartesian products ;;;; >Date: 16 Mar 2006 03:14:23 -0800 ;;;; ;;;; (defun show-me (x) (format t "~A~%" x)) (defun set^n (fn set n &optional acc) "call `fn' on each permutation of `set' raised to the `n' power" (if (<= n 0) (funcall fn (reverse acc)) (dolist (e set) (set^n fn set (- n 1) (cons e acc))))) ;; test set^n by printing and visually inspecting the result (defun pr-set^n (set n) (set^n #'show-me set n)) ;; curry `set^n' so that `fn' is the only parameter (defun set^n-gen (set n) (lambda (fn) (set^n fn set n))) (defun mk-matchl-p (pat-list) "return a function that tests a value against the patterns in `pat-list'" (labels ((matchp (pat val) (cond ((null pat) t) ((or (eq (car pat) (car val)) (eq (car pat) :any)) (matchp (cdr pat) (cdr val)))))) (lambda (val) "predicate: return true if val matches any pattern in `pat-list'" (dolist (p pat-list) (if (matchp p val) (return t)))))) (defun not-fp (f-pred) "return the complement of predicate `f-pred'" (lambda (x) (not (funcall f-pred x)))) ;; f-gen is a generator of the form returned by set^n-gen (defun accumulate-if (f-gen f-pred) "accumulate values generated by f-gen that satisfy f-pred" (let (acc) (funcall f-gen (lambda (x) (if (funcall f-pred x) (push x acc)))) (nreverse acc))) ;; `pr-set^n-withoutWC' is the lisp equivalent (more or less) of ;; python code: ;; >>> for i in cp.CPWithoutWC(x,y,z): print i (defun pr-set^n-withoutWC (set n pat-list) (format t "~%~%set = ~A~%n = ~A~%patterns = ~A~%~A~%" set n pat-list "-----------------------") (dolist (e (accumulate-if (set^n-gen set n) (not-fp (mk-matchl-p pat-list)))) (format t "~A~%" e))) (defun tests () "generate test output per the original problem examples" (pr-set^n-withoutWC '(1 2) 3 '((1 :any 2))) (pr-set^n-withoutWC '(a b) 3 '((a :any b) (b :any a))) (pr-set^n-withoutWC '(1 2) 3 '((1 :any 2) (2 :any 1)))) Mar 16 '06 #10

 P: n/a NOTE: I am a lisp newbie. I'm sure our resident lisp experts can create much better (both faster, shorter and clearer) solutions than the one above. Even I could have created something shorter but I thought it would be fun to apply the "utility function" approach in decomposing the problem. --jfc Mar 16 '06 #11

 P: n/a "wk*******@cox.net" writes: The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n>=3, of [a,b,c,d] that excludes '*a*b*' and '*c*d*a*'. See below for details. I'm afraid that different programs in this thread has understood the asterisk differently: that it matches any single element, or that it matches any sequence of elements. -- __("< Marcin Kowalczyk \__/ qr****@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/ Mar 16 '06 #13

 P: n/a The asterisk '*' matches any sequence of elements, not just one element. The wildcard '*1*2*' would then correspond to a tuple with a 1 preceding a 2 in any positions. The wc '1*2' would correspond to a starting 1 and an ending 2 with anything in between. The wc *12* would correspond to a 1 adjacent to a 2 with the pair in any position. Possibilities like '*a*a*b*' and '*a*a*a*' of any length are also allowed. If n is the dimension, then any n-tuple wc is just a point. My apologies for the confusion. Mar 17 '06 #14

 P: n/a Is this Haskell implementation what you want? It does the wildcard matching through a state machine and it essentially threads the state machine through the cartesian product, switching to the ordinary cartesian product when possible as an optimisation. The execution of the state machine is shared by strings with the same prefix making it reasonably efficient even though the state machine itself isn't optimised. If it doesn't work, I'm sure it's only a few typos away... -- generate strings of length n from alphabet l such that -- the state machine, with transition function t, is not on -- a final state (determined by function f) at the -- end of the string. -- If the state is ever 'unmatchable' (as determined by u) -- we just return the cartesian product as no rejection -- can take place. generate f u t s 0 l = if f s then [] else [[]] generate f u t s n l | u s = sequence (replicate n l) | otherwise = [a:b | a <- l, let s' = t s a, b <- generate f u t s' (n-1) l] -- The states are lists of regular expressions -- where [a,b,..] means match a or b or... -- This is the transition function for our machine. transition pat a = pat >>= d a where -- Brzozowski derivative d a [] = [] d a p@('*':pat) = p:d a pat d a (p:pat) | a==p = [pat] | otherwise = [] -- A terminal state is one that matches the null string terminal p = or \$ map terminal' p where terminal' "" = True terminal' ('*':p) = terminal' p terminal' _ = False run n alphabet pat = generate terminal null transition [pat] n alphabet test = run 3 "abc" "aa*a" Mar 17 '06 #15

 P: n/a wk*******@cox.net wrote: What I have in mind is the efficient, generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. Another attempt. I have made no special attempt to create an exclusion language, just used an anonymous lambda predicate. ;; Wade Humeniuk (defclass odometer () ((base :initform 0 :accessor base) (meter :initform nil :accessor meter) (n-digits :initarg :n-digits :accessor n-digits) (digit-set :initarg :digit-set :accessor digit-set))) (defmethod initialize-instance :after ((obj odometer) &rest initargs) (setf (base obj) (length (digit-set obj)) (meter obj) (make-array (n-digits obj) :initial-element 0) (digit-set obj) (coerce (digit-set obj) 'vector))) (defun inc-odometer (odometer) (loop with carry = 1 for i from (1- (n-digits odometer)) downto 0 for digit = (incf (aref (meter odometer) i) carry) if (= digit (base odometer)) do (setf (aref (meter odometer) i) 0) (setf carry 1) else do (setf carry 0) while (not (zerop carry)))) (defun zero-meter-p (odometer) (every #'zerop (meter odometer))) (defmethod next-set ((obj odometer)) (prog1 (map 'list (lambda (digit) (aref (digit-set obj) digit)) (meter obj)) (inc-odometer obj))) (defclass cs-with-wc (odometer) ((exclusion :initarg :exclusion :accessor exclusion) (at-end :initform nil :accessor at-end))) (defmethod next-set ((obj odometer)) (tagbody :next (unless (at-end obj) (let ((set (call-next-method))) (when (zero-meter-p obj) (setf (at-end obj) t)) (if (not (funcall (exclusion obj) set)) (return-from next-set set) (go :next)))))) (defun print-all-cs (set length exclusion) (let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set :exclusion exclusion))) (loop for set = (next-set cs-with-wc) while set do (print set)))) CL-USER 134 > (cs-with-wc '(a b) 3 (lambda (set) (destructuring-bind (x y z) set (or (and (eql x 'a) (eql z 'b)) (and (eql x 'b) (eql z 'a)))))) (A A A) (A B A) (B A B) (B B B) NIL CL-USER 135 > (cs-with-wc '(a b) 3 (lambda (set) (eql (second set) 'a))) (A B A) (A B B) (B B A) (B B B) NIL CL-USER 136 > (cs-with-wc '(abc xyz) 3 (lambda (set) (and (eql (first set) 'abc) (eql (third set) 'xyz)))) (ABC ABC ABC) (ABC XYZ ABC) (XYZ ABC ABC) (XYZ ABC XYZ) (XYZ XYZ ABC) (XYZ XYZ XYZ) NIL Mar 17 '06 #16

 P: n/a Oops, problems cutting an pasting, should be, ;; Wade Humeniuk (defclass odometer () ((base :initform 0 :accessor base) (meter :initform nil :accessor meter) (n-digits :initarg :n-digits :accessor n-digits) (digit-set :initarg :digit-set :accessor digit-set))) (defmethod initialize-instance :after ((obj odometer) &rest initargs) (setf (base obj) (length (digit-set obj)) (meter obj) (make-array (n-digits obj) :initial-element 0) (digit-set obj) (coerce (digit-set obj) 'vector))) (defun inc-odometer (odometer) (loop with carry = 1 for i from (1- (n-digits odometer)) downto 0 for digit = (incf (aref (meter odometer) i) carry) if (= digit (base odometer)) do (setf (aref (meter odometer) i) 0) (setf carry 1) else do (setf carry 0) while (not (zerop carry)))) (defun zero-meter-p (odometer) (every #'zerop (meter odometer))) (defmethod next-set ((obj odometer)) (prog1 (map 'list (lambda (digit) (aref (digit-set obj) digit)) (meter obj)) (inc-odometer obj))) (defclass cs-with-wc (odometer) ((exclusion :initarg :exclusion :accessor exclusion) (at-end :initform nil :accessor at-end))) (defmethod next-set ((obj cs-with-wc)) (tagbody :next (unless (at-end obj) (let ((set (call-next-method))) (when (zero-meter-p obj) (setf (at-end obj) t)) (if (not (funcall (exclusion obj) set)) (return-from next-set set) (go :next)))))) (defun print-all-cs (set length exclusion) (let ((cs-with-wc (make-instance 'cs-with-wc :n-digits length :digit-set set :exclusion exclusion))) (loop for set = (next-set cs-with-wc) while set do (print set)))) CL-USER 7 > (print-all-cs '(a b) 3 (lambda (set) (destructuring-bind (x y z) set (or (and (eql x 'a) (eql z 'b)) (and (eql x 'b) (eql z 'a)))))) (A A A) (A B A) (B A B) (B B B) NIL CL-USER 8 > (print-all-cs '(abc xyz) 3 (lambda (set) (and (eql (first set) 'abc) (eql (third set) 'xyz)))) (ABC ABC ABC) (ABC XYZ ABC) (XYZ ABC ABC) (XYZ ABC XYZ) (XYZ XYZ ABC) (XYZ XYZ XYZ) NIL CL-USER 9 > Mar 17 '06 #17

 P: n/a -- The states are lists of regular expressions -- where [a,b,..] means match a or b or... I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded, so the list [wc1..wcn] is to correspond generating all tuples matching not(wc1 and ... and wcn). Maybe you're already doing that. The wc's themselves could be logical statements among the 'primitive' wc's. That's why I named it the 'wildcard exclusion problem'. It's a lot easier to specify a list of simpler wc's than create a long logical expression. Thanks to all who are making this an interesting thread. Mar 17 '06 #18

 P: n/a Heh, here's a Prolog version: ================================================== ======== gen( _, 0, [] ) :- !. gen( S, N, [H | T] ) :- member( H, S ), M is N - 1, gen( S, M, T ). ================================================== ======== Yep, that's it :))) Here's how to test it: ================================================== ======== 1 ?- gen([a, b, c], 3, X), print(X), nl, fail. [a, a, a] [a, a, b] [a, a, c] [a, b, a] [a, b, b] [a, b, c] [a, c, a] [a, c, b] [a, c, c] [b, a, a] [b, a, b] [b, a, c] [b, b, a] [b, b, b] [b, b, c] [b, c, a] [b, c, b] [b, c, c] [c, a, a] [c, a, b] [c, a, c] [c, b, a] [c, b, b] [c, b, c] [c, c, a] [c, c, b] [c, c, c] No 2 ?- gen([a, b, c], 3, X), not(member(X, [[a, _, _], [_, b, _], [_, _, c]])), print(X), nl, fail. [b, a, a] [b, a, b] [b, c, a] [b, c, b] [c, a, a] [c, a, b] [c, c, a] [c, c, b] No ================================================== ======== Mar 17 '06 #19

 P: n/a Dan Piponi wrote: Is this Haskell implementation what you want? It does the wildcard matching through a state machine and it essentially threads the state machine through the cartesian product, switching to the ordinary cartesian product when possible as an optimisation. The execution of the state machine is shared by strings with the same prefix making it reasonably efficient even though the state machine itself isn't optimised. I've implemented the same concept yesterday evening: ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- module WildCartesian where import List data Pat a = All | Lit a deriving (Show, Eq) advancePattern :: Eq a => a -> [Pat a] -> [[Pat a]] advancePattern y (Lit x : ps) | x == y = [ps] | otherwise = [] advancePattern y (All : ps) = [All : ps] ++ [ps] ++ advancePattern y ps advancePattern _ [] = [] generateNotMatching :: Eq a => [a] -> Int -> [[Pat a]] -> [[a]] generateNotMatching alphabet = gen [] where gen _ n pats | any (\ps -> all (== All) ps && (not (null ps) || n == 0)) pats = [] gen acc 0 _ = [reverse acc] gen acc n pats = [ w | x <- alphabet , let pats' = [ p' | p <- pats, p' <- advancePattern x p ] , w <- gen (x : acc) (n - 1) pats' ] test :: IO () test = do t [1,2] 3 [[Lit 1, All, Lit 2]] t ['a','b'] 3 [[Lit 'a', All, Lit 'b'], [Lit 'b', All, Lit 'a']] t [1,2] 3 [[Lit 1, All, Lit 2], [Lit 2, All, Lit 1]] where t a b c = do putStrLn (concat (intersperse " " ["generateNotMatching", show a, show b, show c])) mapM_ (putStrLn . (" "++) . show) (generateNotMatching a b c) ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland Mar 17 '06 #20

 P: n/a wk*******@cox.net wrote: -- The states are lists of regular expressions -- where [a,b,..] means match a or b or... I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded, so the list [wc1..wcn] is to correspond generating all tuples matching not(wc1 and .. and wcn). Maybe you're already doing that. The wc's themselves could be logical statements among the 'primitive' wc's. That's why I named it the 'wildcard exclusion problem'. It's a lot easier to specify a list of simpler wc's than create a long logical expression. I missed "any logical combination" :-( It would be quite easy to fix my first program, but I don't have the time to do it right now. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland Mar 17 '06 #21

 P: n/a You may be interested in, or already know about http://www.lambdassociates.org/ http://www.lambdassociates.org/aboutqi.htm http://www.lambdassociates.org/webbook/contents.htm http://www.lambdassociates.org/prolog.htm Let me know what you think. Mar 17 '06 #22

 P: n/a It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. Mar 17 '06 #23

 P: n/a wk*******@cox.net said: I haven't run or studied your program yet myself but what I had in mind was that the list of wc's are *all* to be excluded I think it already does what you want. You might want to change run n alphabet pat = generate terminal null transition [pat] n alphabet to to run n alphabet pat = generate terminal null transition pat n alphabet to allow lists of patterns where the patterns in a list are ORed together. Mar 17 '06 #24

 P: n/a wk*******@cox.net wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. Oops...missed that part. It took me a while to study the exchange on this topic more thoroughly, and I now fully appreciate the fact that the problem calls for a much more sophisticated approach. Sorry for the hasty shot, I'll give it another shortly. Cheers, Dinko Mar 17 '06 #25

 P: n/a sa wrote: in k: cp:{[c;n;p]+(n#c)_vs(!_ c^n)_dvl,/{2_sv+(,/,/:\:)/(),/:@[x;&x=-1;:[;!c]]}'p} That one goes a long way as a proof of eg evolution theory, you know, monkeys reproducing shakespeare with a typewriter k-board and all that :) examples: cp[2;3;,0 -1 1] (0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1) cp[2;3;(0 -1 1;1 -1 0)] (0 0 0 0 1 0 1 0 1 1 1 1) cp[2;3;(0 -1 1;1 -1 1)] (0 0 0 0 1 0 1 0 0 1 1 0) arguments of cp: c = cardinality of the input set n = power p = list of patterns (-1 = wildcard) the algorithm directly computes the target set. in other words, it does not generate the set, then filter the matches from the target. modifying cp to accept s instead of the cardinality of s, patterns expressed in terms of elements of s, &c. adds nothing of interest to the problem. Mar 17 '06 #26

 P: n/a Tomasz Zielonka said: I've implemented the same concept yesterday evening... It's uncanny reading such similar code coming from another person! Mar 17 '06 #27

 P: n/a Wade Humeniuk wrote: wk*******@cox.net wrote: What I have in mind is the efficient, generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. Another attempt. I have made no special attempt to create an exclusion language, just used an anonymous lambda predicate. FWIW, here's my Q-and-D pattern matcher (only partially tested). (defun match(list pattern &optional (test #'eql)) "Match a list of atoms against a pattern list using :all as a 0-to-many wildcard, :single as a 1-to-1 wildcard, a list of elements or a single element to match a specific place. Optional argument test for comparing elements (default eql). Returns: T if match is made, NIL otherwise. Examples: (match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :all)) => T (match '(0 1 2 3 4 5) '(:all (2 3) 3 :single 5 :single)) => NIL" (let ((current (first pattern)) (next-pattern (rest pattern)) (candidate (first list))) (cond ((and (null pattern) (null list)) t) ((and (eq :single current) candidate) (match (rest list) next-pattern test)) ((eq :all current) (loop for new-list on list when (match new-list next-pattern test) do (return-from match t)) (null next-pattern)) ; last case null remainder ((if(atom current) (funcall test candidate current) (member candidate current :test test)) (match (rest list) next-pattern test))))) -- Geoff Mar 17 '06 #28

 P: n/a "wk*******@cox.net" writes: What I have in mind is the efficient, generation of the complement S^n/WC(S^n). A good program should initialize, generate, and terminate. T=cartprodex(S,n,WC); //initialize for all i in T do what you want with i test to see if any more terminate if not and it should do this without explicitly generating WC and then complementing. For example, if the cardinality of S is m, and the WC is just '*a*b*', with a != b, then EX(S^n):=S^n\WC(S^n) has cardinality (m-1)^(n-1)*(m+n-1). Specifically, if m=5 and n=10, then |EX|=3670016 while |S^10|=9765625, so that |EX|/|S^10| is about 0.3758. In general the program should directly generate EX from arbitrary WC. Of course, in practice the WC should themselves occur in a logically consistent manner, but let's just assume they're a given. The following code doesn't build a data structure. It recurses n deep and those n stack frames contain state that tracks progress through the product. But it still generates and tests, taking time proportional to S^n. Is this what you had in mind? ;; A macro to let you loop over the S^n possibilities ;; without building a big data structure ;; The macro is structured as syntactic sugar on ;; a higher order function ;; (defmacro cartesian-power ((variable set exponent) &body code) (let ((body-function (gensym))) `(flet ((,body-function(,variable),@code)) (cartesian-power-hof ,set ,exponent '() (function ,body-function))))) ;; The standard idea of using recursion to implement a nest ;; of loops of indefinite depth ;; (defun cartesian-power-hof (set exponent prefix f) (if (zerop exponent) (funcall f prefix) (dolist (item set) (cartesian-power-hof set (- exponent 1) (cons item prefix) f)))) ;; A simple recursive pattern match ;; I haven't thought through the implications ;; I guess that it is exponentially slow on some long ;; patterns ;; (defun wild-match (pattern data) (cond ((endp pattern) (endp data)) ((endp data) (or (endp pattern) (equal pattern '(:wild)))) ((eql (car pattern) :wild) (or (null (cdr pattern)) (wild-match (cdr pattern) data) (wild-match pattern (cdr data)))) ('literal-pattern (and (eql (car pattern) (car data)) (wild-match (cdr pattern) (cdr data)))))) ;; close over a data item to get a function ;; suitable for checking several patterns (defun match-data (data) (lambda(pattern) (wild-match pattern data))) ;; Use the macro and the utilities to count how many are not excluded (defun count-remainder (set exponent &rest exclusions) (let ((count 0)) (cartesian-power (item set exponent) (when (notany (match-data item) exclusions) (incf count))) count)) CL-USER> (loop for i from 3 to 10 do (format t "~&~4D~10D" i (count-remainder '(a b c d e) i '(:wild a :wild b :wild)))) 3 112 4 512 5 2304 6 10240 7 45056 8 196608 9 851968 10 3670016 I can see how a pattern such as (a b :wild) would knock out an element from each of the first two sets so reducing the task from m^n to (m-1)^2 * m^(n-2). Also (:wild a :wild) knocks it down from m^n to (m-1)^n However I can only see the exploitation of (:wild a :wild b :wild) as a hairy special case. Pick one of n places for the first a. Pick earlier elements from the set excluding a, pick later elements from the set excluding b. Add in all the items with a missing altogether (so b can be used freely). I cannot see what algorithm exploits the constraints more generally. Is there a standard technique, page nn of Knuth or the like? Is that what you are actually wanting to see coded? Alan Crowe Edinburgh Scotland Mar 17 '06 #29

 P: n/a wk*******@cox.net wrote: The wildcard exclusion problem is interesting enough to have many distinct, elegant solutions in as many languages. In that case, you should have crossposted to comp.lang.python also. Your program looks like a dog's breakfast. Mar 18 '06 #30

 P: n/a wk*******@cox.net wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping & filtering complete sets? Mar 18 '06 #31

 P: n/a "funkyj" writes: One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping & filtering complete sets? Storage. You can iterate over problem spaces much too large to fit in memory. Also generate + iterate can be faster because of reduced memory pressure. Mar 18 '06 #32

 P: n/a Yes, the program is quite a jumble: but it works. And I didn't post to python newsgroup since I was limited to just 5 newsgroups and didn't feel like doing multiple postings to multiple newsgroups. Mar 18 '06 #33

 P: n/a Here is an nice intro to K: http://www.kuro5hin.org/?op=displays...1/14/22741/791 "This is where K starts to set itself from apart from most of the common programming languages in use today. You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values." How about an interactive loop-like version? Generating the target set is good for baby test cases but not if the cardinality of the target is large. Does that make the problem more intersesting? Mar 18 '06 #34

 P: n/a When I run this I get through ghc I get C:\Documents and Settings\User\My Documents\wildcard>ghc "./wc-zielonka.hs" compilation IS NOT required C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x1d):Main.c: undefined refe rence to `__stginit_ZCMain' C:/Languages/ghc/ghc-6.4.1/libHSrts.a(Main.o)(.text+0x43):Main.c: undefined refe rence to `ZCMain_main_closure' collect2: ld returned 1 exit status Unless there's a command line option I'm missing? Mar 18 '06 #35

 P: n/a The cardinality of excluding '*a*b*' from S^n should be (m-1)^(n-1)*(m+n-1), where m=|S|. For m=5 this becomes 4^(n-1)*(n+4), and your table fits this formula. As far as generating and testing, an 'ideal' solution would be to 'start from the ground up', as in excluding length 2 wc, and then length 3, etc, until all wc's have been excluded. The 'ideal' solution would intrinsically exclude wc's and not test against a background generation of all of S^n. Does that make sense? Mar 18 '06 #36

 P: n/a Nice! How to put it in a loop? I'm totally a newbie to Lisp myself, just gettng into Graham and Touretzky. Let's create a problem. Suppose after excluding I want to know if the digits sum to 12, say, like maybe they're part of a partition. S={0,..6}, S^5, excluding "*1*5*" and "1*2*3*", say. How would I do that? Mar 18 '06 #37

 P: n/a wk*******@cox.net schrieb: "This is where K starts to set itself from apart from most of the common programming languages in use today. You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values." Doesn't sound too different from what closures do. Or lazy parameter passing. I'm not sure whether the K designer actually fits that description, but there are too many language designers around reinventing the wheel, arguing whether it should have seven, eight or thirteen sides... Regards, Jo Mar 18 '06 #38

 P: n/a OK, a bad case of RTFM. I saved your file as WildCartesian.hs and then 1) command line: ghci WildCartesian.hs 2) Get some loading messages 3) command line: test and it works! But how do I compile it to get a program with command line arguments? I'm looking through Daume's tutorial right now. Mar 19 '06 #39

 P: n/a Doug Quale wrote: "funkyj" writes: One advantage of a generator over filtering the full product is that I, as the user of the generator, am not obligated to iterate over the entire solution space. Are there other _practical_ advantages of generators over mapping & filtering complete sets? Storage. You can iterate over problem spaces much too large to fit in memory. Also generate + iterate can be faster because of reduced memory pressure. Hmmm...storage is not an issue in the Prolog version. It generates a candidate solution, then checks membership in the wildcard set, then backtracks (backtracking is caused by "fail" in the test goal.) On backtracking, it effectively "forgets" the last solution, so the memory is freed up (or at least free to be reclaimed through GC.) Cheers, Dinko Mar 20 '06 #40

 P: n/a wk*******@cox.net wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. OK, having read some of the comments so far, I have the feeling that I may be missing the point in more than one way, so let's set this straight: If I understand correctly, for an alphabet S, and a subset W of S* specified by the wildcards, you expect the enumeration of sequences of length n to run in Theta( n*|S^n - W| ) instead of Theta( n*|S^n| ). First, this doesn't seem to hold for your Python program. Try, for example, S = { a, b, c }, W = { *a*b*, *b*c*, *c*a*, *b*a*, *c*b*, *a*c* }, with some large values of n. Theta( n*|S^n - W| ) predicts that the enumeration time should grow linearly with n, as |S^n - W| = 3, but if you take some measurements, you'd notice that it grows faster than that. Second, my current bet is that such an improvement in asymptotic complexity is not possible, if we consider *both* pre-processing of the wildcard set and subsequent enumeration. Speculation: the time for building-up a smart structure to speed-up enumeration, together with the time for enumerating the set using that structure, should sum up to roughly Theta( n*|S^n| ), even with a really smart algorithm. Even if you're willing to pay up-front for tighter loop execution later, and you build a suitable structure for this purpose, you would have to consider the structure's size, so here's another speculation: such structure would likely take up Theta( |S^n| ) space in memory, in the worst case. I would really appreciate it if you could pour some light into what you're trying to do exactly, and possibly point out anything that I might have missed so far. Cheers, Dinko Mar 20 '06 #41

 P: n/a Hi, You wrote into the Qilang News group with your problem. This is a solution in 17 lines of Qi for any n-product >= 2. It falls short of your complete requirement since it uses generate and then test, rather than interleaving the two. (define challenge Patterns N X -> (filter (/. Y (member Y Patterns)) (n-product N X))) (define n-product 2 X -> (cartesian-product l X X) N X -> (cartesian-product c X (n-product (- N 1) X))) (define cartesian-product _ [ ] _ -> [ ] c [X | Y] Z -> (append (map (/. W [X | W]) Z) (cartesian-product c Y Z)) l [X | Y] Z -> (append (map (/. W [X W]) Z) (cartesian-product l Y Z))) (define filter _ [] -> [] F [X | Y] -> (filter F Y) where (F X) F [X | Y] -> [X | (filter F Y)]) (define member _ [] -> false X [Pattern | _] -> true where (query-prolog [[= Pattern X]]) X [_ | Patterns] -> (member X Patterns)) Notes: Pattern filtering is done by a unification test within the member function. You can do this most easily by calling Qi Prolog using query-prolog. Here's a test. (42 -) (n-product 3 [a b c]) [[a a a] [a a b] [a a c] [a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] OK, remove all lists beginning [a a ....]. (43-) (challenge [[a a | X]] 3 [a b c]) [[a b a] [a b b] [a b c] [a c a] [a c b] [a c c] [b a a] [b a b] [b a c] [b b a] [b b b] [b b c] [b c a] [b c b] [b c c] [c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] Remove all lists beginning with a or b. (51-) (challenge [[a | X] [b | X]] 3 [a b c]) [[c a a] [c a b] [c a c] [c b a] [c b b] [c b c] [c c a] [c c b] [c c c]] Mark Mar 20 '06 #42

 P: n/a I'd like to propose a coding challenge of my own. The challenge is to reproduce the TEA (Tiny Encryption Algorith): http://www.simonshepherd.supanet.com/tea.htm in your language of choice. Here's the code, just two simple functions: void encipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v,z=v,sum=0,delta=0x9E3779B9, a=k,b=k,c=k,d=k,n=32; while(n-->0) { sum += delta; y += (z << 4)+a ^ z+sum ^ (z >> 5)+b; z += (y << 4)+c ^ y+sum ^ (y >> 5)+d; } w=y; w=z; } void decipher(unsigned long *const v,unsigned long *const w, const unsigned long *const k) { register unsigned long y=v,z=v,sum=0xC6EF3720, delta=0x9E3779B9,a=k,b=k, c=k,d=k,n=32; /* sum = delta<<5, in general sum = delta * n */ while(n-->0) { z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d; y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b; sum -= delta; } w=y; w=z; } I had a crack at it in Lisp. My version doesn't work - but of greater concern to me is that it doesn't appear nearly as compact as the C version. Anyway, here's my Lisp code (no prizes for guessing that I'm a noob to Lisp): (defconstant delta 2654435769 ) ; delta= 0x9E3779B9 (defun floorn (n) (nth-value 0 (floor n))) (defun >> (val num-bytes) "Right-shift positive integer val by num-bytes" (let* (t1 t2) (setf t1 (expt 2 num-bytes)) (setf t2 (/ val t1)) (floor t2))) (defun << (val num-bytes) "Left-shift positive integer v by num-bytes" (* val (expt 2 num-bytes))) (defun <<4 (i) (<< i 4)) (defun byte-n (v n) "Return the nth byte of a value v" (let* ((bits-to-shift (* 8 (1- n))) (shifted-value (>> v bits-to-shift))) (logand shifted-value 256))) (defun transform (v1 v2 v3 v4) (let (t1 t2 t3) (setf t1 (<<4 v1)) (setf t2 (expt v2 v1)) (setf t3 (expt v3 (>> v2 5))) (+ t1 t2 t3 v4))) (defun pack64 (b1 b2) (+ (<< b1 32) b2)) (defun encipher (v k) (let ((sum 0) (a (byte-n k 3)) ; a=k (b (byte-n k 2)) ; b=k (c (byte-n k 1)) ; c=k (d (byte-n k 0)) ; d=k (y (byte-n v 1)) ; y=v (z (byte-n v 0))) ; z=v (loop for n from 0 to 31 do ;n=32, while(n-->0) (incf sum delta) ;sum += delta; (incf y (transform z a sum b)) ; y += (z << 4)+a ^ z+sum ^ (z >> 5)+b (incf z (transform y c sum d)) ;z += (y << 4)+c ^ y+sum ^ (y >> 5)+d; ) (pack64 y z) ; w=y; w=z; )) (defun decipher (v k) (let ((sum 3337565984) ; 0xC6EF3720 (a (byte-n k 3)) ; a=k (b (byte-n k 2)) ; b=k (c (byte-n k 1)) ; c=k (d (byte-n k 0)) ; d=k (y (byte-n v 1)) ; y=v (z (byte-n v 0))) ; z=v (loop for n from 0 to 31 do ;n=32, while(n-->0) (decf z (transform y c sum d)) ;z -= (y << 4)+c ^ y+sum ^ (y >> 5)+d; (decf y (transform z a sum b)) ;y -= (z << 4)+a ^ z+sum ^ (z >> 5)+b; (decf sum delta) ;sum -= delta; ) (pack64 y z) ; w=y; w=z; )) Mar 20 '06 #43

 P: n/a Interesting. But you probably need to post this as a new message, since it is a distinctly different problem from the one driving this thread. Mark Mar 20 '06 #44

 P: n/a Mark Tarver wrote: Interesting. At the risk of being labelled a troll, one thought that is occuring to me is that in Lisp it seems that sometimes it is difficult to achieve a simple thing in a simple way. To clarify ... recently, I had been trying to obtain md5 hashes of the files we had on our server (a different exercise than the one I mentioned in my OP, just in case you thought that I didn't understand the difference between encryption and hashing). There is an md5 package for Lisp available on the web, which I used with CLISP. I had a file that contained a non-standard character, causing CLISP to throw an error when it tried to print it. Well, I suppose I could have tried to figure out a way to cajole CLISP into printing something it didn't want to print, but I was keen to give Corman Lisp 2.5 a try-out anyway, so I tried the package on it. EXCEPT, for some reason when you try to read a file with an :element-type of (unsigned-byte 8) (or something similar), Corman didn't like it. In the end, I hacked together an md5 DLL from some sources I found on the internet. You can get the package here, together with Corman Lisp bindings: http://www.markcarter.me.uk/computin...d5mc/md5mc.htm In the past, I had also employed a similar technique in order to get access to some console functions that I was interested in. My worry is that it seems to be a recurring theme with me ... get stumped in Lisp, realise that it is probably just plain easier in C, and then link the whole thing together in Lisp. Which is kinda less than expected. Mar 20 '06 #45

 P: n/a "Dinko Tenev" wrote in message news:11*********************@e56g2000cwe.googlegro ups.com... wk*******@cox.net wrote: It would seem that your program is just filtering the full cartesian product, right? The solution I'm looking for generates the elements one-by-one so that it could be used in a loop. Oops...missed that part. It took me a while to study the exchange on this topic more thoroughly, and I now fully appreciate the fact that the problem calls for a much more sophisticated approach. Sorry for the hasty shot, I'll give it another shortly. I wouldn't worry about it, Prolog generated the elements one-by-one. The loop was the print,nl,fail line. Just beefing it up a bit, I didn't take the time to clean it up though. :-) gen(_,0,[]). gen(S,N,[H|T]):- N > 0, N1 is N - 1, member(H,S), gen(S,N1,T). filter([],[]). filter([X|T],[X|T1]):- filter(T,T1). filter([*|T],L):- filter(T,L). filter([*|T],[_|T1]):- filter([*|T],T1). filter_list(L,[[and|T]|_]):- filter_and(L,T), !. filter_list(L,[[or|T]|_]):- filter_list(L,T), !. filter_list(L,[H|_]):- H \= [and|_], H \= [or|_], filter(H,L),!. filter_list(L,[H|T]):- H \= [and|_], H \= [or|_], filter_list(L,T). filter_and(_,[]) :- !. filter_and(L,[H|T]):- filter_list(L,[H]), filter_and(L,T). generate_member(X,S,N,[]):-gen(S,N,X). generate_member(X,S,N,[H|T]):-gen(S,N,X),\+ filter_list(X,[H|T]). 1 ?- generate_member(X,[a,b],3,[[a,*,b],[b,*,a]]). X = [a, a, a] ; X = [a, b, a] ; X = [b, a, b] ; X = [b, b, b] ; No 2 ?- generate_member(X,[1,2],3,[[and, [*,2], [or, [2,1,*], [1,2,*]]]]). X = [1, 1, 1] ; X = [1, 1, 2] ; X = [1, 2, 1] ; X = [2, 1, 1] ; X = [2, 2, 1] ; X = [2, 2, 2] ; No --- Geoff Mar 20 '06 #46

 P: n/a > I had a crack at it in Lisp. My version doesn't work - but of greater concern to me is that it doesn't appear nearly as compact as the C version. Anyway, here's my Lisp code (no prizes for guessing that I'm a noob to Lisp): Lot's of things you can write more compact. But compact is not always the best way to write source. For me the most important criteria is that I can return to some source after, say, a year absence and everything is clear and readable again. (defconstant delta 2654435769 ) ; delta= 0x9E3779B9 (defconstant +delta+ #x9E3779B9) (defun floorn (n) (nth-value 0 (floor n))) is above used? (defun >> (val num-bytes) "Right-shift positive integer val by num-bytes" (let* (t1 t2) (setf t1 (expt 2 num-bytes)) (setf t2 (/ val t1)) (floor t2))) (defun >> (val num-bytes) "Right-shift positive integer val by num-bytes" (floor (/ val (expt 2 num-bytes)))) (defun transform (v1 v2 v3 v4) (let (t1 t2 t3) (setf t1 (<<4 v1)) (setf t2 (expt v2 v1)) (setf t3 (expt v3 (>> v2 5))) (+ t1 t2 t3 v4))) (defun transform (v1 v2 v3 v4) (+ (<<4 v1) (expt v2 v1) (expt v3 (>> v2 5)) v4)) and so on... Mar 20 '06 #48

 P: n/a jo****@corporate-world.lisp.de writes: (defun >> (val num-bytes) "Right-shift positive integer val by num-bytes" (floor (/ val (expt 2 num-bytes)))) or just (floor val (expt 2 num-bytes)) 'as Mar 20 '06 #49 