473,242 Members | 1,504 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,242 software developers and data experts.

Efficient way of generating original alphabetic strings like unix file "split"

Hi,

I'm looking to generate x alphabetic strings in a list size x. This
is exactly the same output that the unix command "split" generates as
default file name output when splitting large files.

Example:

produce x original, but not random strings from english alphabet, all
lowercase. The length of each string and possible combinations is
dependent on x. You don't want any repeats.

[aaa, aab, aac, aad, .... aax, ...... bbc, bbd, .... bcd]

I'm assumming there is a slick, pythonic way of doing this, besides
writing out a beast of a looping function. I've looked around on
activestate cookbook, but have come up empty handed. Any suggestions?

Thanks,
Conor

Jun 14 '07 #1
6 3398
On Jun 14, 1:41 pm, py_genetic <conor.robin...@gmail.comwrote:
Hi,

I'm looking to generate x alphabetic strings in a list size x. This
is exactly the same output that the unix command "split" generates as
default file name output when splitting large files.

Example:

produce x original, but not random strings from english alphabet, all
lowercase. The length of each string and possible combinations is
dependent on x. You don't want any repeats.

[aaa, aab, aac, aad, .... aax, ...... bbc, bbd, .... bcd]

I'm assumming there is a slick, pythonic way of doing this, besides
writing out a beast of a looping function. I've looked around on
activestate cookbook, but have come up empty handed. Any suggestions?
If you allow numbers also, you can use Base 36:
>>import gmpy
int('aaa',36)
13330
>>for n in xrange(13330,13330+1000):
print gmpy.digits(n,36),

aaa aab aac aad aae aaf aag aah aai aaj aak aal
aam aan aao aap aaq aar aas aat aau aav aaw aax
aay aaz ab0 ab1 ab2 ab3 ab4 ab5 ab6 ab7 ab8 ab9
aba abb abc abd abe abf abg abh abi abj abk abl
abm abn abo abp abq abr abs abt abu abv abw abx
aby abz ac0 ac1 ac2 ac3 ac4 ac5 ac6 ac7 ac8 ac9
aca acb acc acd ace acf acg ach aci acj ack acl
acm acn aco acp acq acr acs act acu acv acw acx
acy acz ad0 ad1 ad2 ad3 ad4 ad5 ad6 ad7 ad8 ad9
ada adb adc add ade adf adg adh adi adj adk adl
adm adn ado adp adq adr ads adt adu adv adw adx
ady adz ae0 ae1 ae2 ae3 ae4 ae5 ae6 ae7 ae8 ae9
aea aeb aec aed aee aef aeg aeh aei aej aek ael
aem aen aeo aep ...

>
Thanks,
Conor

Jun 14 '07 #2
py_genetic <co************@gmail.comwrites:
Hi,

I'm looking to generate x alphabetic strings in a list size x. This
is exactly the same output that the unix command "split" generates as
default file name output when splitting large files.

Example:

produce x original, but not random strings from english alphabet, all
lowercase. The length of each string and possible combinations is
dependent on x. You don't want any repeats.

[aaa, aab, aac, aad, .... aax, ...... bbc, bbd, .... bcd]

I'm assumming there is a slick, pythonic way of doing this, besides
writing out a beast of a looping function. I've looked around on
activestate cookbook, but have come up empty handed. Any suggestions?
You didn't try hard enough. :)

http://aspn.activestate.com/ASPN/Coo.../Recipe/190465

--
HTH,
Rob
Jun 14 '07 #3
>
You didn't try hard enough. :)

http://aspn.activestate.com/ASPN/Coo.../Recipe/190465

--
HTH,
Rob
Thanks Rob, "permutation" was the keyword I shcould have used!

Jun 14 '07 #4
On Jun 14, 3:08 pm, Rob Wolfe <r...@smsnet.plwrote:
py_genetic <conor.robin...@gmail.comwrites:
Hi,
I'm looking to generate x alphabetic strings in a list size x. This
is exactly the same output that the unix command "split" generates as
default file name output when splitting large files.
Example:
produce x original, but not random strings from english alphabet, all
lowercase. The length of each string and possible combinations is
dependent on x. You don't want any repeats.
[aaa, aab, aac, aad, .... aax, ...... bbc, bbd, .... bcd]
I'm assumming there is a slick, pythonic way of doing this, besides
writing out a beast of a looping function. I've looked around on
activestate cookbook, but have come up empty handed. Any suggestions?

You didn't try hard enough. :)

http://aspn.activestate.com/ASPN/Coo.../Recipe/190465

Unfortunately, that's a very poor example. The terminaology is
all wrong.

"xpermutations takes all elements from the sequence, order matters."
This ought to be the Cartesian Product, but it's not (no replacement).

"xcombinations takes n distinct elements from the sequence, order
matters."
If order matters, it's a PERMUTATION, period.

"xuniqueCombinations takes n distinct elements from the sequence,
order is irrelevant."
No such thing, a Combination is unique by definition.

"xselections takes n elements (not necessarily distinct) from the
sequence, order matters."
Ah, this allows a size operator, so if size = length, we get full
Cartesian Product.

The proper terminology for the Cartesian Product and
its subsets is:

Permutations with replacement
Combinations with replacement
Permutations without replacement
Combinations without replacement

And if the functions were properly labeled, you would get:

permutation without replacement - size 4

Permutations of 'love'
love loev lvoe lveo leov levo olve olev ovle ovel oelv oevl vloe vleo
vole voel velo veol elov elvo eolv eovl evlo evol
permutation without replacement - size 2

Combinations of 2 letters from 'love'
lo lv le ol ov oe vl vo ve el eo ev
combination without replacement - size 2

Unique Combinations of 2 letters from 'love'
lo lv le ov oe ve
permutation with replacement - size 2

Selections of 2 letters from 'love'
ll lo lv le ol oo ov oe vl vo vv ve el eo ev ee
full Cartesian Product, permutations with replacement - size 4

Selections of 4 letters from 'love'
llll lllo lllv llle llol lloo llov lloe llvl llvo llvv llve llel lleo
llev llee loll lolo lolv lole lool looo loov looe lovl lovo lovv love
loel loeo loev loee lvll lvlo lvlv lvle lvol lvoo lvov lvoe lvvl lvvo
lvvv lvve lvel lveo lvev lvee lell lelo lelv lele leol leoo leov leoe
levl levo levv leve leel leeo leev leee olll ollo ollv olle olol oloo
olov oloe olvl olvo olvv olve olel oleo olev olee ooll oolo oolv oole
oool oooo ooov oooe oovl oovo oovv oove ooel ooeo ooev ooee ovll ovlo
ovlv ovle ovol ovoo ovov ovoe ovvl ovvo ovvv ovve ovel oveo ovev ovee
oell oelo oelv oele oeol oeoo oeov oeoe oevl oevo oevv oeve oeel oeeo
oeev oeee vlll vllo vllv vlle vlol vloo vlov vloe vlvl vlvo vlvv vlve
vlel vleo vlev vlee voll volo volv vole vool vooo voov vooe vovl vovo
vovv vove voel voeo voev voee vvll vvlo vvlv vvle vvol vvoo vvov vvoe
vvvl vvvo vvvv vvve vvel vveo vvev vvee vell velo velv vele veol veoo
veov veoe vevl vevo vevv veve veel veeo veev veee elll ello ellv elle
elol eloo elov eloe elvl elvo elvv elve elel eleo elev elee eoll eolo
eolv eole eool eooo eoov eooe eovl eovo eovv eove eoel eoeo eoev eoee
evll evlo evlv evle evol evoo evov evoe evvl evvo evvv evve evel eveo
evev evee eell eelo eelv eele eeol eeoo eeov eeoe eevl eevo eevv eeve
eeel eeeo eeev eeee
And Combinations with replacement seems to be missing.

>
--
HTH,
Rob- Hide quoted text -

- Show quoted text -

Jun 14 '07 #5
On Jun 14, 4:39 pm, py_genetic <conor.robin...@gmail.comwrote:
You didn't try hard enough. :)
http://aspn.activestate.com/ASPN/Coo.../Recipe/190465
--
HTH,
Rob

Thanks Rob, "permutation" was the keyword I shcould have used!
See my other post to see if that is indeed what you mean.

Jun 14 '07 #6
On Jun 14, 3:02 pm, "mensana...@aol.com" <mensana...@aol.comwrote:
On Jun 14, 4:39 pm, py_genetic <conor.robin...@gmail.comwrote:
You didn't try hard enough. :)
>http://aspn.activestate.com/ASPN/Coo.../Recipe/190465
--
HTH,
Rob
Thanks Rob, "permutation" was the keyword I shcould have used!

See my other post to see if that is indeed what you mean.
Thanks, mensanator I see what you are saying, I appreciate you
clarification. I modified the unique version to fit my needs,
sometimes you just want the first x unique combinations and of the
right "width" (A or AA or AAA...) string, so I reworked it a bit to be
more efficient. Isn't this a case of base^n-1 for # unique
combinations, using the alphabet: 26^strlen - 1 or to figure out
strlen from #of combinations needed: ln(26 * #ofcobinations needed)/
ln(26) obviously a float but a pritty good idea of strlen needed when
rounded?

Jun 19 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Carlos Ribeiro | last post by:
Hi all, While writing a small program to help other poster at c.l.py, I found a small inconsistency between the handling of keyword parameters of string.split() and the split() method of...
4
by: kinne | last post by:
The following code is supposed to reverse the date in "yyyy-mm-dd" format, but it produces different results in Firefox 1.0 and in Internet Explorer 6SP1. In Firefox, the result is correct...
1
by: Sam Johnson | last post by:
Hi I've seen that the string classes in the .net framework have a cool (new?) function: splitting strings and returning the substrings in an array. Is there any equivalent to this in "old-style"...
11
by: pmarisole | last post by:
I am using the following code to split/join values in a multi-select field. It is combining all the values in All the records into one long string in each record in recordset. Example: I have a...
12
blazedaces
by: blazedaces | last post by:
Hello again. I'm trying to take as an input an ArrayList<String> and utilize String's .spit(delimiter) method to turn that into a String. I'm getting some kind of error though (I'll post the code...
2
by: elgin | last post by:
I have a split Access 2003 database. I have signed the database with a Code Signing Certificate from Small Business Server. This works fine and users can have Access macro security on high or...
7
by: spoken | last post by:
Hi, I'm trying to read a file with data seperated by "|" character. Ex: 3578.27|2001|Road Bikes|Metchosin|Canada 3399.99|2001|Mountain Bikes|Pantin|France 3399.99|2001|Mountain...
2
by: gabielmatos | last post by:
Good morning, I'm reading a csv file. My problem is that when I use the comma delimiter brings me more fields that have the line. This is because some fields contain commas in them. An example of...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.