470,811 Members | 1,282 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,811 developers. It's quick & easy.

MoinMoin WikiName and python regexes


hi-

i know nada about python so please forgive me if this is way off base. i'm
trying to fix a bug in MoinMoin whereby

WordsWithTwoCapsInARowLike
^^
^^
^^

do not become WikiNames. this is because the the wikiname pattern is
basically

/([A-Z][a-z]+){2,}/

but should be (IMHO)

/([A-Z]+[a-z]+){2,}/

however, the way the patterns are constructed like

word_rule = ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s][%(l)s]+){2,})+(?![%(u)s%(l)s]+)' % {
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' % re.escape(PARENT_PREFIX)) or '',
}
and i'm not that familiar with python syntax. to me this looks like a map
used to bind variables into the regex - or is it binding into a string then
compiling that string into a regex - regexs don't seem to be literal objects
in pythong AFAIK... i'm thinking i need something like

word_rule = ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s]+[%(l)s]+){2,})+(?![%(u)s%(l)s]+)' % {
^
^
^
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' % re.escape(PARENT_PREFIX)) or '',
}

and this seems to work - but i'm wondering what the 's' in '%(u)s' implies?
obviously the u is the char range (unicode?)... but what's the 's'?

i'm looking at

http://docs.python.org/lib/re-syntax.html
http://www.amk.ca/python/howto/regex/

and coming up dry. sorry i don't have more time to rtfm - just want to
implement this simple fix and get on to fcgi configuration! ;-)

cheers.

-a
--
================================================== =============================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
================================================== =============================

Jul 19 '05 #1
6 2554
Don
Ara.T.Howard wrote:

hi-

i know nada about python so please forgive me if this is way off base.
i'm trying to fix a bug in MoinMoin whereby

WordsWithTwoCapsInARowLike
^^
^^
^^

do not become WikiNames. this is because the the wikiname pattern is
basically

[snip]

PHPWiki has the same "feature", BTW. (Sorry, couldn't get MoinMoin to work
on Sourceforge, had to use PHPWiki).

-Don

Jul 19 '05 #2
Ara.T.Howard wrote:
(...)
and i'm not that familiar with python syntax. to me this looks like a map
used to bind variables into the regex - or is it binding into a string then
compiling that string into a regex - regexs don't seem to be literal
objects
in pythong AFAIK... i'm thinking i need something like

word_rule =
ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s]+[%(l)s]+){2,})+(?![%(u)s%(l)s]+)'
% {
^
^
^
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX +
'?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' %
re.escape(PARENT_PREFIX)) or '',
}

and this seems to work - but i'm wondering what the 's' in '%(u)s' implies?
obviously the u is the char range (unicode?)... but what's the 's'?


an example may help here:
a = 123
'%04d' % a '0123' '%f' % a '123.000000' '%s' % a '123'

that "s" tells python to convert the number as string. the form %(key)s
tells python to lookup a dictionary "key" and format the found value
into a string:
d = {'key': 123}
'%(key)s' % d

'123'

so in your code there's some keys named 'u', 'l', 'subpages', etc. and
their values are substitued into that big RE, replacing the
corresponding key names.

HTH.

--
deelan <http://www.deelan.com/>
Jul 19 '05 #3

"Ara.T.Howard" <Ar**********@noaa.gov> wrote in message
news:Pi*******************************@harp.ngdc.n oaa.gov...
i'm trying to fix a bug in MoinMoin whereby
A 'bug' is a discrepancy between promise (specification) and perfomance
(implementation). Have you really found such -- does MoinMoin not follow
the Wiki standard -- or are you just trying to customize MoinMoin to your
different specification.
WordsWithTwoCapsInARowLike
^^
do not become WikiNames.


Would your proposed change to make the above into an Wiki name also make
all-cap sequences like NATO, FTP, and API into WikiNames and do you really
want that? If WikiNum, appearing one place, were also mistyped as WikeNUm
(from holding down the shift key too long, which I do occasionally), should
the latter become a separate WikiName? I can certainly understand why the
Wike designers might have answered both questions 'No."

Terry J. Reedy

Jul 19 '05 #4
On Wed, 8 Jun 2005, Terry Reedy wrote:

"Ara.T.Howard" <Ar**********@noaa.gov> wrote in message
news:Pi*******************************@harp.ngdc.n oaa.gov...
i'm trying to fix a bug in MoinMoin whereby
A 'bug' is a discrepancy between promise (specification) and perfomance
(implementation). Have you really found such -- does MoinMoin not follow
the Wiki standard -- or are you just trying to customize MoinMoin to your
different specification.


well, according to the specification at

http://moinmoin.wikiwikiweb.de/WikiN...%28wikiname%29

ThisIsAWikiName

there seems to be general agreement here

http://wikka.jsnx.com/WikiName
http://twiki.org/cgi-bin/view/TWiki/WikiWord

though not a wikis agree.

in moinmoin others have noted the inconsistency and filed a bug as noted in

http://moinmoin.wikiwikiweb.de/MoinM...%28wikiname%29

the problem being that the specification is simply vague here and does not
specifically prohibit AWikiName.
WordsWithTwoCapsInARowLike
^^
do not become WikiNames.
Would your proposed change to make the above into an Wiki name also make
all-cap sequences like NATO, FTP, and API into WikiNames


it wouldn't since

NATO !~ /^([A-Z]+[a-z]+){2,}$/
FTP !~ /^([A-Z]+[a-z]+){2,}$/
API !~ /^([A-Z]+[a-z]+){2,}$/

the pattern is

word = one, or more, upper case letters followed by one, or more, lower case
letters

wikiword = at least two words together

so

FOobar is not a link

but

AFooBar is
If WikiNum, appearing one place, were also mistyped as WikeNUm (from holding
down the shift key too long, which I do occasionally), should the latter
become a separate WikiName? I can certainly understand why the Wike
designers might have answered both questions 'No."


perhaps - it's just inconsistent the way it is now.

cheers.
-a
--
================================================== =============================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
================================================== =============================

Jul 19 '05 #5
Ara.T.Howard wrote:
i know nada about python so please forgive me if this is way off base. i'm
trying to fix a bug in MoinMoin whereby

WordsWithTwoCapsInARowLike


I don't think there is such a thing as the perfect "hyperlink vs
just-text" convention. In MoinMoin, you can force a custom link using e.g.:

[wiki:WebsiteSecurity this is the link text to WebsiteSecurity so call
it whatever you want such as WebsiteSecurities]

This custom linking, whilst obviously not ideal, solves the problems
mentioned at http://www.c2.com/cgi/wiki?WikiName

This seems better than producing endless confusing variations on the
"standard" (be it formal, actual, or simply obviously desired).

I'm not convinced of the usefulness of MoinMoin's "subpages" idea, while
we're on the (related) subject - they seem to create more problems than
they solve:
http://moinmoin.wikiwikiweb.de/HelpOnEditing/SubPages
Jul 19 '05 #6
On Wed, 8 Jun 2005 09:49:51 -0600, "Ara.T.Howard" <Ar**********@noaa.gov> wrote:

hi-

i know nada about python so please forgive me if this is way off base. i'm
trying to fix a bug in MoinMoin whereby

WordsWithTwoCapsInARowLike
^^
^^
^^

do not become WikiNames. this is because the the wikiname pattern is
basically

/([A-Z][a-z]+){2,}/

but should be (IMHO)

/([A-Z]+[a-z]+){2,}/ That would take care of the example above, but does it change an official spec?

however, the way the patterns are constructed like

word_rule = ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s][%(l)s]+){2,})+(?![%(u)s%(l)s]+)' % {
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' % re.escape(PARENT_PREFIX)) or '',
}
and i'm not that familiar with python syntax. to me this looks like a map
used to bind variables into the regex - or is it binding into a string then
compiling that string into a regex - regexs don't seem to be literal objects
in pythong AFAIK... i'm thinking i need something like

word_rule = ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s]+[%(l)s]+){2,})+(?![%(u)s%(l)s]+)' % {
^
^
^
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' % re.escape(PARENT_PREFIX)) or '',
}

and this seems to work - but i'm wondering what the 's' in '%(u)s' implies?
obviously the u is the char range (unicode?)... but what's the 's'? 'u' doesn't stand for unicode here. It is the key to look up config.chars_upper from the dict. That could
be unicode, and probably is. The 's' is the final part of a formatting spec which says how to convert the
data looked up, and 's' is for string, which doesn't change string data (unless, and UIAM, a conversion to unicode is required).

All of the above is making use of the % operator of strings, as in the expression
fmt % data
where fmt is a string containing ordinary characters and formatting specs in the form
of substrings escaped by a leading character '%'. The formatting specs take two basic
alternative forms: %<spec> or %(name)<spec>. If any '%' is followed by a parenthesized name,
as in '%(u)s' it means that the data to be formatted is retrieved from data['u'] for the latter example.
If there is no parenthesized name, the data is retrieved from data[i] where data must be a tuple and
i is the positional count of format specs in fmt. In some cases where there is no ambiguity,
and there is only one datum, data[0] may be written as the non-tuple value expression, e.g.,
instead of (123,) that data could be written as (123,)[0] or plain 123.

In the word_rule above, %(u)s uses 'u' as a key to get data from the dictionary { 'u': config.chars_upper, ...}
to substitute in the [%(u)s] as a string (that's what the 's' specifies), so config.chars_upper will
presumably have had a string value such as u'ABC..Z' and that will then be inserted in place of the %(u)s to
get u'...[ABC..Z]...' (if fmt is unicode, the resulting string will be unicode, UIAM)

i'm looking at

http://docs.python.org/lib/re-syntax.html
http://www.amk.ca/python/howto/regex/
See also
http://www.python.org/doc/current/li...q-strings.html
(which IMO should be easier to find, but if you click on the index square
at the top right of any library reference page, you can see a "%formatting" link)
and coming up dry. sorry i don't have more time to rtfm - just want to
implement this simple fix and get on to fcgi configuration! ;-)

cheers.

-a
--
================================================= ==============================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
================================================= ==============================


Regards,
Bengt Richter
Jul 19 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by asdf sdf | last post: by
46 posts views Thread by Reinhold Birkenfeld | last post: by
4 posts views Thread by ferg | last post: by
1 post views Thread by funny_leech | last post: by
2 posts views Thread by gdetre | last post: by
reply views Thread by Marcus | last post: by
1 post views Thread by Daniel Klein | last post: by
2 posts views Thread by kyosohma | last post: by
1 post views Thread by John [H2O] | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.