What YAML engine do you use?

Reinhold Birkenfeld

Hello,

I know that there are different YAML engines for Python out there (Syck,
PyYaml, more?).

Which one do you use, and why?

For those of you who don't know what YAML is: visit http://yaml.org/!
You will be amazed, and never think of XML again. Well, almost.

Reinhold

Jul 18 '05 #1

Subscribe Reply

3426

Diez B. Roggisch

> I know that there are different YAML engines for Python out there (Syck,

PyYaml, more?).

Which one do you use, and why?
I first used yaml, tried to migrate to syck. What I like about syck is that
it is faster and doesn't try to create objects but only dicts - but it
crashed if the number of yaml objects grew larger. So I still use yaml.

For those of you who don't know what YAML is: visit http://yaml.org/!
You will be amazed, and never think of XML again. Well, almost.

It is certainly nice.

--
Regards,

Diez B. Roggisch

Jul 18 '05 #2

Jonas Galvez

Diez B. Roggisch wrote:

I first used yaml, tried to migrate to syck. What I like about
syck is that it is faster and doesn't try to create objects but
only dicts - but it crashed if the number of yaml objects grew
larger. So I still use yaml.

Hmm.. I've never had any problems with syck. In fact, I'm using it in
a small project now where I store a helluva of data in yaml files...

Strange.

Jul 18 '05 #3

Istvan Albert

Reinhold Birkenfeld wrote:

You will be amazed, and never think of XML again.

XML with elementtree is what makes me never have think about XML again.

Istvan.

Jul 18 '05 #4

Irmen de Jong

Istvan Albert wrote:

XML with elementtree is what makes me never have think about XML again.

+1 QOTW

-Irmen

Jul 18 '05 #5

Paul Rubin

Reinhold Birkenfeld <re************************@wolke7.net> writes:

For those of you who don't know what YAML is: visit http://yaml.org/!
You will be amazed, and never think of XML again. Well, almost.

Oh please no, not another one of these. We really really don't need it.

Jul 18 '05 #6

Paul Rubin wrote:

Reinhold Birkenfeld <re************************@wolke7.net> writes:
For those of you who don't know what YAML is: visit http://yaml.org/!
You will be amazed, and never think of XML again. Well, almost.

Oh please no, not another one of these. We really really don't need it.

well, I did look at it, and as a text format is more readable than XML
is. Furthermore, XML's verbosity is incredible. This format is not.
People are abusing the genericity of XML to put everything into it.

Parsing and working with XML are highly optimized, so there's not really
a problem in that sector. But to transfer the same data in a YAML
format, rather than a XML format is much more economic. But networks are
getting faster, right?

Nowadays, people are trying to create binary XML, XML databases,
graphics in XML (btw, I'm quite impressed by SVG), you have XSLT, you
have XSL-FO, ... .

And I think, YAML is a nice initiative.

bye,
rm

Jul 18 '05 #7

Fredrik Lundh

"rm" wrote:

well, I did look at it, and as a text format is more readable than XML is.

judging from http://yaml.org/spec/current.html (750k), the YAML designers are
clearly insane. that's the most absurd software specification I've ever seen. they
need help, not users.

</F>

Jul 18 '05 #8

A.M. Kuchling

On Fri, 21 Jan 2005 18:30:47 +0100,
rm <rm@rm.net> wrote:

Nowadays, people are trying to create binary XML, XML databases,
graphics in XML (btw, I'm quite impressed by SVG), you have XSLT, you
have XSL-FO, ... .

Which is an argument in favor of XML -- it's where the activity is, so it's
quite likely you'll encounter the need to know XML. Few projects use YAML,
so the chance of having to know its syntactic details is small.

--amk

Jul 18 '05 #9

A.M. Kuchling

On Fri, 21 Jan 2005 18:54:50 +0100,
Fredrik Lundh <fr*****@pythonware.com> wrote:

judging from http://yaml.org/spec/current.html (750k), the YAML designers are
clearly insane. that's the most absurd software specification I've ever seen. they
need help, not users.

IMHO that's a bit extreme. Specifications are written to be detailed, so
consequently they're torture to read. Seen the ReStructured Text spec
lately?

The basic idea -- a data dumping format that's human-readable -- isn't a bad
one. OTOH, I can't recall wanting such a thing -- when I want readable
output I'm happy using
unreadable pickle files, unpickling the object and calling a .dump() or
..as_text() method.)

But YAML seems to have started out with the goal of being human-writable,
something you would write in Emacs, and that seems to have gotten lost; the
format is now just as complicated as Restructured Text, but more cryptic
(the URI namespacing for tags, for example), not really simpler than
XML and in some ways weaker (e.g. only two encodings supported, more
complicated escaping rules).

For a pure Python application, I can't see a need for YAML; use
pickle/cPickle instead, because they're already there. Exchanging
serialized objects between Python/Perl/Ruby scripts might be a good use case
for YAML, but XML has wider software support and S-expressions are simpler,
so my inclination would be to use them instead of YAML.

--amk

Jul 18 '05 #10

Reinhold Birkenfeld

A.M. Kuchling wrote:

On Fri, 21 Jan 2005 18:54:50 +0100,
Fredrik Lundh <fr*****@pythonware.com> wrote:
judging from http://yaml.org/spec/current.html (750k), the YAML designers are
clearly insane. that's the most absurd software specification I've ever seen. they
need help, not users.
IMHO that's a bit extreme. Specifications are written to be detailed, so
consequently they're torture to read. Seen the ReStructured Text spec
lately?

Agreed. If you just want to use it, you don't need the spec anyway.
The basic idea -- a data dumping format that's human-readable -- isn't a bad
one. OTOH, I can't recall wanting such a thing -- when I want readable
output I'm happy using
unreadable pickle files, unpickling the object and calling a .dump() or
.as_text() method.)

But YAML seems to have started out with the goal of being human-writable,
something you would write in Emacs,
Exactly. I use it as a format for config files the user can edit
directly without much thinking (the explanation on top of the file are 3
lines).
and that seems to have gotten lost; the
format is now just as complicated as Restructured Text, but more cryptic
(the URI namespacing for tags, for example), not really simpler than
XML and in some ways weaker (e.g. only two encodings supported, more
complicated escaping rules).

In most cases you don't need the complicated things, and the
http://www.yaml.org/refcard.html isn't very complex either.

Reinhold

Jul 18 '05 #11

Fredrik Lundh

A.M. Kuchling wrote:

IMHO that's a bit extreme. Specifications are written to be detailed, so
consequently they're torture to read. Seen the ReStructured Text spec
lately?
I've read many specs; YAML (both the spec and the format) is easily
among the worst ten-or-so specs I've ever seen.

ReST and YAML share the same deep flaw: both formats are marketed
as simple, readable formats, and at a first glance, they look simple and read-
able -- but in reality, they're messy as hell, and chances are that the thing
you're looking at doesn't really mean what you think it means (unless you're
the official ReST/YAML parser implementation). experienced designers
know how to avoid that; the ReST/YAML designers don't even understand
why they should.
But YAML seems to have started out with the goal of being human-writable,
something you would write in Emacs, and that seems to have gotten lost; the
format is now just as complicated as Restructured Text, but more cryptic
(the URI namespacing for tags, for example), not really simpler than
XML and in some ways weaker (e.g. only two encodings supported, more
complicated escaping rules).

http://www.modelsmodelsmodels.biz/images/hmo033.jpg

</F>

Jul 18 '05 #12

Daniel Bickett

Istvan Albert wrote:

XML with elementtree is what makes me never have think about XML again.
I second that. I heard about yaml and I read into it, but when I tried
to use it I didn't seem to get in touch with all of the glory
surrounding it. The yaml module -- when I tried to use it -- was very
error prone, and simply didn't work. I didn't have the time to go
through and try to tweak it because I was pressed for time and need a
quick solution. As for syck, I don't know if it was just me, but when
I downloaded it I got a whole lot of directories with obscure names
and files with .c extensions. So, discouraged, I gave up on yaml.

Elementtree, on the other hand, is wonderful :)

Irmen de Jong wrote: +1 QOTW

I second that, as well.

here's-to-appreciating-the-end-without-having-to-be-interested-in-the-means-ly
y'rs
Daniel Bickett

Jul 18 '05 #13

Bengt Richter

On Fri, 21 Jan 2005 12:04:10 -0600, "A.M. Kuchling" <am*@amk.ca> wrote:

On Fri, 21 Jan 2005 18:30:47 +0100,
rm <rm@rm.net> wrote:
Nowadays, people are trying to create binary XML, XML databases,
graphics in XML (btw, I'm quite impressed by SVG), you have XSLT, you
have XSL-FO, ... .

Which is an argument in favor of XML -- it's where the activity is, so it's
quite likely you'll encounter the need to know XML. Few projects use YAML,
so the chance of having to know its syntactic details is small.

<rant>
I thought XML was a good idea, but IMO requiring quotes around
even integer attribute values was an unfortunate decision. I don't buy
their rationale of keeping parsing simple -- as if extracting a string
with no embedded space from between an equal sign and terminating white
space were that much harder than extracting the same delimited by double quotes.
The result is cluttering SVG with needless cruft around numerical graphics parameters.
</rant>

OTOH, I think the HTML XML spec is very readable, and nicely designed.
At least the version 1.0 spec I snagged from W3C a long time ago.
.... I see the third edition at http://www.w3.org/TR/REC-xml/ is differently styled,
(I guess new style sheets) but still pretty readable (glancing at it now).

Regards,
Bengt Richter

Jul 18 '05 #14

Peter Hansen

A.M. Kuchling wrote:

On Fri, 21 Jan 2005 18:54:50 +0100,
Fredrik Lundh <fr*****@pythonware.com> wrote:
judging from http://yaml.org/spec/current.html (750k), the YAML designers are
clearly insane. that's the most absurd software specification I've ever seen. they
need help, not users.
IMHO that's a bit extreme. Specifications are written to be detailed, so
consequently they're torture to read. Seen the ReStructured Text spec
lately?

[...] But YAML ... the format is now ... not really simpler than
XML and in some ways weaker (e.g. only two encodings supported, more
complicated escaping rules).

As I recall, one of the key original goals for XML was that the
parsers be relatively easy to write (relative to SGML).

Judging by that YAML spec, I can imagine that a YAML parser could
well be much more difficult to write than an XML parser would be.

Anyone have personal experience with this?

(Yes, I know people don't write parsers as often as they use
them, and that's probably some of the justification behind YAML,
but looking at that YAML spec, I find it hard to imagine I could
ever remember enough of it to write a YAML file by hand, and
yet I can and do write XML files by hand often.)

-Peter

Jul 18 '05 #15

Fredrik Lundh

Reinhold Birkenfeld wrote:

Agreed. If you just want to use it, you don't need the spec anyway.

but the guy who wrote the parser you're using had to read it, and understand it.
judging from the number of crash reports you see in this thread, chances are that
he didn't.

</F>

Jul 18 '05 #16

Steve Holden

Bengt Richter wrote:

On Fri, 21 Jan 2005 12:04:10 -0600, "A.M. Kuchling" <am*@amk.ca> wrote:

On Fri, 21 Jan 2005 18:30:47 +0100,
rm <rm@rm.net> wrote:
Nowadays, people are trying to create binary XML, XML databases,
graphics in XML (btw, I'm quite impressed by SVG), you have XSLT, you
have XSL-FO, ... .
Which is an argument in favor of XML -- it's where the activity is, so it's
quite likely you'll encounter the need to know XML. Few projects use YAML,
so the chance of having to know its syntactic details is small.

<rant>
I thought XML was a good idea, but IMO requiring quotes around
even integer attribute values was an unfortunate decision. I don't buy
their rationale of keeping parsing simple -- as if extracting a string
with no embedded space from between an equal sign and terminating white
space were that much harder than extracting the same delimited by double quotes.

It isn't that much harder, but if there are two ways to do the same
thing then effectively one of them has to become a special case, thereby
complicating the code that has to handle it (in this case the parser).

"There should be one (and preferably only one) ..." should be a familiar
mantra around here :-)
The result is cluttering SVG with needless cruft around numerical graphics parameters.
</rant>
It seems to me the misunderstanding here is that XML was ever intended
to be generated directly by typing in a text editor. It was rather
intended (unless I'm mistaken) as a process-to-process data interchange
metalanguage that would be *human_readable*.

Tools that *create* XML are perfectly at liberty not to require quotes
around integer values.
OTOH, I think the HTML XML spec is very readable, and nicely designed.
At least the version 1.0 spec I snagged from W3C a long time ago.
.... I see the third edition at http://www.w3.org/TR/REC-xml/ is differently styled,
(I guess new style sheets) but still pretty readable (glancing at it now).

Regards,
Bengt Richter

regards
Steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/
Holden Web LLC +1 703 861 4237 +1 800 494 3119

Jul 18 '05 #17

Istvan Albert

Paul Rubin wrote:

I love this old rant about XML:

http://groups-beta.google.com/group/...30c508201627ee

This is my favorite:

http://weblog.burningbird.net/archiv...-the-languages

"I’m considered the savior, the ultimate solution, the final word.
Odes are written to me, flowers strewn at my feet, virgins sacrificed at
my altar. Programmers speak my name with awe. Companies insist on using
me in all their projects, though they’re not sure why. And whenever a
problem occurs, someone somewhere says, “Let’s use XML", and miracles
occur and my very name has become a talisman against evil. And yet, all
I am is a simple little markup, from humble origins.
It’s a burden, being XML."

Jul 18 '05 #18

Istvan Albert

rm wrote:

http://www.theinquirer.net/?article=20868 :-)

There's a lot of nonsense out there propagated by people who do not
understand XML. You can't possibly blame that on XML...

For me XSLT transformations are the main reason for using XML.
If I have an XML document I can turn it into other
formats with a few lines of code. Most importantly these
are much safer to run than a program.

I think of an XML document as a "mini-database" where one
can easily and efficiently access content via XPath. So there
is a lot more to XML than just markup and that's
why YAML vs XML comparisons make very little sense.

Istvan.

Jul 18 '05 #19

Sion Arrowsmith

Paul Rubin <http://ph****@NOSPAM.invalid> wrote:

YAML looks to me to be completely insane, even compared to Python
lists. I think it would be great if the Python library exposed an
interface for parsing constant list and dict expressions, e.g.:
[1, 2, 'Joe Smith', 8237972883334L, # comment
{'Favorite fruits': ['apple', 'banana', 'pear']}, # another comment
'xyzzy', [3, 5, [3.14159, 2.71828, []]]]
[ ... ]
Note that all the values in the above have to be constant literals.
Don't suggest using eval. That would be a huge security hole.

I'm probably not thinking deviously enough here, but how are you
going to exploit an eval() which has very tightly controlled
globals and locals (eg. eval(x, {"__builtins__": None}, {}) ?

--
\S -- si***@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
___ | "Frankly I have no feelings towards penguins one way or the other"
\X/ | -- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump

Jul 18 '05 #20

Doug Holton

rm wrote:

Doug Holton wrote:
rm wrote:
this implementation of their idea. But I'd love to see a generic,
pythonic data format.

That's a good idea. But really Python is already close to that. A
lot of times it is easier to just write out a python dictionary than
using a DB or XML or whatever. Python is already close to YAML in
some ways.

true, it's easy enough to separate the data from the functionality in
python by putting the data in a dictionary/list/tuple, but it stays
source code.

Check out JSON, an alternative to XML for data interchange. It is
basically just python dictionaries and lists:
http://www.crockford.com/JSON/example.html

I think I would like this better than YAML or XML, and it looks like it
already parses as valid Python code, except for the /* */ multiline
comments (which boo supports).

It was mentioned in a story about JSON-RPC-Java:
http://developers.slashdot.org/artic...5/01/24/125236

Jul 18 '05 #21

Fredrik Lundh

Sion Arrowsmith wrote:

I'm probably not thinking deviously enough here, but how are you
going to exploit an eval() which has very tightly controlled
globals and locals (eg. eval(x, {"__builtins__": None}, {}) ?

try this:

eval("'*'*1000000*2*2*2*2*2*2*2*2*2")

(for more on eval and builtins, see the "Evaluating Python expressions"
section here: http://effbot.org/librarybook/builtin.htm )

</F>

Jul 18 '05 #22

Peter Hansen

Sion Arrowsmith wrote:

Paul Rubin <http://ph****@NOSPAM.invalid> wrote:
YAML looks to me to be completely insane, even compared to Python
lists. I think it would be great if the Python library exposed an
interface for parsing constant list and dict expressions, e.g.:
[1, 2, 'Joe Smith', 8237972883334L, # comment
{'Favorite fruits': ['apple', 'banana', 'pear']}, # another comment
'xyzzy', [3, 5, [3.14159, 2.71828, []]]]
[ ... ]
Note that all the values in the above have to be constant literals.
Don't suggest using eval. That would be a huge security hole.

I'm probably not thinking deviously enough here, but how are you
going to exploit an eval() which has very tightly controlled
globals and locals (eg. eval(x, {"__builtins__": None}, {}) ?

See, for example, Alex Martelli's post in an old thread from 2001:
http://groups.google.ca/groups?selm=...s2.newsguy.com

-Peter

Jul 18 '05 #23

Michael Spencer

Fredrik Lundh wrote:

Sion Arrowsmith wrote:
I'm probably not thinking deviously enough here, but how are you
going to exploit an eval() which has very tightly controlled
globals and locals (eg. eval(x, {"__builtins__": None}, {}) ?

try this:

eval("'*'*1000000*2*2*2*2*2*2*2*2*2")

I updated the safe eval recipe I posted yesterday to add the option of reporting
unsafe source, rather than silently ignoring it. Is this completely safe? I'm
interested in feedback.

Michael

Some source to try:

goodsource = """[1, 2, 'Joe Smith', 8237972883334L, # comment ... {'Favorite fruits': ['apple', 'banana', 'pear']}, # another comment
... 'xyzzy', [3, 5, [3.14159, 2.71828, []]]]"""
...

Unquoted string literal badsource = """[1, 2, JoeSmith, 8237972883334L, # comment ... {'Favorite fruits': ['apple', 'banana', 'pear']}, # another comment
... 'xyzzy', [3, 5, [3.14159, 2.71828, []]]]"""
...
Non-constant expression effbot = "'*'*1000000*2*2*2*2*2*2*2*2*2" safe_eval(good_source) [1, 2, 'Joe Smith', 8237972883334L, {'Favorite fruits': ['apple', 'banana',
'pear']}, 'xyzzy', [3, 5, [3.1415899999999999, 2.71828, []]]] assert _ == eval(good_source) safe_eval(bad_source) Traceback (most recent call last):
[...]
Unsafe_Source_Error: Line 1. Strings must be quoted: JoeSmith
safe_eval(bad_source, fail_on_error = False) [1, 2, None, 8237972883334L, {'Favorite fruits': ['apple', 'banana', 'pear']},
'xyzzy', [3, 5, [3.1415899999999999, 2.71828, []]]]
safe_eval(effbot) Traceback (most recent call last):
[...]
Unsafe_Source_Error: Line 1. Unsupported source construct: compiler.ast.Mul
safe_eval(effbot, fail_on_error = False) ...
'*'

Source:

import compiler

class Unsafe_Source_Error(Exception):
def __init__(self,error,descr = None,node = None):
self.error = error
self.descr = descr
self.node = node
self.lineno = getattr(node,"lineno",None)

def __repr__(self):
return "Line %d. %s: %s" % (self.lineno, self.error, self.descr)
__str__ = __repr__

class AbstractVisitor(object):
def __init__(self):
self._cache = {} # dispatch table

def visit(self, node,**kw):
cls = node.__class__
meth = self._cache.setdefault(cls,
getattr(self,'visit'+cls.__name__,self.default))
return meth(node, **kw)

def default(self, node, **kw):
for child in node.getChildNodes():
return self.visit(child, **kw)
visitExpression = default

class SafeEval(AbstractVisitor):

def visitConst(self, node, **kw):
return node.value

def visitDict(self,node,**kw):
return dict([(self.visit(k),self.visit(v)) for k,v in node.items])

def visitTuple(self,node, **kw):
return tuple(self.visit(i) for i in node.nodes)

def visitList(self,node, **kw):
return [self.visit(i) for i in node.nodes]

class SafeEvalWithErrors(SafeEval):

def default(self, node, **kw):
raise Unsafe_Source_Error("Unsupported source construct",
node.__class__,node)

def visitName(self,node, **kw):
raise Unsafe_Source_Error("Strings must be quoted",
node.name, node)

# Add more specific errors if desired
def safe_eval(source, fail_on_error = True):
walker = fail_on_error and SafeEvalWithErrors() or SafeEval()
try:
ast = compiler.parse(source,"eval")
except SyntaxError, err:
raise
try:
return walker.visit(ast)
except Unsafe_Source_Error, err:
raise

Jul 18 '05 #24

Sion Arrowsmith

Fredrik Lundh <fr*****@pythonware.com> wrote:

Sion Arrowsmith wrote:
I'm probably not thinking deviously enough here, but how are you
going to exploit an eval() which has very tightly controlled
globals and locals (eg. eval(x, {"__builtins__": None}, {}) ?

try this:

eval("'*'*1000000*2*2*2*2*2*2*2*2*2")

No thanks.

I guess my problem is a tendency view security issues from the
point of view of access to data rather than access to processing.

--
\S -- si***@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
___ | "Frankly I have no feelings towards penguins one way or the other"
\X/ | -- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump

Jul 18 '05 #25

Aahz

In article <eE*******************@news.indigo.ie>,
Alan Kennedy <al****@hotmail.com> wrote:

However, I'm torn on whether to use ReST for textual content. On the one
hand, it's looks pretty comprehensive and solidly implemented. But OTOH,
I'm concerned about complexity: I don't want to commit to ReST if it's
going to become a lot of hard work or highly-inefficient when I really
need to use it "in anger".

From what I've seen, pretty much every textual markup targetted for web
content, e.g. wiki markup, seems to have grown/evolved organically,
meaning that it is either underpowered or overpowered, full of special
cases, doesn't have a meaningful object model, etc.
My perception is that reST is a lot like Python itself: it's easy to hit
the ground running, particularly if you restrict yourself to a specific
subset of featuers. It does give you a fair amount of power, and some
things are difficult or impossible.

Note that reST was/is *not* specifically aimed at web content. Several
people have used it for writing books; some people are using it instead
of PowerPoint.
So, I'm hoping that the learned folks here might be able to give me some
pointers to a markup language that has the following characteristics

1. Is straightforward for non-technical users to use, i.e. can be
(mostly) explained in a two to three page document which is
comprehensible to anyone who has ever used a simple word-processor or
text-editor.

2. Allows a wide variety of content semantics to be represented, e.g.
headings, footnotes, sub/superscript, links, etc, etc.

These two criteria seem to be in opposition. I certainly wouldn't
expect a three-page document to explain all these features, not for
non-technical users. reST fits both these criteria, but only for a
selected subset of featuers.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis

Jul 18 '05 #26

Alan Kennedy

[Alan Kennedy]

However, I'm torn on whether to use ReST for textual content. On the one
hand, it's looks pretty comprehensive and solidly implemented. But OTOH,
I'm concerned about complexity: I don't want to commit to ReST if it's
going to become a lot of hard work or highly-inefficient when I really
need to use it "in anger".

From what I've seen, pretty much every textual markup targetted for web
content, e.g. wiki markup, seems to have grown/evolved organically,
meaning that it is either underpowered or overpowered, full of special
cases, doesn't have a meaningful object model, etc.
[Aahz]
My perception is that reST is a lot like Python itself: it's easy to hit
the ground running, particularly if you restrict yourself to a specific
subset of featuers. It does give you a fair amount of power, and some
things are difficult or impossible.

Note that reST was/is *not* specifically aimed at web content. Several
people have used it for writing books; some people are using it instead
of PowerPoint.
Thanks, Aahz, that's a key point that I'll continue on below.

[Alan Kennedy]So, I'm hoping that the learned folks here might be able to give me some
pointers to a markup language that has the following characteristics

1. Is straightforward for non-technical users to use, i.e. can be
(mostly) explained in a two to three page document which is
comprehensible to anyone who has ever used a simple word-processor or
text-editor.

2. Allows a wide variety of content semantics to be represented, e.g.
headings, footnotes, sub/superscript, links, etc, etc.

[Aahz] These two criteria seem to be in opposition. I certainly wouldn't
expect a three-page document to explain all these features, not for
non-technical users. reST fits both these criteria, but only for a
selected subset of featuers.

The point is well made.

When I wrote my requirements, I did have a specific limited feature set
in mind: basically a print-oriented set of features with which anyone
who reads books would be familiar. I'm trying to capture scientific
abstracts, of the sort that you can see linked off this page.

http://www.paratuberculosis.org/proc7/

But I'm basically only interested in representation of the original
input text. I'll be capturing a lot of metadata as well, but most of
that will be captured outside the markup language, through a series of
form inputs which ask specific metadata questions. So, for example, the
relationships between authors and institutions, seen on the next page,
will not be recorded in the markup.

http://www.paratuberculosis.org/proc7/abst5_p2.htm

I think that is where a lot of markup languages fall down, in that they
end trying to develop a sophisticated metadata model that can capture
that kind of information, and re-engineering the markup to support it.
This co-evolution of the markup and model can go horribly awry, if the
designers are inexperienced or don't know where they're headed.

Since ReST seems to do this stuff fairly well, I think I'll take a
closer look at it. From what I've seen of it, e.g. PEPs, python module
documentation (SQLObject, etc), it seems to be reasonably unobtrusive to
the author.

regards,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan

Jul 18 '05 #27

Aahz

In article <UF*******************@news.indigo.ie>,
Alan Kennedy <al****@hotmail.com> wrote:

I think that is where a lot of markup languages fall down, in that they
end trying to develop a sophisticated metadata model that can capture
that kind of information, and re-engineering the markup to support it.
This co-evolution of the markup and model can go horribly awry, if the
designers are inexperienced or don't know where they're headed.

Ayup. Fortunately, David Goodger (the primary architect of reST) has
plenty of experience in this area; reST was written as a reaction to the,
er, organic nature of some other experiments. Nobody involved with the
reST project claims it's perfect, but most of us do think we've made
good tradeoffs (just like Python ;-). Like Guido, David's also pretty
comfortable saying "no"....

One more thing: reST does make it fairly easy to write "self-documenting"
forms -- give your users boilerplate plus some simple instructions, and
they should find it very easy to write their content. (I'm particularly
referring to using bibliographic fields in reST -- that's easier to do
with boilerplate than explain how to do correctly.)
The one thing I forgot to mention in my first post is that the biggest
downside to reST for non-technical users is that it's possible to write
incorrect documents. With a GUI interface, you can write butt-ugly junk,
but it'll never generate a syntax error. However, given your
requirements, it sounds more like you're using "non-technical" to mean
"not computer experts" -- those people are probably more likely to feel
comfortable with getting error messages.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis

Jul 18 '05 #28

Aahz

In article <ma***************************************@python. org>,
Fredrik Lundh <fr*****@pythonware.com> wrote:

I've read many specs; YAML (both the spec and the format) is easily
among the worst ten-or-so specs I've ever seen.

ReST and YAML share the same deep flaw: both formats are marketed as
simple, readable formats, and at a first glance, they look simple and
readable -- but in reality, they're messy as hell, and chances are
that the thing you're looking at doesn't really mean what you think it
means (unless you're the official ReST/YAML parser implementation).
experienced designers know how to avoid that; the ReST/YAML designers
don't even understand why they should.

While I can see how you'd get that impression of reST, it's not true:
like Python, reST is intended to be simpl*er* and readable, but not
simple. The joy of reST is that I can concentrate on writing instead of
formatting, just as I do when writing Usenet posts. ;-) Even after
using reST for a long time, I'm still constantly looking up features that
I use rarely (such as correct formatting of URLs).

But reST is great because it's relatively unobtrusive. Those of us
who've used reST to document code for a long time have gotten into the
habit of using some reST-isms even when not writing reST: have you
noticed the number of Pythonistas who use constructs like ``foo()``?
Even if you didn't know it was from reST, the meaning is obvious.

As you say, reST can/does get messy when you're doing complicated things,
but it stays more readable than XML/DocBook. For the most part, I think
I'd also have to disagree with your assertion that reST formatting
doesn't do what you think it does. In cases where your statement is
correct, it's either labeled as an explicit design decision (to prevent
other ugliness) or it's a bug.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis

Jul 18 '05 #29

Tim Parkin

On Sun, 2005-01-23 at 13:41 +0100, Fredrik Lundh wrote:

Alan Kennedy wrote:
If I can't find such a markup language, then I might instead end up using a WYSIWYG editing
component that gives the user a GUI and generates (x)html.

htmlArea: http://www.htmlarea.com/
Editlet: http://www.editlet.com/

But I'd prefer a markup solution.

some of these are amazingly usable. have you asked your users what they
prefer? (or maybe you are your user? ;-)

Most users prefer to write documents in word and then paste them into
textareas. Not surprisingly means no semantic content, little chance of
restyling, horrible encoding problems and far too long spent on the
phone trying to explain why it's not a good idea.

Giving users a wysiwyg textarea creates the problems that users start to
spend time trying to create a 'styled' document, inevitably sacrificing
semantics (many is the user that has applied a header style to make
things bold or a quote sytle to indent a paragraph). Using text based
layouts reinforces the perception that you aren't creating a styled
document and that the semantic structure is important.

People who have used non-wysiwyg editors have found that their initial
reticence has been quickly overtaken by their joy at not having to fight
with 'style' and the reassurance that their content is now 'redesign
proof'.

Tim Parkin
http://www.pollenation.net

Jul 18 '05 #30

richard

Aahz wrote:

While I can see how you'd get that impression of reST, it's not true:
like Python, reST is intended to be simpl*er* and readable, but not
simple.**The*joy*of*reST*is*that*I*can*concentrate *on*writing*instead*of
formatting, just as I do when writing Usenet posts.**;-)**Even*after
using reST for a long time, I'm still constantly looking up features that
I use rarely (such as correct formatting of URLs).
But reST is great because it's relatively unobtrusive. Those of us
who've used reST to document code for a long time have gotten into the
habit of using some reST-isms even when not writing reST: have you
noticed the number of Pythonistas who use constructs like ``foo()``?
Even if you didn't know it was from reST, the meaning is obvious.
And this is the core of it for me too (if you want simple, use Word).
Roundup's documentation__ (in particular the `Customisation Doc`__ which is
now huge) is entirely written in reST. It uses a fraction of the total pool
of reST constructs, but I believe the end result is perfectly legible. I
also tend to write in reST style when composing emails (a biggie for me is
starting examples with "::").

Anyway, some sample Roundup docs:

__ http://roundup.sourceforge.net/doc-0.8/index.html
__ http://roundup.sourceforge.net/doc-0.8/customizing.html

As you say, reST can/does get messy when you're doing complicated things,
but it stays more readable than XML/DocBook.

Indeed - I chose to use reST for Roundup's documentation for two very
important reasons:

1. lower the barrier for me to write the docs - and I am *really* happy with
how current the Roundup docs stay, because I don't feel like actually
writing them is a pain, as opposed to any sort of Markup Language format,
and
2. the first contributor of docs suggested it, and I've had several
contributors since. It's easier for contributors to write for Roundup's
documentation - even if they don't get the reST markup correct, it's
trivial to fix. This is less the case with a Markup Language.
Richard

Jul 18 '05 #31

What YAML engine do you use?

Similar topics