Tabs versus Spaces in Source Code - Page 3

Xah Lee

Tabs versus Spaces in Source Code

Xah Lee, 2006-05-13

In coding a computer program, there's often the choices of tabs or
spaces for code indentation. There is a large amount of confusion about
which is better. It has become what's known as â€œreligious warâ€ â€”
a heated fight over trivia. In this essay, i like to explain what is
the situation behind it, and which is proper.

Simply put, tabs is proper, and spaces are improper. Why? This may seem
ridiculously simple given the de facto ball of confusion: the semantics
of tabs is what indenting is about, while, using spaces to align code
is a hack.

Now, tech geekers may object this simple conclusion because they itch
to drivel about different editors and so on. The alleged problem
created by tabs as seen by the industry coders are caused by two
things: (1) tech geeker's sloppiness and lack of critical thinking
which lead them to not understanding the semantic purposes of tab and
space characters. (2) Due to the first reason, they have created and
propagated a massive none-understanding and mis-use, to the degree that
many tools (e.g. vi) does not deal with tabs well and using spaces to
align code has become widely practiced, so that in the end spaces seem
to be actually better by popularity and seeming simplicity.

In short, this is a phenomenon of misunderstanding begetting a snowball
of misunderstanding, such that it created a cultural milieu to embrace
this malpractice and kick what is true or proper. Situations like this
happens a lot in unix. For one non-unix example, is the file name's
suffix known as â€œextensionâ€, where the code of file's type became
part of the file name. (e.g. â€œ.txtâ€, â€œ.htmlâ€, â€œ.jpgâ€).
Another well-known example is HTML practices in the industry, where
badly designed tags from corporation's competitive greed, and stupid
coding and misunderstanding by coders and their tools are so
wide-spread such that they force the correct way to the side by the
eventual standardization caused by sheer quantity of inproper but set
practice.

Now, tech geekers may still object, that using tabs requires the
editors to set their positions, and plain files don't carry that
information. This is a good question, and the solution is to advance
the sciences such that your source code in some way embed such
information. This would be progress. However, this is never thought of
because the â€œunix philosophiesâ€ already conditioned people to hack
and be shallow. In this case, many will simply use the character
intended to separate words for the purpose of indentation or alignment,
and spread the practice with militant drivels.

Now, given the already messed up situation of the tabs vs spaces by the
unixers and unix brain-washing of the coders in the industry... Which
should we use today? I do not have a good proposition, other than just
use whichever that works for you but put more critical thinking into
things to prevent mishaps like this.

Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
semantic vs format. In these, it is always easy to convert from the
former to the latter, but near impossible from the latter to the
former. And, that is because the former encodes information that is
lost in the latter. If we look at the issue of tabs vs spaces, indeed,
it is easy to convert tabs to spaces in a source code, but more
difficult to convert from spaces to tabs. Because, tabs as indentation
actually contains the semantic information about indentation. With
spaces, this critical information is lost in space.

This issue is intimately related to another issue in source code:
soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
line character). This issue has far more consequences than tabs vs
spaces, and the unixer's unthinking has made far-reaching damages in
the computing industry. Due to unix's EOL ways of thinking, it has
created languages based on EOL (just about ALL languages except the
Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
and basically every tool in unix), thoughts based on EOL (software
value estimation by counting EOL, hard-coded email quoting system by
â€œ>â€ prefix, and silent line-truncations in many unix tools), such
that any progress or development towards a â€œalgorithmic code unitâ€
concept or language syntaxes are suppressed. I have not written a full
account on this issue, but i've touched it in this essay: â€œThe Harm
of hard-wrapping Linesâ€, at
http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
----
This post is archived at:
http://xahlee.org/UnixResource_dir/w...vs_spaces.html

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

May 15 '06

Subscribe Post Reply

135

7309

Edward Elliott

William Studenmund wrote:

The problem is that tabs take you to the next tab stop, they don't
expand to a fixed number of spaces.

Got it. You're talking about using tabs other than for initial line
indentation on a source file. Yes, then tab expansion is not perfect.

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

May 18 '06 #101

Edward Elliott

Terry Hancock wrote:

Now, of course, the data I provide is nasty, mean, poorly-formatted
data, abhorable by space-zealots and tab-libertines alike (;-)), but the
point is, unless you have set up your editor to syntax color spaces
and tabs differently, you won't see the difference in the original
editor.

Sure, mixed tabs and spaces were not part of my use case.

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

May 18 '06 #102

Terry Hancock

Edmond Dantes wrote:

The real issue is, of course, that ASCII is showing its age and we should
probably supplant it with something better. But I know that will never fly,
given the torrents of code, configuration files, and everything else in
ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
global development efforts. Not sure how many compilers would be able to
handle Unicode source anyway. I suspect the large majority of them would
would choke big time.

I think that was the old conventional wisdom, but it's not so obvious
anymore. UTF-8 is a pretty cool standard. gVim handles it just
fine, Python source allows UTF-8 within string literals, even if it
doesn't like it in identifiers, and IIRC, Unicode is the official standard
for Java files. It continues not to be used so much, but a lot of the
capacity is there.

Also, the 'config files in ASCII' thing is simply not a problem -- ASCII
*is* a full-subset of UTF-8, so an ASCII config file is already a UTF-8
config file.

Personally, I don't think ASCII is nearly as entrenched as you suggest.
I wouldn't be surprised if Unicode/UTF-8 has fully supplanted it inside
of 10 years.

Cheers,
Terry

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

May 18 '06 #103

Dave Hansen

On 17 May 2006 16:13:54 -0700 in comp.lang.python, "achates"
<ay****@cantab.net> wrote:

Carl J. Van Arsdall wrote:
The converse can also be said, "it's difficult to make sure everyone
uses spaces and not tabs".

I think we've just about beat this discussion to death... nice work
everyone!
Yeah - we've got to the repeating ourselves stage.

But that's the problem with this issue: it's really hard to get the
space-indenters to actually think about it and to address what is being
said. Every time it comes up, there's always a few people trying to

Look in the mirror. There is non so blind...
explain why tabs give are a good idea, facing a whole raft of others
The problem is that TABs are a _bad_ idea.
spouting stuff like:
'mixing spaces and tabs is bad so use spaces only'
Mixing TABs and spaces is bad because it means using TABs. ;-)
'tabs are x spaces and I like to use y spaces'
I've not seen that argument. One of us needs to read closer.

Although I have seen the converse used to defend TABs: x spaces is x
spaces, and I like y spaces,
'tabs are bad end of story'
Works for me! ;-)
and these non-arguments are repeated over and over within the same
thread. At times it's like talking to a child - and not a bright one at
that.
These "non-arguments" are your own straw men. Either that, or you
need to work on reading comprehension.

Does it matter? Perhaps not if we can use tools which enable us to
bridge the divide, like indent auto-detection in emacs and vim. I'm
prepared to do that in cases where I have to work with an existing
group of coders uasing spaces.
It matters because not every programmer is willing to put in the time
effort required to learn how to use a sophisticated editor like emacs
or vim well. Or at all.

It matters because in industry you get programmers with a wide range
of skills, and you can't fire everyone who can't tell when there are
spaces in front of a tab character. Often these people have unique
and hard-to-find domain knowledge.

But unfortunately the situation is worse than that: tab indentation
needs to be actively defended. Most of the coding 'style guides' you'll
No, it needs to be stamped out. ;-)
find (including Python's) advocate spaces only. There are plenty of
Hallelujah!
people who would like tabs removed from the language as an acceptable
indentation method - look at the responses to Guido's April Fools blog
entry last year.
I would love to see the TAB character treated as a syntax error. I
have no illusions that's going to happen, though.

FWIW, I would be equally (well, almost, anyway) happy if Python said
that the _only_ place a TAB character could appear was at the
beginning of a line, and that the number of TAB characters _always_
indicated the indentation level (e.g., spaces do _not_ change
indentation level, and all the lines in a multi-line statement had to
be at the same indentation level). This would eliminate most of my
objections to TABs. I have no illusions this will happen either.

Unlikely perhaps. I hope so. It's a cruel irony that Python's creator
didn't appreciate the benefits that tab indentation would bring to his
own language - the only major language in which indentation levels
actually have semantic significance.

The problem with TAB characters is that they look just like the
equivalent number of space characters. This is, of course, their
major feature as well. The problem, especially with Python, is that
mistakes in the placement of TAB characters within a source file can
silently change the _meaning_ of the code.

TAB proponents seem to list one overriding advantage of using TAB
characters for indentation: "I can use my preferred indent level, and
everyone else can use theirs." I find this argument _very_ weak. I've
seen misuse of TABs break code. I've never seen an enforced
indentation level break a programmer.

Regards,
-=Dave

--
Change is inevitable, progress is not.

May 18 '06 #104

Edward Elliott

We've finally hit the meta-discussion point. Instead of talking about tabs
and spaces, we're talking about talking about tabs and spaces. Which
frankly is a much more interesting conversation anyway.

achates wrote:

Does it matter? Perhaps not if we can use tools which enable us to
bridge the divide, like indent auto-detection in emacs and vim. I'm
prepared to do that in cases where I have to work with an existing
group of coders uasing spaces.
If you ask me, which of course you didn't, indentation is just one small
part of the larger issue of code formatting. Unfortunately it's the only
one that allows some semblance of flexibility. Formatting like brace/paren
placement and inter-operator spacing greatly affect readability but are
hard-coded into the source. And none of this matters a wit to the
semantics of the code.

What really should happen is that every time an editor reads in source code,
the code is reformatted for display according to the user's settings. The
editor becomes a parser, breaking the code down into tokens and emitting it
in a personally preferred format. Comments are left untouched apart from
initial indentation. On output back to a file, the code can be either
written as-is (the next guy's editor will reformat it anyway) or put in
some standard form (for the poor shlubs who code with cat/notepad).

All this becomes completely transparent to the user, who sees every file he
edits in exactly the format he's accustomed to. It's similar to the
various pushes for syntactic code storage formats like abstract syntax
trees or <shudder> xml, but works with the existing infrastructure built
around processing plain text files. Meanwhile LISP has been storing code
in paren-based ASTs since the 50s.

vim and emacs can already do this today. It might not be perfect, but if
people spent half as much time perfecting this as arguing about tabs vs
spaces, we'd all be a lot better off (yes I'm guilty too).

It's a cruel irony that Python's creator
didn't appreciate the benefits that tab indentation would bring to his
own language - the only major language in which indentation levels
actually have semantic significance.

Fate is a cruel mistress. Or maybe just a heartless bitch. Either way,
watch your back.

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

May 18 '06 #105

Jorge Godoy

achates wrote:

Jorge Godoy wrote
Emacs guess what's used in the file and allows me to use tabs all the
time, doing the correct thing...

That sounds like useful behaviour.

Maybe this is an area where modern editors might be able to save us
from ourselves. I'll admit I'm suspicious of relying on editor
functionality - I'm happier if I know I can also use the old-school
methods just in case.. Sometimes adding intelligence to an interface
can be a usability disaster if it makes wrong assumptions about what
you want. But if people are hell-bent on converting tabs to spaces,
maybe it's the best way to accommodate them.

If you don't want the functionality, simply disable it. This is why
configuration files and options exist...

--
Jorge Godoy <go***@ieee.org>

"Quidquid latine dictum sit, altum sonatur."
- Qualquer coisa dita em latim soa profundo.
- Anything said in Latin sounds smart.

May 18 '06 #106

ashesh

If I work on your project, I follow the coding and style standards you
specify.
Likewise if you work on my project you follow the established
standards.
Fortunately for you, I am fairly liberal on such matters.
I like to see 4 spaces for indentation. If you use tabs, that's what I

will see, and you're very likely to have your code reformatted by the
automated build process, when the standard copyright header is pasted
and missing javadoc tags are generated as warnings.
I like the open brace to start on the line of the control keyword. I
can deal with the open brace being on the next line, at the same level
of indentation as the control keyword. I don't quite understand the
motivation behind the GNU style, where the brace itself is treated as a

half-indent, but I can live with it on *your* project.
Any whitespace or other style that isn't happy to be reformatted
automatically is an error anyway.
I'd be very laissez-faire about it except for the fact that code
repositories are much easier to manage if everything is formatted
before
it goes in, or as a compromise, as a step at release tags.
Ashesh..

May 18 '06 #107

PoD

On Wed, 17 May 2006 21:37:14 +0800, Andy Sy wrote:

If tabs are easily misunderstood, then they are a MISfeature
and they need to be removed.
From the Zen of Python:

"Explicit is better than implicit..."
"In the face of ambiguity, refuse the temptation to guess..."
"Special cases aren't special enough to break the rules..."

Exactly.
How many levels of indentation does 12 spaces indicate?
It could be 1,2,3,4,6 or 12. If you say it's 3 then you are _implying_
that each level is represented by 4 spaces.

How many levels of indentation is 3 tabs? 3 levels in any code that you
will find in the wild.

May 18 '06 #108

Christophe

Carl J. Van Arsdall a écrit :

glomde wrote:

But If you work in a team it is kind of hard to make sure that
everybody use tabs and not spaces. And it is not very easy to spot
either.

The converse can also be said, "it's difficult to make sure everyone
uses spaces and not tabs".

I think we've just about beat this discussion to death... nice work
everyone!

No, it's really easy : a simple precoomit hook which will refuse any .py
file with the \t char in it and it's done ;)

May 18 '06 #109

Duncan Booth

PoD wrote:

How many levels of indentation does 12 spaces indicate?
It could be 1,2,3,4,6 or 12. If you say it's 3 then you are
_implying_ that each level is represented by 4 spaces.
By reading the code I can see how many levels of indentation it
represents.
How many levels of indentation is 3 tabs? 3 levels in any code that
you will find in the wild.

No. That is precisely the problem: there is code in the wild which
contains mixed space and tab indentation, and any time that happens 3
tabs could mean any number of indentations.

Now, I just know someone is going to challenge me over my assertion that
there really could be code with mixed spaces and tabs out there, so here
are a few examples found by grepping a Plone Products folder. All the
projects below use spaces almost everywhere for indentation, but it looks
like a few tabs slipped through.

http://svn.plone.org/view/archetypes...5111&view=auto

contains tabs at the start of two lines. Fortunately these are
continuation lines so it doesn't really matter how you display them. I
think they are intended to be displayed with tab-size=8.

http://svn.plone.org/view/archetypes...4970&view=auto

One tab used for indentation. The block is only one line long so the code
doesn't break whatever tabsize you use, but visually it would appear the
intended tabsize is 0.

http://svn.plone.org/view/plone/CMFP...9836&view=auto

A tab is used for two levels of indentation. Anything other than tabsize=8
would cause a syntax error.

http://svn.plone.org/view/plone/CMFP...9836&view=auto

Lots of tabs, most but not all on continuation lines. The two which aren't
are on single line blocks with a single tab representing two indents.

CMFPlone\tests\testInterfaces.py
CMFPlone\tests\testTranslationServiceTool.py
ExternalEditor (various files)
kupu (spellcheck.py)

and finally, at the end of my Plone Products directory I found this beauty
where I've replaced the tab characters with <tab> to make them visible:

svn://svn.zope.org/repos/main/Zelenium/trunk/scripts/tinyWebServer.py

if __name__ == '__main__':
<tab>port = PORT
<tab>if len(sys.argv) > 1:
<tab> port = int(sys.argv[1])
<tab>
server_address = ('', port)
<tab>httpd = BaseHTTPServer.HTTPServer(server_address, HTTPHandler)

<tab>print "serving at port", port
<tab>print "To run the entire JsUnit test suite, open"
<tab>print " http://localhost:8000/jsunit/testRunner.html?testPage=http://localhost:8000/tests/JsUnitSuite.html&autoRun=true"
<tab>print "To run the acceptance test suite, open"
<tab>print " http://localhost:8000/TestRunner.html"

<tab>while not HTTPHandler.quitRequestReceived :
<tab>httpd.handle_request()<tab>
<tab>

This is a genuine example of code in the wild which will look like
syntactically valid Python at either tab-size=4 or tab-size=8, but
if you view it at tab-size=4 you will see different block indentation
than the Python interpreter uses at tab-size=8.

At tab-size=4 it reads:

if __name__ == '__main__':
port = PORT
if len(sys.argv) > 1:
port = int(sys.argv[1])

server_address = ('', port)
httpd = BaseHTTPServer.HTTPServer(server_address, HTTPHandler)

print "serving at port", port
print "To run the entire JsUnit test suite, open"
print " http://localhost:8000/jsunit/testRunner.html?testPage=http://localhost:8000/tests/JsUnitSuite.html&autoRun=true"
print "To run the acceptance test suite, open"
print " http://localhost:8000/TestRunner.html"

while not HTTPHandler.quitRequestReceived :
httpd.handle_request()

but at tab-size=8 it reads:

if __name__ == '__main__':
port = PORT
if len(sys.argv) > 1:
port = int(sys.argv[1])

server_address = ('', port)
httpd = BaseHTTPServer.HTTPServer(server_address, HTTPHandler)

print "serving at port", port
print "To run the entire JsUnit test suite, open"
print " http://localhost:8000/jsunit/testRunner.html?testPage=http://localhost:8000/tests/JsUnitSuite.html&autoRun=true"
print "To run the acceptance test suite, open"
print " http://localhost:8000/TestRunner.html"

while not HTTPHandler.quitRequestReceived :
httpd.handle_request()

I wouldn't have a problem with tabs if Python rejected mixed indentation by
default, because then none of the code above would execute. But it doesn't.
:(

Anyone got a subversion checkin hook to reject mixed indentation? I think that big
repositories like Zope and Plone could benefit from it.

I just ran the same grep on the Python source tree. Not a tab in sight. :)

May 18 '06 #110

Christophe

PoD a écrit :

On Wed, 17 May 2006 21:37:14 +0800, Andy Sy wrote:

If tabs are easily misunderstood, then they are a MISfeature
and they need to be removed.
From the Zen of Python:
"Explicit is better than implicit..."
"In the face of ambiguity, refuse the temptation to guess..."
"Special cases aren't special enough to break the rules..."

Exactly.
How many levels of indentation does 12 spaces indicate?
It could be 1,2,3,4,6 or 12. If you say it's 3 then you are _implying_
that each level is represented by 4 spaces.

Actually, who said you had to always use the same number of spaces to
indent ? 12 = 6 + 6 = 4 + 4 + 4 but also 12 = 2 + 10 = 1 + 1 + 3 + 3 + 4 :D
How many levels of indentation is 3 tabs? 3 levels in any code that you
will find in the wild.

No, it could be 3 levels or 3 tabs per level or 2 tabs for the first
level and 1 tab for the second ...

May 18 '06 #111

achates

Edward Elliott wrote:

What really should happen is that every time an editor reads in source code,
the code is reformatted for display according to the user's settings. The
editor becomes a parser, breaking the code down into tokens and emitting it
in a personally preferred format.

I completely agree, and I guess that is what I was groping towards in
my remarks about using modern editing tools.

At the same time I would be resist any move towards making source files
less huiman-readable. There will still be times when those tools aren't
available (e.g. for people working on embedded s/w or legacy systems),
and that's when having ASCII source with tabbed indentation would be so
useful. But it looks, sadly, like we're fighting a rearguard action on
that one.

May 18 '06 #112

Alain Picard

"Bill Pursell" <bi**********@gmail.com> writes:

In my experience, the people who complain about the use
of tabs for indentation are the people who don't know
how to use their editor, and those people tend to use
emacs.

HA HA HA HA HA HA HA HA HA HA HA HA ....

Tee, hee heee.... snif!

Phew. Better now.

That was funny! Thanks! :-)

May 18 '06 #113

achates

Duncan Booth wrote:

No. That is precisely the problem: there is code in the wild which
contains mixed space and tab indentation...
<followed by some good examples of mixed tab and space indentation>
I wouldn't have a problem with tabs if Python rejected mixed indentation by
default, because then none of the code above would execute.

I think it's great that at least we're all agreed that mixed
indentation is a bad idea in any code, and particularly in Python.

How would people feel about having the -t (or even -tt) behaviour
become the default in future Python releases? A legacy option would
obviously need to be provided for the old default behaviour.

May 18 '06 #114

Pascal Bourguignon

Edmond Dantes <ed****@le-comte-de-monte-cristo.biz> writes:

It all depends on your editor of choice. Emacs editing of Lisp (and a few
other languages, such as Python) makes the issue more or less moot. I
personally would recommend choosing one editor to use with all your
projects, and Emacs is wonderful in that it has been ported to just about
every platform imaginable.

The real issue is, of course, that ASCII is showing its age and we should
probably supplant it with something better. But I know that will never fly,
given the torrents of code, configuration files, and everything else in
ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
global development efforts. Not sure how many compilers would be able to
handle Unicode source anyway. I suspect the large majority of them would
would choke big time.

All right unicode support is not 100% perfect already, but my main
compilers support it perfectly well, only 1/5 don't support it, and
1/5 support it partially:

------(unicode-script.lisp)---------------------------------------------

(defun clisp (file)
(ext:run-program "/usr/local/bin/clisp"
:arguments (list "-ansi" "-norc" "-on-error" "exit"
"-E" "utf-8"
"-i" file "-x" "(ext:quit)")
:input nil :output :terminal :wait t))

(defun gcl (file)
(ext:run-program "/usr/local/bin/gcl"
:arguments (list "-batch"
"-load" file "-eval" "(lisp:quit)")
:input nil :output :terminal :wait t))

(defun ecl (file)
(ext:run-program "/usr/local/bin/ecl"
:arguments (list "-norc"
"-load" file "-eval" "(si:quit)")
:input nil :output :terminal :wait t))

(defun sbcl (file)
(ext:run-program "/usr/local/bin/sbcl"
:arguments (list "--userinit" "/dev/null"
"--load" file "--eval" "(sb-ext:quit)")
:input nil :output :terminal :wait t))

(defun cmucl (file)
(ext:run-program "/usr/local/bin/cmucl"
:arguments (list "-noinit"
"-load" file "-eval" "(extensions:quit)")
:input nil :output :terminal :wait t))
(dolist (implementation '(clisp gcl ecl sbcl cmucl))
(sleep 3)
(terpri) (print implementation) (terpri)
(funcall implementation "unicode-source.lisp"))

------(unicode-source.lisp)---------------------------------------------
;; -*- coding: utf-8 -*-

(eval-when (:compile-toplevel :load-toplevel :execute)
(format t "~2%~A ~A~2%"
(lisp-implementation-type)
(lisp-implementation-version))
(finish-output))
(defun Î¹Î¿Ï„Î± (&key (Ð½Ð¾Ð¼ÐµÑ€ 10) (ë‹¨ê³„ 1) (×‘×›×•×› 0))
(loop :for i :from ×‘×›×•×› :to Ð½Ð¾Ð¼ÐµÑ€ :by ë‹¨ê³„ :collect i))
(defun test ()
(format t "~%Calling ~S --> ~A~%"
'(Î¹Î¿Ï„Î± :Ð½Ð¾Ð¼ÐµÑ€ 10 :ë‹¨ê³„ 2 :×‘×›×•×› 2)
(Î¹Î¿Ï„Î± :Ð½Ð¾Ð¼ÐµÑ€ 10 :ë‹¨ê³„ 2 :×‘×›×•×› 2)))

(test)

------------------------------------------------------------------------

(load"unicode-script.lisp")
;; Loading file unicode-script.lisp ...

CLISP
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8

Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2006

;; Loading file unicode-source.lisp ...

CLISP 2.38 (2006-01-24) (built 3347193361) (memory 3347193794)
Calling (Î™ÎŸÎ¤Î‘ :ÐÐžÐœÐ•Ð* 10 :ë‹¨ê³„ 2 :×‘×›×•×› 2) --> (2 4 6 8 10)
;; Loaded file unicode-source.lisp
Bye.
GCL
GNU Common Lisp (GCL) GCL 2.6.7
Calling (Î¹Î¿Ï„Î± :Ð½Ð¾Ð¼ÐµÑ€ 10 :ë‹¨ê³„ 2 :×‘×›×•×› 2) --> (2 4 6 8
10)
ECL
;;; Loading "unicode-source.lisp"
ECL 0.9g
Calling (Î¹Î¿Ï„Î± :Ð½Ð¾Ð¼ÐµÑ€ 10 :ë‹¨ê³„ 2 :×‘×›×•×› 2) --> (2 4 6 8 10)
SBCL
This is SBCL 0.9.12, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
SBCL 0.9.12
Calling (|Î¹Î¿Ï„Î±| :|Ð½Ð¾Ð¼ÐµÑ€| 10 :|Ë‹Ã‚Â¨Ê³Ã‚Â„| 2 :|×‘×›×•×›| 2) --> (2 4 6 8 10)
CMUCL
; Loading #P"/local/users/pjb/src/lisp/encours/unicode-source.lisp".
CMU Common Lisp 19c (19C)
Reader error at 214 on #<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">:
Undefined read-macro character #\ÃƒÂŽ
[Condition of type READER-ERROR]

Restarts:
0: [CONTINUE] Return NIL from load of "unicode-source.lisp".
1: [ABORT ] Skip remaining initializations.

Debug (type H for help)

(LISP::%READER-ERROR
#<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">
"Undefined read-macro character ~S"
#\ÃƒÂŽ)
Source: Error finding source:
Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM: Source file no longer exists:
target:code/reader.lisp.
0] abort
*
Received EOF on *standard-input*, switching to *terminal-io*.
* (extensions:quit)
;; Loaded file unicode-script.lisp
T
[4]>
--
__Pascal Bourguignon__ http://www.informatimago.com/
Grace personified,
I leap into the window.
I meant to do that.

May 18 '06 #115

Jonathon McKitrick

Pascal Bourguignon wrote:

(defun Î¹Î¿Ï„Î± (&key (Ð½Ð¾Ð¼ÐµÑ€ 10) (ë‹¨ê³„ 1) (×‘×›×•×› 0))
(loop :for i :from ×‘×›×•×› :to Ð½Ð¾Ð¼ÐµÑ€ :by ë‹¨ê³„ :collect i))

How do you even *enter* these characters? My browser seems to trap all
the special character combinations, and I *know* you don't mean
selecting from a character palette.

à¿¿ hey, this is weird...

Ã®

I've got something happening, but I can't tell what.

Yes, I'm an ignorant Western world ASCII user. :-)

May 18 '06 #116

Edward Elliott

Christophe wrote:

No, it's really easy : a simple precoomit hook which will refuse any .py
file with the \t char in it and it's done ;)

$ echo \t
t

Why would you wan_ _o remove all _ee charac_ers? Isn'_ _ha_ a li__le
awkward?

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

May 18 '06 #117

Pascal Bourguignon

"Jonathon McKitrick" <j_*********@bigfoot.com> writes:

Pascal Bourguignon wrote:
(defun Î¹Î¿Ï„Î± (&key (Ð½Ð¾Ð¼ÐµÑ€ 10) (ë‹¨ê³„ 1) (×‘×›×•×› 0))
(loop :for i :from ×‘×›×•×› :to Ð½Ð¾Ð¼ÐµÑ€ :by ë‹¨ê³„ :collect i))

How do you even *enter* these characters? My browser seems to trap all
the special character combinations, and I *know* you don't mean
selecting from a character palette.

Why? Of course!
Aren't you either an emacs or a Mac user?

On a Mac, you just select the input keyboad from the Input menu (the
little flag on the right of the menubar, you may activate it from the
International System Preference panel).

On emacs, it's as simple: M-x set-input-method RET

I've bound C-F9, C-F10, C-F11, and C-F12 to various input methods:

(global-set-key [C-f9] (lambda()(interactive)(set-input-method 'chinese-py-b5)))
(global-set-key [C-f10] (lambda()(interactive)(set-input-method 'cyrillic-yawerty)))
(global-set-key [C-f11] (lambda()(interactive)(set-input-method 'greek)))
(global-set-key [C-f12] (lambda()(interactive)(set-input-method 'hebrew)))

C-\ is bound to toggle-input-method which allows to revert back to the
usual input method.

For the alphabetic scripts, there's no difficulty, it's like with
roman scripts: each key is a character. For ideographic scripts, the
input methods are more sophisticated.

Then, you have to learn some of these strange languages. I learned
several (but I forgot everything but: ×œ×•×“×’ ×’×“ ×“×’ ×™×*×“, Ð·Ð´Ñ€Ð°ÑÑ‚Ð²ÑƒÐ¹Ñ‚Ðµ, Ñ
Ð»ÑŽÐ±Ð»ÑŽ Ñ‚Ð¸Ð±Ðµ, æˆ‘ è½é¾, æˆ‘ ä¸ ä¸*å›½äºº). For the Korean, I copy-and-pasted
it from some web translation service. But keying them in is the
easiest part.

--
__Pascal Bourguignon__ http://www.informatimago.com/
Cats meow out of angst
"Thumbs! If only we had thumbs!
We could break so much!"

May 18 '06 #118

Oliver Bandel

Jonathon McKitrick wrote:

Pascal Bourguignon wrote:
(defun Î¹Î¿Ï„Î± (&key (Ð½Ð¾Ð¼ÐµÑ€ 10) (ë‹¨ê³„ 1) (×‘×›×•×› 0))
(loop :for i :from ×‘×›×•×› :to Ð½Ð¾Ð¼ÐµÑ€ :by ë‹¨ê³„ :collect i))

How do you even *enter* these characters? My browser seems to trap all
the special character combinations, and I *know* you don't mean
selecting from a character palette.

Didn't you heard of that big keyboards?

12 meter x 2 meter wide I think.... you need a long
stick (maybe if you play golf, that can help).

The you have all UTF-8 characters there, that's fine,
but typing needs some time.
But it's good, because when ready with typing your email,
it's not necessary to go to sports after work. So your boss
can insist that you longer stay at work.
Ciao,
Oliver

;-)

May 18 '06 #119

PoD

On Thu, 18 May 2006 08:30:03 +0000, Duncan Booth wrote:

PoD wrote:
How many levels of indentation does 12 spaces indicate?
It could be 1,2,3,4,6 or 12. If you say it's 3 then you are
_implying_ that each level is represented by 4 spaces.

By reading the code I can see how many levels of indentation it
represents.
How many levels of indentation is 3 tabs? 3 levels in any code that
you will find in the wild.

No. That is precisely the problem: there is code in the wild which
contains mixed space and tab indentation, and any time that happens 3
tabs could mean any number of indentations.

I think it is universally accepted that mixed tabs and spaces is indeed
**EVIL**

I should have said any code using tabs exclusively.

May 19 '06 #120

PoD

On Thu, 18 May 2006 10:33:58 +0200, Christophe wrote:

PoD a écrit :
On Wed, 17 May 2006 21:37:14 +0800, Andy Sy wrote:

If tabs are easily misunderstood, then they are a MISfeature
and they need to be removed.

From the Zen of Python:

"Explicit is better than implicit..."
"In the face of ambiguity, refuse the temptation to guess..."
"Special cases aren't special enough to break the rules..."

Exactly.
How many levels of indentation does 12 spaces indicate?
It could be 1,2,3,4,6 or 12. If you say it's 3 then you are _implying_
that each level is represented by 4 spaces.

Actually, who said you had to always use the same number of spaces to
indent ? 12 = 6 + 6 = 4 + 4 + 4 but also 12 = 2 + 10 = 1 + 1 + 3 + 3 + 4 :D

Thus supporting my assertion that space indenting is implicit not
explicit. Spaces are evil.

How many levels of indentation is 3 tabs? 3 levels in any code that
you will find in the wild.

No, it could be 3 levels or 3 tabs per level or 2 tabs for the first
level and 1 tab for the second ...

Could be but wouldn't be.

Maybe what Python should do (but never will given the obsession with using
spaces) is only allow one level of indentation increase per block so that

def foo():
<TAB><TAB>return 'bar'

would return a syntax error

May 19 '06 #121

Duncan Booth

PoD wrote:

I think it is universally accepted that mixed tabs and spaces is indeed
**EVIL**

I should have said any code using tabs exclusively.

Can you point at any significant body of publically visible Python code
which uses tabs exclusively? All of the Python projects I've ever been
involved with use spaces only as a convention (although as I pointed out in
my previous post, some with more success than others).

The problem with conventions such as 'tabs only' or 'space only' is that
they only work if everyone sticks to the conventions, and it helps if the
same conventions are in place everywhere (otherwise people forget when they
switch from one project to another). Also, in the open source universe you
are quite likely to pull in bits of code from other projects, and you don't
want to either have to reformat it or to switch your editor settings for
some files.

My experience of programming with either spaces or tabs has taught me
that tabs are evil not for themselves, but simply because no matter how
hard you try they always end up being mixed with spaces.

Do you know of any open-source projects which actually try to enforce a
'tab only' convention for Python? I'd really like to see a similar scan
over some 'tab only' code as I did over Plone to see whether they actually
manage to remain 'pure'.

May 19 '06 #122

Christophe

PoD a écrit :

Maybe what Python should do (but never will given the obsession with using
spaces) is only allow one level of indentation increase per block so that

def foo():
<TAB><TAB>return 'bar'

would return a syntax error

Which would make <TAB> mandatory for indentation. What about some
freedom of choice ?

May 19 '06 #123

Sybren Stuvel

Duncan Booth enlightened us with:

Can you point at any significant body of publically visible Python
code which uses tabs exclusively?
Everything Python at http://www.stuvel.eu/software
Also, in the open source universe you are quite likely to pull in
bits of code from other projects, and you don't want to either have
to reformat it or to switch your editor settings for some files.
If I grab a module, I just leave the module as is. If I grab a code
snippet, I always reformat it to my own style. That's very easy using
VIM's "retab" command.
Do you know of any open-source projects which actually try to enforce a
'tab only' convention for Python?

My software is, although I'm still the only one working on them ;-)

Sybren
--
The problem with the world is stupidity. Not saying there should be a
capital punishment for stupidity, but why don't we just take the
safety labels off of everything and let the problem solve itself?
Frank Zappa

May 19 '06 #124

Dave Hansen

On 19 May 2006 07:18:03 GMT in comp.lang.python, Duncan Booth
<du**********@invalid.invalid> wrote:

[...]

My experience of programming with either spaces or tabs has taught me
that tabs are evil not for themselves, but simply because no matter how
hard you try they always end up being mixed with spaces.

That's been my experience as well. At least on projects with more
than one programmer. And more than once with single-programmer
projects where the programmer changed or updated his editor in the
middle...

Regards,
-=Dave

--
Change is inevitable, progress is not.

May 19 '06 #125

PoD

On Fri, 19 May 2006 10:04:15 +0200, Christophe wrote:

PoD a écrit :
Maybe what Python should do (but never will given the obsession with using
spaces) is only allow one level of indentation increase per block so that

def foo():
<TAB><TAB>return 'bar'

would return a syntax error

Which would make <TAB> mandatory for indentation. What about some
freedom of choice ?

Hey, if people are allowed to say that tabs should be banned, then I'm
allowed to say they should be mandatory ;)

May 19 '06 #126

Peter Decker

On 19 May 2006 07:18:03 GMT, Duncan Booth <du**********@invalid.invalid>

Can you point at any significant body of publically visible Python code
which uses tabs exclusively? All of the Python projects I've ever been
involved with use spaces only as a convention (although as I pointed out in
my previous post, some with more success than others).

Dabo. http://dabodev.com

--

# p.d.

May 19 '06 #127

Christopher Weimann

On 05/19/2006-07:18AM, Duncan Booth wrote:

My experience of programming with either spaces or tabs has taught me
that tabs are evil not for themselves, but simply because no matter how
hard you try they always end up being mixed with spaces.

Swap the word 'tabs' for the word 'spaces' and you get...

My experience of programming with either tabs or spaces has taught me
that spaces are evil not for themselves, but simply because no matter how
hard you try they always end up being mixed with tabs.

Which is just as vaild as the un-swapped paragraph. Both versions
express a bias. The first is biased in favor of spaces. The second is
biased in favor of tabs. Neither have any useful content. Mixing is bad
but that fact doesn't favor spaces OR tabs.

May 19 '06 #128

Roedy Green

On Mon, 15 May 2006 02:44:54 GMT, Eli Gottlieb <el*********@gmail.com>
wrote, quoted or indirectly quoted someone who said :

Actually, spaces are better for indenting code.

Agreed. All it takes is one programmer to use a different tab
expansion convention to screw up a project. Spaces are unambiguous.

Ideally though you should run code through a beautifier before checkin
to avoid false deltas with people manually formatting code slightly
differently.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

May 20 '06 #129

Andy Sy

achates wrote:

Yeah - we've got to the repeating ourselves stage.
Actually a couple of the responses on this newsgroup
have settled the question for me. I did learn something
new by engaging in this holy war.
Tabs need not be evil, but ONLY if they are used in one
particular way:

If you really must use tabs, use them *ONLY* for 'semantic'
indentation and use pure spaces when you need arbitrary
indentation (e.g. arbitrary spacing).

PHP Example
===========

function xyz() {
->print "This is a really really really long line that I'd ".
-> "like to extend to the one down below while preserving ".
-> "ease of readability by aligning the start of the string".
-> "lines. I use tabs to reflect syntactical (e.g. semantic) ".
-> "indentation, but use spaces for arbitrary indentation.\n\n".;
->while (10) {
->->print "This code will align properly no matter what tabsize setting ".
->-> "the reader uses, and is the only real benefit of using tab ".
->-> "characters instead of pure spaces.\n\n";
}

I did some basic tests, and it looks like this style should also
work for Python code.
THIS IS THE *SINGLE* CORRECT WAY TO USE TABS IN CODE! ANYTHING
ELSE WILL BE A PAIN FOR PEOPLE WITH TABSIZE SETTINGS DIFFERENT
FROM YOURS!!

But that's the problem with this issue: it's really hard to get the
space-indenters to actually think about it and to address what is being
said. Every time it comes up, there's always a few people trying to
explain why tabs give are a good idea, facing a whole raft of others
spouting stuff like:
Most of the tab-indenters have the same attitude. But I did get ALL
of my points against tabs addressed this time. And thank you to those
people including you for the 'less -x<tabstop>' tip.
But unfortunately the situation is worse than that: tab indentation
needs to be actively defended.
A world where everyone uses pure spaces would be far far far better
than the current situation where tab users can't even decide what
the universal tabsize should be and whose code alignment breaks
on different tabsize settings.

However, a world where EVERYONE who uses tabs strictly practice the
ONE TRUE TAB WAY would be a *slightly* better place than one where
everyone used pure spaces.

And remember, the ONE TRUE TAB WAY does not come for free and involves
a certain amount of tedium and attention. It will be impractical on
most simple text editors. For most people, I frankly don't think the
minor benefits are worth it.
Unlikely perhaps. I hope so. It's a cruel irony that Python's creator
didn't appreciate the benefits that tab indentation would bring to his
own language - the only major language in which indentation levels
actually have semantic significance.

You might want to run those benefit/s/ by me again, because the SINGLE
benefit I can still see to using tabs is so that people who have
different indentation width preferences can view code according to
the way they want. And remember, this benefit will ONLY occur
IF people stick to using the ONE TRUE TAB WAY outlined above.

Any other method of tabbing would just be worse than a pure-spaces
world.

--
It's called DOM+XHR and it's *NOT* a detergent!

May 20 '06 #130

Christophe Cavalaria

Christopher Weimann wrote:

On 05/19/2006-07:18AM, Duncan Booth wrote:

My experience of programming with either spaces or tabs has taught me
that tabs are evil not for themselves, but simply because no matter how
hard you try they always end up being mixed with spaces.

Swap the word 'tabs' for the word 'spaces' and you get...

My experience of programming with either tabs or spaces has taught me
that spaces are evil not for themselves, but simply because no matter
how hard you try they always end up being mixed with tabs.

Which is just as vaild as the un-swapped paragraph. Both versions
express a bias. The first is biased in favor of spaces. The second is
biased in favor of tabs. Neither have any useful content. Mixing is bad
but that fact doesn't favor spaces OR tabs.

The difference is that you cannot code without spaces but you can do it
without tabs.

May 20 '06 #131

Andy Sy

Andy Sy wrote:

Actually a couple of the responses on this newsgroup
have settled the question for me. I did learn something
new by engaging in this holy war.
Tabs need not be evil, but ONLY if they are used in one
particular way:

If you really must use tabs, use them *ONLY* for 'semantic'
indentation and use pure spaces when you need arbitrary
indentation (e.g. arbitrary spacing).

PHP Example
===========

function xyz() {
->print "This is a really really really long line that I'd ".
-> "like to extend to the one down below while preserving ".
-> "ease of readability by aligning the start of the string".
-> "lines. I use tabs to reflect syntactical (e.g. semantic) ".
-> "indentation, but use spaces for arbitrary indentation.\n\n".;
->while (10) {
->->print "This code will align properly no matter what tabsize setting ".
->-> "the reader uses, and is the only real benefit of using tab ".
->-> "characters instead of pure spaces.\n\n";
}

I did some basic tests, and it looks like this style should also
work for Python code.
THIS IS THE *SINGLE* CORRECT WAY TO USE TABS IN CODE! ANYTHING
ELSE WILL BE A PAIN FOR PEOPLE WITH TABSIZE SETTINGS DIFFERENT
FROM YOURS!!

Also... remember that the 'ONE TRUE WAY' essentially involves *mixing*
tabs and spaces for indentation with all the objections that that
entails... (although like mentioned above, it should work with Python,
at least in the simple cases i've tried)

Frankly, the case for tab usage is not that compelling...

--
It's called DOM+XHR and it's *NOT* a detergent!

May 22 '06 #132

Edward Elliott

Andy Sy wrote:

[snipped 50 lines of previous message]

Also... remember that the 'ONE TRUE WAY' essentially involves *mixing*
tabs and spaces for indentation with all the objections that that
entails... (although like mentioned above, it should work with Python,
at least in the simple cases i've tried)

Frankly, the case for tab usage is not that compelling...

Quoting usenet posts is like hunting buffalo: only take what you need.
http://www.xs4all.nl/~wijnands/nnq/nquote.html#Q2

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

May 22 '06 #133

Xah Lee

the following are 2 FAQ following this thread. Thanks.

Addendum: 2006-05-15

Q: What you mean by embeding tab position info into the source code?
How's that gonna be done?

A: Tech geekers may not realize, but such embedding of meta info do
exist in many technologies by various means because of a need. For
example, Mac OS Classic's resource fork and Mac OS X's bundling system,
unix shell script's shebang (#!), emacs and Python's encoding
declaration â€œ#-*- coding: utf-8 -*-â€, Unicode's BOM, CVS's
change-log insertion, Mathematica's source code system the Notebook,
Microsoft Word's transparent meta data, as well as HTML and XML's
various declarations embedded in the file. Some of these systems are
good designs and some are hacks.

Somehow tech geekers have the sense that â€œsource codeâ€ mustbe a
plain text file containing nothing else but the programing code. This
may be a defendable position, but as we can see in the above examples,
this idea is primitive and does not address the various needs. If the
tech geekers have thought out about these issues, computing languages
and its source code may have developed into more powerful and flexible
integrated systems as the above standardized examples. For instance,
many commercial development systems actually already have such
meta-data embodied with the source code. (e.g. Borland Delphi,
Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
Mathematica.) Some of which, not only embody development-related info
such as debug points or linking files, but also allow programers to
high-light code for visual purposes like a word processor, or even
display them visually as type-set mathematics.

Q: Converting spaces to tabs is actually easy. I don't see how spacess
lose info.

A: Here is a illustration on how it is not possible to convert spaces
to tabs. Suppose you are writing in a language where the indentation is
part of the semantics, not just for appearance. Now, suppose you have
these two lines:

1234567890
A
B

The first line has 2 space prefix and second line has 4 space prefix.
How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
represents, where n is the smallest space prefix in the code, unless n
== 1.

The above demonstrates the information loss in using spaces for
indentation in a theoretical way. There are also practical problems. In
practice, many languages allow string literals like this myName="i love
you", and strings easily can have a run of spaces. One cannot simply
run a blind find-n-replace operation to replace all spaces to tabs. But
also, many unix languages contains a so-called construct of
â€œheredocâ€ as a mean to embed a literal block of text. For example,
here's a PHP construct of heredoc:

$novelText = <<<arbitraryCharsHereAsDelimiter
(__)
(oo)
/-------\/
/ | ||
* ||----||
~~ ~~
arbitraryCharsHereAsDelimiter;
}

Regardless of its design as a language construct, the purpose of
â€œheredocâ€ is that it allows programers to easily embed a text (a
large string), without worrying about the text containing sequence of
characters that may be meaningful to the language. If a language has
heredoc construct, then it is basically impossible to convert from
spaces to tabs, as that will botch literal string embedded in heredoc.
However, it is less of a problem to convert tabs to spaces, because the
frequency of spaces appearing in literal strings are far higher than
literal tabs.

Another practical issue is error recovery. Suppose, one uses 4 spaces
for a indentation. Now, it is not uncommon to see lines with odd number
of space prefixes such as 7 or 10 out of common sloppiness. Such error
would happen more often if spaces are used for indentation, and the
essence is that tabs enforce a semantic association and is impossible
to make a half-indentation.

Q: Well, i just like spaces because they are most compatible.

A: Sure, crass simplicity is always more compatible. Suppose a unixer
will say, he doesn't like HTML because it is fret with problems and
incompatibilities. He'd rather prefer plain text. And, indeed, a lot
unixers seriously think that.

---------------------------
PS in the answer to the first question, i gave the following examples
of IDE/Language that actually embed formatting info in the source code:
Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
Wolfram Research's Mathematica

actually, i know Mathematica does, but i'm not quite sure about the
other examples. So, my question is, does any one knows a language or
IDE that actually allows the coder to manually highlight parts of the
code and this highlight stick with the file upon reopening, as if a
word processor?

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Xah Lee wrote:

Tabs versus Spaces in Source Code
This post is archived at:
http://xahlee.org/UnixResource_dir/w...vs_spaces.html

May 23 '06 #134

Mumia W.

Xah Lee wrote:

the following are 2 FAQ following this thread. Thanks.

Addendum: 2006-05-15

Q: What you mean by embeding tab position info into the source code?
How's that gonna be done?

A: Tech geekers may not realize, but such embedding of meta info do
exist in many technologies by various means because of a need. For
example, Mac OS Classic's resource fork and Mac OS X's bundling system,
unix shell script's shebang (#!), emacs and Python's encoding
declaration â€œ#-*- coding: utf-8 -*-â€, Unicode's BOM, CVS's
change-log insertion, Mathematica's source code system the Notebook,
Microsoft Word's transparent meta data, as well as HTML and XML's
various declarations embedded in the file. Some of these systems are
good designs and some are hacks.

Vim's mode-lines do this too.
Somehow tech geekers have the sense that â€œsource codeâ€ must be a
plain text file containing nothing else but the programing code. This
may be a defendable position, but as we can see in the above examples,
this idea is primitive and does not address the various needs. If the
tech geekers have thought out about these issues, computing languages
and its source code may have developed into more powerful and flexible
integrated systems as the above standardized examples.
The tech geekers have thought about it. Donald Knuth invented TeX, and
went on to invent the WEB literate programming system. You don't get any
geekier than that :)
For instance,
many commercial development systems actually already have such
meta-data embodied with the source code. (e.g. Borland Delphi,
Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
Mathematica.) Some of which, not only embody development-related info
such as debug points or linking files, but also allow programers to
high-light code for visual purposes like a word processor, or even
display them visually as type-set mathematics.

Q: Converting spaces to tabs is actually easy. I don't see how spacess
lose info.

A: Here is a illustration on how it is not possible to convert spaces
to tabs. Suppose you are writing in a language where the indentation is
part of the semantics, not just for appearance. Now, suppose you have
these two lines:
I'd say that such a language removes the choice of whether to use tabs
or spaces, and the discussion is over when you don't have a choice.

1234567890
A
B

The first line has 2 space prefix and second line has 4 space prefix.
How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
represents, where n is the smallest space prefix in the code, unless n
== 1.
vim: tabstop=4

The argument for spaces over tabs says that you have to include some
metadata in order for the document to look right on other people's
computers if you use tabs. This example, plus my example mode-line for
vim, reinforces that idea IMO.

The above demonstrates the information loss in using spaces for
indentation in a theoretical way. There are also practical problems. In
practice, many languages allow string literals like this myName="i love
you", and strings easily can have a run of spaces. One cannot simply
run a blind find-n-replace operation to replace all spaces to tabs. But
also, many unix languages contains a so-called construct of
â€œheredocâ€ as a mean to embed a literal block of text. For example,
here's a PHP construct of heredoc:

$novelText = <<<arbitraryCharsHereAsDelimiter
(__)
(oo)
/-------\/
/ | ||
* ||----||
~~ ~~
arbitraryCharsHereAsDelimiter;
}

Yes, there are lots of situations like this where you can't just
willy-nilly convert between tabs and spaces. But even in this case shows
that, if you use consistent tab widths, the text has a chance of
surviving. I converted your little doggie to and from text with tab
sizes of eight, and he survived. (I did it with tabs set to four too,
and it worked.)

Regardless of its design as a language construct, the purpose of
â€œheredocâ€ is that it allows programers to easily embed a text (a
large string), without worrying about the text containing sequence of
characters that may be meaningful to the language. If a language has
heredoc construct, then it is basically impossible to convert from
spaces to tabs, as that will botch literal string embedded in heredoc.
Yes it would. Upon printing, if the terminal tab width was set to eight,
but the text conversion was done with tabs at four, bye bye doggie.
However, it is less of a problem to convert tabs to spaces, because the
frequency of spaces appearing in literal strings are far higher than
literal tabs.

Another practical issue is error recovery. Suppose, one uses 4 spaces
for a indentation. Now, it is not uncommon to see lines with odd number
of space prefixes such as 7 or 10 out of common sloppiness. Such error
would happen more often if spaces are used for indentation, and the
essence is that tabs enforce a semantic association and is impossible
to make a half-indentation.

What I've learned is that, if I'm going to use tabs for indentation, I
have to be consistent.
Q: Well, i just like spaces because they are most compatible.

A: Sure, crass simplicity is always more compatible. Suppose a unixer
will say, he doesn't like HTML because it is fret with problems and
incompatibilities. He'd rather prefer plain text. And, indeed, a lot
unixers seriously think that.

---------------------------
PS in the answer to the first question, i gave the following examples
of IDE/Language that actually embed formatting info in the source code:
Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
Wolfram Research's Mathematica

Perl's POD and Java's javadoc do it too.
actually, i know Mathematica does, but i'm not quite sure about the
other examples. So, my question is, does any one knows a language or
IDE that actually allows the coder to manually highlight parts of the
code and this highlight stick with the file upon reopening, as if a
word processor?

Xah
xa*@xahlee.org
âˆ‘ http://xahlee.org/

Xah Lee wrote:
Tabs versus Spaces in Source Code
This post is archived at:
http://xahlee.org/UnixResource_dir/w...vs_spaces.html

I'm slowly moving into the "spaces" camp. After reading your earlier
post on tabs vs. spaces and other people's responses, I began thinking
about why I like tabs so much, and there is only one answer--backspace.

If I use tabs, when I backspace I go back to the previous tab position,
which is what I want. With spaces, I have to hit the backspace key
several times to get back. That's it--one feature is the only reason I
like tabs, so I decided to investigate vim's features to see if vim
would let me backspace to the previous tab position with one keystroke.

'Softtabstop' (sts) is the feature. I would have never thought to look
for this feature without your post. Thanks again Xah.

Your posts are on topic, informative, engaging and necessary. Keep them
coming Xah. :)

May 23 '06 #135

Oliver Wong

"Jonathon McKitrick" <j_*********@bigfoot.com> wrote in message
news:11********************@j33g2000cwa.googlegrou ps.com...

Pascal Bourguignon wrote:
(defun Î¹Î¿Ï„Î± (&key (Ð½Ð¾Ð¼ÐµÑ€ 10) (ë‹¨ê³„ 1) (×‘×›×•×› 0))
(loop :for i :from ×‘×›×•×› :to Ð½Ð¾Ð¼ÐµÑ€ :by ë‹¨ê³„ :collect i))

How do you even *enter* these characters? My browser seems to trap all
the special character combinations, and I *know* you don't mean
selecting from a character palette.

à¿¿ hey, this is weird...

Ã®

I've got something happening, but I can't tell what.

Yes, I'm an ignorant Western world ASCII user. :-)

What OS are you using? In Windows XP, you'd have to let the XP know that
you're interested in input in languages other than English via "Control
Panel -> Regional Settings -> Languages -> Text Services and Input
Languages". There, you'd add input methods other than English. Each "input
method" works in a sort of unique way, so you'll just have to learn them.
For example, under English, you can use the "keyboard" input method which
probably is what you're using now, or the "handwriting recognition" input
method, or the "speech recognition" input method to insert english text.
There are other input methods for the Asian languages (e.g. Chinese,
Japanese, etc.)

- Oliver

May 23 '06 #136