ISO Studies of underscores vs MixedCase in Ada or C++

Andy Glew

I am in search of any rigourous,
scientific, academic or industrial studies
comparing naming conventions in
C++ or similar languages such as
Ada:

Specifically, are names formed with
underscores more or less readable
than names formed with MixedCase
StudlyCaps camelCase?

....and similarly, any measurements
of programmer productivity, bug rate,
etc.; although IMHO readability matters
most.
* Religion - NOT?!

I understand that this is a religious issue
for many programmers, an issue of programming style.
I am not interested in a religious war.
I obviously have my own opinion, but I am
open to scientific evidence.
* Ada Studies?

I thought that I had seen studies like
this in some of the early design documents
for Ada, but I have not been able to find
such references on the web. Which is not
entirely surprising, since Ada was designed
prior to the web.

The Ada 83 and 95 Quality Guidelines recommend
underscores to improve readability, but provide
no source justifying this statement.
* What such studies might look like

Simple readability and recall:
- present a test subject with
a list of compound words
formed with underscoresand mixed case
- remove the list, and ask test subject
to write it
- score on accuracy

Program debugging
- present programs that are otherwise identical,
differing only in their use of underscores/MixedCase
to test subject programmers (e.g. a CS class)
- program has a known bug
- ask test subjects to find bug
- score on accuracy locating bug

Cruel TA study:
- Two sections of a CS class
- Enforce programming standards,
underscores vs MixedCase
- Pose a programming problem
- Score according to success
completing assignment

Empirical:
- Given version control databases
of large programs, some written in underscore
style, others in MixedCase
- Total bug rates normalized by LOC, name count, etc.
- OR: count only bugs that can be attributed
(after inspection of checkins) to misnamed variables

For that matter, I would be interested in any surveys
folks may have done that count projects and their
coding standards, possibly weighted
- open source (e.g. sourceforge)
- industrial
- textbooks, weighted by sales
- websites of coding standards, weighted by Google score...
Although this is less convincing than a rigorous study.
* Explanation of Newsgroups Chosen

I hope it is obvious why I have chosen these
newsgroups to post this search to:

comp.software-eng, comp.programming,
- an issue of software engineering
comp.lang.c++,
- the language I am most interested in
comp.lang.ada
- because I vaguely recall historical work

Jul 19 '05 #1

Subscribe Post Reply

5793

Attila Feher

Andy Glew wrote:

I am in search of any rigourous,
scientific, academic or industrial studies
comparing naming conventions in
C++ or similar languages such as

[SNIP]

The underscore convention work also in case insensitive languages.

The InnerCaps convention fails to solve the issue of all caps words like
SMTPTCPIPConnection. Usual solution is to write them wrong as
SmtpTcpIpConnection.

The underscore convention tends to make lines longer, which can have bad
effect on readablity.

IMO it is a personal preference issue, and also an issue of what fonts and
development envirnmoent is in use.

IMO if one has to select *one* convention for a whole company using many
languages then only the underscore one stands. With InnerCaps there is a
possibility to create hard-to-find name collisions, especially in languages
where the type of variables can change runtime by a simple assignment.

--
Attila aka WW

Jul 19 '05 #2

Jakob Bieling

"Andy Glew" <an*******@amd.com> wrote in message
news:2c**************************@posting.google.c om...

[snip]

Specifically, are names formed with
underscores more or less readable
than names formed with MixedCase
StudlyCaps camelCase?
Write a large text (several lines) with mixed-case and the same again
with underscores. Then give it people to read and ask them what they find
easier to read. I would not be surprised if the majority favours the text
with underscores.

[snip]
The Ada 83 and 95 Quality Guidelines recommend
underscores to improve readability, but provide
no source justifying this statement.

The underscore can easily be view as a space which seperates the words,
whereas mixed-case does not provide a seperation like that, but rather a
'large' here-comes-a-new-word-mark (ie. the captial letter). The problem I
see with this: non-captial letters can be 'large' as well. just have a look
at the 't', 'h' etc, which, imo, does not make reading a mixed-case text
easier.

Personally, I prefer underscore for the reason above.

Just my .02c
--
jb

(replace y with x if you want to reply by e-mail)

Jul 19 '05 #3

Matt Gregory

Jakob Bieling wrote:

The underscore can easily be view as a space which seperates the words,
whereas mixed-case does not provide a seperation like that, but rather a
'large' here-comes-a-new-word-mark (ie. the captial letter). The problem I
see with this: non-captial letters can be 'large' as well. just have a look
at the 't', 'h' etc, which, imo, does not make reading a mixed-case text
easier.

I think we just need a programming font that has half-sized underscores
in front of all the capital letters. That would solve all these problems.
I personally don't like typing underscores, but I agree they are more
readable. Emacs does have a view-camel-cased-identifiers-as-underscored
mode, so that's a step in the right direction.

Jul 19 '05 #4

Ludovic Brenta

Personally I prefer underscores, too, and for that reason I really
like Emacs' glasses-mode. So, use whatever you want, *I* will always
see underscores :)

--
Ludovic Brenta.

Jul 19 '05 #5

Steve

I think a more relevent test would be to give two versions the same code,
one with underscores, one with mixed casing, to different groups of
programmers to analyze. Include a quiz asking questions about the code.
See which version results in more correct answers, and which version
achieves the answers more quickly.

Steve
(The Duck)

"Jakob Bieling" <ne*****@gmy.net> wrote in message
news:bl*************@news.t-online.com...
[snip]

Write a large text (several lines) with mixed-case and the same again
with underscores. Then give it people to read and ask them what they find
easier to read. I would not be surprised if the majority favours the text
with underscores.

Jul 19 '05 #6

Frank J. Lhota

Underscores are basically a way to provide spaces in an identifier. Since
identifiers are generally phrases (nown phrases for objects, verb phrases
for procedures) and phrases often consist of more than one word, I find the
use of underscores to be quite natural.

The opposing argument is that underscores are too large, and that a case
change is a more readable way to indicate how to divide the decomposition
into words. To me, the upper / lower case method of delineate the words in
an indentifier has always looked like the transcript of a very fast talker.
Yes, you can make out the words, but just barely. Moreover, the use of
letter case to delineate words prohibits any other use of letter case. It
rules out using all caps for a certain category of identifiers, for example.

There is an easy way to test which convention is more readable. Here is one
of Shakespeare's sonnets rendered in the mixed case format:

FromFairestCreaturesWeDesireIncrease,
ThatTherebyBeautysRoseMightNeverDie,
ButAsTheRiperShouldByTimeDecease,
HisTenderHeirMightBearHisMemory:
ButThouContractedToThineOwnBrightEyes,
FeedstThyLightsFlameWithSelfSubstantialFuel,
MakingAFamineWhereAbundanceLies,
ThySelfThyFoeToThySweetSelfTooCruel:
ThouThatArtNowTheWorldsFreshOrnament,
AndOnlyHeraldToTheGaudySpring,
WithinThineOwnBudBuriestThyContent,
AndTenderChurlMakstWasteInNiggarding:
PityTheWorldOrElseThisGluttonBe,
ToEatTheWorldsDueByTheGraveAndThee

It may be a matter of taste, but I certainly found the original sonnet to be
more readable and more beautiful.

Jul 19 '05 #7

Randy King

<snip> op <snip>

This is a somwhat offtopic post, but the OP did ask the question about
readability.

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
inwaht orredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can be a
total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
the huamn mnid deos not raed ervey lteter by istlef, butthe wrod as a
wlohe. Aolbsulty amzanig huh?

Jul 19 '05 #8

Hyman Rosen

Randy King wrote:

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
inwaht orredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can be a
total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
the huamn mnid deos not raed ervey lteter by istlef, butthe wrod as a
wlohe. Aolbsulty amzanig huh?

"Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd,
utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials
of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl
cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr."

Jul 19 '05 #9

Matt Gregory

I wrote:

I think we just need a programming font that has half-sized underscores
in front of all the capital letters. That would solve all these problems.

Nevermind, that was a terrible idea. It was almost good though.

Jul 19 '05 #10

Jack Klein

On 25 Sep 2003 21:32:40 -0700, an*******@amd.com (Andy Glew) wrote in
comp.lang.c++:

I am in search of any rigourous,
scientific, academic or industrial studies
comparing naming conventions in
C++ or similar languages such as
Ada:

Specifically, are names formed with
underscores more or less readable
than names formed with MixedCase
StudlyCaps camelCase?

My team is currently working under this guideline as a compromise:

Function names must be CamelMode, but optionally underscores are
allowed, e.g. Camel_Mode.

....or should I say "compromised" guidelines?

Interestingly I see a lot of programmers who prefer CamelMode for
function names, yet prefer under_scores in variable names. In every
single case where I have checked, the programmer has done at least
some coding for Windows and its Pascal, BASIC, etc., API. And in
every single case they claim that is not where their style came from.
Go figure.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Jul 19 '05 #11

Programmer Dude

Jack Klein wrote:

Interestingly I see a lot of programmers who prefer CamelMode for
function names, yet prefer under_scores in variable names. In every
single case where I have checked, the programmer has done at least
some coding for Windows and its Pascal, BASIC, etc., API. And in
every single case they claim that is not where their style came from.

I've tried just about every combination over the years. At one
point it was underscores in function names, not in data names.
OOP added enough other basic types of things it got hard to have
a style for each. Currently, I use lower_case_with_underscores
for local names and CamelCaseMode for functions/methods and
for global data.

I'm considering switching to Mixed_Case_With_Underscores for
global data. In fact, with the fairly recent addition of
several new languages to my tool kit, it's probably time to
once again re-think my whole naming convention thing.

--
|_ CJSonnack <Ch***@Sonnack.com> _____________| How's my programming? |
|_ http://www.Sonnack.com/ ___________________| Call: 1-800-DEV-NULL |
|_____________________________________________|___ ____________________|

Jul 19 '05 #12

Mike Smith

Hyman Rosen wrote:

Randy King wrote:
Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
inwaht orredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can be a
total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
the huamn mnid deos not raed ervey lteter by istlef, butthe wrod as a
wlohe. Aolbsulty amzanig huh?

"Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd,
utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials
of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl
cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr."

Yes, it's possible to take it *too* far. But I *was* able to read the
quoted text at maybe half the speed at which I could have read it if it
were spelled correctly. And the text in Randy King's post is even more
readable than that - I can read it at almost full speed.

--
Mike Smith

Jul 19 '05 #13

tmoran

> > I think we just need a programming font that has half-sized underscores
If you want to get into fonts etc, look at "Human Factors and Typography
for More Readable Programs", (c) 1990 ACM Press, ISBN 0-201-10745-7
(It doesn't appear to address naming questions, however.)

Jul 19 '05 #14

Michael Feathers

"Matt Gregory" <bl****************@earthlink.net> wrote in message
news:Ar*****************@newsread2.news.atl.earthl ink.net...

I wrote:
I think we just need a programming font that has half-sized underscores
in front of all the capital letters. That would solve all these
problems.
Nevermind, that was a terrible idea. It was almost good though.

Let's see, what if an IDE had a toggle which converted identifier names back
and forth on demand, flagging any clashes. ;-)

Jul 19 '05 #15

Hyman Rosen

Mike Smith wrote:

Yes, it's possible to take it *too* far. But I *was* able to read the
quoted text at maybe half the speed at which I could have read it if it
were spelled correctly. And the text in Randy King's post is even more
readable than that - I can read it at almost full speed.

Which clearly means that the first/last letter thing isn't the
only factor in comprehension.

Jul 19 '05 #16

Default User

Mike Smith wrote:

Hyman Rosen wrote:
Randy King wrote:
Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
inwaht orredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can be a
total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae
the huamn mnid deos not raed ervey lteter by istlef, butthe wrod as a
wlohe. Aolbsulty amzanig huh?

"Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd,
utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials
of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl
cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr."

Yes, it's possible to take it *too* far. But I *was* able to read the
quoted text at maybe half the speed at which I could have read it if it
were spelled correctly. And the text in Randy King's post is even more
readable than that - I can read it at almost full speed.

That's because it's not well scrambled at all. Examine the larger words,
they almost all have large unchanged or barely changed segments. Most of
the time double letter combos are kept together, very little reversal of
segments. I think the given example (I've received it many times) does
not provide much evidence for the contention at all.

Brian Rodenborn

Jul 19 '05 #17

Default User

Jack Klein wrote:

Function names must be CamelMode, but optionally underscores are
allowed, e.g. Camel_Mode.

We are allowed underscores when acronyms appear in the name.

InitiateFMS_Executive();

Brian Rodenborn

Jul 19 '05 #18

Arthur J. O'Dwyer

On Fri, 26 Sep 2003, Default User wrote:

Mike Smith wrote:
Hyman Rosen wrote:

"Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd,
utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials
of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl
cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr."

Yes, it's possible to take it *too* far. But I *was* able to read the
quoted text at maybe half the speed at which I could have read it if it
were spelled correctly. And the text in Randy King's post is even more
readable than that - I can read it at almost full speed.

That's because it's not well scrambled at all. Examine the larger words,
they almost all have large unchanged or barely changed segments. Most of
the time double letter combos are kept together, very little reversal of
segments. I think the given example (I've received it many times) does
not provide much evidence for the contention at all.

On the other hand, the thing which turned out to be confusing me the
most in Hyman's scrambled text was the typo (the comma after "unnamed").
Once I learned to ignore that, and take the rest of the grammar with a
grain of salt (the phrase including the word "uncited" also gave me
problems), it was fairly straight sailing.
At least, it was straight sailing until about half-way through, at
which point my brain kicked in and I rezilaed waht mohted was bnieg
uesd to otacsufbe the iaudividnl wdros -- at taht pniot I jsut setratd
rnidaeg tehm bdrawkcas.
Perhaps an interesting experiment would be to compare the relative
effects of ioisrevnn, aaabehiilopttzn, roandm sirnlcmabg, and radonm
dpraigh scamrbnlig. But that's not really topical here, (wherever
"here" is).

-Arthur

Jul 19 '05 #19

Dennis Lee Bieber

Matt Gregory fed this fish to the penguins on Friday 26 September 2003
12:11 am:

I think we just need a programming font that has half-sized
underscores
in front of all the capital letters. That would solve all these
problems. I personally don't like typing underscores, but I agree they
are more
readable. Emacs does have a
view-camel-cased-identifiers-as-underscored mode, so that's a step in
the right direction.

Well, we could all revert to a language with a parser like classical
FORTRAN -- where whitespace in identifiers was ignored...
-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Bestiaria Home Page: http://www.beastie.dm.net/ <
Home Page: http://www.dm.net/~wulfraed/ <

Jul 19 '05 #20

Mad Hamish

On Fri, 26 Sep 2003 15:40:00 GMT, "Frank J. Lhota"
<NO******************@verizon.net> wrote:

Underscores are basically a way to provide spaces in an identifier. Since
identifiers are generally phrases (nown phrases for objects, verb phrases
for procedures) and phrases often consist of more than one word, I find the
use of underscores to be quite natural.

The opposing argument is that underscores are too large, and that a case
change is a more readable way to indicate how to divide the decomposition
into words. To me, the upper / lower case method of delineate the words in
an indentifier has always looked like the transcript of a very fast talker.
Yes, you can make out the words, but just barely. Moreover, the use of
letter case to delineate words prohibits any other use of letter case. It
rules out using all caps for a certain category of identifiers, for example.

There is an easy way to test which convention is more readable. Here is one
of Shakespeare's sonnets rendered in the mixed case format:

FromFairestCreaturesWeDesireIncrease,
ThatTherebyBeautysRoseMightNeverDie,
ButAsTheRiperShouldByTimeDecease,
HisTenderHeirMightBearHisMemory:
ButThouContractedToThineOwnBrightEyes,
FeedstThyLightsFlameWithSelfSubstantialFuel,
MakingAFamineWhereAbundanceLies,
ThySelfThyFoeToThySweetSelfTooCruel:
ThouThatArtNowTheWorldsFreshOrnament,
AndOnlyHeraldToTheGaudySpring,
WithinThineOwnBudBuriestThyContent,
AndTenderChurlMakstWasteInNiggarding:
PityTheWorldOrElseThisGluttonBe,
ToEatTheWorldsDueByTheGraveAndThee

It may be a matter of taste, but I certainly found the original sonnet to be
more readable and more beautiful.

But produces more compilation errors.
Hence the mixed case format must be better for programming.
--
"Hope is replaced by fear and dreams by survival, most of us get by."
Stuart Adamson 1958-2001

Mad Hamish
Hamish Laws
h_****@aardvark.net.au

Jul 19 '05 #21

Gerry Quinn

In article <3F***************@Sonnack.com>, Programmer Dude <Ch***@Sonnack.com> wrote:

Jack Klein wrote:
Interestingly I see a lot of programmers who prefer CamelMode for
function names, yet prefer under_scores in variable names. In every
single case where I have checked, the programmer has done at least
some coding for Windows and its Pascal, BASIC, etc., API. And in
every single case they claim that is not where their style came from.

I've tried just about every combination over the years. At one
point it was underscores in function names, not in data names.
OOP added enough other basic types of things it got hard to have
a style for each. Currently, I use lower_case_with_underscores
for local names and CamelCaseMode for functions/methods and
for global data.

I use:

ClassName // need not start with C
FunctionName()
m_MemberVariable // misc. variable
m_pPointerVariable // common typed variable
localVariable
pLocalPointer
SOME_CONSTANT

I guess I could use underscore more if I wanted. Don't like typing it
much, though.

I think what I dislike about underscores is related to what some people
like about them: they look like spaces. That interferes with my ability
to break up a statement into individual identifiers.

When someone posts code with lots of underscores, I find it hard to
read.

Gerry Quinn
--
http://bindweed.com
Kaleidoscopic Screensavers and Games for Windows
Download free trial versions
New screensaver: "Hypercurve"

Jul 19 '05 #22

Richard Heathfield

[Uncomfortable with crosspost, but not sure which groups to trim]

Jack Klein wrote:

Interestingly I see a lot of programmers who prefer CamelMode for
function names, yet prefer under_scores in variable names. In every
single case where I have checked, the programmer has done at least
some coding for Windows and its Pascal, BASIC, etc., API. And in
every single case they claim that is not where their style came from.
Go figure.

Add another one to your tally. I have written a fair few Windows programs.
But /before/ that, I had already invented MixedCase for myself. I was quite
pleased, actually, to discover that the Windows API people had copied my
style. :-)

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Jul 19 '05 #23

Robert I. Eachus

Richard Heathfield wrote:

[Uncomfortable with crosspost, but not sure which groups to trim]

I trimmed comp.lang.ada. I don't know about the other languages, but in
Ada names like Ada.Text_IO.Integer_IO are a part of the standard.
Mixing that with any other style looks uglier than any unmixed style.
So it is sort of forced on anyone who cares, and most are happy with it.

--
Robert I. Eachus

"Quality is the Buddha. Quality is scientific reality. Quality is the
goal of Art. It remains to work these concepts into a practical,
down-to-earth context, and for this there is nothing more practical or
down-to-earth than what I have been talking about all along...the repair
of an old motorcycle." -- from Zen and the Art of Motorcycle
Maintenance by Robert Pirsig

Jul 19 '05 #24

Ian Woods

Richard Heathfield <do******@address.co.uk.invalid> wrote in
news:bl**********@hercules.btinternet.com:

[Uncomfortable with crosspost, but not sure which groups to trim]

Jack Klein wrote:

Interestingly I see a lot of programmers who prefer CamelMode for
function names, yet prefer under_scores in variable names. In every
single case where I have checked, the programmer has done at least
some coding for Windows and its Pascal, BASIC, etc., API. And in
every single case they claim that is not where their style came from.
Go figure.

Add another one to your tally. I have written a fair few Windows
programs. But /before/ that, I had already invented MixedCase for
myself. I was quite pleased, actually, to discover that the Windows
API people had copied my style. :-)

Indeed! It's not exactly a huge leap of imagination to go from

somename

to realise that

someName

or

SomeName

is generally easier to spot.

I'm just wondering when someone will pull out a patent on such an obvious
thing.

Ian Woods

Jul 19 '05 #25

Martin Dowie

"Mad Hamish" <h_****@aardvark.net.au> wrote in message
news:l3********************************@4ax.com...

It may be a matter of taste, but I certainly found the original sonnet to bemore readable and more beautiful.
But produces more compilation errors.
Hence the mixed case format must be better for programming.

Are you arguing that more compilation errors are a godd thing or a bad
thing?...

"Hope is replaced by fear and dreams by survival, most of us get by."
Stuart Adamson 1958-2001

"Nice quote" says Dunfermline resident.

Jul 19 '05 #26

James Dow Allen

> On 25 Sep 2003 21:32:40 -0700, an*******@amd.com (Andy Glew) wrote in

comp.lang.c++:

Specifically, are names formed with
underscores more or less readable
than names formed with MixedCase
StudlyCaps camelCase?

In the discussion I haven't yet seen the *correct* answer. :-)

CamelMode, camel_mode, etc. are all quite *readable*; when using long
names the important thing is to make them *writable*, i.e.
easy to remember.

Consistency is therefore the important thing. If you abbreviate words,
abbreviate them as the first 4 (or whatever) letters, consistently.

(I usually rewind a file with "lseek(fd, 0L, 0)" because I can't
remember if 0 is SEEKSET or SEEK_SET.)

James

Jul 19 '05 #27

Georg Bauhaus

>>>>> "Frank" == Frank J Lhota <NO******************@verizon.net> writes:

: Since identifiers are generally phrases (nown phrases
: for objects, verb phrases for procedures) and phrases often consist
: of more than one word, I find the use of underscores to be quite
: natural.

But we should, I think, consider non-phrases or almost-non-phrases
being used as identifiers, and "juxtapositions" of identifiers. The
isolated identifiers might be shorter and thus more easily broken
into parts during the "reading process".

theFools(42);

the_fools (42);

the_Fools(42);

The_Fools (42);

....

y := doYouMind.ifI();

y := do_you_mind.if_i ();

y := do_You_Mind.if_I();

y := Do_You_Mind.If_I ();
takeAction(doYouMind.ifI(openTheWindow));

take_action (do_you_mind.if_i (open_the_window));

take_Action (do_You_Mind.if_I(open_The_Window));

Take_Action (Do_You_Mind.If_I (Open_The_Window));

So in context, your "Shakespearean" argument might still apply,
even if short identifiers are readable in dense mixed case?

: There is an easy way to test which convention is more readable. Here
: is one of Shakespeare's sonnets rendered in the mixed case format:

: FromFairestCreaturesWeDesireIncrease,
Also, looking closely at letters, fonts certainly do matter.
In a string such as "glubf()" it might or might not be easy
to distinguish the two characters 'f' and '('. It depends on
how ink would be spread, or on how pixels would appear on some
display screen. You can see this comparing foo(a) and oof(a),
using different fonts for the letters and symbols.

Georg

Jul 19 '05 #28

Mike Bandor

I was once told by a TRW employee that on one particular project they had a
coding standard that used underscores in lieu of running the names together.
One of their "measures" of readability was to take copy of the code, remove
the underscores, and run it through a spell checker. If it made it through
the spell checker, it was deemed "readable".
--
Mike Bandor, Software Engineer, BS-CS/SE
Ada83, Ada95, C++, Delphi, JavaScript, WinHelp, PL/SQL, SQL, JOVIAL, MASM,
Java, HTML
Creator of MEGATERMS, Military Terms & Acronyms
http://home.satx.rr.com/bandor/megaterm/megaterm.htm

"Georg Bauhaus" <ge***@strudel.futureapps.de> wrote in message
news:86************@strudel.futureapps.de...

>> "Frank" == Frank J Lhota <NO******************@verizon.net> writes:

: Since identifiers are generally phrases (nown phrases
: for objects, verb phrases for procedures) and phrases often consist
: of more than one word, I find the use of underscores to be quite
: natural.

But we should, I think, consider non-phrases or almost-non-phrases
being used as identifiers, and "juxtapositions" of identifiers. The
isolated identifiers might be shorter and thus more easily broken
into parts during the "reading process".

theFools(42);

the_fools (42);

the_Fools(42);

The_Fools (42);

...

y := doYouMind.ifI();

y := do_you_mind.if_i ();

y := do_You_Mind.if_I();

y := Do_You_Mind.If_I ();
takeAction(doYouMind.ifI(openTheWindow));

take_action (do_you_mind.if_i (open_the_window));

take_Action (do_You_Mind.if_I(open_The_Window));

Take_Action (Do_You_Mind.If_I (Open_The_Window));

So in context, your "Shakespearean" argument might still apply,
even if short identifiers are readable in dense mixed case?

: There is an easy way to test which convention is more readable. Here
: is one of Shakespeare's sonnets rendered in the mixed case format:

: FromFairestCreaturesWeDesireIncrease,
Also, looking closely at letters, fonts certainly do matter.
In a string such as "glubf()" it might or might not be easy
to distinguish the two characters 'f' and '('. It depends on
how ink would be spread, or on how pixels would appear on some
display screen. You can see this comparing foo(a) and oof(a),
using different fonts for the letters and symbols.

Georg

Jul 19 '05 #29

Peter Ammon

Andy Glew wrote:

I am in search of any rigourous,
scientific, academic or industrial studies
comparing naming conventions in
C++ or similar languages such as
Ada:

Specifically, are names formed with
underscores more or less readable
than names formed with MixedCase
StudlyCaps camelCase?

[...]

Since camelCase and MixedCase seem to be getting routed by underscore
proponents, here's one example of where something in mixed case is
significantly more readable. It's an excerpt from a bison grammar file
I'm working on.

classmethod :
access_specifier method_type_specifier method_return_type_specifier
method_declaration method_body

In the body, I reference things like $4, which (for those who don't
know) refers to the fourth symbol in that space delimited list above.
Can you quickly count which is the fourth? I can't, since spaces look
similar to underscores.

Compare to

classmethod :
accessSpecifier methodTypeSpecifier methodReturnTypeSpecifier
methodDeclaration methodBody

The second is much more readable IMO. The effect is even more dramatic
without Usenet's line wrapping.

-Peter

Jul 19 '05 #30

Programmer Dude

Peter Ammon wrote:

classmethod :
access_specifier method_type_specifier method_return_type_specifier
method_declaration method_body

Can you quickly count which is the fourth?
Compare to

classmethod :
accessSpecifier methodTypeSpecifier methodReturnTypeSpecifier
methodDeclaration methodBody

Compare to

classmethod :
access_specifier
method_type_specifier
method_return_type_specifier
method_declaration
method_body

Or my preference if the tool allows

classmethod :
access-specifier
method-type-specifier
method-return-type-specifier
method-declaration
method-body

(In proportional fonts, hyphens are usually skinnier than
underscores and (to my eye) make the text more readable.
It's not as noticable with monospace fonts, but I think the
lower example looks better (read: more readable :-).)

--
|_ CJSonnack <Ch***@Sonnack.com> _____________| How's my programming? |
|_ http://www.Sonnack.com/ ___________________| Call: 1-800-DEV-NULL |
|_____________________________________________|___ ____________________|

Jul 19 '05 #31

Leif Roar Moldskred

"Mike Bandor" <mb*****@satx.rr.com> writes:

I was once told by a TRW employee that on one particular project they had a
coding standard that used underscores in lieu of running the names together.
One of their "measures" of readability was to take copy of the code, remove
the underscores, and run it through a spell checker. If it made it through
the spell checker, it was deemed "readable".

This touches on one of my pet annoyances with development tools today:
no way to easily spell-check your code. In my opinion, a development
environment should at the very _least_ let you easily spell-check all
the text in comments, and preferably the individual words in variable
and function names (whether the words are separated by mixed case,
hyphens or underscores.)

Unfortunately, nobody else seems to mind. *sighs* Oh well,
spell-checkers are overrated anyway.

--
Leif Roar Moldskred

Jul 19 '05 #32

William

"Leif Roar Moldskred" <rm******@online.no> wrote in message
news:86***************@huldreheim.huldreskog.no...

This touches on one of my pet annoyances with development tools today:
no way to easily spell-check your code. In my opinion, a development
environment should at the very _least_ let you easily spell-check all
the text in comments, and preferably the individual words in variable
and function names (whether the words are separated by mixed case,
hyphens or underscores.)

I've used a few things that did have spell checking. (One had a spell
check button on certain text fields in its forms, kinda neat.) My favorite
text and source editor, Ultraedit, has a pretty good spell checker and it
can be expanded to handle reserved words. I don't think it handles
mixed case (or case at all) though. I've never used it except to check
comments or display text. -Wm

Jul 19 '05 #33

William

"William" <Re***@NewsGroup.Please> wrote in message
news:s9********************@giganews.com...

I've used a few things that did have spell checking. (One had a spell
check button on certain text fields in its forms, kinda neat.) My favorite
text and source editor, Ultraedit, has a pretty good spell checker and it
can be expanded to handle reserved words. I don't think it handles
mixed case (or case at all) though. I've never used it except to check
comments or display text. -Wm

Talking to myself here... I occurred to me that its syntax highlighting
makes spell checking reserved words less necessary - and the syntax
highlighting can deal with case. -Wm

Jul 19 '05 #34

Leif Roar Moldskred

"William" <Re***@NewsGroup.Please> writes:

Talking to myself here... I occurred to me that its syntax highlighting
makes spell checking reserved words less necessary - and the syntax
highlighting can deal with case. -Wm

What I want though, is a spell-checker that, for instance for java,
will spot the errors such as this

// Number of misspelled words fuond so far
int noErorsInTetx = 0;

I want to spell-check this such that I get notified both on "fuond"
for "found", "Erors" for "Errors" and "Tetx" for "Text". They are all,
after all, words in natural language, and it should be possible to
spell-check them automatically.

--
Leif Roar Moldskred
demanding developer

Jul 19 '05 #35

Jim Rogers

"William" <Re***@NewsGroup.Please> wrote in message news:<ud********************@giganews.com>...

Talking to myself here... I occurred to me that its syntax highlighting
makes spell checking reserved words less necessary - and the syntax
highlighting can deal with case. -Wm

Even more to the point -- any compiler should be
able to properly recognize reserved words.
Why use another tool to check what the compiler will also check?

Jim Rogers

Jul 19 '05 #36

Kevin Morenski

> // Number of misspelled words fuond so far

int noErorsInTetx = 0;

I want to spell-check this such that I get notified both on "fuond"
for "found", "Erors" for "Errors" and "Tetx" for "Text". They are all,
after all, words in natural language, and it should be possible to
spell-check them automatically.

Let's say you had a variable named "tHTa," for example. With respect to
your concept, this would be a misspelling of the word "that." Now, a lot of
programmers--myself included--use letters to represent certain things in
variable names. tHTa could mean "type HTa" or anything else a programmer
could think of. How could a program possibly differentiate between
conventions in the naming of variables?

It's much simpler to check the spelling of comments...programmers have
developed so many conventions for making their lives easier; a spell checker
on variable names just adds one more problem to overcome.

kevin

Jul 19 '05 #37

Mike Wahler

"Leif Roar Moldskred" <rm******@online.no> wrote in message
news:86************@huldreheim.huldreskog.no...

"William" <Re***@NewsGroup.Please> writes:
Talking to myself here... I occurred to me that its syntax highlighting
makes spell checking reserved words less necessary - and the syntax
highlighting can deal with case. -Wm
What I want though, is a spell-checker that, for instance for java,
will spot the errors such as this

// Number of misspelled words fuond so far
int noErorsInTetx = 0;

I want to spell-check this such that I get notified both on "fuond"
for "found",

Agree so far.
"Erors" for "Errors" and "Tetx" for "Text". They are all,
after all, words in natural language, and it should be possible to
spell-check them automatically.

But Java (or C++ or whatever) is *not* a "natural language",
afaik they all allow any spelling whatever (of course with
some necessary exceptions and limitations) of identifiers.

I often create "abbreviated" identifers which
save typing while still retaining enough meaning,
e.g.

struct Emp
{
string FName;
string LName;
/* etc */
};

I wouldn't want a spell checker to flag those
identifers, and I certainly don't want to be
bothered with needing to always add such invented
words to a checker's dictionary.

-Mike

Jul 19 '05 #38

Peter Ammon

Programmer Dude wrote:

Peter Ammon wrote:

classmethod :
access_specifier method_type_specifier method_return_type_specifier
method_declaration method_body

Can you quickly count which is the fourth?
Compare to

classmethod :
accessSpecifier methodTypeSpecifier methodReturnTypeSpecifier
methodDeclaration methodBody

Compare to

classmethod :
access_specifier
method_type_specifier
method_return_type_specifier
method_declaration
method_body

You've piqued my interest, since I'm the first to admit that my grammar
specifications are hard to read.

Where do you put the action in the above code?

classmethod :
access_specifier
method_type_specifier
method_return_type_specifier
method_declaration
method_body
{ doSomething(); }

What if there's more than one reduction possible?

classmethod :
access_specifier
method_type_specifier
method_return_type_specifier
method_declaration
method_body
{ doSomething(); }
| something_else
another_thing
even_more
blah_blah
{ doSomethingElse(); }

This looks like it's getting hard to read.

Or my preference if the tool allows

classmethod :
access-specifier
method-type-specifier
method-return-type-specifier
method-declaration
method-body

(In proportional fonts, hyphens are usually skinnier than
underscores and (to my eye) make the text more readable.
It's not as noticable with monospace fonts, but I think the
lower example looks better (read: more readable :-).)

Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

-Peter

Jul 19 '05 #39

Wes Groleau

Peter Ammon wrote:

Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Does Dylan prevent having variables named Max,
Max-Iterations, & Iterations in the same scope?

--
Wes Groleau
"Lewis's case for the existence of God contains fallacies."
"You mean like circular reasoning?"
"He believes in God. Isn't that illogical enough?"

Jul 19 '05 #40

Peter Ammon

Wes Groleau wrote:

Peter Ammon wrote:
Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Does Dylan prevent having variables named Max,
Max-Iterations, & Iterations in the same scope?

No. Whitespace is more important in Dylan than in a language like C.

Max-Iterations <-- variable name
Max - Iterations <-- Max minus Iterations

Other strange characters can appear in Dylan variable names. This
allows for some nice naming conventions without the nastiness of
something like Hungarian Notation. See
http://www.gwydiondylan.org/gdref/tu...nventions.html

-Peter

Jul 19 '05 #41

Steve

If you happen to be use GNAT (GNU Ada), the compiler does do some degree of
spell checking.

gcc -c dointxor.adb
dointxor.adb:30:28: "b_valu" is undefined
dointxor.adb:30:28: possible misspelling of "b_value"
gnatmake: "dointxor.adb" compilation error

If you use the GPS for programming Ada, you'll get a little wrench icon next
to the error in the output window. If you click on the wrench it corrects
the spelling error.

If you're really interested in having comments spell checked, the folks at
ACT (Ada Core Techologies) would probably add the feature for a fee.

Steve
(The Duck)
"Leif Roar Moldskred" <rm******@online.no> wrote in message
news:86***************@huldreheim.huldreskog.no...
[snip]

This touches on one of my pet annoyances with development tools today:
no way to easily spell-check your code. In my opinion, a development
environment should at the very _least_ let you easily spell-check all
the text in comments, and preferably the individual words in variable
and function names (whether the words are separated by mixed case,
hyphens or underscores.)

Unfortunately, nobody else seems to mind. *sighs* Oh well,
spell-checkers are overrated anyway.

--
Leif Roar Moldskred

Jul 19 '05 #42

Matt Gregory

James Dow Allen wrote:

(I usually rewind a file with "lseek(fd, 0L, 0)" because I can't
remember if 0 is SEEKSET or SEEK_SET.)

The Vim editor is cool for things like this because you can add your
own words to the syntax highlighting. I write Windows programs and
I have over a hundred typedef's and constants in my word list.
Actually, Vim's C syntax file comes with the standard C constants
and typedef's highlighted.

Jul 19 '05 #43

Matt Gregory

Peter Ammon wrote:

Programmer Dude wrote:
Or my preference if the tool allows

classmethod :
access-specifier
method-type-specifier
method-return-type-specifier
method-declaration
method-body

(In proportional fonts, hyphens are usually skinnier than
underscores and (to my eye) make the text more readable.
It's not as noticable with monospace fonts, but I think the
lower example looks better (read: more readable :-).)

Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Lisp and Scheme.

Jul 19 '05 #44

Leif Roar Moldskred

"Steve" <no*************@comcast.net> writes:

If you happen to be use GNAT (GNU Ada), the compiler does do some degree of
spell checking.

gcc -c dointxor.adb
dointxor.adb:30:28: "b_valu" is undefined
dointxor.adb:30:28: possible misspelling of "b_value"
gnatmake: "dointxor.adb" compilation error

That's not really spell-checking though - it doesn't check "b_value" to see
if "value" is a proper word in English.

--
Leif Roar Moldskred

Jul 19 '05 #45

Leif Roar Moldskred

"Kevin Morenski" <km@nospam.geekcenter.net> writes:

Let's say you had a variable named "tHTa," for example. With respect to
your concept, this would be a misspelling of the word "that." Now, a lot of
programmers--myself included--use letters to represent certain things in
variable names. tHTa could mean "type HTa" or anything else a programmer
could think of. How could a program possibly differentiate between
conventions in the naming of variables?

In the same way that spell-checkers for ordinary text today handles names and
other words that are correct, but not in the dictionary: When detecting the
unknown word the first time, ask the user what to do with it - whether to
correct it, accept this instance, accept all instances in this document or
add it to your private dictionary. (For a spell-checking of source-code we'd
probably also want the option "accept all instances with this case.")

This really isn't any different from the same problem in regular text,
except that your programming convention might cause a lot of unknown
words to appear. If that's a major headache, just don't spell-check.

--
Leif Roar Moldskred

Jul 19 '05 #46

Martin Dowie

"Matt Gregory" <bl****************@earthlink.net> wrote in message
news:BPvfb.20445

Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Lisp and Scheme.

COBOL

Jul 19 '05 #47

Jakob Bieling

"Leif Roar Moldskred" <rm******@online.no> wrote in message
news:86************@huldreheim.huldreskog.no...

"Steve" <no*************@comcast.net> writes:
If you happen to be use GNAT (GNU Ada), the compiler does do some degree of spell checking.

gcc -c dointxor.adb
dointxor.adb:30:28: "b_valu" is undefined
dointxor.adb:30:28: possible misspelling of "b_value"
gnatmake: "dointxor.adb" compilation error
That's not really spell-checking though - it doesn't check "b_value" to

see if "value" is a proper word in English.

But it is that kind of word-matching I would personally like to see in
more compilers (specifically C++ compilers).

I do agree with Kevin Morenski (news:3f********@nntp2.nac.net) that a
real spell-checker for source code is not practicable. You said that the
spell-checker would just have to ask you whether to ignore it or how else to
proceed. Have you thought about how annoying 100s or even 1000s of those
messages boxes, asking how to proceed, will be when compiling already
existing source with this spell-checker?

regards
--
jb

(replace y with x if you want to reply by e-mail)

Jul 19 '05 #48

Corey Murtagh

Martin Dowie wrote:

"Matt Gregory" <bl****************@earthlink.net> wrote in message
news:BPvfb.20445
Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Lisp and Scheme.

COBOL

Isn't there a variation of Godwin's Law covering COBOL? :>

--
Corey Murtagh
The Electric Monk
"Quidquid latine dictum sit, altum viditur!"

Jul 19 '05 #49

CBFalconer

Matt Gregory wrote:

Peter Ammon wrote:
Programmer Dude wrote:
.... snip ...

(In proportional fonts, hyphens are usually skinnier than
underscores and (to my eye) make the text more readable.
It's not as noticable with monospace fonts, but I think the
lower example looks better (read: more readable :-).)

Agreed! I wish that more languages allowed hyphen use in identifiers.
Dylan is the only one I can think of off the top of my head.

Lisp and Scheme.

Cobol

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Jul 19 '05 #50

ISO Studies of underscores vs MixedCase in Ada or C++

Similar topics