Critique my assignment please

warint

My lecturer gave us an assignment. He has a very "mature" way of
teaching in that he doesn't care whether people show up, whether they
do the assignments, or whether they copy other people's work.
Furthermore, he doesn't even mark the assignments, but rather gives
tips and so forth when going over students' work. To test students'
capabilities for the purpose of state exams and qualifications though,
he actually sits down with us at a computer and watches us write code.

We were issued with an assignment a yesterday. I have attempted the
assignment, and have re-produced the code below. I'd appreciate if
you'd give me advice, suggestions, etc. on the code. Again, I must
stress that you won't simply be "doing my homework for me".

Here's the assignment the lecturer gave us:

[begin quote]
Write a program to check the validity of a serial number found on a
Euro bank note. The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms. The program should be well-
commented and easily understood where possible. Debug-mode facilities
such as "assert" should be exploited no matter how much redundancy
they add -- however the efficiency of the release-mode executable
should not be burdened by debug-mode features. The student must find
out for themselves how to check the validity of a Euro bank note
serial number. The program should not malfunction on account of bad
user input. Best of luck.
[end quote]

Before I begin, here's an excerpt from Wikipedia which discusses how
to check the validity of the serial number:

[begin excerpt]
Replace the initial letter by its position in the alphabet (that is L
is 12, M is 13,..., Z is 26).
Add up this number and every digit of the serial number. For example:
U08217383936 is 21 + 0 + 8 + 2 + 1 + 7 + 3 + 8 + 3 + 9 + 3 + 6 = 71
Add up all the digits of this new number, redo as many times as
necessary until you obtain a one-digit number. Alternatively computer
programmers may find the familiar MOD 9 function easier to implement.
This gives the same result.
The resulting number must be 8 - in the example above, 7 + 1 = 8 or 71
MOD 9 = 8, so it's correct.
[end excerpt]

Now here's my code. You're probably gonna kill me for this, but I
don't have a compiler to hand, so, even though I've checked over the
code, I can't say with 100% certainty that it will compile.

#include <assert.h/* For assert */
#include <ctype.h/* For stuff like isupper */
#include <stdlib.h/* For EXIT_FAILURE */
#include <stdio.h/* For puts and gets */
int const serial_len = 12; /* Serial number = one letter followed
by eleven digits */
/* Function: DigitFromChar

Converts '0' to 0, '5' to 5, '3' to 3, etc..

Exploits C89 feature that '0' through '9' must be consecutive.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid

*/

unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );

return x - '0';
}
/* Function: NumberFromUpperLetter

Converts 'A' to 1, 'B' to 2, 'C' to 3... and so on.

Uses a simple switch statement.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid

*/

unsigned NumberFromUpperLetter(char const x)
{
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
}
}
/* Function: IsValidEuroSerial

Returns false if serial is invalid, otherwise true.

Loops through the characters, summing with each iteration.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/

int IsValidEuroSerial(char const *p)
{
int const assert_dummy = assert(p);

char const *const pend = p + serial_len;

unsigned sum = NumberFromUpperLetter(*p++);

do
{
assert(isnum(*p));
sum += DigitFromChar(*p++);
} while (pend != p);

return 8 == sum%9;
}
int main(void)
{
char input[serial_len + 1],
*p = input + 1,
const *const pnull = input + (sizeof input / sizeof *input -
1);

puts("Enter Euro banknote serial number: ");

gets(input);

if (!isupper(*input)) goto Bad; /* Check first char is upper
letter */

do if (!isnum(*p++)) goto Bad; /* Check each char is a digit */
while (pnull != p);

puts(IsValidEuroSerial(input) ? "\n\nValid\n" : "\n\nINVALID\n");
return 0;

Bad:
puts("\n\nInvalid Input. Input must consist of an uppercase
letter followed by eleven digits only.\n");
return EXIT_FAILURE;
}

The shortcomings I can see so far are:

1) The char array is looped through twice, once in main and once in
"IsValidEuroSerial". I should consider writing an
"IsValidEuroSerial_Safe" function so that it can be reduced to one
loop.
2) I'm sure you will all jump down my throat about "gets", but please
just give suitable alternatives.

Thanks a lot for your time,

Martin

Aug 23 '07 #1

Subscribe Post Reply

3468

user923005

On Aug 23, 4:12 pm, war...@eircom.net wrote:
[snip]

int const assert_dummy = assert(p);

If this compiles, then your compiler is broken.

Aug 24 '07 #2

user923005

On Aug 23, 4:12 pm, war...@eircom.net wrote:
[snip]

do if (!isnum(*p++)) goto Bad; /* Check each char is a digit */

[snip]

I guess you mean isdigit() instead of isnum()

P.S.
Of course we're going to roast you over gets().
In this context:
fgets(input, sizeof input, stdin);
would be perfectly fine and not have dire consequences.

Aug 24 '07 #3

cpedant

Your code fails to compile on my GCC with -ansi -pedantic.

On Aug 23, 6:12 pm, war...@eircom.net wrote:
[snip]

unsigned NumberFromUpperLetter(char const x)
{
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
}

}

A possibly minor point: there is no return statement executed in the
case that x is not one of 'A'-'Z'.

[snip]

int IsValidEuroSerial(char const *p)
{
int const assert_dummy = assert(p);

This is a syntax error since assert expands to a void expression.

>
char const *const pend = p + serial_len;

unsigned sum = NumberFromUpperLetter(*p++);

do
{
assert(isnum(*p));

isnum is not a function in standard C library. I think you're looking
for isdigit.

sum += DigitFromChar(*p++);
} while (pend != p);

return 8 == sum%9;

}

int main(void)
{
char input[serial_len + 1],

Variable-length arrays are not permitted in C89. Although you've
declared serial_len with the const qualifier, this does not make
"serial_len" a constant expression.

*p = input + 1,
const *const pnull = input + (sizeof input / sizeof *input -
1);

This is a syntax error. The leading "const" must be grouped directly
with "char". Here you have it following a comma in a list of
declarators, which is illegal.

>
puts("Enter Euro banknote serial number: ");

gets(input);

It is widely accepted that using fgets is much better than using gets.

>
if (!isupper(*input)) goto Bad; /* Check first char is upper
letter */

do if (!isnum(*p++)) goto Bad; /* Check each char is a digit */

Again, you probably want isdigit here.

while (pnull != p);

puts(IsValidEuroSerial(input) ? "\n\nValid\n" : "\n\nINVALID\n");
return 0;

Bad:
puts("\n\nInvalid Input. Input must consist of an uppercase
letter followed by eleven digits only.\n");
return EXIT_FAILURE;

}

[snip]

Aug 24 '07 #4

Eric Sosman

wa****@eircom.net wrote:

[...]
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

--
Eric Sosman
es*****@ieee-dot-org.invalid

Aug 24 '07 #5

user923005

On Aug 23, 6:42 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:

war...@eircom.net wrote:
[...]
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

/* Since I wanted to do some math directly from the switch, I used a
switch anyway like this: */

#include <assert.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

#define VALID_SERIAL_NUMBER (0)
#define INVALID_LENGTH (-1)
#define INVALID_FORMAT (-2)
#define INVALID_CHECKSUM (-3)

static unsigned letter_to_sum(int letter, int *err)
{
int sum = 0;
assert(isalpha(letter));
switch (letter) {
case 'A':
sum = 2;
break;
case 'B':
sum = 3;
break;
case 'C':
sum = 4;
break;
case 'D':
sum = 5;
break;
case 'E':
sum = 6;
break;
case 'F':
sum = 7;
break;
case 'G':
sum = 8;
break;
case 'H':
sum = 9;
break;
case 'I':
sum = 1;
break;
case 'J':
sum = 2;
break;
case 'K':
sum = 3;
break;
case 'L':
sum = 4;
break;
case 'M':
sum = 5;
break;
case 'N':
sum = 6;
break;
case 'O':
sum = 7;
break;
case 'P':
sum = 8;
break;
case 'Q':
sum = 9;
break;
case 'R':
sum = 1;
break;
case 'S':
sum = 2;
break;
case 'T':
sum = 3;
break;
case 'U':
sum = 4;
break;
case 'V':
sum = 5;
break;
case 'W':
sum = 6;
break;
case 'X':
sum = 7;
break;
case 'Y':
sum = 8;
break;
case 'Z':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}

static unsigned digit_to_sum(int digit, int *err)
{
int sum = 0;
assert(isdigit(digit));
switch (digit) {
case '0':
sum = 0;
break;
case '1':
sum = 1;
break;
case '2':
sum = 2;
break;
case '3':
sum = 3;
break;
case '4':
sum = 4;
break;
case '5':
sum = 5;
break;
case '6':
sum = 6;
break;
case '7':
sum = 7;
break;
case '8':
sum = 8;
break;
case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);
if (err != 0)
return 1;
return sum % 9;
}

int validate_euro_serial(const char *const serial)
{
if (strlen(serial) != 12)
return INVALID_LENGTH;
if (!isalpha(serial[0]))
return INVALID_FORMAT;
if (!isdigit(serial[1]))
return INVALID_FORMAT;
if (!isdigit(serial[2]))
return INVALID_FORMAT;
if (!isdigit(serial[3]))
return INVALID_FORMAT;
if (!isdigit(serial[4]))
return INVALID_FORMAT;
if (!isdigit(serial[5]))
return INVALID_FORMAT;
if (!isdigit(serial[6]))
return INVALID_FORMAT;
if (!isdigit(serial[7]))
return INVALID_FORMAT;
if (!isdigit(serial[8]))
return INVALID_FORMAT;
if (!isdigit(serial[9]))
return INVALID_FORMAT;
if (!isdigit(serial[10]))
return INVALID_FORMAT;
if (!isdigit(serial[11]))
return INVALID_FORMAT;
if (compute_checksum(serial) == 0)
return VALID_SERIAL_NUMBER;
return INVALID_CHECKSUM;
}

#ifdef UNIT_TEST
#include <stdio.h>
#include <string.h>
const char *const valid_serials[] = {
"X01587275921",
"N67010337633",
"L09914094383",
"V10427872648",
NULL
};

const char *const invalid_serials[] = {
"A12345678901",
"D12345678901",
"E12345678901",
"Z12345678901",
NULL
};

const char *const stupid_serials[] = {
"Abz980769671",
"D178901",
"1aE12345678901",
"!@#$%^&*1",
NULL
};

/* Due to Jack Klein: */
char *getsafe(char *buffer, int count)
{
char *result = buffer,
*np;
assert(buffer);
if ((buffer == NULL) || (count < 1))
result = NULL;
else if (count == 1)
*result = '\0';
else if ((result = fgets(buffer, count, stdin)) != NULL)
if (np = strchr(buffer, '\n'))
*np = '\0';
return result;
}

void explain(int result, const char *string)
{
assert(string);
switch (result) {
case VALID_SERIAL_NUMBER:
printf("%s is a valid serial number\n", string);
break;
case INVALID_LENGTH:
printf("%s has an invalid length\n", string);
break;
case INVALID_FORMAT:
printf("%s has an invalid format\n", string);
break;
case INVALID_CHECKSUM:
printf("%s has an invalid checksum\n", string);
break;
default:
printf("I am an idiot, because I never imagined: %s\n",
string);
break;

}
}
char string[32767];
int main(void)
{
char *p;
size_t index;
int result;
for (index = 0; valid_serials[index]; index++) {
result = validate_euro_serial(valid_serials[index]);
explain(result, valid_serials[index]);
}
for (index = 0; invalid_serials[index]; index++) {
result = validate_euro_serial(invalid_serials[index]);
explain(result, invalid_serials[index]);
}
for (index = 0; stupid_serials[index]; index++) {
result = validate_euro_serial(stupid_serials[index]);
explain(result, stupid_serials[index]);
}

puts("Enter a serial number from a Euro note:");
p = getsafe(string, sizeof string);
if (p) {
result = validate_euro_serial(string);
explain(result, string);
}
return 0;
}
#endif

Aug 24 '07 #6

Richard Heathfield

wa****@eircom.net said:

<snip>

[begin quote]
Write a program to check the validity of a serial number found on a
Euro bank note. The program must be fully portable and compliant with
the C89 Standard.

Okay, sounds like a job for comp.lang.c - although he says "fully
portable", he is unlikely to be quite as picky as we are, so you should
be okay.

<more "write it properly" stuff>

All reasonable.

The program should not malfunction on account of bad
user input. Best of luck.

Mark this well.

[end quote]

Before I begin, here's an excerpt from Wikipedia which discusses how
to check the validity of the serial number:

[begin excerpt]
Replace the initial letter by its position in the alphabet (that is L
is 12, M is 13,..., Z is 26).
Add up this number and every digit of the serial number. For example:
U08217383936 is 21 + 0 + 8 + 2 + 1 + 7 + 3 + 8 + 3 + 9 + 3 + 6 = 71

Okay, 21 % 9 is 3, + 0 + 8 is 11, % 9 is 2, + 2 + 1 is 5, + 7 is 12, % 9
is 3, + 3 is 6, + 8 is 14, % 9 is 5, + 3 is 8, + 9 % 9 is 8, + (3 + 6)
% 9 is still 8.

Now here's my code. You're probably gonna kill me for this, but I
don't have a compiler to hand, so, even though I've checked over the
code, I can't say with 100% certainty that it will compile.

Consider yourself dead, then. :-)

After fixing your line wrap, but making no other changes, I get the
following output from a C89-conforming implementation:

foo.c:24: warning: no previous prototype for `DigitFromChar'
foo.c:44: warning: no previous prototype for `NumberFromUpperLetter'
foo.c: In function `NumberFromUpperLetter':
foo.c:59: warning: control reaches end of non-void function
foo.c: At top level:
foo.c:74: warning: no previous prototype for `IsValidEuroSerial'
foo.c: In function `IsValidEuroSerial':
foo.c:75: void value not ignored as it ought to be
foo.c:83: warning: implicit declaration of function `isnum'
foo.c:75: warning: unused variable `assert_dummy'
foo.c: In function `main':
foo.c:93: warning: ANSI C forbids variable-size array `input'
foo.c:93: size of array `input' is too large
foo.c:95: parse error before `const'
foo.c:106: `pnull' undeclared (first use in this function)
foo.c:106: (Each undeclared identifier is reported only once
foo.c:106: for each function it appears in.)
foo.c:93: warning: unused variable `input'
make: *** [foo.o] Error 1

The shortcomings I can see so far are:

....irrelevant until it compiles.

2) I'm sure you will all jump down my throat about "gets", but please
just give suitable alternatives.

fgets

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 24 '07 #7

Keith Thompson

Eric Sosman <es*****@ieee-dot-org.invalidwrites:

wa****@eircom.net wrote:
>[...]
assert(isupper(x));
switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

Well, strchr does a linear search over the 'alphabet' string, whereas
the switch statement most likely uses a jump table for O(1)
performance. It's not a huge deal for only 26 letters, and worrying
about it almost seems like microoptimization, but it's something to
consider.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 24 '07 #8

Old Wolf

On Aug 24, 2:03 pm, Richard Heathfield <r...@see.sig.invalidwrote:

After fixing your line wrap, but making no other changes, I get the
following output from a C89-conforming implementation:

foo.c:24: warning: no previous prototype for `DigitFromChar'
[...]

No suffusion of yellow this time?

Aug 24 '07 #9

Richard Heathfield

Old Wolf said:

On Aug 24, 2:03 pm, Richard Heathfield <r...@see.sig.invalidwrote:
>After fixing your line wrap, but making no other changes, I get the
following output from a C89-conforming implementation:

foo.c:24: warning: no previous prototype for `DigitFromChar'
[...]

No suffusion of yellow this time?

No, because it doesn't compile yet. Suffusions may follow in due course,
but right now they would be premature.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 24 '07 #10

Eric Sosman

Keith Thompson wrote:

Eric Sosman <es*****@ieee-dot-org.invalidwrites:
>wa****@eircom.net wrote:
>>[...]
assert(isupper(x));
switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

Well, strchr does a linear search over the 'alphabet' string, whereas
the switch statement most likely uses a jump table for O(1)
performance.

Although the Standard says nothing about the algorithm
used by strchr() nor about its time complexity, I am confident
that most implementations exhibit O(1) time in the case shown.

Hint: Can you think of a simpler way to write O(27)?

It's not a huge deal for only 26 letters, and worrying
about it almost seems like microoptimization, but it's something to
consider.

At what time should you consider it?

Jackson's Laws Of Program Optimization:

First Law: Don't Do It.

Second Law (for experts only): Don't Do It Yet.

Which of the laws applies to you?

--
Eric Sosman
es*****@ieee-dot-org.invalid

Aug 24 '07 #11

Thad Smith

wa****@eircom.net wrote:

Here's the assignment the lecturer gave us:

[begin quote]
Write a program to check the validity of a serial number found on a
Euro bank note. The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms. The program should be well-
commented and easily understood where possible. Debug-mode facilities
such as "assert" should be exploited no matter how much redundancy
they add -- however the efficiency of the release-mode executable
should not be burdened by debug-mode features. The student must find
out for themselves how to check the validity of a Euro bank note
serial number. The program should not malfunction on account of bad
user input. Best of luck.
[end quote]

First, my critique of the assignment (since you asked): I think it is a
well-defined assignment. I especially like the part that /you/ need to
determine how to check validity.

int const serial_len = 12; /* Serial number = one letter followed
by eleven digits */

This should be a #define constant in C. Your C++ is showing. ;-)

/* Function: DigitFromChar

Converts '0' to 0, '5' to 5, '3' to 3, etc..

Exploits C89 feature that '0' through '9' must be consecutive.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid

*/

First, I like your function header. I personally like to specify the
domain of the inputs, so I would say that the input is a digit '0'
though '9', but you imply that.

I would quibble with your designation of an assertion exit as safe,
while returning an undefined value when called with an input outside the
specified domain as unsafe, but that's minor.

/* Function: NumberFromUpperLetter

Converts 'A' to 1, 'B' to 2, 'C' to 3... and so on.

Uses a simple switch statement.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid

*/

unsigned NumberFromUpperLetter(char const x)
{
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
}
}

Others has mentioned that no value is returned for a domain error. For
this function, I would probably return 0 for a non-uppercase character.

/* Function: IsValidEuroSerial

Returns false if serial is invalid, otherwise true.

Loops through the characters, summing with each iteration.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/

int IsValidEuroSerial(char const *p)
{
int const assert_dummy = assert(p);

char const *const pend = p + serial_len;

unsigned sum = NumberFromUpperLetter(*p++);

do
{
assert(isnum(*p));
sum += DigitFromChar(*p++);
} while (pend != p);

return 8 == sum%9;
}

My most important suggestion is with this function. According to the
description, it determines whether a specified string is a valid Euro
(bank note) serial number. It fails to determine whether the string is
in the proper format. I see that you test that elsewhere, but the
description of this function does not match the functionality. I think
your program would be much better organized and capable of being used
elsewhere if IsValidEuroSerial() did a full validity check.

puts(IsValidEuroSerial(input) ? "\n\nValid\n" : "\n\nINVALID\n");
return 0;

Bad:
puts("\n\nInvalid Input. Input must consist of an uppercase
letter followed by eleven digits only.\n");
return EXIT_FAILURE;

Here you have too types of responses for invalid serial numbers. One
says only INVALID and returns a value of 0, which the other returns with
a longer message and returns EXIT_FAILURE. Why are the two error
responses so different?

Your code also does not check for an input which is too long.

--
Thad

Aug 24 '07 #12

cpedant

On Aug 23, 9:53 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:

Keith Thompson wrote:
Eric Sosman <esos...@ieee-dot-org.invalidwrites:
war...@eircom.net wrote:
[...]
assert(isupper(x));
switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

[snip]

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

Well, strchr does a linear search over the 'alphabet' string, whereas
the switch statement most likely uses a jump table for O(1)
performance.

Although the Standard says nothing about the algorithm
used by strchr() nor about its time complexity, I am confident
that most implementations exhibit O(1) time in the case shown.

Hint: Can you think of a simpler way to write O(27)?

Sure. Let A = O(27). A. :)

The string search is a nice solution in this case. However, I think an
important observation to make is that the niceness of it is dependent
on
the fact that we are mapping 'A'-'Z' onto {1, 2, ..., 26} (or some
other
nice set). In a hypothetical general case where we may need to map
'A'-'Z' onto nastier sets, the switch does the job easily while the
string search breaks down.

[snip]

Aug 24 '07 #13

Army1987

On Thu, 23 Aug 2007 22:53:24 -0400, Eric Sosman wrote:

[snip]

>Well, strchr does a linear search over the 'alphabet' string, whereas
the switch statement most likely uses a jump table for O(1)
performance.

Although the Standard says nothing about the algorithm
used by strchr() nor about its time complexity, I am confident
that most implementations exhibit O(1) time in the case shown.

Hint: Can you think of a simpler way to write O(27)?

Yes. O(1). O(27) means bounded time. Any algorithm which has O(27)
runtime has also O(1) runtime, but with a constant 27 times
larger, and viceversa. You were thinking about O(n)?
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 24 '07 #14

Army1987

On Thu, 23 Aug 2007 18:49:31 -0700, user923005 wrote:

On Aug 23, 6:42 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:
>war...@eircom.net wrote:
[...]
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

/* Since I wanted to do some math directly from the switch, I used a
switch anyway like this: */

#include <assert.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

#define VALID_SERIAL_NUMBER (0)
#define INVALID_LENGTH (-1)
#define INVALID_FORMAT (-2)
#define INVALID_CHECKSUM (-3)

static unsigned letter_to_sum(int letter, int *err)
{
int sum = 0;
assert(isalpha(letter));
switch (letter) {
case 'A':
sum = 2;
break;
case 'B':
sum = 3;
break;

[snip]

case 'H':
sum = 9;

What would be wrong with sum = 0 here, then?

break;

[snip]

default:
*err = 1;
break;
}
return sum;
}

static unsigned digit_to_sum(int digit, int *err)
{
int sum = 0;
assert(isdigit(digit));
switch (digit) {

Whoa. How is it better than sum = digit - '0'?

case '0':
sum = 0;
break;
case '1':
sum = 1;
break;
case '2':
sum = 2;
break;
case '3':
sum = 3;
break;
case '4':
sum = 4;
break;
case '5':
sum = 5;
break;
case '6':
sum = 6;
break;
case '7':
sum = 7;
break;
case '8':
sum = 8;
break;
case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

if (err != 0)
return 1;
return sum % 9;
}

[snip]

char string[32767];

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.
Also, identifiers starting with str followed by a lowercase letter
are reserved.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 24 '07 #15

Army1987

On Thu, 23 Aug 2007 16:12:36 -0700, warint wrote:

>
My lecturer gave us an assignment. He has a very "mature" way of
teaching in that he doesn't care whether people show up, whether they
do the assignments, or whether they copy other people's work.
Furthermore, he doesn't even mark the assignments, but rather gives
tips and so forth when going over students' work. To test students'
capabilities for the purpose of state exams and qualifications though,
he actually sits down with us at a computer and watches us write code.

We were issued with an assignment a yesterday. I have attempted the
assignment, and have re-produced the code below. I'd appreciate if
you'd give me advice, suggestions, etc. on the code. Again, I must
stress that you won't simply be "doing my homework for me".

Here's the assignment the lecturer gave us:

[begin quote]
Write a program to check the validity of a serial number found on a
Euro bank note. The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms. The program should be well-
commented and easily understood where possible. Debug-mode facilities
such as "assert" should be exploited no matter how much redundancy
they add -- however the efficiency of the release-mode executable
should not be burdened by debug-mode features. The student must find
out for themselves how to check the validity of a Euro bank note
serial number. The program should not malfunction on account of bad
user input. Best of luck.
[end quote]

Before I begin, here's an excerpt from Wikipedia which discusses how
to check the validity of the serial number:

[begin excerpt]
Replace the initial letter by its position in the alphabet (that is L
is 12, M is 13,..., Z is 26).
Add up this number and every digit of the serial number. For example:
U08217383936 is 21 + 0 + 8 + 2 + 1 + 7 + 3 + 8 + 3 + 9 + 3 + 6 = 71
Add up all the digits of this new number, redo as many times as
necessary until you obtain a one-digit number. Alternatively computer
programmers may find the familiar MOD 9 function easier to implement.
This gives the same result.
The resulting number must be 8 - in the example above, 7 + 1 = 8 or 71
MOD 9 = 8, so it's correct.
[end excerpt]

Now here's my code. You're probably gonna kill me for this, but I
don't have a compiler to hand, so, even though I've checked over the
code, I can't say with 100% certainty that it will compile.

[snip]

The shortcomings I can see so far are:

1) The char array is looped through twice, once in main and once in
"IsValidEuroSerial". I should consider writing an
"IsValidEuroSerial_Safe" function so that it can be reduced to one
loop.

Write the function which returns false even if the format or the
length are wrong.
BTW, identifiers starting with "is" followed by a lowercase letter
are reserved, and identifiers with external linkage needn't be
case-sensitive or have more than 6 significant characters, so
IsValidEuroSerial is equivalent to isvali which is a reserved
identifier. Either declare the function as static, so that the
identifier will have internal linkage and will be required to be
case sensitive, or call it Is_ValidEuroSerial (or, as my religion
suggests, is_valid_euro_serial).

2) I'm sure you will all jump down my throat about "gets", but please
just give suitable alternatives.

Well, the assignment doesn't require to read from stdin, so you
could pass the serial as a command line argument. (Note that
implementations where command line is case insensitive will
pass it in lowercase, so ignore the case of the first letter.)
#include "stuff.h"
int main(int argc, char *argv[])
{
if (argv < 2) {
fprintf(stderr, "Usage:\t%s serial_no\n", argv[0]);
return EXIT_FAILURE;
} else {
argv[1][0] = toupper((unsigned char)argv[1][0]);
printf("%s %s a valid serial number for a euro "
"banknote.\n", argv[1],
is_valid_euro_serial(argv[1]) ? "is" : "is not");
}
return 0;
}
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 24 '07 #16

warint

Thanks everyone for your replies. I'll give it a second try:

#include <assert.h/* For assert */
#include <ctype.h/* For stuff like isupper */
#include <stdio.h/* For puts and gets */
#include <string.h/* For strchr */
#define SERIAL_LEN 12 /* Serial number = one letter followed by
eleven digits */
/* Function: DigitFromChar

Converts '0' to 0, '5' to 5, '3' to 3, etc..

Exploits C89 feature that '0' through '9' must be consecutive.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/
unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );

return x - '0';
}
/* Function: NumberFromUpperLetter

Converts 'A' to 1, 'B' to 2, 'C' to 3... and so on.

Uses "ABCDEF..." and strchr.

Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/
unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

assert(isupper(x));

return strchr(letters, x) - letters + 1;
}
/* Function: IsValidEuroSerial

Returns 1 if valid, 0 if invalid, or -1 if syntax error.

Loops through the characters, summing with each iteration.

Release Mode: UNSAFE because behaviour is undefined if pointer is
invalid
Debug Mode: SAFE because assertion fails if pointer is invalid
*/
int IsValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );

char const *const pend = p + SERIAL_LEN;

unsigned sum;
if(!isupper(*p)) return -1;

sum = NumberFromUpperLetter(*p++);

do
{
if(!isdigit(*p)) return -1;
sum += DigitFromChar(*p++);
} while (pend != p);

if (*pend) return -1;

if (8 == sum%9) return 1;
else return 0;
}
int main(void)
{
char input[SERIAL_LEN + 1];

char const *output_string;
puts("Enter Euro banknote serial number: ");

gets(input);
switch (Is_ValidEuroSerial(input))
{
case -1: output_string = "\n\nInvalid Input. Input must consist "
"of an uppercase letter followed by "
"eleven digits only.\n";
break;

case 0: output_string = "\n\nINVALID\n";

break;

case 1: output_string = "\n\nValid\n";

break;
}

puts(output_string);

return 0;
}

Aug 24 '07 #17

Richard Heathfield

wa****@eircom.net said:

Thanks everyone for your replies. I'll give it a second try:

Getting better, but not quite there yet.

foo.c:24: warning: no previous prototype for `DigitFromChar'
foo.c:44: warning: no previous prototype for `NumberFromUpperLetter'
foo.c:66: warning: no previous prototype for `IsValidEuroSerial'
foo.c: In function `IsValidEuroSerial':
foo.c:67: warning: unused variable `dummy'
foo.c: In function `main':
foo.c:103: warning: implicit declaration of function
`Is_ValidEuroSerial'
foo.c:95: warning: `output_string' might be used uninitialized in this
function
foo.o: In function `main':
foo.c:100: the `gets' function is dangerous and should not be used.
foo.c:103: undefined reference to `Is_ValidEuroSerial'
collect2: ld returned 1 exit status
make: *** [foo] Error 1

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 24 '07 #18

Eric Sosman

wa****@eircom.net wrote:

Thanks everyone for your replies. I'll give it a second try:
[...]

unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
assert(isupper(x));
return strchr(letters, x) - letters + 1;
}

Shaky. First, assert() is not a good mechanism for
checking the validity of input: not only can it be turned
off, but if it detects something there's no recovery and
little likelihood of a diagnostic message that has meaning
for anyone but the programmer himself.

Second, there well be upper-case letters that are not
among the twenty-six you have listed: Ã‚, Ã‘, Ä¢, Ã˜, Ã‡ and so
on. In theory, all C programs begin execution in the "C"
locale where only the listed twenty-six are upper-case, and
you're safe. But in practice, as a "convenience," some C
implementations start in some other, non-standard locale
that agrees better with local customs than with the Standard.
You're safe in theory, but in practice you may find that
isupper(x) is no guarantee that strchr(letters, x) will
return a non-NULL result.

Recommendations: (1) Use something solider than assert()
for input validation. (2) Don't try to predict whether the
strchr() will succeed or fail, but inspect its returned value
to discover what happened. (Thought experiment: What if it
turned out that the letter O was forbidden at the start of
a serial number, because of its resemblance to 0? Would you
still use isupper() to check for validity? What if I and L
and Y were also forbidden because they look like 1, Â£, and Â¥?
In short: Why make two tests when one will do?)

--
Eric Sosman
es*****@ieee-dot-org.invalid

Aug 24 '07 #19

warint

foo.c:24: warning: no previous prototype for `DigitFromChar'
foo.c:44: warning: no previous prototype for `NumberFromUpperLetter'
foo.c:66: warning: no previous prototype for `IsValidEuroSerial'

I realise your compiler is set to warn about this, but I'm happy with
their omission in this program.

foo.c: In function `IsValidEuroSerial':
foo.c:67: warning: unused variable `dummy'

Again I'm happy with this.

foo.c: In function `main':
foo.c:103: warning: implicit declaration of function
`Is_ValidEuroSerial'

Wups a daisy. I'll rewrite the code and post below.

foo.c:95: warning: `output_string' might be used uninitialized in this
function

The surrounding functions prevent that. I'm happy with the setup.

foo.o: In function `main':
foo.c:100: the `gets' function is dangerous and should not be used.

fgets instead, I presume?

foo.c:103: undefined reference to `Is_ValidEuroSerial'
collect2: ld returned 1 exit status
make: *** [foo] Error 1

Here we go:

#include <assert.h/* For assert */
#include <ctype.h/* For stuff like isupper */
#include <stdio.h/* For puts and gets */
#include <string.h/* For strchr */
#define SERIAL_LEN 12 /* Serial number = one letter followed by
eleven digits */
/* Function: DigitFromChar
Converts '0' to 0, '5' to 5, '3' to 3, etc..
Exploits C89 feature that '0' through '9' must be consecutive.
Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/
unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );
return x - '0';

}
/* Function: NumberFromUpperLetter

Converts 'A' to 1, 'B' to 2, 'C' to 3... and so on.
Uses "ABCDEF..." and strchr.
Release Mode: UNSAFE because behaviour is undefined if input is
invalid
Debug Mode: SAFE because assertion fails if input is invalid
*/
unsigned NumberFromUpperLetter(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
assert(isupper(x));
return strchr(letters, x) - letters + 1;

}
/* Function: Is_ValidEuroSerial

Returns 1 if valid, 0 if invalid, or -1 if syntax error.
Loops through the characters, summing with each iteration.
Release Mode: UNSAFE because behaviour is undefined if pointer is
invalid
Debug Mode: SAFE because assertion fails if pointer is invalid
*/
int Is_ValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );
char const *const pend = p + SERIAL_LEN;
unsigned sum;
if(!isupper(*p)) return -1;
sum = NumberFromUpperLetter(*p++);
do
{
if(!isdigit(*p)) return -1;
sum += DigitFromChar(*p++);
} while (pend != p);
if (*pend) return -1;
if (8 == sum%9) return 1;
else return 0;

}
int main(void)
{
char input[SERIAL_LEN + 1];

char const *output_string;
puts("Enter Euro banknote serial number: ");
fgets(input,sizeof input / sizeof *input, stdin);
switch (Is_ValidEuroSerial(input))
{
case -1: output_string = "\n\nInvalid Input. Input must consist "
"of an uppercase letter followed by "
"eleven digits only.\n";
break;
case 0: output_string = "\n\nINVALID\n";
break;
case 1: output_string = "\n\nValid\n";
break;
}
puts(output_string);
return 0;

}

Aug 24 '07 #20

Richard Heathfield

wa****@eircom.net said:

<snip>

unsigned DigitFromChar(char const x)
{
assert( x >= '0' && x <= '9' );

Better: assert(isdigit(x));

<snip>

int Is_ValidEuroSerial(char const *p)
{
int const dummy = ( assert(p), 0 );

Poor style. You don't need this value. If the assertion fails, the
program terminates without using the value. If the assertion is absent
or doesn't fire, dummy gets the value 0, always, and then you don't use
it. What's the point?

char const *const pend = p + SERIAL_LEN;
unsigned sum;
if(!isupper(*p)) return -1;
sum = NumberFromUpperLetter(*p++);
do
{
if(!isdigit(*p)) return -1;
sum += DigitFromChar(*p++);
} while (pend != p);

I'm not saying you should, necessarily, but were you aware that you can,
if you wish, cast out 9s as you go?

sum += DigitFromChar(*p++);
sum %= 9;

Anyway - yes, that looks reasonable. You might want to check the return
value of fgets, though - always a good idea to ensure that a requested
resource was in fact made available, rather than just to assume it.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 24 '07 #21

user923005

On Aug 24, 3:00 am, Army1987 <army1...@NOSPAM.itwrote:

On Thu, 23 Aug 2007 18:49:31 -0700, user923005 wrote:
On Aug 23, 6:42 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:
war...@eircom.net wrote:
[...]
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;

Gyuuggh ...

A naive programmer who hasn't yet learned that all the
world isn't ASCII might write

if ('A' <= x && x <= 'Z')
return x - 'A' + 1;
else
vomit();

A slightly more sophisticated programmer might use a
look-up table. But a programmer who knows what's in his
library -- and assiduous use of the library was, I believe,
part of the assignment -- would write something more like

static const char alphabet[]
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char *px = strchr(alphabet, x);
if (px != NULL)
return px - alphabet + 1;
else
vomit();

/* Since I wanted to do some math directly from the switch, I used a
switch anyway like this: */

#include <assert.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

#define VALID_SERIAL_NUMBER (0)
#define INVALID_LENGTH (-1)
#define INVALID_FORMAT (-2)
#define INVALID_CHECKSUM (-3)

static unsigned letter_to_sum(int letter, int *err)
{
int sum = 0;
assert(isalpha(letter));
switch (letter) {
case 'A':
sum = 2;
break;
case 'B':
sum = 3;
break;
[snip]
case 'H':
sum = 9;

What would be wrong with sum = 0 here, then?

It is neither better nor worse.

break;
[snip]
default:
*err = 1;
break;
}
return sum;
}

static unsigned digit_to_sum(int digit, int *err)
{
int sum = 0;
assert(isdigit(digit));
switch (digit) {

Whoa. How is it better than sum = digit - '0'?

Neither better nor worse.

case '0':
sum = 0;
break;
case '1':
sum = 1;
break;
case '2':
sum = 2;
break;
case '3':
sum = 3;
break;
case '4':
sum = 4;
break;
case '5':
sum = 5;
break;
case '6':
sum = 6;
break;
case '7':
sum = 7;
break;
case '8':
sum = 8;
break;
case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

if (err != 0)
return 1;
return sum % 9;
}
[snip]
char string[32767];

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?...Category=42478
so 32K is $ 0.002625, but point taken.

Also, identifiers starting with str followed by a lowercase letter
are reserved.

I realize that function names of that format are reserved:
7.26.10 General utilities <stdlib.h>
1 Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.hheader.
7.26.11 String handling <string.h>
1 Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.hheader.

Can you give me a citation that proves variable names are likewise
reserved to the implementation?

Aug 24 '07 #22

Army1987

On Fri, 24 Aug 2007 08:46:32 -0400, Eric Sosman wrote:

Army1987 wrote:
>On Thu, 23 Aug 2007 22:53:24 -0400, Eric Sosman wrote:
>> Hint: Can you think of a simpler way to write O(27)?
Yes. O(1). O(27) means bounded time. Any algorithm which has O(27)
runtime has also O(1) runtime, but with a constant 27 times
larger, and viceversa. You were thinking about O(n)?

<topicality level="dubious">

O(27) means bounded time (if we're writing about time,
which in this case we are). So do O(1) and O(n) and O(n*n)
and O(exp(n)) and O(f(n)) for arbitrary f.

Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 24 '07 #23

Army1987

On Fri, 24 Aug 2007 11:48:55 -0700, user923005 wrote:

On Aug 24, 3:00 am, Army1987 <army1...@NOSPAM.itwrote:
>On Thu, 23 Aug 2007 18:49:31 -0700, user923005 wrote:
On Aug 23, 6:42 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:
war...@eircom.net wrote:

static unsigned digit_to_sum(int digit, int *err)
{
int sum = 0;
assert(isdigit(digit));
switch (digit) {

Whoa. How is it better than sum = digit - '0'?

Neither better nor worse.

For me, it isn't. I won't take more time to copy and paste that
than to write sum = digit - '0'. For you I'd expect a difference,
but if you don't mind bothering to type that, that's your
business. Also, why waste more than one screenful when one line
suffices?

case '0':
sum = 0;
break;

[snip]

case '9':
sum = 9;
break;
default:
*err = 1;
break;
}
return sum;
}

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

So why did you bother typing *that*?

if (err != 0)
return 1;
return sum % 9;
}
[snip]
char string[32767];

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?...Category=42478
so 32K is $ 0.002625, but point taken.

C89 does not require an implementation to allow more than 32 KB
of auto variables. And anyway you just need 13 bytes for that...
(If the twelfth isn't '\n' the string is too long.)

>Also, identifiers starting with str followed by a lowercase letter
are reserved.

I realize that function names of that format are reserved:
7.26.10 General utilities <stdlib.h>
1 Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.hheader.
7.26.11 String handling <string.h>
1 Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.hheader.

Can you give me a citation that proves variable names are likewise
reserved to the implementation?

My wrong. They are only if they have external linkage. 7.1.3.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 24 '07 #24

Eric Sosman

Army1987 wrote On 08/24/07 15:45,:

On Fri, 24 Aug 2007 08:46:32 -0400, Eric Sosman wrote:

>>Army1987 wrote:

>>>On Thu, 23 Aug 2007 22:53:24 -0400, Eric Sosman wrote:

Hint: Can you think of a simpler way to write O(27)?

Yes. O(1). O(27) means bounded time. Any algorithm which has O(27)
runtime has also O(1) runtime, but with a constant 27 times
larger, and viceversa. You were thinking about O(n)?

<topicality level="dubious">

O(27) means bounded time (if we're writing about time,
which in this case we are). So do O(1) and O(n) and O(n*n)
and O(exp(n)) and O(f(n)) for arbitrary f.

Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

A deliberate abuse to draw attention to the fact that
the first argument to strchr() is an unvarying, constant
string. The time strchr() requires for a search is bounded
by the worst-case time to search that constant string, and
this worst-case time is a constant[*]. To put it another
way, there is "no n" that increases and drives some value
towards a limiting case. Suggesting a `switch' because it
is "most likely [...] O(1)" is an abuse of big-O[**].

... to which I responded with a deliberate abuse of
a different sort, intending to draw attention to Keith's
misstatement. (I just got through reading "Plato and a
Platypus Walk Into a Bar," and it's made me too subtle
for my own good ...)
[*] Well, no, but the analyses we can carry out in
C-land are not able to deal with pipeline stalls, cache
effects, TLB misses, and all the rest.

[**] Please note that I do not claim `switch' is
slower than strchr() or vice versa, but something rather
different: It's the reason given for using `switch' that
is wrong, not necessarily the conclusion itself.

--
Er*********@sun.com

Aug 24 '07 #25

user923005

On Aug 24, 12:57 pm, Army1987 <army1...@NOSPAM.itwrote:
[snip]

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[9], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

So why did you bother typing *that*?

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

if (err != 0)
return 1;
return sum % 9;
}
[snip]
char string[32767];

"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?...Category=42478
so 32K is $ 0.002625, but point taken.

C89 does not require an implementation to allow more than 32 KB
of auto variables.

It is not an auto variable.

And anyway you just need 13 bytes for that...
(If the twelfth isn't '\n' the string is too long.)

Making it the exact length is a mistake. When someone inputs the
string, they should be able to type something too long and correct it.
Of course, 32767 is utter overkill, but it is a habit I have for
simple demo programs.

[snip]

Aug 24 '07 #26

cpedant

On Aug 24, 4:31 pm, Eric Sosman <Eric.Sos...@sun.comwrote:

Army1987 wrote On 08/24/07 15:45,:

On Fri, 24 Aug 2007 08:46:32 -0400, Eric Sosman wrote:

>Army1987 wrote:

>>On Thu, 23 Aug 2007 22:53:24 -0400, Eric Sosman wrote:

>> Hint: Can you think of a simpler way to write O(27)?

>>Yes. O(1). O(27) means bounded time. Any algorithm which has O(27)
runtime has also O(1) runtime, but with a constant 27 times
larger, and viceversa. You were thinking about O(n)?

><topicality level="dubious">

O(27) means bounded time (if we're writing about time,
which in this case we are). So do O(1) and O(n) and O(n*n)
and O(exp(n)) and O(f(n)) for arbitrary f.

No, this is an abuse of the word "bounded". Plain "bounded" is
conventionally used to mean "bounded by a constant". "bounded by f(n)"
is used to indicate boundedness by a specific function. If plain
"bounded" were used in the way that you have used it, then it would be
a meaningless tautology because every function f(n) would then
trivially be "bounded": f(n) < f(n) + 1.

>
Well, I was using 'bounded' in a stricter sense.
Yes, O(exp(exp(n))) and O(busy_beaver(n)) both imply that the
algorithm will eventually terminate, no matter how large n is.

But O(1) and O(27) mean that the limsup as n approaches infinity
of runtime(n) / 1 (or runtime(n) / 27) is finite. IOW there is a
upper bound M such as for any n, the runtime is less than M.

(We both agree that O(27) and O(1) are synonymous... So what was
the statement about strchr being O(27) about?)

A deliberate abuse to draw attention to the fact that
the first argument to strchr() is an unvarying, constant
string. The time strchr() requires for a search is bounded
by the worst-case time to search that constant string, and
this worst-case time is a constant[*]. To put it another
way, there is "no n" that increases and drives some value
towards a limiting case. Suggesting a `switch' because it
is "most likely [...] O(1)" is an abuse of big-O[**].

... to which I responded with a deliberate abuse of
a different sort, intending to draw attention to Keith's
misstatement. (I just got through reading "Plato and a
Platypus Walk Into a Bar," and it's made me too subtle
for my own good ...)
[*] Well, no, but the analyses we can carry out in
C-land are not able to deal with pipeline stalls, cache
effects, TLB misses, and all the rest.

[**] Please note that I do not claim `switch' is
slower than strchr() or vice versa, but something rather
different: It's the reason given for using `switch' that
is wrong, not necessarily the conclusion itself.

Yes, the point is that the "runtime" of the function in question is
bounded by a constant regardless of whether or not strchr is used
(assuming strchr is not written in some absurd way).

Keith's suggestion to use a switch because it is "most likely [...]
O(1)" is correct. The misnomer is his implication that our function is
not "O(1)" if strchr is used. strchr itself is likely "O(n)", but that
does not make our function "O(n)" for the reasons you have stated.

I find such informal (ab)use of big-O notation often leads to nothing
but confusion when its semantics start being discussed, and downright
misleading for people who are not very familiar with big-O. Formally
speaking, the "runtime" of our function actually lies in each of O(1),
O(n), O(n^2), O(2^n), O(n^n), and (infinitely) many more distinct
classes of functions. Informal usage of "O(f(n))" is almost always
meant to correspond to the lesser-known Theta(f(n)) function class.

Aug 24 '07 #27

Keith Thompson

user923005 <dc*****@connx.comwrites:

On Aug 24, 12:57 pm, Army1987 <army1...@NOSPAM.itwrote:
[snip]

static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

>I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

So why did you bother typing *that*?

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

I changed one of the quoted lines of code. Did you notice? Didn't
think so.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 24 '07 #28

user923005

On Aug 24, 4:00 pm, Keith Thompson <ks...@mib.orgwrote:

user923005 <dcor...@connx.comwrites:
On Aug 24, 12:57 pm, Army1987 <army1...@NOSPAM.itwrote:
[snip]
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

So why did you bother typing *that*?

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

Are you saying that seeing a one character typo in the above is harder
than seeing a one character typo in a loop?

I changed one of the quoted lines of code. Did you notice? Didn't
think so.

It's quite easy to spot. Certainly no more difficult than spotting a
typo in a loop. Loops are also prone to off-by-one fencepost errors,
where the inline version is less likely to suffer that defect.
At any rate, I do not consider a loop in any way superior. At some
point (e.g. 25 lines or so) a loop is probably better just because it
is a bit less tedious. I have a tendency to unroll loops that comes
from programming way, way, back in the day when compilers did a poor
job of it. So maybe it is a poor habit on my part, since others seem
to have an objection to it. But I firmly do not believe that it will
cause a greater defect rate or difficulty in maintenance.

Aug 25 '07 #29

Keith Thompson

user923005 <dc*****@connx.comwrites:

On Aug 24, 4:00 pm, Keith Thompson <ks...@mib.orgwrote:
>user923005 <dcor...@connx.comwrites:
On Aug 24, 12:57 pm, Army1987 <army1...@NOSPAM.itwrote:
[snip]
static int compute_checksum(const char *const serial)
{
unsigned sum = 0;
int err = 0;
sum += letter_to_sum(serial[0], &err);
sum += digit_to_sum(serial[1], &err);
sum += digit_to_sum(serial[2], &err);
sum += digit_to_sum(serial[3], &err);
sum += digit_to_sum(serial[4], &err);
sum += digit_to_sum(serial[5], &err);
sum += digit_to_sum(serial[6], &err);
sum += digit_to_sum(serial[7], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[8], &err);
sum += digit_to_sum(serial[10], &err);
sum += digit_to_sum(serial[11], &err);

>I don't think the OP's teacher means *this* by "the code must be
efficient. How many nanoseconds it is going to save compared to a
loop if all the optimizations are turned off? How many compilers
won't turn a loop into this if they knew it'd be faster and the
optimizations are turned to a decent level?

It's O(1). A loop of 12 iteratsions is also O(1). There is no
difference.

>So why did you bother typing *that*?

The first iteration is different, and the others were all cut and
paste.
It was not any more difficult than typing the loop.

[snip]

But it's much harder to read, and any chunk of code will be read many
more times than it's written. The run time of your code is not the
only consideration.

What if there's a typo on one of the lines? It's likely that you'll
never notice it, whereas a single digit_to_sum() call in a loop is
easier to maintain.

Are you saying that seeing a one character typo in the above is harder
than seeing a one character typo in a loop?

Yes, but that's not my main point. It's easier to *make* a one
character typo, simply because the unnecessary multiple lines
introduce more opportunites for error. And it's harder to maintain,
since there are N places that have to be updated consistently.

>I changed one of the quoted lines of code. Did you notice? Didn't
think so.

It's quite easy to spot. Certainly no more difficult than spotting a
typo in a loop. Loops are also prone to off-by-one fencepost errors,
where the inline version is less likely to suffer that defect.
At any rate, I do not consider a loop in any way superior. At some
point (e.g. 25 lines or so) a loop is probably better just because it
is a bit less tedious. I have a tendency to unroll loops that comes
from programming way, way, back in the day when compilers did a poor
job of it. So maybe it is a poor habit on my part, since others seem
to have an objection to it. But I firmly do not believe that it will
cause a greater defect rate or difficulty in maintenance.

Yes, I think it's a bad habit. You're doing manual
micro-optimization; the compiler is likely to be able to do a better
job of it than you are (particularly if it knows it can get faster
code *without* unrolling the loop).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Aug 25 '07 #30

Army1987

On Fri, 24 Aug 2007 15:21:33 -0700, user923005 wrote:

char string[32767];

>"The program must be fully portable and compliant with
the C89 Standard. The program should exploit the standard library
where possible. The program should also be expected to perform
efficiently, both in terms of resource consumption and execution
speed, on a wide variety of platforms." IOW, 32 KB are much, much
more than needed.

512 MB of RAM is $42:
http://www.archmemory.com/index.asp?...Category=42478
so 32K is $ 0.002625, but point taken.

C89 does not require an implementation to allow more than 32 KB
of auto variables.

It is not an auto variable.

Whoops... snipped too much. Sorry for that. (I'd better check in
the *original* code before making such comments, next time. I was
mis-remembering having seen it declared in main.)

>And anyway you just need 13 bytes for that...
(If the twelfth isn't '\n' the string is too long.)

Making it the exact length is a mistake. When someone inputs the
string, they should be able to type something too long and correct it.

A serial number on a euro banknote is 12 characters. If you can
read 13 characters from stdin without hitting the end-of-file, and
the thirteen character isn't a newline (or whitespace), you know
that the serial number input is not valid because it is too long.

Of course, 32767 is utter overkill, but it is a habit I have for
simple demo programs.

All right, I understand that (though I tend to use less
ridiculously large numbers for those).

--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 25 '07 #31

CBFalconer

cp*****@gmail.com wrote:

On Aug 23, 6:12 pm, war...@eircom.net wrote:
[snip]
>unsigned NumberFromUpperLetter(char const x) {
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
}
}

A possibly minor point: there is no return statement executed in
the case that x is not one of 'A'-'Z'.

Foully overlong and tricky to write code, IMO. Try:

unsigned NumberFromUpperLetter(char const x) {
static const char UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int ch;
char *p;

ch = toupper((unsigned char)x);
if (p = strchr(UC, ch)) return p - &UC[0];
else return 0;
} /* untested */

Note the absence of any foul assert calls. All non-letters return
zero. By changing the function name and the string UC you can
search and convert any collection of chars. Works regardless of
the char set in effect.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
--
Posted via a free Usenet account from http://www.teranews.com

Aug 25 '07 #32

warint

CBFalconer:

unsigned int NumberFromUpperLetter(char const x) {
static const char UC[] = " ABCDEFGHIJKLMNOPQRSTUVW";
const char *p;

if (p = strchr((unsigned char)x, UC)) return p - &UC[0];
else return 0;
} /* untested */

Before I go to write a function, or any code really, I decide how
"safe" it should be. I decide whether all the input to a function
should be non-erronous, or whether it's the function's duty to check
for errors. I do this for two reasons:

1: Efficiency (i.e. fast execution time)
2: Erradicate redundant code before it's even written

Notwithstanding this though, debug-mode code such as "assert" can be
put anywhere so long as it doesn't hinder the release-mode executable.
(That's *my* religion in anyway).

In the case of converting an uppercase letter to a number, I decided
that it was the CALLING function's duty to make sure there was no
erronous data, leaving me with a function something like:

unsigned UpperLetterToNumber(char const x)
{
static char const letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

assert(x); /* Redundant, but OK because it only works in debug
mode */

return strchr(letters,x) - letters + 1;
}

Martin

Aug 25 '07 #33

Martin Wells

assert(x); /* Redundant, but OK because it only works in debug

assert(isupper(x));

Aug 25 '07 #34

pete

CBFalconer wrote:

>
cp*****@gmail.com wrote:
On Aug 23, 6:12 pm, war...@eircom.net wrote:
[snip]
unsigned NumberFromUpperLetter(char const x) {
assert(isupper(x));

switch (x)
{
case 'A': return 1; case 'B': return 2; case 'C': return 3;
case 'D': return 4; case 'E': return 5; case 'F': return 6;
case 'G': return 7; case 'H': return 8; case 'I': return 9;
case 'J': return 10; case 'K': return 11; case 'L': return 12;
case 'M': return 13; case 'N': return 14; case 'O': return 15;
case 'P': return 16; case 'Q': return 17; case 'R': return 18;
case 'S': return 19; case 'T': return 20; case 'U': return 21;
case 'V': return 22; case 'W': return 23; case 'X': return 24;
case 'Y': return 25; case 'Z': return 26;
}
}
A possibly minor point: there is no return statement executed in
the case that x is not one of 'A'-'Z'.

Foully overlong and tricky to write code, IMO. Try:

unsigned NumberFromUpperLetter(char const x) {
static const char UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

int ch;
char *p;

ch = toupper((unsigned char)x);
if (p = strchr(UC, ch)) return p - &UC[0];
else return 0;
} /* untested */

Note the absence of any foul assert calls. All non-letters return
zero. By changing the function name and the string UC you can
search and convert any collection of chars. Works regardless of
the char set in effect.

--
pete

Aug 25 '07 #35

Martin Wells

pete:

(In the context of a local variable within a function)

static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

In the original example, there was a little bit of merit in declaring
the array as static:

static char const letters[] = ...

The "little bit of merit" to which I allude is that, if we'd instead
chosen not to define it as static, then the final executable might
store the string on a stack -- which is what we don't want in terms of
efficiency.

However, pete, I don't see why you'd bother making "UC" static.

Using "static" as an optimisation for local variables is usually only
done for arrays and such because of their size.

Of course, it could be argued that we shouldn't be using "static" at
all for this purpose because it's the *compiler's* job to make things
work well, but nonetheless I don't see why you'd go to the bother of
making a simple pointer static.

Martin

Aug 26 '07 #36

Ian Collins

Martin Wells wrote:

pete:

(In the context of a local variable within a function)
> static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

In the original example, there was a little bit of merit in declaring
the array as static:

static char const letters[] = ...

The "little bit of merit" to which I allude is that, if we'd instead
chosen not to define it as static, then the final executable might
store the string on a stack -- which is what we don't want in terms of
efficiency.

However, pete, I don't see why you'd bother making "UC" static.

Using "static" as an optimisation for local variables is usually only
done for arrays and such because of their size.

It probably makes no difference in the case of a string literal.

--
Ian Collins.

Aug 26 '07 #37

pete

Martin Wells wrote:

>
pete:

(In the context of a local variable within a function)
static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

In the original example, there was a little bit of merit in declaring
the array as static:

static char const letters[] = ...

The "little bit of merit" to which I allude is that, if we'd instead
chosen not to define it as static, then the final executable might
store the string on a stack -- which is what we don't want in terms of
efficiency.

However, pete, I don't see why you'd bother making "UC" static.

You're right, I didn't pay proper attention to the code.
What I did was to make UC into a pointer.
CBFalconer made a typo and declared UC as a (static const char).

Maybe a little better might be:

const char *const UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

--
pete

Aug 26 '07 #38

pete

Martin Wells wrote:

>
pete:

(In the context of a local variable within a function)
static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

In the original example, there was a little bit of merit in declaring
the array as static:

static char const letters[] = ...

The "little bit of merit" to which I allude is that, if we'd instead
chosen not to define it as static, then the final executable might
store the string on a stack -- which is what we don't want in terms of
efficiency.

I don't consider things like "the stack" and "the heap"
on this newsgroup.
The pointer declaration, defines two objects:
1 the pointer,
2 and the object refered to by the string literal.
The object refered to by the string literal, has static duration.

However, pete, I don't see why you'd bother making "UC" static.

Using "static" as an optimisation for local variables is usually only
done for arrays and such because of their size.

Of course, it could be argued that we shouldn't be using "static" at
all for this purpose because it's the *compiler's* job to make things
work well, but nonetheless I don't see why you'd go to the bother of
making a simple pointer static.

If we consider the choice as being between
static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
and
const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
then the merit of the first,
is that the initialization occurs only once,
and the merit of the second is that it doesn't
define an extra object with static duration.
It doesn't really make much difference to me.

I like
static char const letters[] = ...
just fine.

A definition like
static char const letters[] = {'\0'};
defines only one object.

I've debated whether or not a definition like
static char const letters[] = "";
defines two objects. I think it doesn't.

The standard says that two definitions like
the ones shown above here for the letters array, are identical;
The question is: Just exactly how identical are they?

--
pete

Aug 26 '07 #39

Martin Wells

pete:

I don't consider things like "the stack" and "the heap"
on this newsgroup.

Indeed the Standard doesn't mention them. If we're writing code which
we expect to perform efficiently though, then I think we have to pay a
little attention to how 99% (if not 100%) of implementations actually
get things done. Isn't that what led to the popularity of "*p++"? Even
with that said though, and I think you've implied, there's no
guarantee that static local data won't use a stack, or that automatic
local data will. I wouldn't be surprised at all if a compiler made
identical machine code for:

static char const letters[] = ...

and:

char const letters[] = ...

If we consider the choice as being between
static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
and
const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
then the merit of the first,
is that the initialization occurs only once,
and the merit of the second is that it doesn't
define an extra object with static duration.

Actually, rather than any initialisation taking place at all, I
wouldn't expect either of them to result in the formation of a
variable in which to store the address of the string literal -- but
rather for it to work something like a #define.

I've debated whether or not a definition like
static char const letters[] = "";
defines two objects. I think it doesn't.

>From what I can see, that should create an array of char's of length

1, and the single element should be a null character. Quite identical
to:

static char const letters[1] = "";

and:

static char const letters[1] = {'\0'};

Martin

Aug 26 '07 #40

Martin Wells

I've debated whether or not a definition like
static char const letters[] = "";
defines two objects. I think it doesn't.
From what I can see, that should create an array of char's of length

1, and the single element should be a null character. Quite identical
to:

static char const letters[1] = "";

and:

static char const letters[1] = {'\0'};

I messed up my editing there. That shuda read:

>From what I can see, that should create an array of char's of length

1, and the single element should be a null character. Quite identical
to:

static char const letters[1] = "";
and:
static char const letters[1] = {'\0'};

Aug 26 '07 #41

pete

Martin Wells wrote:

>
pete:

I don't consider things like "the stack" and "the heap"
on this newsgroup.

Indeed the Standard doesn't mention them. If we're writing code which
we expect to perform efficiently though, then I think we have to pay a
little attention to how 99% (if not 100%) of implementations actually
get things done.

What I mostly discuss on this newgroup,
is the meaning of the C code.
I'm more interested what the code can do on the DS9K
than I am in what it does on a Windows box.

Isn't that what led to the popularity of "*p++"?

I read about it in K&R.
It says that the notational convenience is considerable
and that the idiom should be mastered,
if for no other reason
than that it is frequently seen in C programs.

Even
with that said though, and I think you've implied, there's no
guarantee that static local data won't use a stack, or that automatic
local data will. I wouldn't be surprised at all if a compiler made
identical machine code for:

static char const letters[] = ...

and:

char const letters[] = ...

If we consider the choice as being between
static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
and
const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";
then the merit of the first,
is that the initialization occurs only once,
and the merit of the second is that it doesn't
define an extra object with static duration.

Actually, rather than any initialisation taking place at all, I
wouldn't expect either of them to result in the formation of a
variable in which to store the address of the string literal -- but
rather for it to work something like a #define.

Like I said, I'm more interested in the meaning of the code.
I can microoptimise along with the best of them;
I don't need to discuss that here.

The most obvious difference in the meaning of those
two declarations, is that the address of UC
can only be returned from a function if UC is declared static.
Likewise for the address of letters in your above two examples.

>
I've debated whether or not a definition like
static char const letters[] = "";
defines two objects. I think it doesn't.

From what I can see, that should create an array of char's of length
1, and the single element should be a null character. Quite identical
to:

static char const letters[1] = "";

and:

static char const letters[1] = {'\0'};

Whether or not the "" string literal refers to an object
in the array initialisation is entirely academic,
because if it does,
then it refers to an anonymous object whose address
can't be known by the program.

--
pete

Aug 26 '07 #42

CBFalconer

pete wrote:

>

.... snip ...

>
You're right, I didn't pay proper attention to the code.
What I did was to make UC into a pointer.
CBFalconer made a typo and declared UC as a (static const char).

No typo. See the discussion I just posted.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 26 '07 #43

CBFalconer

Ian Collins wrote:

Martin Wells wrote:
pete:

(In the context of a local variable within a function)
>> static const char *UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

In the original example, there was a little bit of merit in
declaring the array as static:

static char const letters[] = ...

The "little bit of merit" to which I allude is that, if we'd
instead chosen not to define it as static, then the final
executable might store the string on a stack -- which is what
we don't want in terms of efficiency.

However, pete, I don't see why you'd bother making "UC" static.

Using "static" as an optimisation for local variables is usually
only done for arrays and such because of their size.

It probably makes no difference in the case of a string literal.

I originally created the code using the array (not a pointer). The
idea was that a static array is stored fully initialized in the
code file, and is simply read into place. The const prevents
changing that array after initialization. This operation does not
have the heavy overhead that a non-static array would induce.
These facts are not universal, but often apply.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 26 '07 #44

Richard Heathfield

CBFalconer said:

pete wrote:
>>
... snip ...
>>
You're right, I didn't pay proper attention to the code.
What I did was to make UC into a pointer.
CBFalconer made a typo and declared UC as a (static const char).

No typo. See the discussion I just posted.

Sure looked like a typo to me. You may have meant to create an array,
but what you actually created was a char, and then you initialised it
illegally.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Aug 26 '07 #45

CBFalconer

Richard Heathfield wrote:

CBFalconer said:
>pete wrote:
>>>
... snip ...
>>>
You're right, I didn't pay proper attention to the code.
What I did was to make UC into a pointer.
CBFalconer made a typo and declared UC as a (static const char).

No typo. See the discussion I just posted.

Sure looked like a typo to me. You may have meant to create an
array, but what you actually created was a char, and then you
initialised it illegally.

I disagree. Here is a copy and paste from my original message:

static const char UC = " ABCDEFGHIJKLMNOPQRSTUVWXYZ";

Woops - now I see the missing []. Abject apologies. At least I
marked the code as untested.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 26 '07 #46

Army1987

On Sat, 25 Aug 2007 11:43:14 +0200, Army1987 wrote:

On Fri, 24 Aug 2007 15:21:33 -0700, user923005 wrote:

>char string[32767];

>>C89 does not require an implementation to allow more than 32 KB
of auto variables.

>It is not an auto variable.

Whoops... snipped too much. Sorry for that. (I'd better check in
the *original* code before making such comments, next time. I was
mis-remembering having seen it declared in main.)

<pedantry level="somewhat high">
If it is not declared as static and it has file scope, it has
external linkage. You aren't allowed to use the identifier
'string ' with external linkage. See 7.1.3.
</pedantry>
(fx:blush) Ok, I'll stop it...

--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 26 '07 #47

pete

Army1987 wrote:

>
On Sat, 25 Aug 2007 11:43:14 +0200, Army1987 wrote:

On Fri, 24 Aug 2007 15:21:33 -0700, user923005 wrote:

char string[32767];

>C89 does not require an implementation to allow more than 32 KB
of auto variables.

This thread, as you have quoted it here,
makes it seem as though the person who wrote that sentence,
thinks that 32767 excedes some translation limit in C89,
which it doesn't.

It is not an auto variable.

And it also makes it seem as though the translation
limit applies only to objects with automatic duration,
which as far as I can tell from the standard, it doesn't.

ISO/IEC 9899: 1990
5.2.4.1 Translation limits

— 32767 bytes in an object (in a hosted environment only)

--
pete

Aug 26 '07 #48

Army1987

On Fri, 24 Aug 2007 15:24:35 -0700, cpedant wrote:

I find such informal (ab)use of big-O notation often leads to nothing
but confusion when its semantics start being discussed, and downright
misleading for people who are not very familiar with big-O. Formally
speaking, the "runtime" of our function actually lies in each of O(1),
O(n), O(n^2), O(2^n), O(n^n), and (infinitely) many more distinct
classes of functions. Informal usage of "O(f(n))" is almost always
meant to correspond to the lesser-known Theta(f(n)) function class.

Not really. One would usually say that randomly shuffling an array
of N elements needs O(N log N) bits of entropy, but lg(N!) is not
Theta(N log N). Of course one usually tries to avoid ridiculous
overestimates when tighter ones are available, one doesn't claim
linear search to require O(busy_beaver(ackermann(n, n))) time when
we could say O(n), but in other cases one just finds a reasonable
upper bound.

(And virtually all the use of the big-O notation for runtime of
algorithms are abuse of notation. Does anybody care how long does
it take to sort an array of one googolplex members? If not, one
should a fortiori not care about the asymptotic behavior, which is
unrelated to the behavior of a function for the first few values
of N, even if by 'few' we mean busy_beaver(Graham's number).)

--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 26 '07 #49

pete

Martin Wells wrote:

>
pete:

I don't consider things like "the stack" and "the heap"
on this newsgroup.

Indeed the Standard doesn't mention them. If we're writing code which
we expect to perform efficiently though, then I think we have to pay a
little attention to how 99% (if not 100%) of implementations actually
get things done.

The most important skill that you can learn and develop
from following this newsgroup, is how to write C code
that means exactly what you think it means.

--
pete

Aug 27 '07 #50

Critique my assignment please

Similar topics