Reformulating a macro to use argument just once

Francois Grieu

Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)

Of course, this can't be safely used as in
if (ISVALID(*p++)) foo();
where p is a pointer ot unsigned char.
Unless I err, this issue can be fixed (and often, performance
improved) using

#define ISVALID(x) ((unsigned char)((x)-0x20)<=0x7E-0x20)

Buf can we do something similar about this one?

// check if x, assumed of type unsigned char,
// is in range [0x20..0x7E] or grater than 0xC0
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Francois Grieu

Jan 16 '07 #1

Subscribe Post Reply

2186

Peter Nilsson

Francois Grieu wrote:

Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)

Of course, this can't be safely used as in
if (ISVALID(*p++)) foo();
where p is a pointer ot unsigned char.

So don't use it that way.

Note that ISVALID is in a class of reserved identifiers.

Unless I err, this issue can be fixed (and often, performance
improved) using

#define ISVALID(x) ((unsigned char)((x)-0x20)<=0x7E-0x20)

Why bother? The same issue exists for the ISXXXX macros in
<ctype.h>, but it generally _isn't_ a problem.

Buf can we do something similar about this one?

// check if x, assumed of type unsigned char,
// is in range [0x20..0x7E] or grater than 0xC0
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

If you assume x is in the range 0..0xFF, you can do...

#define VALID(x) (28245449 % (((x)/32+1)*6+5) == 0)

--
Peter

Jan 17 '07 #2

Francois Grieu

In article <11**********************@51g2000cwl.googlegroups. com>,
"Peter Nilsson" <ai***@acay.com.auwrote:

Francois Grieu wrote: (name and comment fixed)
// check if x, assumed of type unsigned char,
// is in range [0x20..0x7E] or at least 0xC0
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

If you assume x is in the range 0..0xFF, you can do...

#define VALID(x) (28245449 % (((x)/32+1)*6+5) == 0)

That wont work. After (x)/32, 0x7E and 0x7F are undistinguishable.
Francois Grieu

Jan 17 '07 #3

Keith Thompson

"Peter Nilsson" <ai***@acay.com.auwrites:

Francois Grieu wrote:
Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)

Of course, this can't be safely used as in
if (ISVALID(*p++)) foo();
where p is a pointer ot unsigned char.

So don't use it that way.

Note that ISVALID is in a class of reserved identifiers.

No, it ISn't.

Unless I err, this issue can be fixed (and often, performance
improved) using

#define ISVALID(x) ((unsigned char)((x)-0x20)<=0x7E-0x20)

Why bother? The same issue exists for the ISXXXX macros in
<ctype.h>, but it generally _isn't_ a problem.

There are no ISXXXX macros in <ctype.h>. There are a number of isXXXX
functions declared in <ctype.h(which may also be implemented as
macros with the same names).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 17 '07 #4

Ben Pfaff

Keith Thompson <ks***@mib.orgwrites:

"Peter Nilsson" <ai***@acay.com.auwrites:
>Francois Grieu wrote:
Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)

Of course, this can't be safely used as in
if (ISVALID(*p++)) foo();
where p is a pointer ot unsigned char.

So don't use it that way.

Note that ISVALID is in a class of reserved identifiers.

No, it ISn't.

In C89, the linker isn't required to be case-sensitive, so it's
risky to make that assertion.
--
"...deficient support can be a virtue.
It keeps the amateurs off."
--Bjarne Stroustrup

Jan 17 '07 #5

Keith Thompson

Ben Pfaff <bl*@cs.stanford.eduwrites:

Keith Thompson <ks***@mib.orgwrites:
"Peter Nilsson" <ai***@acay.com.auwrites:
Francois Grieu wrote:
Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)

Of course, this can't be safely used as in
if (ISVALID(*p++)) foo();
where p is a pointer ot unsigned char.

So don't use it that way.

Note that ISVALID is in a class of reserved identifiers.
No, it ISn't.

In C89, the linker isn't required to be case-sensitive, so it's
risky to make that assertion.

Good point (but I wonder *how* risky it is).

I'm really glad that the E* identifiers in <errno.h.hare macros;
otherwise any identifier starting with 'e' and a digit or letter could
be dangerous.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Jan 17 '07 #6

Peter Nilsson

Keith Thompson wrote:

"Peter Nilsson" <ai***@acay.com.auwrites:
Note that ISVALID is in a class of reserved identifiers.

No, it ISn't.

...The same issue exists for the ISXXXX macros in
<ctype.h>, but it generally _isn't_ a problem.

There are no ISXXXX macros in <ctype.h>.

Well, at least I was right about it not being a problem!

Thanks for correcting my bilge.

--
Peter

Jan 17 '07 #7

Peter Nilsson

Francois Grieu wrote:

"Peter Nilsson" <ai***@acay.com.auwrote:
Francois Grieu wrote: (name and comment fixed)

Obviously I read (and wrote) hastily.

I was wrong about ISVALID being reserved, and I missed the 7E/7F
subtlety.
Apologies

// check if x, assumed of type unsigned char,
// is in range [0x20..0x7E] or at least 0xC0
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)
If you assume x is in the range 0..0xFF, you can do...

#define VALID(x) (28245449 % (((x)/32+1)*6+5) == 0)

That wont work.

Good. It's not the sort of thing you should be doing anyway. ;-)

--
Peter

Jan 17 '07 #8

Francois Grieu

For the record, the thread is about reformulating the
following macro as another macro such that, like a function,
it evaluates it's argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. To me this is now
an intellectual and mathematical challenge, although it came
in practice (validating data according to European Regulation
3821/1985, Annex 1B)
"Peter Nilsson" <ai***@acay.com.auproposed:

>>#define VALID(x) (28245449 % (((x)/32+1)*6+5) == 0)

Then retracted

It's not the sort of thing you should be doing anyway. ;-)

But that gave me an idea! Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.
Francois Grieu

Jan 17 '07 #9

Arthur J. O'Dwyer

On Wed, 17 Jan 2007, Francois Grieu wrote:

>
For the record, the thread is about reformulating the
following macro as another macro such that, like a function,
it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. To me this is now
an intellectual and mathematical challenge, although it came
in practice (validating data according to European Regulation
3821/1985, Annex 1B)

You might have better luck in rec.puzzles or comp.programming,
since the question you're asking doesn't have anything to do with C
per se (except the coincidence that the macro is written in C).
The large number of C gurus here is probably irrelevant, because
a C guru wouldn't try to do what you're trying in the first place.

Crossposted and followups set to comp.programming and c.l.c.

But that gave me an idea! Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.

-Arthur

Jan 17 '07 #10

CBFalconer

Ben Pfaff wrote:

Keith Thompson <ks***@mib.orgwrites:
>"Peter Nilsson" <ai***@acay.com.auwrites:

.... snip ...

>>>
Note that ISVALID is in a class of reserved identifiers.

No, it ISn't.

In C89, the linker isn't required to be case-sensitive, so it's
risky to make that assertion.

Er, when did a macro last escape to be linked? :-)

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Jan 17 '07 #11

Francois Grieu

"Arthur J. O'Dwyer" <aj*******@andrew.cmu.eduwrote in article
<Pi***********************************@unix33.andr ew.cmu.edu>

On Wed, 17 Jan 2007, Francois Grieu wrote:

For the record, the thread is about reformulating the
following macro as another macro such that, like a function,
it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. To me this is now
an intellectual and mathematical challenge, although it came
in practice (validating data according to European Regulation
3821/1985, Annex 1B)

You might have better luck in rec.puzzles or comp.programming,
since the question you're asking doesn't have anything to do with C
per se (except the coincidence that the macro is written in C).

The subject, at least as stated, is too dependant on the definition
of C to be fit for rec.puzzles and even comp.programming

Readers in these groups are not supposed to know about UCHAR_MAX,
unsigned char, why it matters that x appears only once in the whole
expression; much less the signedness of constants in C, which played
a role in some earlier parts of the thread.

The large number of C gurus here is probably irrelevant, because
a C guru wouldn't try to do what you're trying in the first place.

The problem of efficiently testing if a variable belongs to one of
a few intervals is very common, and I guess some C gurus would do

#include <limits.h>

/* define CharIsAlpha(x) testing if char x is a letter */
#if UCHAR_MAX==255 && 'A'==65 && 'Z'==90 && 'a'==97 && 'z'==122
/* unsigned char is 8 bit and the characters set seems ASCII */
#define CharIsAlpha(x) ((unsigned char)(((unsigned char)(x)&191)-64)<26)
#else
#include <ctypes.h>
#define CharIsAlpha(x) isalpha(x)
#endif

The optimization shown has merits: speeds up things by an order of
magnitude (no function call, no branch/cache miss), compact code,
less dependancy on runtime. I can't think of a platform where it
fails and the compiler is not buggy (say in what it assumes that
the target character set is).
Francois Grieu

Jan 17 '07 #12

Eric Sosman

Ben Pfaff wrote:

Keith Thompson <ks***@mib.orgwrites:

>"Peter Nilsson" <ai***@acay.com.auwrites:
>>Francois Grieu wrote:
Consider this macro

// check if x, assumed of type unsigned char, is in range [0x20..0x7E]
#define ISVALID(x) ((x)>=0x20 && (x)<=0x7E)
[...]

Note that ISVALID is in a class of reserved identifiers.
No, it ISn't.

In C89, the linker isn't required to be case-sensitive, so it's
risky to make that assertion.

... but he's using ISVALID as the identifier for a macro,
and such identifiers have no linkage.

--
Eric Sosman
es*****@acm-dot-org.invalid

Jan 17 '07 #13

hagman

Arthur J. O'Dwyer schrieb:

On Wed, 17 Jan 2007, Francois Grieu wrote:

For the record, the thread is about reformulating the
following macro as another macro such that, like a function,
it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. To me this is now
an intellectual and mathematical challenge, although it came
in practice (validating data according to European Regulation
3821/1985, Annex 1B)

You might have better luck in rec.puzzles or comp.programming,
since the question you're asking doesn't have anything to do with C
per se (except the coincidence that the macro is written in C).
The large number of C gurus here is probably irrelevant, because
a C guru wouldn't try to do what you're trying in the first place.

Crossposted and followups set to comp.programming and c.l.c.

But that gave me an idea! Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.

-Arthur

#define VALID(x) (((( ((x)>>5) ^4) *5)&~21)!=0)

After
((x)>>5) ^4)
we have VALID <-result not in {0,1,4}
Multiplication by 5 causes no overflow (as we already divided by 32),
hence the only multiples of 5 with all bits contained in 21 are
indeed 0,5,20.

Jan 18 '07 #14

Francois Grieu

In article <11*********************@a75g2000cwd.googlegroups. com>,
"hagman" <go****@von-eitzen.dewrote:

On Wed, 17 Jan 2007, Francois Grieu wrote:
>Reformulate the following macro as another macro such that,
like a function, it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. (..This) came
in practice, validating data according to European Regulation
3821/1985, Annex 1B, Appendix 4, section 4, page 108 in

http://eur-lex.europa.eu/LexUriServ/...0060501:EN:PDF

>>
(..) Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.

#define VALID(x) (((( ((x)>>5) ^4) *5)&~21)!=0)

After ((x)>>5) ^4) we have VALID <-result not in {0,1,4}
Multiplication by 5 causes no overflow (as we already divided by 32),
hence the only multiples of 5 with all bits contained in 21 are
indeed 0,5,20.

That works, except when x is 0x7F. Nice thing is that there is
no dependency on UCHAR_MAX, as noted.
Francois Grieu

Jan 18 '07 #15

Francois Grieu

I'm asking[*]

Reformulate the following macro as another macro such that,
like a function, it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter.

(..) Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.

Another try: I can replace the explicit % operator with multiplication
and truncation, this is more efficient with no hadware support for %

I think the following would work fine on many platforms with
that have UCHAR_MAX equal to 0xFF

#include <limits.h>
#if UCHAR_MAX==0xFFu
#if USHRT_MAX==0xFFFFu
/* good on most "regular" platform */
#define VALID(x) ((unsigned short)(((unsigned char)(x)+0x81u)*0x199u)<0x9900u)
#else
/* same idea, independent of the range for unsigned short */
#define VALID(x) (((((unsigned char)(x)+0x81u)*0x199u)&0xFFFFu)<0x9900u)
#endif
#else
#error "no working solution so far"
#endif
Francois Grieu

[*] This came in practice, validating data on a platform with UCHAR_MAX=0xFF
according to European Regulation 3821/1985, Annex 1B, Appendix 4, section 4,
page 108 in this pdf:
http://eur-lex.europa.eu/LexUriServ/...0060501:EN:PDF

Jan 18 '07 #16

hagman

Francois Grieu schrieb:

In article <11*********************@a75g2000cwd.googlegroups. com>,
"hagman" <go****@von-eitzen.dewrote:

On Wed, 17 Jan 2007, Francois Grieu wrote:
Reformulate the following macro as another macro such that,
like a function, it evaluates its argument only once.

// Check if x, assumed the be of type unsigned char,
// lies in either [0x20..0x7E] or [0xC0..UCHAR_MAX]
#define VALID(x) ((x)>=0x20 && (x)<=0x7E || (x)>=0xC0)

Using a function (including inline), or table, no matter
how practical, is disregarded hereafter. (..This) came
in practice, validating data according to European Regulation
3821/1985, Annex 1B, Appendix 4, section 4, page 108 in

http://eur-lex.europa.eu/LexUriServ/...0060501:EN:PDF

>
(..) Assuming UCHAR_MAX is 0xFF,
a solution optimized towards using the least characters is

#define VALID(x) ((x)+128)%160<95)

The use of % makes it a dog, performancewise, on many
architectures (embedded 8 bit). I'm looking for a more
efficient solution. Making no assumption on UCHAR_MAX
would be a nice plus.

#define VALID(x) (((( ((x)>>5) ^4) *5)&~21)!=0)

After ((x)>>5) ^4) we have VALID <-result not in {0,1,4}
Multiplication by 5 causes no overflow (as we already divided by 32),
hence the only multiples of 5 with all bits contained in 21 are
indeed 0,5,20.

That works, except when x is 0x7F. Nice thing is that there is
no dependency on UCHAR_MAX, as noted.

Oops, I misread the limit to be "<=0x7F", sorry.
I'm not sure if I can get my trick working in that case, unless by
plugging in some multplication at the beginning:

#define OLDVALID(x) (((( ((x)>>5) ^4) *5)&~21)!=0)
#define PREMUL(x) (((unsigned int) x)*0x81 +1u)>>7
#define VALID(x) OLDVALID(PREMUL(x))

Note that PREMUL(x) = x for x<=0x7E and PREMUL(x) >=x+1 >=0x80 for
x>=0x7F.
However, this may fail under certain circumstances unless unsigned int
is more than 7 bits larger than unsigned char.
E.g., if unsigned int has less than 15 bits then PREMUL(0x80)==0 =>
0x80 wrongly invalid.
And in a world with 32bit ints and sufficiently big values of
UCHAR_MAX,
we have e.g. PREMUL(0x01FC07F1)==0

Jan 18 '07 #17

ais523

Francois Grieu wrote:

The problem of efficiently testing if a variable belongs to one of
a few intervals is very common, and I guess some C gurus would do

#include <limits.h>

/* define CharIsAlpha(x) testing if char x is a letter */
#if UCHAR_MAX==255 && 'A'==65 && 'Z'==90 && 'a'==97 && 'z'==122
/* unsigned char is 8 bit and the characters set seems ASCII */
#define CharIsAlpha(x) ((unsigned char)(((unsigned char)(x)&191)-64)<26)
#else
#include <ctypes.h>
#define CharIsAlpha(x) isalpha(x)
#endif

I doubt a real C guru would do this; they would allow for (as the
Standard allows for) a character set that was almost but not quite the
same as ASCII. In particular, I wouldn't be surprised if there were
locales and systems where isalpha() ought to return true for characters
like 'è' (which equals (char)232 on at least one system I use) and in
which the condition UCHAR_MAX==255 && 'A'==65 && 'Z'==90 && 'a'==97 &&
'z'==122 holds. This is still incorrect if you don't care about
locales, but less likely to trip you up in practice; still, the
theoretical incorrectness is enough reason to avoid it in my opinion.
--
ais523

Jan 19 '07 #18

Reformulating a macro to use argument just once

Similar topics