473,839 Members | 1,330 Online

# portable ascii-hex conversion

All,

I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.

Here is what I've got:

int num = 0;
int ch = nextch(); /* nextc obtains the next character value */

while(isxdigit( ch))
{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

num = num * 0x10 + ch;
ch = nextch();
}

If you look at the if-else statement inside the while() loop, you will see
how I attempt to convert 'ch' from a character-value to a numeric value in
the range 0-15 inclusive. But I have doubts about the ((ch & ~0x20) - 'A' +
10) expression:

It assumes that 'A' - 'F' are consecutive values
It assumes that 'a' - 'f' are consecutive, and are always 0x20 above their
'uppercase' counterparts.

Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

p.s. I derived this code from the lcc compiler sourcecode...

James
Nov 8 '06 #1
28 3421
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
>I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.
>Here is what I've got:
>int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
>while(isxdigit (ch))
What if it was EOF ?
>{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */
Yes.
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */
The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

num = num * 0x10 + ch;
What if you overflow your int ?
ch = nextch();
}
>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?
fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
--
Programming is what happens while you're busy making other plans.
Nov 8 '06 #2
"Walter Roberson" <ro******@ibd.n rc-cnrc.gc.cawrote in message
news:ei******** **@canopus.cc.u manitoba.ca...
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
>>I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits
are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' -
'Z'
and 'a' - 'z'.
>>Here is what I've got:
>>int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
>>while(isxdigi t(ch))

What if it was EOF ?
I thought it would be ok? ch would be EOF, which would cause isxdigit to
return(0), and the loop would break out. Is this not what would happen?
>
>>{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */

Yes.
> else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

> num = num * 0x10 + ch;

What if you overflow your int ?
yes, I hadn't gotten as far as checking for overflow, that's my next task.
>
> ch = nextch();
}
>>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.
Definitely a nice solution, but I think it will be hard to detect overflows?
My compiler documentation for strtoul says that it returns ULONG_MAX on
overflow, but how do I distinguish this from the case when I encounter the
actual ULONG_MAX value? This is why I am hand-coding this thing, so that I
can emit appropriate warning messages when such things happen.
>
If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
--
Programming is what happens while you're busy making other plans.
I'll definitely consider this as a solution - I was hoping for a 1/2 liner
(calling a c-runtime func would be ideal), but it looks like a lookup table
may be the most appropriate way forward. I'm not too concerned with
performance though - I would prefer a simple loop above all else.

thanks,
James

Nov 8 '06 #3
>>
>> num = num * 0x10 + ch;

What if you overflow your int ?

yes, I hadn't gotten as far as checking for overflow, that's my next task.
>>
>> ch = nextch();
}
>>>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

Definitely a nice solution, but I think it will be hard to detect
overflows? My compiler documentation for strtoul says that it returns
ULONG_MAX on overflow, but how do I distinguish this from the case when I
encounter the actual ULONG_MAX value? This is why I am hand-coding this
thing, so that I can emit appropriate warning messages when such things
happen.
ok, so I read the rest of the strtoul docs and it says 'errno' is set for
overflow/underflow. Looks like this is my preferred solution, thanks for the
help.

James
Nov 8 '06 #4
Walter Roberson wrote:
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.
Here is what I've got:
int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
while(isxdigit( ch))

What if it was EOF ?
The loop exists. What of it?
{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */

Yes.
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

num = num * 0x10 + ch;

What if you overflow your int ?
ch = nextch();
}
Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
Assuming UCHAR_MAX is reasonably small.
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
A simple switch() will do the job too...

--
Peter

Nov 8 '06 #5
"James Brown" <no*@home.netwr ites:
"Walter Roberson" <ro******@ibd.n rc-cnrc.gc.cawrote in message
news:ei******** **@canopus.cc.u manitoba.ca...
[...]
>fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

Definitely a nice solution, but I think it will be hard to detect overflows?
My compiler documentation for strtoul says that it returns ULONG_MAX on
overflow, but how do I distinguish this from the case when I encounter the
actual ULONG_MAX value? This is why I am hand-coding this thing, so that I
can emit appropriate warning messages when such things happen.
This is explained in the documentation for strtoul(). On overflow, it
returns ULONG_MAX and sets errno to ERANGE. (You have to set errno to
0 before calling it.)

errno = 0;
result = strtoul(blah, blah, blah);
if (result == ULONG_MAX && errno == ERANGE) {
/* overflow */
}

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #6
"Peter Nilsson" <ai***@acay.com .auwrites:
Walter Roberson wrote:
>In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
[...]
>while(isxdigit (ch))

What if it was EOF ?

The loop exists. What of it?
I think you mean the loop exits.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #7
James Brown wrote:
>
.... snip ...
>
p.s. I derived this code from the lcc compiler sourcecode...
If you mean lcc-win32, that explains the non-portability.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>
Nov 9 '06 #8
On Wed, 2006-11-08 at 18:59 +0000, James Brown wrote:
It assumes that 'A' - 'F' are consecutive values
It assumes that 'a' - 'f' are consecutive, and are always 0x20 above their
'uppercase' counterparts.

Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?
Here's a trick from /C Unleashed/, in a chapter (I believe) Richard
Heathfield wrote:

char *hex = "0123456789ABCD EF";

Then you have a number-to-hex converter right there:
hex[n] = n_16, 0 <= n <= 15.

--
Andrew Poelstra <http://www.wpsoftware. net>
For email, use 'apoelstra' at the above site.
"You're only smart on the outside." -anon.

Nov 9 '06 #9
Andrew Poelstra said:
Here's a trick from /C Unleashed/, in a chapter (I believe) Richard
Heathfield wrote:

char *hex = "0123456789ABCD EF";
How I wish I'd written const char *. Oh well.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 9 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

### Similar topics

 3 1353 by: hantheman | last post by: Hi all, I have to convert an integer into a portable byte stream (to disk/network) Relevant platforms have 32 bit integers, but different endian (big, little, middle). Is the following a portable solution? --- union IntegerBuffer { int intval; unsigned char charval; }; 1 1662 by: Owen Jacobson | last post by: Salve. Does anyone have any suggestions for writing a portable 'byte' numeric type? I'm aware that (signed) char is a numeric type and can be used as such, and I assume this is the fundamental building block for what I'm trying to do; however, I can't think of a way to typedef this that won't stomp on the definition of char on at least some platforms. Ideally, I'd like to be able to have the following: 20 1807 by: Matthias | last post by: Hello, I am missing certain functionality of std::string, so I am currently writing some helper functions which operate on strings. On of them is as follows (it's actually two functions): inline char to_lower ( char c ) { if( c>=65 && c<=90 ) // A-Z return c += 32; 7 2077 by: Robert Bachmann | last post by: Two years I wrote a simple cesar encryption program, it worked but it relied on ASCII. So today I tried to make an portable cesar encryption. Please tell me if the code below is really protable. Thanks in advance. #include #include #include 3 2355 by: Gautam | last post by: Can any one tell me what are the types in C which are non portable/which make C non-portable is it a)structures, or b)unions, or 3)bit-fields, or are all of them , i am confused 9 4313 by: PengYu.UT | last post by: Hi, I write the content of a in file "data" (in Sun Machine). Then I read "data" in both SunOS and linux. But the result is different. Do you know how to make it binary data portable. Best wishes, Peng 131 6266 by: pemo | last post by: Is C really portable? And, apologies, but this is possibly a little OT? In c.l.c we often see 'not portable' comments, but I wonder just how portable C apps really are. I don't write portable C code - *not only* because, in a 'C sense', I 30 3323 by: Steve Edwards | last post by: Hi, I'm re-writing some code that had relied on some platform/third-party dependent utility functions, as I want to make it more portable. Is there a standard C/C++/stl routine for changing an stl string to all lowercase? (I know how to do it manually, but in the interests of portability...) Thanks Steve 13 3318 by: Tomás | last post by: Let's start off with: class Nation { public: virtual const char* GetName() const = 0; } class Norway : public Nation { public: virtual const char* GetName() const 5 1885 by: copx | last post by: How portable are direct bit operations? Which ones are portable? I have never bothered learning such low-level stuff (I have an excuse: I am not a professional programmer), so I really don't know. One thing I noticed while reading C code was that many C programmers seem to be addicted to bit ops. After all using "bool" is such a waste of mem if you can squeeze 8, 16, 32, whatever boolean values into a single integer type. Especially some... 0 10908 by: Oralloy | last post by: Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in... 0 10587 by: jinu1996 | last post by: In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is... 1 10649 by: Hystou | last post by: Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,... 0 10295 by: tracyyun | last post by: Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some... 1 7829 by: isladogs | last post by: The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will... 0 5867 by: adsilva | last post by: A Windows Forms form does not have the event Unload, like VB6. What one acts like? 1 4487 by: 6302768590 | last post by: Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system 2 4064 by: muto222 | last post by: How can i add a mobile payment intergratation into php mysql website. 3 3136 by: bsmnconsultancy | last post by: In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.