473,804 Members | 2,272 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

portable ascii-hex conversion

All,

I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.

Here is what I've got:

int num = 0;
int ch = nextch(); /* nextc obtains the next character value */

while(isxdigit( ch))
{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

num = num * 0x10 + ch;
ch = nextch();
}

If you look at the if-else statement inside the while() loop, you will see
how I attempt to convert 'ch' from a character-value to a numeric value in
the range 0-15 inclusive. But I have doubts about the ((ch & ~0x20) - 'A' +
10) expression:

It assumes that 'A' - 'F' are consecutive values
It assumes that 'a' - 'f' are consecutive, and are always 0x20 above their
'uppercase' counterparts.

Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

p.s. I derived this code from the lcc compiler sourcecode...

James
Nov 8 '06 #1
28 3415
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
>I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.
>Here is what I've got:
>int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
>while(isxdigit (ch))
What if it was EOF ?
>{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */
Yes.
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */
The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

num = num * 0x10 + ch;
What if you overflow your int ?
ch = nextch();
}
>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?
fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
--
Programming is what happens while you're busy making other plans.
Nov 8 '06 #2
"Walter Roberson" <ro******@ibd.n rc-cnrc.gc.cawrote in message
news:ei******** **@canopus.cc.u manitoba.ca...
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
>>I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits
are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' -
'Z'
and 'a' - 'z'.
>>Here is what I've got:
>>int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
>>while(isxdigi t(ch))

What if it was EOF ?
I thought it would be ok? ch would be EOF, which would cause isxdigit to
return(0), and the loop would break out. Is this not what would happen?
>
>>{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */

Yes.
> else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

> num = num * 0x10 + ch;

What if you overflow your int ?
yes, I hadn't gotten as far as checking for overflow, that's my next task.
>
> ch = nextch();
}
>>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.
Definitely a nice solution, but I think it will be hard to detect overflows?
My compiler documentation for strtoul says that it returns ULONG_MAX on
overflow, but how do I distinguish this from the case when I encounter the
actual ULONG_MAX value? This is why I am hand-coding this thing, so that I
can emit appropriate warning messages when such things happen.
>
If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
--
Programming is what happens while you're busy making other plans.
I'll definitely consider this as a solution - I was hoping for a 1/2 liner
(calling a c-runtime func would be ideal), but it looks like a lookup table
may be the most appropriate way forward. I'm not too concerned with
performance though - I would prefer a simple loop above all else.

thanks,
James


Nov 8 '06 #3
>>
>> num = num * 0x10 + ch;

What if you overflow your int ?

yes, I hadn't gotten as far as checking for overflow, that's my next task.
>>
>> ch = nextch();
}
>>>Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

Definitely a nice solution, but I think it will be hard to detect
overflows? My compiler documentation for strtoul says that it returns
ULONG_MAX on overflow, but how do I distinguish this from the case when I
encounter the actual ULONG_MAX value? This is why I am hand-coding this
thing, so that I can emit appropriate warning messages when such things
happen.
ok, so I read the rest of the strtoul docs and it says 'errno' is set for
overflow/underflow. Looks like this is my preferred solution, thanks for the
help.

James
Nov 8 '06 #4
Walter Roberson wrote:
In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
I have a series of characters which I need to convert to integer values.
Each character is read in turn from a function 'nextch', and hex-digits are
identified by the isxdigit function - so I'm looking at '0' - '9', 'A' - 'Z'
and 'a' - 'z'.
Here is what I've got:
int num = 0;
int ch = nextch(); /* nextc obtains the next character value */
while(isxdigit( ch))

What if it was EOF ?
The loop exists. What of it?
{
if(isdigit(ch))
ch = ch - '0'; /* this is portable I believe */

Yes.
else
ch = (ch & ~0x20) - 'A' + 10; /* not sure if this is ok */

The & ~0x20 is a hidden toupper() and not portable to non-ASCII.

And as you had thought, 'A' through 'F' are not guaranteed to be
consequative or even increasing order.

num = num * 0x10 + ch;

What if you overflow your int ?
ch = nextch();
}
Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?

fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

If you have particular reasons for handling the characters yourself,
then create a translation table of size UCHAR_MAX,
Assuming UCHAR_MAX is reasonably small.
and initialize it, tr['0'+i] = i for i from 0 to 9, and
tr['A'] = 10, tr['B'] = 11, etc., tr['a'] = 10, tr['b'] = 11, etc.,
then to do the conversion, just determine isxdigit(ch) and if so
then the converted value is tr[ch]. Yes, this has the potential
to waste UCHAR_MAX - 26 slots, but it is also a portable single-step
conversion with no math (other than normal array indexing)
A simple switch() will do the job too...

--
Peter

Nov 8 '06 #5
"James Brown" <no*@home.netwr ites:
"Walter Roberson" <ro******@ibd.n rc-cnrc.gc.cawrote in message
news:ei******** **@canopus.cc.u manitoba.ca...
[...]
>fetch more ch as long as isxdigit(ch) and you haven't gotten
more chars than you can handle, and store them into a buffer.
Then strtoul() specifying base 16.

Definitely a nice solution, but I think it will be hard to detect overflows?
My compiler documentation for strtoul says that it returns ULONG_MAX on
overflow, but how do I distinguish this from the case when I encounter the
actual ULONG_MAX value? This is why I am hand-coding this thing, so that I
can emit appropriate warning messages when such things happen.
This is explained in the documentation for strtoul(). On overflow, it
returns ULONG_MAX and sets errno to ERANGE. (You have to set errno to
0 before calling it.)

errno = 0;
result = strtoul(blah, blah, blah);
if (result == ULONG_MAX && errno == ERANGE) {
/* overflow */
}

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #6
"Peter Nilsson" <ai***@acay.com .auwrites:
Walter Roberson wrote:
>In article <lt************ ********@pipex. net>,
James Brown <no*@home.netwr ote:
[...]
>while(isxdigit (ch))

What if it was EOF ?

The loop exists. What of it?
I think you mean the loop exits.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 8 '06 #7
James Brown wrote:
>
.... snip ...
>
p.s. I derived this code from the lcc compiler sourcecode...
If you mean lcc-win32, that explains the non-portability.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>
Nov 9 '06 #8
On Wed, 2006-11-08 at 18:59 +0000, James Brown wrote:
It assumes that 'A' - 'F' are consecutive values
It assumes that 'a' - 'f' are consecutive, and are always 0x20 above their
'uppercase' counterparts.

Are these assumptions correct? I'm guessing the code is non-portable, so
does anyone have a neat(er) suggestion?
Here's a trick from /C Unleashed/, in a chapter (I believe) Richard
Heathfield wrote:

char *hex = "0123456789ABCD EF";

Then you have a number-to-hex converter right there:
hex[n] = n_16, 0 <= n <= 15.

--
Andrew Poelstra <http://www.wpsoftware. net>
For email, use 'apoelstra' at the above site.
"You're only smart on the outside." -anon.

Nov 9 '06 #9
Andrew Poelstra said:
Here's a trick from /C Unleashed/, in a chapter (I believe) Richard
Heathfield wrote:

char *hex = "0123456789ABCD EF";
How I wish I'd written const char *. Oh well.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: normal service will be restored as soon as possible. Please do not
adjust your email clients.
Nov 9 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1352
by: hantheman | last post by:
Hi all, I have to convert an integer into a portable byte stream (to disk/network) Relevant platforms have 32 bit integers, but different endian (big, little, middle). Is the following a portable solution? --- union IntegerBuffer { int intval; unsigned char charval; };
1
1660
by: Owen Jacobson | last post by:
Salve. Does anyone have any suggestions for writing a portable 'byte' numeric type? I'm aware that (signed) char is a numeric type and can be used as such, and I assume this is the fundamental building block for what I'm trying to do; however, I can't think of a way to typedef this that won't stomp on the definition of char on at least some platforms. Ideally, I'd like to be able to have the following:
20
1804
by: Matthias | last post by:
Hello, I am missing certain functionality of std::string, so I am currently writing some helper functions which operate on strings. On of them is as follows (it's actually two functions): inline char to_lower ( char c ) { if( c>=65 && c<=90 ) // A-Z return c += 32;
7
2077
by: Robert Bachmann | last post by:
Two years I wrote a simple cesar encryption program, it worked but it relied on ASCII. So today I tried to make an portable cesar encryption. Please tell me if the code below is really protable. Thanks in advance. #include <stdlib.h> #include <stdio.h> #include <string.h>
3
2355
by: Gautam | last post by:
Can any one tell me what are the types in C which are non portable/which make C non-portable is it a)structures, or b)unions, or 3)bit-fields, or are all of them , i am confused
9
4312
by: PengYu.UT | last post by:
Hi, I write the content of a in file "data" (in Sun Machine). Then I read "data" in both SunOS and linux. But the result is different. Do you know how to make it binary data portable. Best wishes, Peng
131
6257
by: pemo | last post by:
Is C really portable? And, apologies, but this is possibly a little OT? In c.l.c we often see 'not portable' comments, but I wonder just how portable C apps really are. I don't write portable C code - *not only* because, in a 'C sense', I
30
3323
by: Steve Edwards | last post by:
Hi, I'm re-writing some code that had relied on some platform/third-party dependent utility functions, as I want to make it more portable. Is there a standard C/C++/stl routine for changing an stl string to all lowercase? (I know how to do it manually, but in the interests of portability...) Thanks Steve
13
3316
by: Tomás | last post by:
Let's start off with: class Nation { public: virtual const char* GetName() const = 0; } class Norway : public Nation { public: virtual const char* GetName() const
5
1880
by: copx | last post by:
How portable are direct bit operations? Which ones are portable? I have never bothered learning such low-level stuff (I have an excuse: I am not a professional programmer), so I really don't know. One thing I noticed while reading C code was that many C programmers seem to be addicted to bit ops. After all using "bool" is such a waste of mem if you can squeeze 8, 16, 32, whatever boolean values into a single integer type. Especially some...
0
9594
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10599
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10346
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9173
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7635
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5531
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5673
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4308
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3832
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.