Connecting Tech Pros Worldwide Help | Site Map

Advice required on my ascii to hex conversion C++

  #1  
Old January 25th, 2006, 10:45 AM
andrewfaseuk@hotmail.com
Guest
 
Posts: n/a
Hi,

Before I start I have a very basic knowledge of C++ so please be kind
!

I need to convert a input string i.e. from argv[] in the form
'FFEEDD....' containing hex values into a real hex coded string.

For example the input string :

AABBCCDDEEFF

need to be converted to the same as the below definition:

unsigned char data[6] = "\xAA\xBB\xCC\xDD\xEE\xFF";

I've managed to do this but its probably not the best solution, my
code is:

char byte[2];
byte[2] = '\0';
unsigned char *data2 = static_cast <unsigned char*>
(malloc((strlen(argv[2]))/2+1));
int i;
int j=0;
long l;

for (i=0; i < strlen(argv[2])-1;i=i+2) {
byte[0] = argv[2][i];
byte[1] = argv[2][i+1];
l = strtol(byte,NULL,16);
data2[j++] = (char)l;
}
data2[j] = '\0';

argv[2] contains the info I want to convert and data2 is the output
char array. Is there an easier way than this ? am I being very
unefficent ?

Answers appreciated

  #2  
Old January 25th, 2006, 12:45 PM
Maxim Yegorushkin
Guest
 
Posts: n/a

re: Advice required on my ascii to hex conversion C++



andrewfaseuk@hotmail.com wrote:[color=blue]
> Hi,
>
> Before I start I have a very basic knowledge of C++ so please be kind
> !
>
> I need to convert a input string i.e. from argv[] in the form
> 'FFEEDD....' containing hex values into a real hex coded string.
>
> For example the input string :
>
> AABBCCDDEEFF
>
> need to be converted to the same as the below definition:
>
> unsigned char data[6] = "\xAA\xBB\xCC\xDD\xEE\xFF";
>
> I've managed to do this but its probably not the best solution, my
> code is:
>
> char byte[2];
> byte[2] = '\0';[/color]

You've overflowed your buffer here. It only has elements with indexes 0
and 1 and you write to element 2 which is out of bounds.
[color=blue]
> unsigned char *data2 = static_cast <unsigned char*>
> (malloc((strlen(argv[2]))/2+1));
> int i;
> int j=0;
> long l;
>
> for (i=0; i < strlen(argv[2])-1;i=i+2) {[/color]

Here you call strlen() on each iteration. A better idea would be to
call it once before the loop.
[color=blue]
> byte[0] = argv[2][i];
> byte[1] = argv[2][i+1];
> l = strtol(byte,NULL,16);[/color]

Here strtol may go wild, since byte buffer has only two elements and
its not zero terminated.
[color=blue]
> data2[j++] = (char)l;
> }
> data2[j] = '\0';
>
> argv[2] contains the info I want to convert and data2 is the output
> char array. Is there an easier way than this ? am I being very
> unefficent ?[/color]

You also don't check for invalid input here.

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

inline int is_hex(char c)
{
return (c >= '0' && c <= '9')
|| ((c | 0x20) >= 'a' && (c | 0x20) <= 'f')
;
}

inline unsigned char hex2bin(unsigned char h, unsigned char l)
{
h |= 0x20; // to lower
h -= 0x30;
h -= -(h > 9) & 0x27;
l |= 0x20;
l -= 0x30;
l -= -(l > 9) & 0x27;
return h << 4 | l;
}

int main(int ac, char** av)
{
if(ac < 2)
return EXIT_FAILURE; // wrong number of arguments

size_t nibbles = strlen(av[1]);
if(nibbles % 2) // odd number of nibbles
return EXIT_FAILURE;
nibbles /= 2;

unsigned char* buf = (unsigned char*)malloc(nibbles);
if(!buf) // out of memory
return EXIT_FAILURE;

for(size_t n = 0; n != nibbles; ++n)
{
if(!is_hex(av[1][2 * n]) || !is_hex(av[1][2 * n + 1]))
return EXIT_FAILURE; // bad input hex string
buf[n] = hex2bin(av[1][2 * n], av[1][2 * n + 1]);
}

for(size_t n = 0; n != nibbles; ++n)
printf("%02hhx ", buf[n]);
printf("\n");
}

  #3  
Old January 25th, 2006, 07:15 PM
andrewfaseuk@hotmail.com
Guest
 
Posts: n/a

re: Advice required on my ascii to hex conversion C++


[color=blue][color=green]
> >
> > char byte[2];
> > byte[2] = '\0';[/color]
>
> You've overflowed your buffer here. It only has elements with indexes 0
> and 1 and you write to element 2 which is out of bounds.[/color]

Yes basic mistake, it was byte[3] and then I changed it to char[2] asi
could'nt remember is 2 meant 0,1 and 2 were available ....

[color=blue]
>[color=green]
> > unsigned char *data2 = static_cast <unsigned char*>
> > (malloc((strlen(argv[2]))/2+1));
> > int i;
> > int j=0;
> > long l;
> >
> > for (i=0; i < strlen(argv[2])-1;i=i+2) {[/color]
>
> Here you call strlen() on each iteration. A better idea would be to
> call it once before the loop.[/color]

good idea ...
[color=blue]
>[color=green]
> > byte[0] = argv[2][i];
> > byte[1] = argv[2][i+1];
> > l = strtol(byte,NULL,16);[/color]
>
> Here strtol may go wild, since byte buffer has only two elements and
> its not zero terminated.[/color]

Well it I use data[3] it will be zero terminated wont it ? as i zero
terminate the array before the loop
[color=blue]
>[color=green]
> > data2[j++] = (char)l;
> > }
> > data2[j] = '\0';
> >
> > argv[2] contains the info I want to convert and data2 is the output
> > char array. Is there an easier way than this ? am I being very
> > unefficent ?[/color]
>
> You also don't check for invalid input here.[/color]

i've checking before this code snippet for valid input, just didnt post
it ! but thanks for pointing it out !
[color=blue]
>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
>
> inline int is_hex(char c)
> {
> return (c >= '0' && c <= '9')
> || ((c | 0x20) >= 'a' && (c | 0x20) <= 'f')
> ;
> }
>
> inline unsigned char hex2bin(unsigned char h, unsigned char l)
> {
> h |= 0x20; // to lower
> h -= 0x30;
> h -= -(h > 9) & 0x27;
> l |= 0x20;
> l -= 0x30;
> l -= -(l > 9) & 0x27;
> return h << 4 | l;
> }
>[/color]

any chance you could explain how this works ? i don't know what "|="
means not "return h << 4 | l;" to be honest the whole thing seems a bit
confuising for me anyway

Cheers

Andy

  #4  
Old January 27th, 2006, 12:35 PM
Maxim Yegorushkin
Guest
 
Posts: n/a

re: Advice required on my ascii to hex conversion C++



andrewfaseuk@hotmail.com wrote:

[]
[color=blue][color=green]
> > #include <stdlib.h>
> > #include <string.h>
> > #include <stdio.h>
> >
> > inline int is_hex(char c)
> > {
> > return (c >= '0' && c <= '9')
> > || ((c | 0x20) >= 'a' && (c | 0x20) <= 'f')
> > ;
> > }
> >
> > inline unsigned char hex2bin(unsigned char h, unsigned char l)
> > {
> > h |= 0x20; // to lower
> > h -= 0x30;
> > h -= -(h > 9) & 0x27;
> > l |= 0x20;
> > l -= 0x30;
> > l -= -(l > 9) & 0x27;
> > return h << 4 | l;
> > }
> >[/color]
>
> any chance you could explain how this works ? i don't know what "|="
> means not "return h << 4 | l;" to be honest the whole thing seems a bit[/color]

Check out your C/C++ textbook for operators.
[color=blue]
> confuising for me anyway[/color]

hex2bin function was cut from an optimized parsing code. The function
is a bit tricky to understand because it was deliberately coded to
avoid branching (conditional jumps) in generated code.

Basically, converting a hex digit into a hex nibble (4 binary digits)
algorithm looks like;

char xdigit; // hex digit to convert [0-9A-Fa-f]
xdigit = tolower(xdigit); // make it lowercase [0-9a-f]
xdigit -= '0'; // if it was a [0-9] digit, it's the value now
if(xdigit > 9) // if it was a [a-f] digit, compensate for that
xdigit = xdigit + '0' - 'a';

The original code is just an optimization of the algorithm.

Closed Thread