471,612 Members | 1,389 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,612 software developers and data experts.

insert a char in a char array

i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.

but how do i DO this?
how can i check which delimiter was used.
and if not, back to the 1st solution, how do i INSERT a space thereby
moving the following characters to the right which will of course mean
that the array has to be +1 larger. HOW?
Jul 19 '05 #1
6 14966
"Sean Bartholomew" <ad****@rivaband.com> wrote...
i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.


OR

3----quit using strtok and simply look for the next \t and extract
what is between them in your string.

Victor
Jul 19 '05 #2
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003

Jul 19 '05 #3
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message news:<GL*****************@newsread1.news.pas.earth link.net>...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
Jul 19 '05 #4
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message news:<GL*****************@newsread1.news.pas.earth link.net>...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
Jul 19 '05 #5

"Sean Bartholomew" <ad****@rivaband.com> wrote in message
news:b8**************************@posting.google.c om...
i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.


This was not unexpected. You'll note my statement: "isn't recomended for
use with other compilers (which can and will have issues with this)"
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003
Jul 19 '05 #6
DOH!

Sometime in the future, maybe I'll learn to actually look at a new algoritim
implimentation before submitting it.
The desired behavior is not one that my below recommendation and code
follow. Instead, the aforementioned 'while' ('if') should be removed
entirely. The code as-was will skip at most 2 delimiters (being an idiot, I
forgot that that while-statement actually starts at the character after the
last \0-d token delimiter). Removing it would get rid of this entirely,
letting the for-loop just below that while-statement take care of where the
delimiters are.

My bad.

- Alex
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message
news:GL*****************@newsread1.news.pas.earthl ink.net...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want to create & distribute your own LIBC variant), copying it mostly. Somewhere in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're good to go. Noteing that the change is such a minor one (changing a single 'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change, which isn't recomended for use with other compilers (which can and will have issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003
Jul 19 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Niraj | last post: by
1 post views Thread by Chris Fink | last post: by
7 posts views Thread by tano | last post: by
5 posts views Thread by jab3 | last post: by
7 posts views Thread by Kevin Lawrence | last post: by
9 posts views Thread by anachronic_individual | last post: by
reply views Thread by leo001 | last post: by
reply views Thread by MichaelMortimer | last post: by
reply views Thread by CCCYYYY | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.