By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,194 Members | 848 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,194 IT Pros & Developers. It's quick & easy.

insert a char in a char array

P: n/a
i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.

but how do i DO this?
how can i check which delimiter was used.
and if not, back to the 1st solution, how do i INSERT a space thereby
moving the following characters to the right which will of course mean
that the array has to be +1 larger. HOW?
Jul 19 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
"Sean Bartholomew" <ad****@rivaband.com> wrote...
i have a bit of a problem.
im parsing a record string using strtok but im encountering back to
back whitespaces (\t\t) due to empty fields from my database export.
the STRTOK function reads up to the LAST \t in a consecutive
arrangement. this of course skews my fields offsetting them.
annoying.... yes.
however i may have 2 solutions:

1----if i can replace all instances of "\t\t" with "\t \t" or anything
other than space. id be happy.

OR

2----replace all instances of "\t\t" with, say, "*", add "*" to the
list of delimiters of STRTOK and for each token extraction, somehow
find a way to FIND OUT which one it used. this way i could then use IF
statements on each extraction based on a variable updated from the
PREVIOUS extraction.
in other words; if the previous extraction delimited with "*" then
dont even DO this one. and then itll move on to the next token after
maybe manually storing a null to the current field variable.


OR

3----quit using strtok and simply look for the next \t and extract
what is between them in your string.

Victor
Jul 19 '05 #2

P: n/a
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003

Jul 19 '05 #3

P: n/a
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message news:<GL*****************@newsread1.news.pas.earth link.net>...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
Jul 19 '05 #4

P: n/a
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message news:<GL*****************@newsread1.news.pas.earth link.net>...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's
an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want
to create & distribute your own LIBC variant), copying it mostly. Somewhere
in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're
good to go. Noteing that the change is such a minor one (changing a single
'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change,
which isn't recomended for use with other compilers (which can and will have
issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.
Jul 19 '05 #5

P: n/a

"Sean Bartholomew" <ad****@rivaband.com> wrote in message
news:b8**************************@posting.google.c om...
i tried dropping it in but i get a wad of errors starting with a
missing ";" on the 1st line.


This was not unexpected. You'll note my statement: "isn't recomended for
use with other compilers (which can and will have issues with this)"
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003
Jul 19 '05 #6

P: n/a
DOH!

Sometime in the future, maybe I'll learn to actually look at a new algoritim
implimentation before submitting it.
The desired behavior is not one that my below recommendation and code
follow. Instead, the aforementioned 'while' ('if') should be removed
entirely. The code as-was will skip at most 2 delimiters (being an idiot, I
forgot that that while-statement actually starts at the character after the
last \0-d token delimiter). Removing it would get rid of this entirely,
letting the for-loop just below that while-statement take care of where the
delimiters are.

My bad.

- Alex
"Alex Lyman" <alex lyman @ earthlink.net> wrote in message
news:GL*****************@newsread1.news.pas.earthl ink.net...
Might be simpler and more efficient to just re-make strtok() to _not_ skip
over multible delimiters. The other options you mentioned would be: a)
unreliable (if you replace all the '\t\t' with '*', what happens if there's an '*' in there before?), and b) would require alot of fun processing
(expanding '\t \t' would require extra space in a buffer for that ' ', not
to mention an extra buffer).

Just find your implimentation of LIBC's strtok.c file (if you have a
compiler that didn't come with your LIBC source (most of the high-grade
compilers do: gcc, MSVC++, and CodeWarrior, to name a few), you can likely
Google for it, or ask here to see if anyone has a compatible source), and
create a new strtok2() function in your own projects source (unless you want to create & distribute your own LIBC variant), copying it mostly. Somewhere in it, you should see a while-loop that does nothing but check to see if a
certain character is a delimiter, and if so, goes to the next character.
All you have to do is make this while-loop into an if-statement, and you're good to go. Noteing that the change is such a minor one (changing a single 'while' into an 'if'), I'm curious as to why LIBC doesn't allready have a
standardized function for this one -- anybody know?.

Hope this helps
- Alex

Here's the code based off my local MSVC++7 LIBC implimentation (sans
original comments & multi-threaded support) to illustrate the entire change, which isn't recomended for use with other compilers (which can and will have issues with this):

--------------------------

char * __cdecl strtok (
char * string,
const char * control
)
{
unsigned char *str;
const unsigned char *ctrl = control;

unsigned char map[32];
int count;

static char *nextoken;

for (count = 0; count < 32; count++)
map[count] = 0;

do {
map[*ctrl >> 3] |= (1 << (*ctrl & 7));
} while (*ctrl++);

if (string)
str = string;
else
str = nextoken;

/* To make strtok2() _not_ skip over multible delimiters, just
change this while() into an if()! */
// while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
if ( (map[*str >> 3] & (1 << (*str & 7))) && *str )
str++;

string = str;

for ( ; *str ; str++ )
if ( map[*str >> 3] & (1 << (*str & 7)) ) {
*str++ = '\0';
break;
}

nextoken = str;

if ( string == str )
return NULL;
else
return string;
}

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.536 / Virus Database: 331 - Release Date: 11/3/2003
Jul 19 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.