473,800 Members | 2,541 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

home grown strtok() function for review

Hello All.
I was looking around for a function like strtok() which would tokenize
on
the complete list of delimiters, rather than tokenize on *any* of the
delimiters
in the group. I ended up just rolling a function. Thought I would
post it here
for discussion.

Thanks.

---------------------------------------------------------
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
//
// pass a buffer, a substring, and a current
// position to start looking within the buffer
//
char *mstrtok (char *buf, char *delim, int curpos)
{
char *srcp; // src pointer to start searching from
char *ptr; // return val from strstr()
char *freeret; // malloc() this if something to return
int len = 0; // holds length of word found between delims
int malsize = 0; // size of buffer space to malloc (len + 1)

// get a starting point by adding the src addr and curent search
position
srcp = buf;
srcp += curpos;

// make sure src ptr is inside the buffer space
if (srcp <= (buf + strlen (buf)))
{
// find the next delim occurance in the srcp
ptr = strstr (srcp, delim);

// was delim found ?
if (ptr)
{
// adjust by subtracting the source address from the ptr
address
len = (ptr - srcp);
}
else
{
// if not there, then len is the end of string minus the
current src address
printf (" debug.no strstr()\n");
len = (buf + strlen (buf)) - srcp;
}

// setup malloc buffer size and make room for NULL 0 at end of
string
malsize = len + 1;
freeret = malloc (malsize);

// did malloc fail?
if (freeret)
{
memset (freeret, 0x0, malsize);
strncpy (freeret, srcp, len);
printf (" len|%s %d\n", freeret, len);

}
else
{
// error - malloc failed. should we exit(1) here?
printf ("**error mstrtok(): unable to malloc %d bytes\n",
len);
}
}
else
{
// did not find another substring
printf (" debug.nosub\n") ;
freeret = NULL;
}
return (freeret);
}

//
// pass a string to the mstrtok function and print out results
//
int main (void)
{
char buf[] = "foo:.:bar:.:b: az:.:STARToiow: :.ii:::eeerrEND ";
char delim[] = ":.:";
int cur = 0;
char *mptr;

// can loop forever, look for NULL mptr from mstrtok()
do
{
// call mstrtok with buffer, delimiter and current position
// for looking in the string
mptr = mstrtok (buf, delim, cur);

// was a pointer returned?
if (mptr)
{
// keep track of current pos for next call.
cur += strlen (mptr) + strlen (delim);

// do something with newly malloc'd mptr and then free it
printf ("debug.mptr|%s \n", mptr);
free (mptr);
}
else
{
// a NULL pointer was returned. no other substrings to find
printf ("debug.mptr|nu ll\n");
}

// debug
printf ("debug.cur|%d\ n", cur);
}
while (mptr != NULL);
return (0);
}

Sep 19 '06 #1
4 1750
default schrieb:
Hello All.
I was looking around for a function like strtok() which would tokenize
on
the complete list of delimiters, rather than tokenize on *any* of the
delimiters
in the group. I ended up just rolling a function. Thought I would
post it here
for discussion.

Thanks.

---------------------------------------------------------
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
//
// pass a buffer, a substring, and a current
// position to start looking within the buffer
//
Apart from these comments, your code is C90 or C99.
With these comments, your code is only C99 -- which is rather
unusual and unnecessary.
Even if you intended C99, then // comments are dangerous
for usenet messages because line breaks may change the source's
meaning.

To the beef:
Your comment mostly states the obvious but does not say what
the function does or what it returns
char *mstrtok (char *buf, char *delim, int curpos)
Neither buf nor delim are changed, so make them const char*.
The curpos _after_ searching for the delim string may be
of some interest.
By allowing
int *pCurrpos
or
size_t *pCurrpos
to be adjusted by mstrtok(), you do not have to calculate the
information twice.
{
char *srcp; // src pointer to start searching from
char *ptr; // return val from strstr()
char *freeret; // malloc() this if something to return
int len = 0; // holds length of word found between delims
int malsize = 0; // size of buffer space to malloc (len + 1)
If malsize only ever holds len+1, then it is rather unnecessary.
>
// get a starting point by adding the src addr and curent search
position
Here the line broke and your code became uncompileable.
srcp = buf;
srcp += curpos;
Nice, but why not write
srcp = &buf[curpos];
// make sure src ptr is inside the buffer space
if (srcp <= (buf + strlen (buf)))
Why not use
if (curpos <= strlen(buf))
instead?
{
// find the next delim occurance in the srcp
Useless commment
ptr = strstr (srcp, delim);

// was delim found ?
if (ptr)
{
// adjust by subtracting the source address from the ptr
address
len = (ptr - srcp);
}
else
{
// if not there, then len is the end of string minus the
current src address
Useless comment
printf (" debug.no strstr()\n");
len = (buf + strlen (buf)) - srcp;
why not use
len = strlen(buf) - curpos;
instead?
}

// setup malloc buffer size and make room for NULL 0 at end of
string
malsize = len + 1;
freeret = malloc (malsize);

// did malloc fail?
if (freeret)
{
memset (freeret, 0x0, malsize);
You could calloc() instead or even manually add the string terminator.
strncpy (freeret, srcp, len);
printf (" len|%s %d\n", freeret, len);

}
else
{
// error - malloc failed. should we exit(1) here?
printf ("**error mstrtok(): unable to malloc %d bytes\n",
len);
Error output ought to go to stderr().
}
}
else
{
// did not find another substring
printf (" debug.nosub\n") ;
freeret = NULL;
It is IMO clearer to initialise freeret to NULL at the
beginning.
}
return (freeret);
}

//
// pass a string to the mstrtok function and print out results
//
int main (void)
{
char buf[] = "foo:.:bar:.:b: az:.:STARToiow: :.ii:::eeerrEND ";
char delim[] = ":.:";
If you do not want to change buf or delim, use const char instead.
int cur = 0;
char *mptr;

// can loop forever, look for NULL mptr from mstrtok()
do
{
// call mstrtok with buffer, delimiter and current position
// for looking in the string
mptr = mstrtok (buf, delim, cur);

// was a pointer returned?
if (mptr)
{
// keep track of current pos for next call.
cur += strlen (mptr) + strlen (delim);

// do something with newly malloc'd mptr and then free it
printf ("debug.mptr|%s \n", mptr);
free (mptr);
Note: With free(mptr), the bit pattern in mptr (its representation)
may become a trap representation. Using mptr from here on may have
bad consequences.
}
else
{
// a NULL pointer was returned. no other substrings to find
printf ("debug.mptr|nu ll\n");
}

// debug
printf ("debug.cur|%d\ n", cur);
}
while (mptr != NULL);
Such as here.
>

return (0);
return is not a function, so return 0 suffices.
}
The same programme, using size_t instead of int and containing
the above corrections:

,---
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/*
* Extract string between &buf[curpos] and first occurrence of
* delim or end of string; return extracted string in separately
* malloc()ed storage; return NULL on malloc() failure or curpos
* strlen(buf)
*/
static char *mstrtok (const char *buf,
const char *delim,
size_t curpos)
{
char *freeret = NULL; /* malloc() this if something to return */

size_t buf_strlen = strlen (buf);

/* make sure src ptr is inside the buffer space*/
if (curpos <= buf_strlen)
{
const char *srcp = &buf[curpos];
/* find the next delim occurance in the srcp*/
char *ptr = strstr (srcp, delim);
size_t len = 0; /* holds length of word found between delims */

/* was delim found ?*/
if (NULL != ptr)
{
len = (ptr - srcp);
}
else
{
fprintf (stderr, " debug.no strstr()\n");
len = buf_strlen - curpos;
}

freeret = malloc(len + 1);

if (NULL != freeret)
{
strncpy (freeret, srcp, len);
freeret[len] = '\0';
printf (" len|%s %lu\n", freeret, (unsigned long) len);

}
else
{
/* error - malloc failed. should we exit(1) here?*/
fprintf (stderr, "**error mstrtok():"
" unable to malloc %lu bytes\n",
(unsigned long) len);
}
}
else
{
/* did not find another substring */
fprintf (stderr, " debug.nosub\n") ;
}

return (freeret);
}

/* */
/* pass a string to the mstrtok function and print out results */
/* */
int main (void)
{
const char buf[] = "foo:.:bar:.:b: az:.:STARToiow: :.ii:::eeerrEND ";
const char delim[] = ":.:";
size_t delim_strlen = strlen(delim);
size_t cur = 0;
char *mptr = NULL;

/* can loop forever, look for NULL mptr from mstrtok() */
do
{
free (mptr);
mptr = mstrtok (buf, delim, cur);

/* was a pointer returned? */
if (NULL != mptr)
{
/* keep track of current pos for next call. */
cur += strlen (mptr) + delim_strlen;

/* do something with newly malloc'd mptr and then free it */
printf ("debug.mptr|%s \n", mptr);
}
else
{
/* a NULL pointer was returned. no other substrings to find */
printf ("debug.mptr|nu ll\n");
}

/* debug */
printf ("debug.cur|%lu \n", (unsigned long)cur);
}
while (mptr != NULL);
return (0);
}
`---

Note: free(NULL); is well defined (and has no effect).

Note that if we changed that to
static char *mstrtok (const char *buf,
const char *delim,
size_t *pCurrpos);
then we could change the the do--while loop to
do
{
free (mptr);
mptr = mstrtok (buf, delim, &cur);

/* was a pointer returned? */
if (NULL != mptr)
{
/* do something with newly malloc'd mptr and then free it */
printf ("debug.mptr|%s \n", mptr);
}
....

Tests for corner cases you omitted:
":.:"
":.:.:"
"foo:.::.:b ar"
These are important and should be considered.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Sep 19 '06 #2
Michael Mair wrote:
default schrieb:
>char *mstrtok (char *buf, char *delim, int curpos)
>{
char *srcp; // src pointer to start searching from
char *ptr; // return val from strstr()
char *freeret; // malloc() this if something to return
int len = 0; // holds length of word found between delims
int malsize = 0; // size of buffer space to malloc (len + 1)
>
> srcp = buf;
srcp += curpos;

Nice, but why not write
srcp = &buf[curpos];
scrp = buf + curpos; /*?*/

--
imalone
Sep 20 '06 #3
Ian Malone schrieb:
Michael Mair wrote:
>default schrieb:
<snip>
>> char *srcp; // src pointer to start searching from
<snip>
>> srcp = buf;
srcp += curpos;

Nice, but why not write
srcp = &buf[curpos];

scrp = buf + curpos; /*?*/
This is, of course, equivalent.
The OP did use quite much unnecessary pointer arithmetic
operations, so I'd rather stress that they are not strictly
necessary. In addition, I like the "address of array element"
way better :-)

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Sep 20 '06 #4
Michael Mair wrote:
<snip>
The OP did use quite much unnecessary pointer arithmetic
operations, so I'd rather stress that they are not strictly
necessary. In addition, I like the "address of array element"
way better :-)
Hi Michael.
Thank you for the input! All very good points. I like the idea of
using the const keyword as it tells a lot about the parameter list
at-a-glance. I always thought it was a compiler optimization of
somekind - guess it might be that too, but it sure makes things
clearer. I also like your approach to indexing the string with the
square brackets rather than the pointer arithmetic, it does simplify
things a lot.

Sep 21 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
476
by: gyan | last post by:
Hi How strtok track through string? char *strtok(char *s1, const char *s2); As i know The first call (with pointer s1 specified) returns a pointer to the first character of the first token, and will have written a null character into s1 immediately following the returned token. The function keeps track of its position in the string between separate calls, so that subsequent calls (which must be made with the first argument...
13
4928
by: ern | last post by:
I'm using strtok( ) to capture lines of input. After I call "splitCommand", I call strtok( ) again to get the next line. Strtok( ) returns NULL (but there is more in the file...). That didn't happen before 'splitCommands' entered the picture. The problem is in splitCommands( ) somehow modifying the pointer, but I HAVE to call that function. Is there a way to make a copy of it or something ? /* HERE IS MY CODE */ char *...
17
5212
by: bofh1234 | last post by:
I need to delimit a string. The delimiters are a semicolon and comma. When I run the program I get a segmentation fault on the first strtok. I followed the examples of others and from my old C books, but I can't seem to find the problem. The accesslist has a format of 20,45;22,44;46,28;99,43,etc. What am I doing wrong? Thanks, #include <sys/signal.h> #include <messages.h>
8
1932
by: hu | last post by:
hi, everybody! I'm testing the fuction of strtok(). The environment is WinXP, VC++6.0. Program is simple, but mistake is confusing. First, the below code can get right outcome:"ello world, hello dreams." #include <stdafx.h> #include <string.h> #include <stdio.h> int main()
18
3498
by: Robbie Hatley | last post by:
A couple of days ago I dedecided to force myself to really learn exactly what "strtok" does, and how to use it. I figured I'd just look it up in some book and that would be that. I figured wrong! Firstly, Bjarne Stroustrup's "The C++ Programming Language" said: (nothing)
4
2736
by: Michael | last post by:
Hi, I have a proble I don't understand when using strtok(). It seems that if I make a call to strtok(), then make a call to another function that also makes use of strtok(), the original call is somehow confused or upset. I have the following code, which I am using to tokenise some input which is in th form x:y:1.2: int tokenize_input(Sale *sale, char *string){
5
25806
by: Kelly B | last post by:
I need a function which returns me a "word" from a given string and then sets the pointer to the next one which is then retrieved during further calls to the function. I think strtok( ) is the solution but i could not understand the use of the function as given in the C99 standard EXAMPLE #include <string.h> static char str = "?a???b,,,#c";
11
17180
by: magicman | last post by:
can anyone point me out to its implementation in C before I roll my own. thx
12
2361
by: Pilcrow | last post by:
Here is a quick program, together with its output, that illustrates what I consider to be a deficiency of the standard function strtok from <string.h>: I am using C:\>gcc --version gcc (GCC) 3.4.5 (mingw special) I would like there to be a default, to be returned when two delimiters
0
9690
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10504
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10251
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10033
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7576
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6811
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5469
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4149
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2945
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.