By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,805 Members | 1,653 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,805 IT Pros & Developers. It's quick & easy.

strtok stops to parse after encountering odd chars like - " /

P: 2
Hi,

I have a text file as input which has odd chars like: - // "
Bellow is a sample:
Jul 6 00:00:13 proxyhost httpd[7266]: [ID 477494 local3.info] 172.26.74.212 - [06/Jul/2007:00:00:13 -0400] "GET http://www.google.com/gwt/i?i=08
9E3394C_C235D1FF_FA6C3045 HTTP/1.1" 200 200 383 78881e-1 N COMPLETE 158 REQ:220:0:130:4:100:0:10:8:30:0:60:0:5:0:200:0 EXE:80:0:125:0:15:0:230:0 MOD
:240:0:210:0:90:0:110:0:20:0 RES:40:0:70:30:120:0:260:0 PD:126:0 113 - 1

I read the file line by line and split it with strtok but once I try to split the odd chars strtok is not doing its job. Bellow is the C code I use:

/* Counter */
int i=0;

/* Read the file until EOF */
while ( fgets ( line, sizeof line, input_file_pointer ) != NULL ) /* read a line */
{
pch = strtok (line, " ");
arra[i]= pch;
while (pch != NULL)
{
pch = strtok (NULL, " ");
arra[i]= pch;
i++;
}

}
printf("(%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s)\n", arra[0],arra[1],arra[2],arra[3],arra[4],arra[5],arra[
6],arra[7],arra[8],arra[9],arra[10],arra[11],arra[12],arra[13],arra[14]);

Because of the char "-" strtok stops to process after it. If I put in the delimitors also the char "-" if get further with processing but break again at """ and also again at "/" from http://www.google.com.

The output I get with only space as delimitor:
(6) (00:00:18) (to5magproxy1) (httpd[7141]:) ([ID) (477494) (local3.info]) (172.30.128.215) () () () () (T) (1) (TE)

The problems are:
-The 1st field stored in arra[0] (should contain Jul) is missing:
-Once strok hits the char - it stops to parse.

There is a way to overcome this strtok behaviour and force it to parse regardles of the char types?
Jul 9 '07 #1
Share this Question
Share on Google+
3 Replies


gpraghuram
Expert 100+
P: 1,275
Hi,

I have a text file as input which has odd chars like: - // "
Bellow is a sample:
Jul 6 00:00:13 proxyhost httpd[7266]: [ID 477494 local3.info] 172.26.74.212 - [06/Jul/2007:00:00:13 -0400] "GET http://www.google.com/gwt/i?i=08
9E3394C_C235D1FF_FA6C3045 HTTP/1.1" 200 200 383 78881e-1 N COMPLETE 158 REQ:220:0:130:4:100:0:10:8:30:0:60:0:5:0:200:0 EXE:80:0:125:0:15:0:230:0 MOD
:240:0:210:0:90:0:110:0:20:0 RES:40:0:70:30:120:0:260:0 PD:126:0 113 - 1

I read the file line by line and split it with strtok but once I try to split the odd chars strtok is not doing its job. Bellow is the C code I use:

/* Counter */
int i=0;

/* Read the file until EOF */
while ( fgets ( line, sizeof line, input_file_pointer ) != NULL ) /* read a line */
{
pch = strtok (line, " ");
arra[i]= pch;
while (pch != NULL)
{
pch = strtok (NULL, " ");
arra[i]= pch;
i++;
}

}
printf("(%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s) (%s)\n", arra[0],arra[1],arra[2],arra[3],arra[4],arra[5],arra[
6],arra[7],arra[8],arra[9],arra[10],arra[11],arra[12],arra[13],arra[14]);

Because of the char "-" strtok stops to process after it. If I put in the delimitors also the char "-" if get further with processing but break again at """ and also again at "/" from http://www.google.com.

The output I get with only space as delimitor:
(6) (00:00:18) (to5magproxy1) (httpd[7141]:) ([ID) (477494) (local3.info]) (172.30.128.215) () () () () (T) (1) (TE)

The problems are:
-The 1st field stored in arra[0] (should contain Jul) is missing:
-Once strok hits the char - it stops to parse.

There is a way to overcome this strtok behaviour and force it to parse regardles of the char types?
Hi ,
First of all there is a problem in ur code.
Instead of copying the string you are assigning it to an arry which wont work.
Expand|Select|Wrap|Line Numbers
  1. arra[i]= pch; //this is wrong
  2. strcpy(arra[i],pch); ///use this
  3.  
You can pass multiple characters to strtok like strtok("str," -")
But if u dont want to tokenize using other characters then before calling strtok write your own function to remove those special characters

Raghuram
Jul 10 '07 #2

P: 2
Hi,

As advised I put strcpy(arra[i],pch); instead of arra[i]= pch; but now I get "Segmentation Fault - core dumped" when the program run this statement.
The compile is passing with no warnings/errors. Here is the new code:
Expand|Select|Wrap|Line Numbers
  1. ...
  2.         /* --- Initialize the variables --- */
  3.         int i=0;
  4.  
  5.         /* variable to read in the words from the file */
  6.         char line[3000]; /* no line above 3000 characters */
  7.         char *pch;
  8.  
  9.         /* Array to store the input line */
  10.         char *arra[3000];
  11. ...
  12.         /* Read the file until EOF */
  13.         while ( fgets ( line, sizeof line, input_file_pointer ) != NULL ) /* read a line */
  14.         {
  15.           pch = strtok (line, " ");
  16.           //arra[i]= pch;
  17.           strcpy(arra[i],pch);
  18.           return 0;
  19.           printf("(%s)\n",arra[0]);
  20.           return 0;
  21.           while (pch != NULL)
  22.           {
  23.             pch = strtok (NULL, " ");
  24.             //arra[i]= pch;
  25.             strcpy(arra[i],pch);
  26.             i++;
  27.           }
  28.  
Thx
Jul 10 '07 #3

weaknessforcats
Expert Mod 5K+
P: 9,197
As advised I put strcpy(arra[i],pch); instead of arra[i]= pch; but now I get "Segmentation Fault - core dumped" when the program run this statement.
arra[i] is a pointer. You need to allocate memory for the strcpy() before you copy.

Expand|Select|Wrap|Line Numbers
  1. arra[i] = malloc(strlen(pch) +1);  //extra byte for the null terminator
  2.  
and then rememebr to delete this memory wne you are finished with it.
Jul 10 '07 #4

Post your reply

Sign in to post your reply or Sign up for a free account.