473,398 Members | 2,389 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

regex.h

based on http://souptonuts.sourceforge.net/code/regexp.c.html

#include <stdio.h>
#include <string.h>
#include <regex.h>

int split(char * output[100], char * input){
regex_t preg;
regmatch_t pmatch[100];
size_t i;
char inputsub[sizeof(input)];

if(regcomp (&preg, "select", REG_EXTENDED)) output[0]="Invalid
expression";
else {
if(regexec (&preg, input, 100, pmatch, 0)) output[0]="No
match"; else {
for (i = 0; i <= preg.re_nsub; i++){
strncpy (inputsub, input + pmatch[i].rm_so, pmatch[i].rm_eo -
pmatch[i].rm_so);
//inputsub[pmatch[i].rm_eo - pmatch[i].rm_so] = '\0'; //
<-------????????
output[i]=inputsub;
}
}
}
regfree (&preg);
printf (""); //<----------????????????????
return 0;
}

int xml(char * input){
char * output[100];
split(output,input);
printf("Content-Type: text/xml; charset=utf-8"
"\r\n"
"\r\n"
"<xml>%s</xml>"
"\n",output[0]);

return 0;
}

is this line really necessary ?

inputsub[pmatch[i].rm_eo - pmatch[i].rm_so] = '\0';

and why do i have to do printf ("") ? if i dont do printf("") the
second printf doesnt have a output[0] value ?

printf("Content-Type: text/xml; charset=utf-8"
"\r\n"
"\r\n"
"<xml>%s</xml>"
"\n",output[0]);

Feb 18 '07 #1
5 5964
gert <ge**********@gmail.comwrote:
based on http://souptonuts.sourceforge.net/code/regexp.c.html
#include <stdio.h>
#include <string.h>
#include <regex.h>
int split(char * output[100], char * input){
regex_t preg;
regmatch_t pmatch[100];
size_t i;
char inputsub[sizeof(input)];
This looks very wrong. 'input' is a char pointer so its sizeof()
is going to be a very small number. The resulting array is rather
likely to be much too short for holding what you are going to copy
to it. I guess you somehow want to make the string as long as
the the amount of memory 'input' points to. But you can't do that
since there's no information how much taht is - a pointer does
not contain any information about the amount of memory it
points to. Perhaps you meant

char inputsub[ strlen( input ) + 1 ];

But keep in mind that this is only possible in C99, in C89 there
were no arrays of lengths that could set at runtime (VLAs), you
would have to use malloc() to obtain as much memory as you need
here.
if(regcomp (&preg, "select", REG_EXTENDED)) output[0]="Invalid
expression";
else {
if(regexec (&preg, input, 100, pmatch, 0)) output[0]="No
match"; else {
for (i = 0; i <= preg.re_nsub; i++){
strncpy (inputsub, input + pmatch[i].rm_so, pmatch[i].rm_eo -
pmatch[i].rm_so);
Here you copy only a certain number of characters, and if this is
shorter than the source string the destination string isn't ended
with a '\0' (strncpy() does *not* add one by itself), so the desti-
nation string isn't a string at all.
//inputsub[pmatch[i].rm_eo - pmatch[i].rm_so] = '\0'; //
<-------????????
And that line, if uncommented, adds a '\0' to the destination
string, thus making it a real string.
output[i]=inputsub;
And here things go badly wrong. 'inputsub' is a local array and
goes out of scope the moment you leave this function, so whatever
the elements of 'output' are pointing to then is memory you can't
use anymore. Moreover, you assign the same address to all the ele-
ments of 'output' - copying a pointer isn't the same as copying a
string (and for that you would need additional memory).
}
}
}
regfree (&preg);
printf (""); //<----------????????????????
This is actually completely useless.
return 0;
}
int xml(char * input){
char * output[100];
split(output,input);
printf("Content-Type: text/xml; charset=utf-8"
"\r\n"
"\r\n"
"<xml>%s</xml>"
"\n",output[0]);
return 0;
}
and why do i have to do printf ("") ? if i dont do printf("") the
second printf doesnt have a output[0] value ?
Probably because you have some bad mistakes in your program and
for some strange reason putting in the printf("") makes it look
as if they weren't there. If you invoke undefined behaviour by
e.g. writing past the end of an array or by using pointers to
memory that already has gone out of scope basically everything
can happen - it may even look as if it would work flawlessly
or may look as if it works if you add some otherwise completely
useless code.
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
Feb 18 '07 #2
"gert" <ge**********@gmail.comwrites:
[...]
int split(char * output[100], char * input){
[...]

I'll just make one point about this particular line; I might come back
with more comments later.

Parameter declarations that look like array declarations are tricky.
In this case:

char *output[100]

the "100" will be silently ignored by the compiler. What looks like
an array parameter declaration is really a pointer parameter
declaration (it's not converted as most array expressions are, it's
actually translated at compile time). It would be clearer to write
it as:

int split(char *output[], char *input)

or even what I'd prefer:

int split(char **output, char *input);

See section 6 of the copm.lang.c FAQ, <http://www.c-faq.com/>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 18 '07 #3
Second attempt

int split(char ** output, char * input){
regex_t reg;
regmatch_t match[100];
size_t i;

if(regcomp (&reg, "select", REG_EXTENDED)) output[0]="Invalid
expression";
else {
if(regexec (&reg, input, 100, match, 0)) output[0]="No match";
else {
for (i = 0; i <= reg.re_nsub; i++){
strncpy (output[i], input + match[i].rm_so, match[i].rm_eo -
match[i].rm_so);
output[i][match[i].rm_eo - match[i].rm_so] = '\0';
}
}
}
regfree (&reg);
return 0;
}

int xml(char * input){
char output[100][strlen( input ) + 1];
split(output,input);
printf("Content-Type: text/xml; charset=utf-8"
"\r\n"
"\r\n"
"<xml>%s</xml>"
"\n",output[0]);
return 0;
}

gert@gert:~$ Desktop/svn/db/c/make.sh
In file included from /home/gert/Desktop/svn/db/c/post.c:4:
/home/gert/Desktop/svn/db/c/xml.c: In function 'xml':
/home/gert/Desktop/svn/db/c/xml.c:8: warning: passing argument 1 of
'split' from incompatible pointer type
Content-Type: text/xml; charset=utf-8

<xml>o?</xml>
gert@gert:~$

I did something wrong again

Feb 18 '07 #4
gert wrote:
Second attempt

int split(char ** output, char * input){
int xml(char * input){
char output[100][strlen( input ) + 1];
split(output,input);
/home/gert/Desktop/svn/db/c/xml.c:8: warning: passing argument 1 of
'split' from incompatible pointer type
Content-Type: text/xml; charset=utf-8

<xml>o?</xml>
gert@gert:~$

I did something wrong again
<http://c-faq.com/aryptr/pass2dary.html>

Brian
Feb 18 '07 #5
#include <string.h>
#include <regex.h>

int split(char output[100][100], char * input){
char * temp;
regex_t reg;
regmatch_t match[100];
size_t i,tlen;

if(regcomp (&reg, "select", REG_EXTENDED)){
temp="Invalid expression";
tlen=strlen(temp)+1;
strncpy (output[i],temp,tlen);
output[i][tlen] = '\0';
}
else {
if(regexec (&reg, input, 100, match, 0)){
temp="No match";
tlen=strlen(temp)+1;
strncpy (output[i],temp,tlen);
output[i][tlen] = '\0';
}
else {
for (i = 0; i <= reg.re_nsub; i++){
strncpy (output[i], input + match[i].rm_so, match[i].rm_eo -
match[i].rm_so);
output[i][match[i].rm_eo - match[i].rm_so] = '\0';
}
}
}
regfree (&reg);
return 0;
}

int xml(char * input){
char output[100][100];
split(output,input);
printf("Content-Type: text/xml; charset=utf-8"
"\r\n"
"\r\n"
"<xml>%s</xml>"
"\n",output[0]);
return 0;
}

this works but i want to make char output[100][100] dynamic so i can
have as many matches as needed ?

Feb 19 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
9
by: Tim Conner | last post by:
Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) { return true; } else return false; }
20
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
6
by: Extremest | last post by:
I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is a faster way tried to use string.replace with all...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
4
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. ...
0
by: Karch | last post by:
I have these two methods that are chewing up a ton of CPU time in my application. Does anyone have any suggestions on how to optimize them or rewrite them without Regex? The most time-consuming...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.