Connecting Tech Pros Worldwide Forums | Help | Site Map

preg_match and delimited strings

siromega@gmail.com
Guest
 
Posts: n/a
#1: May 30 '06
I have a string I'd like to have broken into parts using preg_match. I
used a regular expression from the Perl FAQ (http://perlfaq.cpan.org/):


push(@new, $+) while $text =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the
quotes
| ([^,]+),?
| ,
}gx;

The idea is that it splits delimited strings respecting the quotes, for
example...

foo, bar, "foo, bar", bar
would end up as
-foo
-bar
-"foo, bar"
-bar

So obviously an explode wont work.

I cant figure out how to convert that piece of perl above into
preg_match. I've copied the string and escaped all the appropriate
charecters however it still wont divide the string show above
properly...

$str = "foo, bar, \"foo, bar\", bar";
$re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
if (preg_match($re, $str, $res)) {
print_r($res);
}

I have other strings that need parsing too (including ones with a
double encapsulator inside the string - eg. 'don''t touch that!'). But
I'm saving those for later.


Rik
Guest
 
Posts: n/a
#2: May 31 '06

re: preg_match and delimited strings


siromega@gmail.com wrote:[color=blue]
> push(@new, $+) while $text =~ m{
> "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside
> the quotes
> | ([^,]+),?
> | ,
> }gx;
>
> I cant figure out how to convert that piece of perl above into
> preg_match. I've copied the string and escaped all the appropriate
> charecters however it still wont divide the string show above
> properly...
>
> $str = "foo, bar, \"foo, bar\", bar";
> $re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
> if (preg_match($re, $str, $res)) {
> print_r($res);
> }[/color]

It's a bitch for sure, isn't the wonderfull function fgetcsv() maybe
applicable in this case?

Grtz,
--
Rik Wasmus


siromega@gmail.com
Guest
 
Posts: n/a
#3: May 31 '06

re: preg_match and delimited strings


Rik wrote:[color=blue]
> siromega@gmail.com wrote:[color=green]
> > push(@new, $+) while $text =~ m{
> > "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside
> > the quotes
> > | ([^,]+),?
> > | ,
> > }gx;
> >
> > I cant figure out how to convert that piece of perl above into
> > preg_match. I've copied the string and escaped all the appropriate
> > charecters however it still wont divide the string show above
> > properly...
> >
> > $str = "foo, bar, \"foo, bar\", bar";
> > $re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
> > if (preg_match($re, $str, $res)) {
> > print_r($res);
> > }[/color]
>
> It's a bitch for sure, isn't the wonderfull function fgetcsv() maybe
> applicable in this case?
>
> Grtz,
> --
> Rik Wasmus[/color]

Rik,

I'd have to write that part of the sql query out to a file and then
read it back in with fgetcsv(). However I do need to thank you, since
when I read through the page and the user contributed notes on the
fgetcsv() page I found a regex and a function that accomplished what I
needed...

function csv_string_to_array($str){
$expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
$results=preg_split($expr,trim($str));
return preg_replace("/^\"(.*)\"$/","$1",$results);
}

I just had to replace the \" with ' to get it to parse my strings. I
tested it with both '' and , in the encapsulated string and it works!

Rik
Guest
 
Posts: n/a
#4: May 31 '06

re: preg_match and delimited strings


siromega@gmail.com wrote:[color=blue]
> I found a regex and a function that accomplished what I
> needed...
>
> function csv_string_to_array($str){
> $expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
> $results=preg_split($expr,trim($str));
> return preg_replace("/^\"(.*)\"$/","$1",$results);
> }
> I just had to replace the \" with ' to get it to parse my strings. I
> tested it with both '' and , in the encapsulated string and it works![/color]


I hate to be the bearer of bad tidings, but it will fail on single escaped
\":

$str = '"baz,\"bax", bay, foz, "a, b", "foo';
$expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
$results=preg_split($expr,trim($str));
$res = preg_replace("/^\"(.*)\"$/","$1",$results);
print_r($res);

Array
(
[0] => "baz
[1] => \"bax"
[2] => bay
[3] => foz
[4] => "a, b"
[5] => foo
)

Properly nested, it can handle it:
$str = '"baz,\"bax\"", bay, foz, "a, b", foo';

Array
(
[0] => baz,\"bax\"
[1] => bay
[2] => foz
[3] => "a, b"
[4] => foo
)

Also a nice one is:
$str = 'bay, foz, "a, b", "fo\"o"';


Array
(
[0] => bay, foz, "a
[1] => b", "fo\"o"
)


Not a very robust one here.
But maybe it's OK for your data, it all depends how much you know for sure
about that.

Grtz,
--
Rik Wasmus


siromega@gmail.com
Guest
 
Posts: n/a
#5: May 31 '06

re: preg_match and delimited strings


Rik wrote:[color=blue]
>
>
> I hate to be the bearer of bad tidings, but it will fail on single escaped
> \":
>
> $str = '"baz,\"bax", bay, foz, "a, b", "foo';
> $expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
> $results=preg_split($expr,trim($str));
> $res = preg_replace("/^\"(.*)\"$/","$1",$results);
> print_r($res);
>
> Array
> (
> [0] => "baz
> [1] => \"bax"
> [2] => bay
> [3] => foz
> [4] => "a, b"
> [5] => foo
> )
>
> Properly nested, it can handle it:
> $str = '"baz,\"bax\"", bay, foz, "a, b", foo';
>
> Array
> (
> [0] => baz,\"bax\"
> [1] => bay
> [2] => foz
> [3] => "a, b"
> [4] => foo
> )
>
> Also a nice one is:
> $str = 'bay, foz, "a, b", "fo\"o"';
>
>
> Array
> (
> [0] => bay, foz, "a
> [1] => b", "fo\"o"
> )
>
>
> Not a very robust one here.
> But maybe it's OK for your data, it all depends how much you know for sure
> about that.
>
> Grtz,
> --
> Rik Wasmus[/color]

Indeed, that is correct. I'm going to have to chcek out my data and see
if it'll work. Do you happen to know if fgetcsv() properly handle the
single " ?

Closed Thread