preg_match and delimited strings | |
I have a string I'd like to have broken into parts using preg_match. I
used a regular expression from the Perl FAQ ( http://perlfaq.cpan.org/):
push(@new, $+) while $text =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the
quotes
| ([^,]+),?
| ,
}gx;
The idea is that it splits delimited strings respecting the quotes, for
example...
foo, bar, "foo, bar", bar
would end up as
-foo
-bar
-"foo, bar"
-bar
So obviously an explode wont work.
I cant figure out how to convert that piece of perl above into
preg_match. I've copied the string and escaped all the appropriate
charecters however it still wont divide the string show above
properly...
$str = "foo, bar, \"foo, bar\", bar";
$re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
if (preg_match($re, $str, $res)) {
print_r($res);
}
I have other strings that need parsing too (including ones with a
double encapsulator inside the string - eg. 'don''t touch that!'). But
I'm saving those for later. | | | | re: preg_match and delimited strings siromega@gmail.com wrote:[color=blue]
> push(@new, $+) while $text =~ m{
> "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside
> the quotes
> | ([^,]+),?
> | ,
> }gx;
>
> I cant figure out how to convert that piece of perl above into
> preg_match. I've copied the string and escaped all the appropriate
> charecters however it still wont divide the string show above
> properly...
>
> $str = "foo, bar, \"foo, bar\", bar";
> $re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
> if (preg_match($re, $str, $res)) {
> print_r($res);
> }[/color]
It's a bitch for sure, isn't the wonderfull function fgetcsv() maybe
applicable in this case?
Grtz,
--
Rik Wasmus | | | | re: preg_match and delimited strings
Rik wrote:[color=blue]
> siromega@gmail.com wrote:[color=green]
> > push(@new, $+) while $text =~ m{
> > "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside
> > the quotes
> > | ([^,]+),?
> > | ,
> > }gx;
> >
> > I cant figure out how to convert that piece of perl above into
> > preg_match. I've copied the string and escaped all the appropriate
> > charecters however it still wont divide the string show above
> > properly...
> >
> > $str = "foo, bar, \"foo, bar\", bar";
> > $re = "\"([^\\\"\\\\]*(?:\\\\.[^\\\"\\\\]*)*)\",?| ([^,]+),?| ,";
> > if (preg_match($re, $str, $res)) {
> > print_r($res);
> > }[/color]
>
> It's a bitch for sure, isn't the wonderfull function fgetcsv() maybe
> applicable in this case?
>
> Grtz,
> --
> Rik Wasmus[/color]
Rik,
I'd have to write that part of the sql query out to a file and then
read it back in with fgetcsv(). However I do need to thank you, since
when I read through the page and the user contributed notes on the
fgetcsv() page I found a regex and a function that accomplished what I
needed...
function csv_string_to_array($str){
$expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
$results=preg_split($expr,trim($str));
return preg_replace("/^\"(.*)\"$/","$1",$results);
}
I just had to replace the \" with ' to get it to parse my strings. I
tested it with both '' and , in the encapsulated string and it works! | | | | re: preg_match and delimited strings siromega@gmail.com wrote:[color=blue]
> I found a regex and a function that accomplished what I
> needed...
>
> function csv_string_to_array($str){
> $expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
> $results=preg_split($expr,trim($str));
> return preg_replace("/^\"(.*)\"$/","$1",$results);
> }
> I just had to replace the \" with ' to get it to parse my strings. I
> tested it with both '' and , in the encapsulated string and it works![/color]
I hate to be the bearer of bad tidings, but it will fail on single escaped
\":
$str = '"baz,\"bax", bay, foz, "a, b", "foo';
$expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
$results=preg_split($expr,trim($str));
$res = preg_replace("/^\"(.*)\"$/","$1",$results);
print_r($res);
Array
(
[0] => "baz
[1] => \"bax"
[2] => bay
[3] => foz
[4] => "a, b"
[5] => foo
)
Properly nested, it can handle it:
$str = '"baz,\"bax\"", bay, foz, "a, b", foo';
Array
(
[0] => baz,\"bax\"
[1] => bay
[2] => foz
[3] => "a, b"
[4] => foo
)
Also a nice one is:
$str = 'bay, foz, "a, b", "fo\"o"';
Array
(
[0] => bay, foz, "a
[1] => b", "fo\"o"
)
Not a very robust one here.
But maybe it's OK for your data, it all depends how much you know for sure
about that.
Grtz,
--
Rik Wasmus | | | | re: preg_match and delimited strings
Rik wrote:[color=blue]
>
>
> I hate to be the bearer of bad tidings, but it will fail on single escaped
> \":
>
> $str = '"baz,\"bax", bay, foz, "a, b", "foo';
> $expr="/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/";
> $results=preg_split($expr,trim($str));
> $res = preg_replace("/^\"(.*)\"$/","$1",$results);
> print_r($res);
>
> Array
> (
> [0] => "baz
> [1] => \"bax"
> [2] => bay
> [3] => foz
> [4] => "a, b"
> [5] => foo
> )
>
> Properly nested, it can handle it:
> $str = '"baz,\"bax\"", bay, foz, "a, b", foo';
>
> Array
> (
> [0] => baz,\"bax\"
> [1] => bay
> [2] => foz
> [3] => "a, b"
> [4] => foo
> )
>
> Also a nice one is:
> $str = 'bay, foz, "a, b", "fo\"o"';
>
>
> Array
> (
> [0] => bay, foz, "a
> [1] => b", "fo\"o"
> )
>
>
> Not a very robust one here.
> But maybe it's OK for your data, it all depends how much you know for sure
> about that.
>
> Grtz,
> --
> Rik Wasmus[/color]
Indeed, that is correct. I'm going to have to chcek out my data and see
if it'll work. Do you happen to know if fgetcsv() properly handle the
single " ? |  | |