473,222 Members | 1,738 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,222 software developers and data experts.

Complex regular expression?

Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301","08-08","BOOK, RETAIL",20.00,23.56
2,"1","20040301","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15.99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray
Jul 17 '05 #1
4 1975
"M Wells" <pl**********@planetthoughtful.org> wrote in message
news:oa********************************@4ax.com...
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301","08-08","BOOK, RETAIL",20.00,23.56
2,"1","20040301","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15.99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.


A not so elegant way:

function to_tab($matches) {
return strtr($matches[1], ",", "\t") . $matches[2];
}

$r = preg_replace_callback('/([^"]*)("?[^"]*"?)/', 'to_tab', $s);
Jul 17 '05 #2
M Wells schrieb:
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other
commas (ie those that do indicate the delimitation of a field,
rather than which appear within the value of a field) in the file
to tabs without affecting the commas that appear in the column in
question.

An example of the contents of the file would be:

1,"1","20040301","08-08","BOOK, RETAIL",20.00,23.56
2,"1","20040301","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15.99,22.00

So, I'm trying to create a regular expression that will change all
the commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray


Another way to solve this problem is to replace all commas which where
NOT followed by spaces... If u can be sure that commas in quotes always
have a space behind them...

$new_string = preg_replace('/\,([\S])/',"\t$1",$string);

*Hannes*

Jul 17 '05 #3
M Wells wrote:
I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.


http://www.php.net/manual/en/function.fgetcsv.php

If you have commas inside the quoted fields this function takes care of it
for you. You can specify what sort of delimiter as well (eg tab, comma etc)

Chris

--
Chris Hope
The Electric Toolbox Ltd
http://www.electrictoolbox.com/
Jul 17 '05 #4

"M Wells" <pl**********@planetthoughtful.org> wrote in message
news:oa********************************@4ax.com...
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301","08-08","BOOK, RETAIL",20.00,23.56
2,"1","20040301","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15.99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray


Had a bit of a tinker, came up with this:

<?php
$x='1,2,3,"some text in quotes",4,5,"some more, this time with a comma"';
preg_match_all('/(".*?")/',$x,$r);

$r[0] now looks like this:
Array
(
[0] => "some text in quotes"
[1] => "some more, this time with a comma"
)

As you can see, the non-greediness of the regexp handles is the key. Run
your original line through substr_replace() to get these strings replaced
with tokens and resume where you left off.

HTH
Garp

Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Buddy | last post by:
Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
11
by: Dimitris Georgakopuolos | last post by:
Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However,...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
1
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
1
by: NvrBst | last post by:
I want to use the .replace() method with the regular expression /^ %VAR % =,($|&)/. The following DOESN'T replace the "^default.aspx=,($|&)" regular expression with "":...
1
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: mar23 | last post by:
Here's the situation. I have a form called frmDiceInventory with subform called subfrmDice. The subform's control source is linked to a query called qryDiceInventory. I've been trying to pick up the...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.