473,625 Members | 3,264 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Complex regular expression?

Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray
Jul 17 '05 #1
4 2013
"M Wells" <pl**********@p lanetthoughtful .org> wrote in message
news:oa******** *************** *********@4ax.c om...
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.


A not so elegant way:

function to_tab($matches ) {
return strtr($matches[1], ",", "\t") . $matches[2];
}

$r = preg_replace_ca llback('/([^"]*)("?[^"]*"?)/', 'to_tab', $s);
Jul 17 '05 #2
M Wells schrieb:
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other
commas (ie those that do indicate the delimitation of a field,
rather than which appear within the value of a field) in the file
to tabs without affecting the commas that appear in the column in
question.

An example of the contents of the file would be:

1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00

So, I'm trying to create a regular expression that will change all
the commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray


Another way to solve this problem is to replace all commas which where
NOT followed by spaces... If u can be sure that commas in quotes always
have a space behind them...

$new_string = preg_replace('/\,([\S])/',"\t$1",$strin g);

*Hannes*

Jul 17 '05 #3
M Wells wrote:
I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.


http://www.php.net/manual/en/function.fgetcsv.php

If you have commas inside the quoted fields this function takes care of it
for you. You can specify what sort of delimiter as well (eg tab, comma etc)

Chris

--
Chris Hope
The Electric Toolbox Ltd
http://www.electrictoolbox.com/
Jul 17 '05 #4

"M Wells" <pl**********@p lanetthoughtful .org> wrote in message
news:oa******** *************** *********@4ax.c om...
Hi All,

I couldn't find a regular expressions group to ask this in, so I
thought I'd ask here as I'm a little familiar with php's regular
expressions syntax.

I have a comma delimited text file that I need to change to being tab
delimited.

My problem is that commas appear in the values of one of my columns,
and I'm trying to think of a graceful way of changing the other commas
(ie those that do indicate the delimitation of a field, rather than
which appear within the value of a field) in the file to tabs without
affecting the commas that appear in the column in question.

An example of the contents of the file would be:

1,"1","20040301 ","08-08","BOOK, RETAIL",20.00,2 3.56
2,"1","20040301 ","03-09","BOOK, WHOLESALE, DISTRIBUTOR",15 .99,22.00

So, I'm trying to create a regular expression that will change all the
commas to tabs, except where the comma(s) appear within quotes.

I've tried several different approaches, including a three-step
process where I just change the commas that appear within quotes to a
known 'escape' value, then changing all the commas to tabs, then
changing the 'escape' values back to commas, but I can't seem to
create a regular expression that will take into account the
possibility of several commas appearing between quotes.

I'm wondering if anyone can help me understand this better?

Many thanks in advance,

Murray


Had a bit of a tinker, came up with this:

<?php
$x='1,2,3,"some text in quotes",4,5,"so me more, this time with a comma"';
preg_match_all( '/(".*?")/',$x,$r);

$r[0] now looks like this:
Array
(
[0] => "some text in quotes"
[1] => "some more, this time with a comma"
)

As you can see, the non-greediness of the regexp handles is the key. Run
your original line through substr_replace( ) to get these strings replaced
with tokens and resume where you left off.

HTH
Garp

Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
5145
by: Buddy | last post by:
Can someone please show me how to create a regular expression to do the following My text is set to MyColumn{1, 100} Test I want a regular expression that sets the text to the following testMyColumn{1, 100}Test Basically I want the regular expression to add the word test infront of the
4
3218
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go over each document, find out if it contains a header and/or a footer and extract only the main content part. The headers and the footers have no specific format and I have to detect and remove them using a list of strings that may appear as...
11
5366
by: Dimitris Georgakopuolos | last post by:
Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However, I want to use the text included within the brackets to do a lookup so that I can replace the expression with the new text. For example:
3
3209
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is that this will one or more occurrences to replace all the white space between with a comma. This search ElseIf InStr(1, indivline, "$") Then insert a replace statement that uses the regular expression to find and replace all the white space...
7
3817
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I want to avoid that. My question here is if there is a way to pass either a memory stream or array of "find", "replace" expressions or any other way to avoid multiple copies of a string. Any help will be highly appreciated
9
3353
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use an app call The Regulator, which makes it pretty easy to build and test regular expressions. As a warning, I'm real weak with regular expressions. Let's say my regular expression is:
25
5146
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How do I gain access to the expression (not the matches) at runtime? Thanks, Mike
1
4376
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find the first regular expression that matches the string. I've gor the regular expressions ordered so that the highest priority is first (if two or more regular expressions match the string I want the first one returned) The code that does this has...
1
3386
by: NvrBst | last post by:
I want to use the .replace() method with the regular expression /^ %VAR % =,($|&)/. The following DOESN'T replace the "^default.aspx=,($|&)" regular expression with "": --------------------------------- myStringVar = myStringVar.replace("^" + iName + "=,($|&)", ""); --------------------------------- The following DOES replace it though: --------------------------------- var match = myStringVar.match("^" + iName + "=,($|&)");
0
8189
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8635
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8356
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8497
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6118
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5570
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4089
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2621
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1803
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.