Replace string except inside quotes?

beliavsky

The code

for text in open("file.txt","r"):
print text.replace("foo","bar")[:-1]

replaces 'foo' with 'bar' in a file, but how do I avoid changing text
inside single or double quotes? For making changes to Python code, I
would also like to avoid changing text in comments, either the '#' or
'""" ... """' kind.

Jul 18 '05 #1

Subscribe Reply

3659

Michael J. Fromberger

In article <30**************************@posting.google.com >,
be*******@aol.com wrote:

The code

for text in open("file.txt","r"):
print text.replace("foo","bar")[:-1]

replaces 'foo' with 'bar' in a file, but how do I avoid changing text
inside single or double quotes? For making changes to Python code, I
would also like to avoid changing text in comments, either the '#' or
'""" ... """' kind.

The first part of what you describe isn't too bad, here's some code that
seems to do what you want:

import re

def replace_unquoted(text, src, dst, quote = '"'):
r = re.compile(r'%s([^\\%s]|\\[\\%s])*%s' %
(quote, quote, quote, quote))

out = '' ; last_pos = 0
for m in r.finditer(text):
out += text[last_pos:m.start()].replace(src, dst)
out += m.group()
last_pos = m.end()

return out + text[last_pos:].replace(src, dst)

Example usage:
print replace_unquoted(file('foo.txt', 'r').read(),
"foo", "bar")

It's not the most elegant solution in the world. This code does NOT
deal with the problem of commented text. I think it will handle triple
quotes, though I haven't tested it on that case.

At any rate, I hope it may help you get started.

Cheers,
-M

--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA

Jul 18 '05 #2

Jeff Shannon

Michael J. Fromberger wrote:

It's not the most elegant solution in the world. This code does NOT
deal with the problem of commented text. I think it will handle triple
quotes, though I haven't tested it on that case.

I believe that it will probably work for triple quotes that begin and
end on the same line. Of course, the primary usage of triple-quotes is
for multiline strings, but given that the file is being examined one
line at a time, you'd need some method of maintaining state in order to
handle multiline strings properly. (Note that this problem is true
regardless of whether the strings are true triple-quoted multiline
strings, or single-quoted single-line strings broken across two lines of
source code using '\'.)

If the entire file is read in and processed as a single chunk, instead
of line-by-line, then *some* of the problems go away (at the cost of
potentially very large memory consumption and poor performance, if the
file is large). The fact that triple-quoted strings work out (mostly)
correctly when viewed as three pairs of quotes will help. But if a
triple-quoted string *contains* a normally quoted string (e.g., """My
"foo" object"""), then things break down again.

In order to handle this sort of nested structure with anything
resembling true reliability, it's necessary to step up to a true
lexing/parsing procedure, instead of mere string matching and regular
expressions.

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #3

Raymond Hettinger

<be*******@aol.com> wrote > The code

for text in open("file.txt","r"):
print text.replace("foo","bar")[:-1]

replaces 'foo' with 'bar' in a file, but how do I avoid changing text
inside single or double quotes? For making changes to Python code, I
would also like to avoid changing text in comments, either the '#' or
'""" ... """' kind.

The source for the tokenize module covers all these bases.
Raymond Hettinger

Jul 18 '05 #4

M.E.Farmer

"Raymond Hettinger" <vz******@verizon.net> wrote in message

The source for the tokenize module covers all these bases. Raymond Hettinger

# tokenize text replace

import keyword, os, sys, traceback
import string, cStringIO
import token, tokenize

################################################## ####################

class Parser:
"""python source code tokenizing text replacer
"""
def __init__(self, raw, out=sys.stdout):
''' Store the source text & set some flags.
'''
self.raw = string.strip(string.expandtabs(raw))
self.out = out

def format(self, search='' ,replace='',
replacetokentype=token.NAME):
''' Parse and send text.
'''
# Store line offsets in self.lines
self.lines = [0, 0]
pos = 0
self.temp = cStringIO.StringIO()
self.searchtext = search
self.replacetext = replace
self.replacetokentype = replacetokentype

# Gather lines
while 1:
pos = string.find(self.raw, '\n', pos) + 1
if not pos: break
self.lines.append(pos)
self.lines.append(len(self.raw))

# Wrap text in a filelike object
self.pos = 0
text = cStringIO.StringIO(self.raw)

# Parse the source.
## Tokenize calls the __call__
## function for each token till done.
try:
tokenize.tokenize(text.readline, self)
except tokenize.TokenError, ex:
traceback.print_exc()
def __call__(self, toktype, toktext,
(srow,scol), (erow,ecol), line):
''' Token handler.
'''
# calculate new positions
oldpos = self.pos
newpos = self.lines[srow] + scol
self.pos = newpos + len(toktext)

# handle newlines
if toktype in [token.NEWLINE, tokenize.NL]:
self.out.write('\n')
return

# send the original whitespace, if needed
if newpos > oldpos:
self.out.write(self.raw[oldpos:newpos])

# skip indenting tokens
if toktype in [token.INDENT, token.DEDENT]:
self.pos = newpos
return

# search for matches to our searchtext
# customize this for your exact needs
if (toktype == self.replacetokentype and
toktext == self.searchtext):
toktext = self.replacetext

# write it out
self.out.write(toktext)
return

################################################## ####################
# just an example
def Main():
import sys
if sys.argv[0]:
filein = open(sys.argv[0]).read()
Parser(filein, out=sys.stdout).format('tokenize', 'MyNewName')

################################################## ####################

if __name__ == '__main__':
Main()

# end of code
This is an example of how to use tokenize to replace names
that match a search string.
If you wanted to only replace strings and not
names then change the replacetokentype to
token.STRING instead of token.NAME etc...
HTH,
M.E.Farmer

Jul 18 '05 #5

Similar topics

String.replace(/</g,'<');

by: higabe | last post by:

Three questions 1) I have a string function that works perfectly but according to W3C.org web site is syntactically flawed because it contains the characters </ in sequence. So how am I...

Javascript

Bug with String Replace with /"/" or double quotes

by: G. | last post by:

This is an obvious bug in the String.Replace function: //load a XML string into a document XmlDocument doc = new XmlDocument(); doc.LoadXml("<test id='' />"); //Obtain the string...

C# / C Sharp

Multiple .replace() in asp.net/vb.net

by: Neo Geshel | last post by:

Greetings I am using VB in my ASP.NET project that uses an admin web site to populate a database that provides content for a front end web site. I am looking for a way to use replace() to...

ASP.NET

How to replace 1000 different values in 250 files in a fast method?

by: serge | last post by:

I managed to put together C# code and have it do the following: 1- Get all the table names that start with the letter "Z" from sysobjects of my SQL 2000 database and put these table names...

C# / C Sharp

replace unknown string via RegEx

by: jarod1701 | last post by:

Hi, i'm currently trying to replace an unknown string using regular expressions. For example I have: user_pref("network.proxy.http", "server1") What do I have to do to replace the...

C# / C Sharp

How do I replace unwanted characters from a string using Reg Exp?

by: vvenk | last post by:

Hello: I have a string, "Testing_!@#$%^&*()". It may have single and double quotations as well. I would like to strip all chararcters others than a-z, A-Z, 0-9 and the comma. I came across...

Visual Basic .NET

replace single slash with double slash

by: dkirkdrei | last post by:

I am having a bit of trouble trying to double up on slashes in a file path. What I am trying to do is very similar to the code below: <? $var =...

PHP

Cannot replace double quotes

by: =?Utf-8?B?R2Vvcmdl?= | last post by:

Hello, I have some XML that is returned to my application from another vendor that I cannot change before it gets to me. I can only alter it after it gets to my application. That being said, I...

Visual Basic .NET

How to truncate char string fromt beginning and replace chars instring by other chars in C or C++?

by: Hongyu | last post by:

Hi, I have a datetime char string returned from ctime_r, and it is in the format like ""Wed Jun 30 21:49:08 1993\n\0", which has 26 chars including the last terminate char '\0', and i would...

C / C++

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET