Efficient Find and Replace

Murali

Given: L = list of integers. X and Y are integers.
Problem: find every occurence of X and replace with Y

Solution1:
def check(s):
if s==X:
return Y
else return s

newL = [ check(s) for s in L]

Now I dont want to create another list but just modify it in place.

SolutionA:

for x in range(len(L)):
if L[x] == X:
L[x:x] = Y

SolutionB:

p = L.index(X)
while p >= 0:
L[p:p] = Y
p = L.index(X)

Problem with both solutions is the efficiency. Both methods require
time O(N^2) in the worst case, where N is the length of the list.
Because L.index() and L[x:x] both take O(N) time in the worst case. But
clearly one should be able to do it in time O(N). Atleast there is a C
solution which does it in O(N) time.

p = head(L)
while (p) {
if (p->data == X) p->data = Y;
}

Is there a python equivalent of this? using iterators or something
which will allow me efficient serial access to the list elements.

- Murali

Jan 28 '06 #1

Subscribe Post Reply

1957

Fredrik Lundh

Murali wrote:

Now I dont want to create another list but just modify it in place.
Why does that matter? List copies are cheap.
SolutionA:

for x in range(len(L)):
if L[x] == X:
L[x:x] = Y
Did you run this code ?
SolutionB:

p = L.index(X)
while p >= 0:
L[p:p] = Y
p = L.index(X)
Did you run this code ?
Problem with both solutions is the efficiency. Both methods require
time O(N^2) in the worst case, where N is the length of the list.
Because L.index() and L[x:x] both take O(N) time in the worst case.

Assigning a single item to L[x:x] doesn't work.

Assigning M items to a slice of length M is an O(M) operation, so if you
do L[x:x+1] = [Y], you get your O(1) operation.

But that's just a silly way to write L[x] = Y, which I assume was what
you meant. L[x] = Y is also an O(1) operation, of course.

</F>

Jan 28 '06 #2

bearophileHUGS

>Because L.index() and L[x:x] both take O(N) time in the worst case.

Why do you think L[x:x] can be O(N)?

This looks O-linear enough to me:

from random import choice
L = [choice("ab") for i in xrange(10)]
L ['b', 'b', 'b', 'a', 'b', 'a', 'b', 'a', 'a', 'a'] for x in xrange(len(L)): .... if L[x] == "a": L[x] = "c" L

['b', 'b', 'b', 'c', 'b', 'c', 'b', 'c', 'c', 'c']

Bye,
bearophile

Jan 28 '06 #3

Murali

I did not actually run the code, so there may be syntax errors and so
forth. But how is L[x] = Y an O(1) operation. Given x finding L[x]
would require to traverse x nodes in the list. So finding L[x] requires
O(x) time. Once you find L[x] setting it to Y is O(1) I agree.

In Solution B: By L.index(X), I mean search for X and then replace it
with Y. Here every time the search starts from the beginning of the
list. Hence the inefficiency.

- Murali

Jan 28 '06 #4

bearophileHUGS

>But how is L[x] = Y an O(1) operation. Given x finding L[x] would require to traverse x nodes in the list.

Python list is a deceptive name, because they are 1D arrays of
pointers. Maybe they are called lists because array(...) is shorter
than list(...).

Bye,
bearophile

Jan 28 '06 #5

David Hirschfield

You aren't getting too many helpful responses. Hope this one helps:

The closest python equivalent to:

p = head(L)
while (p) {
if (p->data == X) p->data = Y;
}

would be:

for i,v in enumerate(L):
if v == X:
L[i] = Y

modifies the list in place.

There's nothing wrong with just doing your solution A, the amount of
time wasted by creating the new list isn't very significant.
-Dave

Murali wrote:

Given: L = list of integers. X and Y are integers.
Problem: find every occurence of X and replace with Y

Solution1:
def check(s):
if s==X:
return Y
else return s

newL = [ check(s) for s in L]

Now I dont want to create another list but just modify it in place.

SolutionA:

for x in range(len(L)):
if L[x] == X:
L[x:x] = Y

SolutionB:

p = L.index(X)
while p >= 0:
L[p:p] = Y
p = L.index(X)

Problem with both solutions is the efficiency. Both methods require
time O(N^2) in the worst case, where N is the length of the list.
Because L.index() and L[x:x] both take O(N) time in the worst case. But
clearly one should be able to do it in time O(N). Atleast there is a C
solution which does it in O(N) time.

p = head(L)
while (p) {
if (p->data == X) p->data = Y;
}

Is there a python equivalent of this? using iterators or something
which will allow me efficient serial access to the list elements.

- Murali

--
Presenting:
mediocre nebula.

Jan 28 '06 #6

Fredrik Lundh

Murali wrote:

I did not actually run the code, so there may be syntax errors and so
forth. But how is L[x] = Y an O(1) operation. Given x finding L[x]
would require to traverse x nodes in the list. So finding L[x] requires
O(x) time.

no, L[x] is an O(1) operation in both Python and C.

the list type uses an internal array, using over-allocation to make
L.append(x) amortized O(1).

</F>

Jan 28 '06 #7

Steven D'Aprano

On Fri, 27 Jan 2006 16:34:53 -0800, Murali wrote:

I did not actually run the code, so there may be syntax errors and so
forth. But how is L[x] = Y an O(1) operation. Given x finding L[x]
would require to traverse x nodes in the list. So finding L[x] requires
O(x) time. Once you find L[x] setting it to Y is O(1) I agree.

You are assuming that Python lists are linked lists. They are not. They
are arrays. Accessing the entry at position x doesn't require traversing
the list at all.

In Solution B: By L.index(X), I mean search for X and then replace it
with Y. Here every time the search starts from the beginning of the
list. Hence the inefficiency.

Yes, but the inefficient search code is done in C, which is so fast that
it really doesn't matter unless your list is HUGE.

--
Steven.

Jan 28 '06 #8

Steven D'Aprano

On Fri, 27 Jan 2006 16:49:24 -0800, David Hirschfield wrote:

You aren't getting too many helpful responses.

Surely pointing out the poster's incorrect assumptions is helpful?

If I said, "I want to add two integers together, but Python's + only does
string concatenation, so I wrote this function to add two ints using bit
manipulation. How do I make it go faster?" would it be better to optimize
the bit manipulation code, or to just tell me that + also does integer
addition?
--
Steven.

Jan 28 '06 #9

Raymond Hettinger

[David Hirschfield]

for i,v in enumerate(L):
if v == X:
L[i] = Y

Here's an alternate solution using a replacement dictionary:

M = {X:Y}
for i, v in enumerate(L):
L[i] = M.get(v, v)

The replacement dictionary directly supports generalization to multiple
substitution pairs without needing additional passes over the input.
Also, if you feel the need for speed, the method lookup can be bound
outside of the loop:

# Find/Replace multiple pairs in a single pass using a bound method
M = {X1:Y1, X2:Y2, X3:Y3}
Mget = M.get
for i, v in enumerate(L):
L[i] = Mget(v, v)
Raymond

Jan 28 '06 #10

Fredrik Lundh

Raymond Hettinger wrote:

for i,v in enumerate(L):
if v == X:
L[i] = Y

Here's an alternate solution using a replacement dictionary:

M = {X:Y}
for i, v in enumerate(L):
L[i] = M.get(v, v)

but that's 2-3 times slower than the OP's corrected code for his use
case, so I'm not sure it qualifies as more "efficient"...

</F>

Jan 28 '06 #11

Raymond Hettinger

> > > for i,v in enumerate(L):

if v == X:
L[i] = Y
Here's an alternate solution using a replacement dictionary:

M = {X:Y}
for i, v in enumerate(L):
L[i] = M.get(v, v)

[Fredrik Lundh] but that's 2-3 times slower than the OP's corrected code for his use
case, so I'm not sure it qualifies as more "efficient"...

The alternate wasn't presented for efficiency. Its virtue is that with
no additional effort, it generalizes to substituting multiple
find/replace pairs in a single pass.
Raymond

Jan 28 '06 #12

Scott David Daniels

Murali wrote:

Given: L = list of integers. X and Y are integers.
Problem: find every occurrence of X and replace with Y Problem with both solutions is the efficiency.

As everyone else says, you are hallucinating efficiency problems
probably brought on by an overdose of Lisp or ML. Here is another
way to get to what you want that will go "even faster" than the
not-a-problem speed you have from simple compare and replace.

lst = range(50) * 10
try:
position = lst.index(X)
while True:
lst[position] = Y
position = lst.index(X, position + 1)
except ValueError:
pass # finally could not find X

I mention this only because people always seem to forget than index
allows you to specify a start (and/or stop) position w/in the list.

--Scott David Daniels
sc***********@acm.org

Jan 28 '06 #13

Murali

Thanks for the replies. I always thought that Python lists were
actually lists under the hood. If they are implemented as arrays of
pointers things should be a lot more efficient. In particular what I
thought was a Linear-time operation is actually an O(1) operation.

Since python allows you to replace single items with lists e.g.
L[x:x+1]= ["a","b","c"], It has to be a little more clever. But with
good data structure design I beleive that this overhead can be
amortized to O(1).

The optional argument to lst.index also makes that an linear time code.
Thanks for all the help.

- Murali

PS: Slowly python is becoming my more favourite language than even C
(except in cases you just cannot use anything but C, e.g. writing a
boot loader)

Jan 28 '06 #14

by: Will | last post by:

I have the following code: var rx = /{{(.+?)}}/i; var expr = 'each {{word}} wrapped in {{curly}} {{braces}} in this {{string}} needs to be {{replaced}} with a different {{value}}.'; var values...

Javascript

Suggest tools/methods for efficient C++ development

by: Sachin Garg | last post by:

Hi, Can you please suggest some tools (except an IDE like Visual Studio/Borland C Builder etc...) which can help me refactor/reengineer/redesign my programs/classes easily and efficiently? ...

C / C++

What is the fastest/efficient way to replace the root element?

by: Jiho Han | last post by:

Suppose I have an xml fragment like: <mother> <child name="Bob" sex="M"/> <child name="Jane" sex="F"/> ... </mother> If I wanted to replace the <mother> element to <father> element, what is...

.NET Framework

Efficient site maintenance with VS.NET

by: Beren | last post by:

Hello, Can anyone give some tips to efficiently update a remote project ? I prefer to keep my projects locally, compile as release and then copy everything it to the remote server. What is...

ASP.NET

This code is efficient

by: fran | last post by:

Hello, Is this code efficient? public static string HTML_FASE1_OTROS_GENERAL = " <table id='tFase1' cellspacing='0' cellpadding='0' width='800' >" + " <tr>" + " <td width='15'></td>" +...

C# / C Sharp

Efficient URL-decoding.

by: Peter Jansson | last post by:

Hello group, The following code is an attempt to perform URL-decoding of URL-encoded string. Note that std::istringstream is used within the switch, within the loop. Three main issues have been...

C / C++

converting strings to most their efficient types '1' --> 1, 'A' ---> 'A', '1.2'---> 1.2

by: py_genetic | last post by:

Hello, I'm importing large text files of data using csv. I would like to add some more auto sensing abilities. I'm considing sampling the data file and doing some fuzzy logic scoring on the...

Python

How can I make this more efficient? (combining DataSet results with the results of a DB lookup.)

by: Ken Fine | last post by:

This is a question that someone familiar with ASP.NET and ADO.NET DataSets and DataTables should be able to answer fairly easily. The basic question is how I can efficiently match data from one...

ASP.NET

fwrite output is not efficient (fast) ?

by: Abubakar | last post by:

Hi, recently some C programmer told me that using fwrite/fopen functions are not efficient because the output that they do to the file is actually buffered and gets late in writing. Is that...

C / C++

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Efficient Find and Replace

Similar topics