Faster search algorithm for lists

I have a file containing 10,000 words which I import as strings and store in a list. Now, I have another list of strings and have to delete the common words in both lists. When I try to run the program, it takes a lot of time. Please let me know if there's any other way to do it.

For now, I am using:

For item in list_1:
If item in list_2:
{Do the deletion}

Jun 15 '18 #1

Subscribe Post Reply

2221

husoski

It's much faster to search a set than a list. Since that list doesn't seem to change during the program run, simply use

Expand|Select|Wrap|Line Numbers

 
    common_words = set(list_2) # after building list_2 from the file

    ...

    for word in list_1:

         if word in common_words:

             ...do the deletion

If you are just making a reduced list, then you can do that without any explicit looping:

Expand|Select|Wrap|Line Numbers

uncommon = [w for w in list_2 if w not in common_words]

That's a "list comprehension". It builds a list of every element from the list_2 sequence that is not an element of the common_words set.

Jun 15 '18 #2

rayisalive

Thanks a lot. I'll definitely try it out.

Jun 18 '18 #3

kiskiller0

I would use this big list to build sub_lists , one sub_list for each letter , thus , reducing the search time by almost 26 times (in average) , I don't know if python's built-in search functions do that or just loop through every element.

Jun 21 '18 #4

husoski

Searching a set is much faster than searching lists--even with the sublists you describe. It's a hash table search, and even if the key-hashing algorithm is a horrible match for the actual key distribution (nearly all "words" happen to hash to the same number, modulo the table size) then that worst-case performance is a sequential list search. Heads anytime you win; tails 30 times in a row you lose a little.

Jul 1 '18 #5

Similar topics

Search algorithm to use

by: joshc | last post by:

If I have an array of data that I know to be sorted in increasing order, and the array is less than 50 elements, and I want to find the first element greater than a certain value, is a simple...

C / C++

FileSystem search algorithm needed!

by: Ben Fidge | last post by:

Hi What is the most efficient way to gather a list of files under a given folder recursively with the ability to specify exclusion file masks? For example, say I wanted to get a list of all...

C# / C Sharp

Algorithm for faster search in DataTable

by: Rathtap | last post by:

I read claims from a file into a DataTable and DataRow and then use the data adapter's Update method to push those claims into the database. I import a few thousand claims at a time and before...

C# / C Sharp

iTunes Search Algorithm/Data Structure?

by: Bill Mill | last post by:

Hello all, What data structure would you use to implement something analogous to the iTunes search? I imagine that it must be a tree of some sort, but I can't figure out an easy structure for...

Python

Graph Search Algorithm (Visual Web Developer Express)

by: gihope | last post by:

Hi, I wonder if someone could help me. I'm trying to develop a search algorithm as I need an engine that can determine the shortest route between two locations based on the number of stops. I...

.NET Framework

Q on loading txt file in C++, and search algorithm

by: mintominto82 | last post by:

Hello, I am a very newbie (big emphasis on being a newbie) to c++. I read c++PRIMER but apparently, it did not provide much help. I need a bit of help in codes loading txt file in C++. My...

C / C++

Access Search Algorithm?

by: blyxx86 | last post by:

Hey everyone, I was wondering how I would go about creating some sort of search algorithm that would find partial matches to a string variable search? (ie. 50123 would be an 80% match with 05123 .....

Microsoft Access / VBA

Binary Search Algorithm

by: Kelicula | last post by:

Why?: One commonly used algorithm is the binary search. If you don't already know it, you should read on. Very helpful. Saves much CPU. Reduces computations exponentially. When searching...

Algorithms / Advanced Math

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++