473,416 Members | 1,712 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,416 software developers and data experts.

A question about searching with multiple strings


Hi there.

I have defined a class called Item with several (about 30 I think)
different attributes (is that the right word in this context?). An
abbreviated example of the code for this is:

class Item(object):

def __init__(self, height, length, function):
params = locals()
del params['self']
self.__dict__.update(params)
def __repr__(self):

all_items = self.__dict__.items()
return '%s,%s,%s' % (self.height, self.length, self.function)

I have a csv file that I use to store and retrieve all the info about
each Item, one item per line.

I have written a little peice of python that lets me search through all
Items (after reading them into a variable called all_items) and will
return matching results:

for item in all_items:

strItem = str(item)

m = re.search(p[i], strItem, flags = re.I)
if m:
height = getattr(item, "height")
length = getattr(item, "length")
function = getattr(item, "function")
print "height is %s, length is %s and function is %s" % height,
length, function

This has the limitation of only working over a single search item. I
want to be able to search over an uncontrollable number of search
strings because I will have people wanting to search over 2, 3 or even
(maybe) as many as 5 different things.

I was thinking that I would try to write a function that created a
sublist of Items if it matched and then run subsequent searches over
the subsequent search strings using this sublist.

I am not entirely sure how to store this subset of Items in such a way
that I can make searches over it. I guess I have to initialize a
variable of type Item, which I can use to add matching Item's to, but
I have no idea how to do that....(If it was just a list I could say
"sublist = []", what do I use for self defined classes? I Am also
usure how to go about creating a function that will accept any number
of parameters.

Any assistance with these two questions will be greatly appreciated!

Thanks!

googleboy

Oct 21 '05 #1
3 1851
"googleboy" <my******@yahoo.com> writes:
for item in all_items:

strItem = str(item)

m = re.search(p[i], strItem, flags = re.I)
if m:
height = getattr(item, "height")
length = getattr(item, "length")
function = getattr(item, "function")
print "height is %s, length is %s and function is %s" % height,
length, function
This has the limitation of only working over a single search item. I
want to be able to search over an uncontrollable number of search
strings because I will have people wanting to search over 2, 3 or even
(maybe) as many as 5 different things.

I was thinking that I would try to write a function that created a
sublist of Items if it matched and then run subsequent searches over
the subsequent search strings using this sublist.

I am not entirely sure how to store this subset of Items in such a way
that I can make searches over it. I guess I have to initialize a
variable of type Item, which I can use to add matching Item's to, but
I have no idea how to do that....(If it was just a list I could say
"sublist = []", what do I use for self defined classes? I Am also
usure how to go about creating a function that will accept any number
of parameters.

Any assistance with these two questions will be greatly appreciated!


Don't use a real list, use an iterator. Inn particular,
itertools.ifilter will take an arbitrary sequence and returns a
sequence of items that a function says to.

for item in ifilter(lambda i: re.search(p[i], str(i), flags = re.I),
all_items):
print "height is %s, length is %s and function is %s" % \
(item.height, item.length, item.function)

The trick is that ifilter returns a sequence, so you can nest them:

for item in filter(filter1, ifilter(filter2, ifilter(filter3, all_items))):
...

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Oct 21 '05 #2
On Fri, 21 Oct 2005 13:39:17 -0700, googleboy wrote:

Hi there.

I have defined a class called Item with several (about 30 I think)
different attributes (is that the right word in this context?). An
abbreviated example of the code for this is:

class Item(object):

def __init__(self, height, length, function):
params = locals()
del params['self']
self.__dict__.update(params)
I get very worried when I see code like that. It makes me stop and think
about what it does, and why you would want to do it. I worry about hidden
side effects. Instead of just groking the code instantly, I've got to stop
and think. You're taking a copy of the locals, deleting self from it, and
them updating self's dictionary with them... why? What do you hope to
achieve?

If I were project manager, and one of my coders wrote something like this,
I would expect him or her to have a really good reason for it. I'd be
thinking not only of hidden bugs ("what if there is something in locals
you don't expect?"), but every time a developer has to work on this class,
they have to stop and think about it.

Joel (of Joel On Software fame) talks about code looking wrong and
smelling dirty. This code might work. It might be perfectly safe. But
there's a whiff to this code.

http://www.joelonsoftware.com/articles/Wrong.html
def __repr__(self):
all_items = self.__dict__.items()
return '%s,%s,%s' % (self.height, self.length, self.function)
You aren't using all_items. Why waste a lookup fetching it?
I have a csv file that I use to store and retrieve all the info about
each Item, one item per line.
Would you like to give us a couple of examples of items from the CSV file?

I have written a little peice of python that lets me search through all
Items (after reading them into a variable called all_items) and will
return matching results:

for item in all_items:

strItem = str(item)

m = re.search(p[i], strItem, flags = re.I)
if m:
height = getattr(item, "height")
length = getattr(item, "length")
function = getattr(item, "function")
print "height is %s, length is %s and function is %s" % height,
length, function
And here we why global variables are Bad: without knowing what p is, how
are we supposed to understand this code?
This has the limitation of only working over a single search item.
So you are searching items for items... I think you need to use a better
name for your class. What does class Item actually represent?
I
want to be able to search over an uncontrollable number of search
strings because I will have people wanting to search over 2, 3 or even
(maybe) as many as 5 different things.

I was thinking that I would try to write a function that created a
sublist of Items if it matched and then run subsequent searches over
the subsequent search strings using this sublist.
That might work.
I am not entirely sure how to store this subset of Items in such a way
that I can make searches over it.
How about in a list?
I guess I have to initialize a
variable of type Item, which I can use to add matching Item's to, but
I have no idea how to do that....(If it was just a list I could say
"sublist = []", what do I use for self defined classes?
See my next post (to follow).
I Am also
usure how to go about creating a function that will accept any number
of parameters.


def func1(*args):
for arg in args:
print arg

def func2(mandatory, *args):
print "Mandatory", mandatory
for arg in args:
print arg

Does that help?

--
Steven.

Oct 22 '05 #3
On Fri, 21 Oct 2005 13:39:17 -0700, googleboy wrote:
Hi there.

I have defined a class called Item with several (about 30 I think)
different attributes (is that the right word in this context?).


Generally speaking, attributes shouldn't be used for storing arbitrary
items in an object. That's what mapping objects like dicts are for. I
would change your class so that it no longer mucked about with it's
internal __dict__:

class Item():
def __init__(self, height, length, function, **kwargs):
# assumes that ALL items will have height, length, function
# plus an arbitrary number (may be zero) of keyword args
self.height = height
self.length = length
self.function = function
self.data = kwargs # store custom data in an instance attribute,
# NOT in the object __dict__
You would use it something like this:

def create_items():
all_items = []
# WARNING WARNING WARNING
# pseudo-code -- this doesn't work because I don't
# know what your input file looks like
open input file
for record in input file:
h = read height
l = read length
f = read function
D = {}
for any more items in record:
D[item key] = item value
newitem = Item(h, l, f, D)
all_items.append(newitem)
close input file
return all_items

Now you have processed your input file and have a list of Items. So let's
search for some!

Firstly, create a function that searches a single Item:

def SearchOneOr(source, height=None, length=None, \
function=None, **kwargs):
"""Performs a short-circuit OR search for one or more search term."""
if height is not None:
if source.height == height: return True
if length is not None:
if source.length == length: return True
if function is not None:
if source.function == function: return True
for key, value in kwargs:
if source.data.has_key(key) and source.data[key] == value:
return True
return False

def SearchOneAnd(source, height=None, length=None, \
function=None, **kwargs):
"""Performs a short-circuit AND search for one or more search term."""
if height is not None:
if source.height != height: return False
if length is not None:
if source.length != length: return False
if function is not None:
if source.function != function: return False
for key, value in kwargs:
if source.data.has_key(key) and source.data[key] != value:
return False
else:
return False
return True
Now create a function that searches all items:

def SearchAll(source_list, flag, height=None, length=None, \
function=None, **kwargs):
found = []
if flag:
search = SearchOneOr
else:
search = SearchOneAnd
for source in source_list:
if search(source, height, length, function, kwargs):
found.append(source)
return found

Now pass all_items to SearchAll as the first argument, and it will search
through them all and return a list of all the items which match your
search terms.

Hope this helps.
--
Steven.

Oct 22 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

18
by: jblazi | last post by:
I should like to search certain characters in a string and when they are found, I want to replace other characters in other strings that are at the same position (for a very simply mastermind game)...
4
by: tgiles | last post by:
Hi, all. Another bewildered newbie struggling with Python goodness. This time it's searching strings. The goal is to search a string for a value. The string is a variable I assigned the name...
1
by: smita | last post by:
Hi, I want to search an xml file for particular searchstrings and also based on the date i.e. all items containing the date -----prior to the specified date, or ----on that date or ----- after the...
9
by: Tim Rentsch | last post by:
I have a question about what ANSI C allows/requires in a particular context related to 'volatile'. Consider the following: volatile int x; int x_remainder_arg( int y ){ return x % y; }
1
by: anu | last post by:
helloo iam working on a parser wherein i search for a few strings in thousands of lines... ie i have an open fifo from which i keep receiving lines and lines of data and iam supposed to search...
35
by: Cor | last post by:
Hallo, I have promised Jay B yesterday to do some tests. The subject was a string evaluation that Jon had send in. Jay B was in doubt what was better because there was a discussion in the C#...
10
by: chrisben | last post by:
Hi, Here is the scenario. I have a list of IDs and there are multiple threads trying to add/remove/read from this list. I can do in C# 1. create Hashtable hList = Hashtable.Synchronized(new...
8
by: Allan Ebdrup | last post by:
What would be the fastest way to search 18,000 strings of an average size of 10Kb, I can have all the strings in memory, should I simply do a instr on all of the strings? Or is there a faster way?...
4
by: Hunk | last post by:
Hi I have a binary file which contains records sorted by Identifiers which are strings. The Identifiers are stored in ascending order. I would have to write a routine to give the record given...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.