473,763 Members | 6,666 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parsing a search string

Happy new year! Since I have run out of alcohol, I'll ask a question that I
haven't really worked out an answer for yet. Is there an elegant way to turn
something like:
moo cow "farmer john" -zug


into:

['moo', 'cow', 'farmer john'], ['zug']

I'm trying to parse a search string so I can use it for SQL WHERE constraints,
preferably without horrifying regular expressions. Uhh yeah.

From 2005,
Freddie

Jul 18 '05 #1
15 3628
How ,
I just posted on something similar earlier ;)
Ok first of all you might want to try shlex it is in the standard
library.
If you don't know what cStringIO is dont worry about it it is just to
give a file like object to pass to shlex.
If you have a file just pass it in opened.
example: a = shlex.shlex(ope n('mytxt.txt',' r'))

py>import cStringIO
py>d = cStringIO.Strin gIO()
py>d.write('moo cow "farmer john" -zug')
py>d.seek(0)
py>a = shlex.shlex(d)
py>a.get_token( )
'moo'
py>a.get_token( )
'cow'
py>a.get_token( )
'"farmer john"'
py>a.get_token( )
'-'
py>a.get_token( )
'zug'
py>a.get_token( )
''
# ok we try again this time we add - to valid chars so we can get it
grouped as a single token .
py>d.seek(0)
py>a = shlex.shlex(d)
py>a.wordchars += '-' # add the hyphen
py>a.get_token( )
'moo'
py>a.get_token( )
'cow'
py>a.get_token( )
'"farmer john"'
py>a.get_token( )
'-zug'
py>a.get_token( )
''

Hth,
M.E.Farmer

Jul 18 '05 #2
That's not bad going considering you've only run out of alcohol at 6 in
the morning and *then* ask python questions.

Anyway - you could write a charcter-by-character parser function that
would do that in a few minutes...

My 'listquote' module has one - but it splits on commas not whitespace.
Sounds like you're looking for a one-liner though.... regular
expressions *could* do it............. ..

Regards,

Fuzzy
http://www.voidspace.org.uk/atlantib...tml#llistquote

Jul 18 '05 #3
Freddie wrote:
Happy new year! Since I have run out of alcohol, I'll ask a question that I
haven't really worked out an answer for yet. Is there an elegant way to turn
something like:
> moo cow "farmer john" -zug


into:

['moo', 'cow', 'farmer john'], ['zug']

I'm trying to parse a search string so I can use it for SQL WHERE constraints,
preferably without horrifying regular expressions. Uhh yeah.


The shlex approach, finished:

searchstring = 'moo cow "farmer john" -zug'
lexer = shlex.shlex(sea rchstring)
lexer.wordchars += '-'
poslist, neglist = [], []
while 1:
token = lexer.get_token ()
# token is '' on eof
if not token: break
# remove quotes
if token[0] in '"\'':
token = token[1:-1]
# select in which list to put it
if token[0] == '-':
neglist.append( token[1:])
else:
poslist.append( token)

regards,
Reinhold
Jul 18 '05 #4
As I noted before shlex requires a file like object or a open file .
py> import shlex
py> a = shlex.shlex('fg fgfg dgfgfdgfdg')
py> a.get_token()
Traceback (most recent call last):
File "<input>", line 1, in ?
File ".\shlex.py ", line 74, in get_token
raw = self.read_token ()
File ".\shlex.py ", line 100, in read_token
nextchar = self.instream.r ead(1)
AttributeError: 'str' object has no attribute 'read'

M.E.Farmer

Jul 18 '05 #5
M.E.Farmer wrote:
As I noted before shlex requires a file like object or a open file .
py> import shlex
py> a = shlex.shlex('fg fgfg dgfgfdgfdg')
py> a.get_token()
Traceback (most recent call last):
File "<input>", line 1, in ?
File ".\shlex.py ", line 74, in get_token
raw = self.read_token ()
File ".\shlex.py ", line 100, in read_token
nextchar = self.instream.r ead(1)
AttributeError: 'str' object has no attribute 'read'


Which Python version are you using?

The docs say that since Py2.3 strings are accepted.

regards,
Reinhold
Jul 18 '05 #6
I am right in the middle of doing text parsing so I used your example as a
mental exercise. :-)

Here's a NDFA for your text:

b 0 1-9 a-Z , . + - ' " \n
S0: S0 E E S1 E E E S3 E S2 E
S1: T1 E E S1 E E E E E E T1
S2: S2 E E S2 E E E E E T2 E
S3: T3 E E S3 E E E E E E T3

and the end-states are:

E: error in text
T1: You have the words: moo, cow
T2: You get "farmer john" (w quotes)
T3: You get zug

Can't gurantee that I did it right - I did it really quick - and it's
*specific* to your text string.

Now just need to hire a programmer to write some clean Python parsing code.
:-)

--
It's me

"Freddie" <li**********@z ebra-madcowdisease.g iraffe-org> wrote in message
news:kX******** ********@text.u senetserver.com ...
Happy new year! Since I have run out of alcohol, I'll ask a question that I haven't really worked out an answer for yet. Is there an elegant way to turn something like:
> moo cow "farmer john" -zug
into:

['moo', 'cow', 'farmer john'], ['zug']

I'm trying to parse a search string so I can use it for SQL WHERE

constraints, preferably without horrifying regular expressions. Uhh yeah.

From 2005,
Freddie


Jul 18 '05 #7
Ah! that is what the __future__ brings I guess.........
Damn that progress making me outdated ;)
Python 2.2.3 ( a lot of extensions I use are stuck there , so I still
use it)
M.E.Farmer

Jul 18 '05 #8
M.E.Farmer wrote:
Ah! that is what the __future__ brings I guess.........
Damn that progress making me outdated ;)
Python 2.2.3 ( a lot of extensions I use are stuck there , so I still
use it)


I'm also positively surprised how many cute little additions are there
every new Python version. Great thanks to the great devs!

Reinhold
Jul 18 '05 #9
"It's me" wrote:
Here's a NDFA for your text:

b 0 1-9 a-Z , . + - ' " \n
S0: S0 E E S1 E E E S3 E S2 E
S1: T1 E E S1 E E E E E E T1
S2: S2 E E S2 E E E E E T2 E
S3: T3 E E S3 E E E E E E T3


Now if I only had an NDFA for parsing that syntax...

:)
Andrew
da***@dalkescie ntific.com

Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
5207
by: flam | last post by:
Hello, I am having a hard time spliting a string into an array for use in a search. Here is the situation. The user will input a search string. Normally I can just split the string by "split /\s+/, $string". However, I want to allow the user to put words so they appear and are searched together ie. "search this" would be searched as one term and not "search" and then somewhere else "this", etc. So if a user enters something like this:
50
4957
by: z. f. | last post by:
HI, i have string in format dd/mm/yyyyy hh:mm:ss and giving this as an input to DateTime.Parse gives a string was not recognized as a valid date time format string error. how do i make the parse method to accept formating that i need. if i do console.writeLine DateTime.Now.ToString it gives the format: YYYY-MM-DD HH:MM:SS TIA, Z.
5
5108
by: davehansen22 | last post by:
Is there a way to generate a MySQL WHERE clause from a search string like this: "(dave OR hansen) php programmer" I would want to use the generated MySQL clause against a "memo" type field. I searched but apparently I'm not using the right keywords.
1
11698
by: atl10spro | last post by:
Hello Everyone, I am new to MS Access and although I have created several different databases I lack the VB knowledge to code a search function. I am turning to your expertise for assistance. I am using MS Access 2003. This is what I am looking for: A text field for the user to enter the search string or keyword.
2
2085
by: beersa | last post by:
Hi All, I am looking for fastest search alogrithm in vb.net or vb2005. I write a code in vb2005 to find a string in given text file. The text file contains about 120,000 lines and search string is contain in another text file which is abt 4000 lines. My code takes nearly 15 mins to scan the whole file. Could anyone suggest me the fastest way to do this?
2
1654
by: sovixi | last post by:
Hi I want to search a text file (.txt) for words that I specify in my program. However, I can’t come up with any working solution. My program search string in a text file for ‘met|meeting|meets’ but it returns values from words like “metro”: met-ro. I want it to look only for words ‘met, meeting, etc.’ and I don’t want them to be a part of other words. This is what I have right now: pattern = re.compile('(?i)met|meeting') match =...
1
1875
by: Nitinkcv | last post by:
Hi, I have a textbox and a button. In my textbox i have to enter the query string(say shoes) and on clicking the button takes me to a page show all item related to the search string( in this case shoes). But on mixing the search string with wildcards it displays that no items could be found. For eg: for search string s@h^o$e@s it would go to the error page. So is there ant way i could like extract the wildcards out of my search string...
5
3103
by: baju123 | last post by:
Hi, Would anybody tell me an efficient method to parse a string inside a variable which is line separated ... Like this value1: line1 /n value2: line2 /n value3: line3 /n
2
2352
by: Mufasa | last post by:
Does anybody have any code that will take a search string and parse it in to the appropriate parts? For instance: +google -news -bush TIA - J.
0
9564
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9387
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10148
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10002
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9938
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9823
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7368
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5270
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3528
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.