473,563 Members | 2,904 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

how to search multiple textfiles ?

hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki
Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.
Sep 26 '08 #1
9 3584
On Sep 26, 8:35*am, Stef Mientki <s.mien...@ru.n lwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki

Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in hethandelsregis ter onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.

On Windows I use the free version of Bare Grep: http://www.baremetalsoft.com/baregrep/

No, it's not a Python solution, but it works for my needs. You should
try using Python to search your script files and see if it really is
too slow though.

Mike
Sep 26 '08 #2
On Sep 26, 9:35 am, Stef Mientki <s.mien...@ru.n lwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )
If you're on *nix platform, you can use:

$ find -name "*py" | xargs egrep "\bword\b"

HTH,
George
Sep 26 '08 #3
Hi!

On Windows, you can use the (standard) command findstr

Example:
findstr /n /s /I strsearched *.py

@-salutations
--
Michel Claveau

Sep 26 '08 #4
Stef Mientki <s.*******@ru.n lwrites:
Does anyone know of a search library that performs this task fast ?
You mean you want a Python search engine (with inverted indexes and all that)?
Try: nucular.sf.net
Sep 26 '08 #5
On Sep 26, 6:35*am, Stef Mientki <s.mien...@ru.n lwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki

Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in hethandelsregis ter onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.
I use 'fgrep' ie... `fgrep -r "toFind" /source`

~Sean
Sep 26 '08 #6
Hi !

Thanks for return.

Some infos: from a long time, I found that it's often more fast to use
windows's command, instead of develop in high level language (and also,
low level...)

FINDSTR is fast. OK. But internal commands are more fast. Example : DIR
(with all his options)
And it's faster to read the result via a Pipe.
Thus, I use frequently this sort of function:
import os

def cmdone(repstart , commande, moderetour="LIS T"):
os.chdir(repsta rt)
sret=''.join(os .popen(commande ))
if moderetour.uppe r() == "STR":
return sret
else:
return sret.split('\n' )

print cmdone('D:\\dev \\python','find str /N /I ponx *.py','STR')
print
print cmdone('D:\\dev \\python','dir *.jpg /B')


Sorry for my bad english, and have a good day...
--
Michel Claveau
Sep 27 '08 #7
In message
<bf************ *************** *******@k30g200 0hse.googlegrou ps.com>, George
Sakkis wrote:
$ find -name "*py" | xargs egrep "\bword\b"
Better:

find -name '*.py' -exec grep -E "\bword\b" {} \;

Sep 29 '08 #8
On Sep 29, 5:16*am, Lawrence D'Oliveiro <l...@geek-
central.gen.new _zealandwrote:
In message
<bf1664da-1e2e-48c8-a108-66b0fb457...@k3 0g2000hse.googl egroups.com>, George

Sakkis wrote:
$ find -name "*py" | xargs egrep "\bword\b"

Better:

* * find -name '*.py' -exec grep -E "\bword\b" {} \;
In what way is this better ? I don't dispute it, I'm just curious.
Sep 29 '08 #9
In message <ma************ *************** ***********@pyt hon.org>, Stef
Mientki wrote:
Lawrence D'Oliveiro wrote:
>In message <ma************ *************** ***********@pyt hon.org>, Stef
Mientki wrote:
>>I'm really amazed by the speed of Python !!
It can only be beaten by findstr, which is only available on windows.

Did you try find -exec grep -F?
well my windows version doesn't understand that :
I assumed when you said "It can only be beaten by findstr, which is only
available on windows", that meant you had tried some non-Windows options,
before concluding that Windows "findstr" was the fastest.
Oct 1 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
3324
by: Laiverd.COM | last post by:
Currently working on a site that requires multiple different styles for all kind of things (tables, link colors etc.) and am suddenly ;-) wondering what would be the wise approach: have all these different styles in one huge stylesheet, or separate into smaller css files and include only those that are necessary. Target audience alos includes...
83
5866
by: D. Dante Lorenso | last post by:
Trying to use the 'search' in the docs section of PostgreSQL.org is extremely SLOW. Considering this is a website for a database and databases are supposed to be good for indexing content, I'd expect a much faster performance. I submitted my search over two minutes ago. I just finished this email to the list. The results have still not...
32
14783
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if ((someString.IndexOf("something1",0) >= 0) || ((someString.IndexOf("something2",0) >= 0) ||
2
1730
by: Chris Murphy via DotNetMonster.com | last post by:
Hey all, just wondering if anyone can point me in the right direction. I'm developing a solution that allows a user to store multiple text-based content (like code snippets, notes, documents etc.) in one master document. I'm not exactly sure which is the most efficient method of going about this. Should I: a. Use a container file like a CAB...
2
2987
by: pengbsam | last post by:
Hello: I need to write a program that search through multiple level BOM, get all the items. It seems like a easy enough project, but when I put my hands on it and couple of hundred lines of codes later. It starts look more difficult. I was hopeing someone might have a very easy way to do it that I don't know about... The BOM structure looks...
5
11930
by: mforema | last post by:
Hi Everyone, I want to search records by typing in multiple keywords. I currently have a search form. It has a combo box, text box, Search command button, and a subform. The combo box lists the names of the fields found in my subform. The search form is supposed to allow a user to choose which field he/she wants to search by and then type a...
47
3403
by: Henning_Thornblad | last post by:
What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000): row+="a"
0
1610
by: sloan | last post by:
Yes, each connection string should be unique for each DataStore . I use DataStore because a connection string can be setup for Databases as well as TextFiles, ExcelFiles, etc, etc. ........... "cj2" <cj2@nospam.nospamwrote in message
6
12780
by: cj2 | last post by:
If from withing one program I want to access multiple databases on a sql server do I need to make multiple connections? The connection string says initial catalog. I'd assume I could denote the database/catalog in the query but I'm not sure the notation to use.
0
7665
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7583
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7888
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8106
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
5484
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5213
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3643
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3626
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2082
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.