473,406 Members | 2,356 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

how to search multiple textfiles ?

hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki
Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in het handelsregister onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.
Sep 26 '08 #1
9 3576
On Sep 26, 8:35*am, Stef Mientki <s.mien...@ru.nlwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki

Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in hethandelsregister onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.

On Windows I use the free version of Bare Grep: http://www.baremetalsoft.com/baregrep/

No, it's not a Python solution, but it works for my needs. You should
try using Python to search your script files and see if it really is
too slow though.

Mike
Sep 26 '08 #2
On Sep 26, 9:35 am, Stef Mientki <s.mien...@ru.nlwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )
If you're on *nix platform, you can use:

$ find -name "*py" | xargs egrep "\bword\b"

HTH,
George
Sep 26 '08 #3
Hi!

On Windows, you can use the (standard) command findstr

Example:
findstr /n /s /I strsearched *.py

@-salutations
--
Michel Claveau

Sep 26 '08 #4
Stef Mientki <s.*******@ru.nlwrites:
Does anyone know of a search library that performs this task fast ?
You mean you want a Python search engine (with inverted indexes and all that)?
Try: nucular.sf.net
Sep 26 '08 #5
On Sep 26, 6:35*am, Stef Mientki <s.mien...@ru.nlwrote:
hello,

I want to search multiple textfiles (python source files) for a specific
word.
I can find all files, open them and do a search,
but I guess that will be rather slow.

I couldn't find any relevant information through google.

Does anyone know of a search library that performs this task fast ?

If it indeed only concerns py-files,
is there another way of searching words ?
( I could imagine that such a "py-only-search" would have benefits,
because you could set a flag to see the words in comment yes or no )

thanks,
Stef Mientki

Het UMC St Radboud staat geregistreerd bij de Kamer van Koophandel in hethandelsregister onder nummer 41055629.
The Radboud University Nijmegen Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629.
I use 'fgrep' ie... `fgrep -r "toFind" /source`

~Sean
Sep 26 '08 #6
Hi !

Thanks for return.

Some infos: from a long time, I found that it's often more fast to use
windows's command, instead of develop in high level language (and also,
low level...)

FINDSTR is fast. OK. But internal commands are more fast. Example : DIR
(with all his options)
And it's faster to read the result via a Pipe.
Thus, I use frequently this sort of function:
import os

def cmdone(repstart, commande, moderetour="LIST"):
os.chdir(repstart)
sret=''.join(os.popen(commande))
if moderetour.upper() == "STR":
return sret
else:
return sret.split('\n')

print cmdone('D:\\dev\\python','findstr /N /I ponx *.py','STR')
print
print cmdone('D:\\dev\\python','dir *.jpg /B')


Sorry for my bad english, and have a good day...
--
Michel Claveau
Sep 27 '08 #7
In message
<bf**********************************@k30g2000hse. googlegroups.com>, George
Sakkis wrote:
$ find -name "*py" | xargs egrep "\bword\b"
Better:

find -name '*.py' -exec grep -E "\bword\b" {} \;

Sep 29 '08 #8
On Sep 29, 5:16*am, Lawrence D'Oliveiro <l...@geek-
central.gen.new_zealandwrote:
In message
<bf1664da-1e2e-48c8-a108-66b0fb457...@k30g2000hse.googlegroups.com>, George

Sakkis wrote:
$ find -name "*py" | xargs egrep "\bword\b"

Better:

* * find -name '*.py' -exec grep -E "\bword\b" {} \;
In what way is this better ? I don't dispute it, I'm just curious.
Sep 29 '08 #9
In message <ma**************************************@python.o rg>, Stef
Mientki wrote:
Lawrence D'Oliveiro wrote:
>In message <ma**************************************@python.o rg>, Stef
Mientki wrote:
>>I'm really amazed by the speed of Python !!
It can only be beaten by findstr, which is only available on windows.

Did you try find -exec grep -F?
well my windows version doesn't understand that :
I assumed when you said "It can only be beaten by findstr, which is only
available on windows", that meant you had tried some non-Windows options,
before concluding that Windows "findstr" was the fastest.
Oct 1 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Laiverd.COM | last post by:
Currently working on a site that requires multiple different styles for all kind of things (tables, link colors etc.) and am suddenly ;-) wondering what would be the wise approach: have all these...
83
by: D. Dante Lorenso | last post by:
Trying to use the 'search' in the docs section of PostgreSQL.org is extremely SLOW. Considering this is a website for a database and databases are supposed to be good for indexing content, I'd...
32
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if...
2
by: Chris Murphy via DotNetMonster.com | last post by:
Hey all, just wondering if anyone can point me in the right direction. I'm developing a solution that allows a user to store multiple text-based content (like code snippets, notes, documents etc.)...
2
by: pengbsam | last post by:
Hello: I need to write a program that search through multiple level BOM, get all the items. It seems like a easy enough project, but when I put my hands on it and couple of hundred lines of codes...
5
by: mforema | last post by:
Hi Everyone, I want to search records by typing in multiple keywords. I currently have a search form. It has a combo box, text box, Search command button, and a subform. The combo box lists the...
47
by: Henning_Thornblad | last post by:
What can be the cause of the large difference between re.search and grep? This script takes about 5 min to run on my computer: #!/usr/bin/env python import re row="" for a in range(156000):...
0
by: sloan | last post by:
Yes, each connection string should be unique for each DataStore . I use DataStore because a connection string can be setup for Databases as well as TextFiles, ExcelFiles, etc, etc. ........... ...
6
by: cj2 | last post by:
If from withing one program I want to access multiple databases on a sql server do I need to make multiple connections? The connection string says initial catalog. I'd assume I could denote the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.