473,785 Members | 2,283 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

File Search

Framework version : 1.1

For a given directory (which may have subdirectories) , I need to identify
the number of text files (*.txt). For this I have tried recursive method to
search files, it works fine for the directory which has smaller size, but it
takes 2 or 3 minutes to search the directory of size 8GB (say “C:”). Is there
any other quicker method to identify whether the given file type (txt) is
available in the given directory, to avoid unnecessary sequential search in
recursive method.
Feb 14 '06 #1
5 8340
I guess one question would be: how long does it take Windows to perform the
same search? If it takes about the same time you're probably not doing too
much wrong.

Under 2.0 the GetFiles method can accept
System.IO.Searc hOption.AllDire ctories which recurses on your behalf, but to
be honest if all you want to do is count them I'm not even sure that this is
the best option, as this will end up returning a relatively big array; I
don't know (without trying) how optimised this is; it *might* still be your
best option to iterate through the directories calling GetFiles;

The following takes about 8 seconds to search my c: drive:

static void Main(string[] args)
{
Console.WriteLi ne(CheckDir(new DirectoryInfo(" c:\\")));
}

static int counter = 0;

static int CheckDir(Direct oryInfo di)
{

int count = 0;
try
{ // watch out for permission denied ;-p
count+=di.GetFi les("*.txt").Le ngth;
foreach(Directo ryInfo subDi in di.GetDirectori es())
{
count += CheckDir(subDi) ;
}
}
catch {} // lazy
if(++counter%10 0 == 0) Console.WriteLi ne(counter); // just to see it is
working
return count;
}
Feb 14 '06 #2

"Dhans" <Dh***@discussi ons.microsoft.c om> wrote in message
news:AB******** *************** ***********@mic rosoft.com...
Framework version : 1.1

For a given directory (which may have subdirectories) , I need to identify
the number of text files (*.txt). For this I have tried recursive method
to
search files, it works fine for the directory which has smaller size, but
it
takes 2 or 3 minutes to search the directory of size 8GB (say "C:"). Is
there
any other quicker method to identify whether the given file type (txt) is
available in the given directory, to avoid unnecessary sequential search
in
recursive method.


It MIGHT be worth eliminating the recursion.

Start with a list of directories (proably just 1) and an (empty) list for
the txt files.
While the directory list is not empty
{
take first directory off the list and examine its content
append directories to directory list
append txt files to file list
}
Feb 14 '06 #3
The same occurred to me; the natural choice here would be a
Queue<Directory Info>, which obviously doesn't exist in 1.1 (as per OP)...
however, timings indicate no appreciable difference in performance between
recursive functions and queueing (some variance both up and down on repeated
tests, but within the same range indicating HDD is the cause). Clearly the
file-system is being the slow dog. Recursion isn't necessarily a sensible
option for horrendous trees, so might be worth refactoring as per Nick's
suggestion.

I ran the tests outside of the debugger, which doubles the performance to
roughly 4.1s to scan my disk (over any implementation) . My comparison also
highlighted that SearchOption.Al lDirectories is not really a very good
option, as it breaks too easily with any permission denial (unless you are
sa, but of course we don't ever run as admin ;-p).

Code for 2.0 follows:

Queue<Directory Info> queue = new Queue<Directory Info>();
queue.Enqueue(d i); // root of search
int files = 0;
while (queue.Count > 0) {
DirectoryInfo current = queue.Dequeue() ;
try { // watch out for permission denied ;-p
files += current.GetFile s(pattern).Leng th; // or put
into a List<FileInfo> or something
foreach (DirectoryInfo subDir in
current.GetDire ctories()) {
queue.Enqueue(s ubDir);
}
} catch { } // lazy
}
return files;

Marc
Feb 14 '06 #4
"Marc Gravell" wrote:
I guess one question would be: how long does it take Windows to perform the
same search? If it takes about the same time you're probably not doing too
much wrong.
More or less mysearch take same time duration for a search as windows
takes.
Under 2.0 the GetFiles method can accept
System.IO.Searc hOption.AllDire ctories which recurses on your behalf, but to
be honest if all you want to do is count them I'm not even sure that this is
the best option, as this will end up returning a relatively big array;


No, I want the file names (fullpath) which matches the search criteria.
Feb 14 '06 #5
Ahh; you mislead me by saying "number of"... but never mind:

Try this; for me under 1.1 this takes 4 seconds to return the 6000+ dll
files on my c: drive (not including UI time to display them) - about 1/4 of
the Windows search time (use different command-line params to select the
root and pattern); what timings do you get with this? How many txt files /
folders are we talking? If the numbers are *very* high, then resizing the
array might be sucking some cycles, in which case eventing or custom
iterators might help...
using System;
using System.IO;
using System.Collecti ons;

namespace ConsoleApplicat ion3
{
/// <summary>
/// Summary description for Class1.
/// </summary>
class Program
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static int Main(string[] args)
{
try
{
DateTime start = DateTime.Now;
FileInfo[] files =GetFiles(args[0], args[1]);
DateTime stop = DateTime.Now; // stop now as have results
foreach(FileInf o file in files)
Console.WriteLi ne(file.FullNam e);
Console.WriteLi ne(files.Length );
Console.WriteLi ne(stop.Subtrac t(start).TotalM illiseconds);
return 0;
}
catch (Exception e)
{
Console.WriteLi ne(e);
return -1;
}

}

static FileInfo[] GetFiles(string path, string pattern)
{
ArrayList queue = new ArrayList(), files = new ArrayList();
queue.Add(new DirectoryInfo(p ath));
while(queue.Cou nt>0)
{
DirectoryInfo dir = (DirectoryInfo) queue[0];
queue.RemoveAt( 0);

try // watch out for permission denied ;-p
{
files.AddRange( dir.GetFiles(pa ttern));
queue.AddRange( dir.GetDirector ies());
}
catch {} // lazy
}
return (FileInfo[]) files.ToArray(t ypeof(FileInfo) );
}
}
}
Feb 14 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1832
by: Rafael Nenninger | last post by:
This question has to do with MS file search but it is happening only with ..asp pages, so I though someone programming with .asp pages has experienced the same situation. I'm trying to find .asp pages with a certain table name (i.e. "renewalInfo" ) When I ran the search I get no results. I know that I have that string in a couple of pages. My file system search engine is working fine with other file types, like Word documents, Excel...
14
2224
by: Frances Del Rio | last post by:
if (parent.frames.main.location == 'mediaselect.html') { I have a very simple frameset, name of frame where I'm checking is 'main'... why is this not working? I mean this is correct syntax, right?? I also put a test alert, but alert does not come up if file I'm testing for is loaded in 'main' frame.. thank you.. Frances
4
6504
by: Nikos | last post by:
Hi... I would like to search for a hex string (for example: "E903") inside a binary file... Although I open the file correctly, how do I search hex values? Thanks in advance! Nikos
13
1926
by: Ray Muforosky | last post by:
Hello all: Task: I want to do file search, using the "conatining text" option from a web page. How do I search for a file on my local drive containing a certain string, from a web page. That is, how do run the windows search program from a web page. Any help will be appreciated.
4
3383
by: Dameon | last post by:
Hi All, I have a process where I'd like to search the contents of a file(in a dir) for all occurences (or the count of) of a given string. My goal is to focus more on performance, as some of the files could be upwards of 25mb in size and time is important. I don't want to take the route of loading the text of the file into a giant string and searching it, but would rather focus on a performance-minded solution. Any sugesstions for a...
7
11790
by: ianenis.tiryaki | last post by:
well i got this assignment which i dont even have a clue what i am supposed to do. it is about reading me data from the file and load them into a parallel array here is the question: Step (1) Your first task is to write a program which reads this file into two parallel arrays in memory. One array contains the titles, and the other array contains the authors. The arrays are 'parallel' in the sense that the n-th element of the authors...
75
3559
by: ume$h | last post by:
/* I wrote the following program to calculate no. of 'a' in the file c:/1.txt but it fails to give appropriate result. What is wrong with it? */ #include"stdio.h" int main(void) { FILE *f; char ch; long int a=0;
1
15553
by: theeverdead | last post by:
Ok I have a file in it is a record of a persons first and last name. Format is like: Trevor Johnson Kevin Smith Allan Harris I need to read that file into program and then turn it into a linked list. So on the list I can go Trevor, Kevin, Allan in a straight row but I can also call out there last name when I am on their first name in the list. Sorry if it doesn't make sense trying to explain best I can. So far I have // list.cpp
3
2165
by: Ahmad Jalil Qarshi | last post by:
Hi, I have a text file having size about 2 GB. The text file format is like: Numeric valueAlphaNumeric values Numeric valueAlphaNumeric values Numeric valueAlphaNumeric values For example consider following chunk of actual data:
16
8955
by: vizzz | last post by:
Hi there, i need to find an hex pattern like 0x650A1010 in a binary file. i can make a small algorithm that fetch all the file for the match, but this file is huge, and i'm scared about performances. Is there any stl method for a fast search? Andrea
0
9646
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
10346
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10157
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10096
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8982
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6742
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5386
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4055
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2887
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.