473,324 Members | 2,193 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Problem with ifilters, regex and large text files

Hello,

I am currently designing a console app that will run and search our network for any files that contain 16 digit numbers. I'm having to utilise iFilters to properly index each file, which was gratefully based on the great article I found on CodeProject.

The problem I'm having is that it basically seems to hang after a certain amount of time and after some debugging, it seems to always happen on the same file. This file is a ~40MB text file and is the second or third one that it has processed. I've included the code snippet that is doing the ifilter processing below.

using (TextReader reader = new FilterReader(f))
{
char[] buffer = new char[512 * 512];
int position = 0;
int charsRead = 0;

while ((charsRead = reader.Read(buffer, 0, buffer.Length)) > 0)
{
string contentToProcess = new string(buffer, 0, charsRead);

//Check it against the provided regular expression
if (regexp.IsMatch(contentToProcess))
{
//Update the global counter
matchCounter++;

//Write the filename out to a log
writeFileName(f);

//Skip over the remainign data chunks as we have found a match
goto skipChunk;
}
position += charsRead;
}
skipChunk: ;
}

I do not doubt that I am doing something fairly silly as I'm kinda self taught at C#, but any constructive help would be greatly appreciated. Any abuse at being rubbish I can get for free from my wife when I get home! :)
Mar 3 '09 #1
0 1351

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Blue Ocean | last post by:
In short, it's not working right for me. In long: The program is designed to read numbers from an accumulator and speak them out loud. Unfortunately, the class that contains the method to...
3
by: jty202 | last post by:
I encounter a problem. I have three files: index.aspx index.aspx.vb HTMLContentParser.vb (doesn't have the class WebForm1 I put all three file in the same directory. when I ran...
0
by: kunal | last post by:
I am developing a web based application in which i give an option to the registered users of the applications an option to upload files as attachments directly in the blob field of the database....
0
by: Greg | last post by:
Not sure if this is best place for this problem, but here it is. I have a project that is simply a C# class that interfaces with an IFilter. This is so I can retreive the text from Word docs. ...
5
by: Chris | last post by:
How Do I use the following auto-generated code from The Regulator? '------------------------------------------------------------------------------ ' <autogenerated> ' This code was generated...
5
by: John Blogger | last post by:
(I don't know if it is the right place. So if I am wrong, please point me the right direction. If this post is read by you masters, I'm honoured. If I am getting a mere response, I'm blessed!) ...
4
by: Chris | last post by:
Hi Everyone, I am using a regex to check for a string. When all the file contains is my test string the regex returns a match, but when I embed the test string in the middle of a text file a...
12
by: Julian | last post by:
Hi, I am having problems with a function that I have been using in my program to read sentences from a 'command file' and parse them into commands. the surprising thing is that the program works...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.