In the program below, when the control comes back to the Main program,
a considerable amount of memory is still associated with the program.
Is this normal? Is it due to repeated generation of MatchCollection
objects?
This program is a sample of a larger problem I am facing with a
document retrieval application. The algorithm is as follows:
1. Recusively traverse a directory
2. Obtain the text of every file in a string
3. Extract "terms" in the string using the expression in Regex
4. Loop through MatchCollection to obtain number of occurences of each
"term"
Through a memory profiling tool, I noticed considerable profusion of
strings, which I think is due to MatchCollection. How do I ensure that
memory occupied by MatchCollection is reclaimed often?
Thanks, Kini
*** Code ***
using System;
using System.Collections;
using System.Text.RegularExpressions;
namespace DemoRegex
{
class MainClass
{
[STAThread]
static void Main(string[] args)
{
for(int i=0;i<10000;i++)
{
FileConvert();
}
Console.WriteLine("Press Any Key To End");
Console.ReadLine();
}
static void FileConvert()
{
string testString = "Fractions can be expressed as a
numerator over a denominator; however, storing them as a floating-point
value might be necessary. Storing fractions as floating-point values
introduces rounding errors that make it difficult to perform
comparisons. Expressing the value as a fraction (e.g., 1/6) allows the
maximum precision. Expressing the value as a floating-point value
(e.g., 0.16667) can limit the precision of the value. In this case, the
precision depends on the number of digits that the developer decides to
use to the right of the decimal point.";
Regex regEx = new
Regex(@"[a-zA-Z0-9]*[.\-_]*[.a-zA-Z\-_][.a-zA-Z0-9\-_]*[a-zA-Z0-9]");
MatchCollection matchesInDoc = regEx.Matches(testString);
}
}
}