473,385 Members | 1,655 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

MatchCollection Memory Issue

In the program below, when the control comes back to the Main program,
a considerable amount of memory is still associated with the program.
Is this normal? Is it due to repeated generation of MatchCollection
objects?

This program is a sample of a larger problem I am facing with a
document retrieval application. The algorithm is as follows:
1. Recusively traverse a directory
2. Obtain the text of every file in a string
3. Extract "terms" in the string using the expression in Regex
4. Loop through MatchCollection to obtain number of occurences of each
"term"

Through a memory profiling tool, I noticed considerable profusion of
strings, which I think is due to MatchCollection. How do I ensure that
memory occupied by MatchCollection is reclaimed often?

Thanks, Kini

*** Code ***

using System;
using System.Collections;
using System.Text.RegularExpressions;

namespace DemoRegex
{
class MainClass
{
[STAThread]
static void Main(string[] args)
{
for(int i=0;i<10000;i++)
{
FileConvert();
}
Console.WriteLine("Press Any Key To End");
Console.ReadLine();
}
static void FileConvert()
{
string testString = "Fractions can be expressed as a
numerator over a denominator; however, storing them as a floating-point
value might be necessary. Storing fractions as floating-point values
introduces rounding errors that make it difficult to perform
comparisons. Expressing the value as a fraction (e.g., 1/6) allows the
maximum precision. Expressing the value as a floating-point value
(e.g., 0.16667) can limit the precision of the value. In this case, the
precision depends on the number of digits that the developer decides to
use to the right of the decimal point.";

Regex regEx = new
Regex(@"[a-zA-Z0-9]*[.\-_]*[.a-zA-Z\-_][.a-zA-Z0-9\-_]*[a-zA-Z0-9]");
MatchCollection matchesInDoc = regEx.Matches(testString);
}
}
}

Nov 17 '05 #1
6 2465
Hi Kini,
usually the Garbage Collector will execute when it thinks it is time to
reclaim unused memory. Are you seeing some kind of degredation in the
program as a result of large amounts of memory being consumed, how much is
being used compared to the amount of RAM you have?

You can force the Garbage collector to explicitly run at any time in your
code by calling GC.Collect();

Mark.

"Kini" wrote:
In the program below, when the control comes back to the Main program,
a considerable amount of memory is still associated with the program.
Is this normal? Is it due to repeated generation of MatchCollection
objects?

This program is a sample of a larger problem I am facing with a
document retrieval application. The algorithm is as follows:
1. Recusively traverse a directory
2. Obtain the text of every file in a string
3. Extract "terms" in the string using the expression in Regex
4. Loop through MatchCollection to obtain number of occurences of each
"term"

Through a memory profiling tool, I noticed considerable profusion of
strings, which I think is due to MatchCollection. How do I ensure that
memory occupied by MatchCollection is reclaimed often?

Thanks, Kini

*** Code ***

using System;
using System.Collections;
using System.Text.RegularExpressions;

namespace DemoRegex
{
class MainClass
{
[STAThread]
static void Main(string[] args)
{
for(int i=0;i<10000;i++)
{
FileConvert();
}
Console.WriteLine("Press Any Key To End");
Console.ReadLine();
}
static void FileConvert()
{
string testString = "Fractions can be expressed as a
numerator over a denominator; however, storing them as a floating-point
value might be necessary. Storing fractions as floating-point values
introduces rounding errors that make it difficult to perform
comparisons. Expressing the value as a fraction (e.g., 1/6) allows the
maximum precision. Expressing the value as a floating-point value
(e.g., 0.16667) can limit the precision of the value. In this case, the
precision depends on the number of digits that the developer decides to
use to the right of the decimal point.";

Regex regEx = new
Regex(@"[a-zA-Z0-9]*[.\-_]*[.a-zA-Z\-_][.a-zA-Z0-9\-_]*[a-zA-Z0-9]");
MatchCollection matchesInDoc = regEx.Matches(testString);
}
}
}

Nov 17 '05 #2
Hello Mark,

I have 500MB of RAM on my PC.

For a particular example, using 59 files of net size 43MB, my
application's memory consumption peaks to 111MB. At the place where I
provide a breakpoint, I know that the only objects in scope are not
more than 10MB in size. I do not understand why the system does not let
go of the remaining memory :-? :(

Thanks, Kini

Nov 17 '05 #3
Hi Kini,
the nature of Garbage collection is that it is non-deterministic to the
application you do not know when it will be run. The GC has an algorithm
sitting behind it that decides when it is time to clean up the managed heap.

If you have 500MB of RAM and you are using 111MB and don't having other
apps competing for the RAM resource then the GC may decide that there is no
need to run. You don't want the Garbage Collector to be running too often
because this may hurt performance more than using more RAM, because while the
Garbage Collector is running no other threads are allowed to run, this is
because the garbage collector is removing items from the managed heap and
moving objects around in memory when it finishes it then has to reset all the
pointers to those objects to the objects new memory location before you are
allowed to continue using them.

In general the Garbage Collector is probably doing an optimum job and I
would not worry about the amount of memory you used. It might be interesting
to open up a lot of applications on your computer such that a lot of RAM is
being used and see if your application still grows to 111MB or if the GC
tidies up before that due to the limited resources.

Mark.

"Kini" wrote:
Hello Mark,

I have 500MB of RAM on my PC.

For a particular example, using 59 files of net size 43MB, my
application's memory consumption peaks to 111MB. At the place where I
provide a breakpoint, I know that the only objects in scope are not
more than 10MB in size. I do not understand why the system does not let
go of the remaining memory :-? :(

Thanks, Kini

Nov 17 '05 #4
Not sure what you mean with a considerable amount of memory, when I run your
code it's working set starts with ~5MB and grows to ~6MB. The GC ran ~30
times to collect Gen0 and once to collect Gen1. That doesn't look like a
considerable amount though.
Note that you have to be more explicit when talking about memory, what kind
of memory are you talking about, there is managed heap memory and there is
native heap memory. All the different memory types have their own
performance counters and you can watch them using perfmon.

Willy.

"Kini" <ki********@gmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
In the program below, when the control comes back to the Main program,
a considerable amount of memory is still associated with the program.
Is this normal? Is it due to repeated generation of MatchCollection
objects?

This program is a sample of a larger problem I am facing with a
document retrieval application. The algorithm is as follows:
1. Recusively traverse a directory
2. Obtain the text of every file in a string
3. Extract "terms" in the string using the expression in Regex
4. Loop through MatchCollection to obtain number of occurences of each
"term"

Through a memory profiling tool, I noticed considerable profusion of
strings, which I think is due to MatchCollection. How do I ensure that
memory occupied by MatchCollection is reclaimed often?

Thanks, Kini

*** Code ***

using System;
using System.Collections;
using System.Text.RegularExpressions;

namespace DemoRegex
{
class MainClass
{
[STAThread]
static void Main(string[] args)
{
for(int i=0;i<10000;i++)
{
FileConvert();
}
Console.WriteLine("Press Any Key To End");
Console.ReadLine();
}
static void FileConvert()
{
string testString = "Fractions can be expressed as a
numerator over a denominator; however, storing them as a floating-point
value might be necessary. Storing fractions as floating-point values
introduces rounding errors that make it difficult to perform
comparisons. Expressing the value as a fraction (e.g., 1/6) allows the
maximum precision. Expressing the value as a floating-point value
(e.g., 0.16667) can limit the precision of the value. In this case, the
precision depends on the number of digits that the developer decides to
use to the right of the decimal point.";

Regex regEx = new
Regex(@"[a-zA-Z0-9]*[.\-_]*[.a-zA-Z\-_][.a-zA-Z0-9\-_]*[a-zA-Z0-9]");
MatchCollection matchesInDoc = regEx.Matches(testString);
}
}
}

Nov 17 '05 #5
Hello Mark and Willy,

Thank you for taking time to look into this issue and offering
suggestions. After introducing Disposer methods for some of my classes
(which includes explicit calls to garbage collector), the memory
consumtpion (I guess managed heap) appears to be modest. This is with
repsect to my application's code.

Thanks, Kini

Nov 17 '05 #6

"Kini" <ki********@gmail.com> wrote in message
news:11*********************@g14g2000cwa.googlegro ups.com...
Hello Mark and Willy,

Thank you for taking time to look into this issue and offering
suggestions. After introducing Disposer methods for some of my classes
(which includes explicit calls to garbage collector), the memory
consumtpion (I guess managed heap) appears to be modest. This is with
repsect to my application's code.

Thanks, Kini


Well, this is exactly what you should NOT do, there are very few reasons to
call GC.Collect() from user code. If your classes don't own unmanaged
resources, there is no reason to implement the disposable pattern either.

Willy.
Nov 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Mike P | last post by:
I know everything about reference counting and making sure you don't have large objects lying around. I have also profiled my app with multiple tools. I know about the fact GC collects memory but...
0
by: Ricardo | last post by:
How cai i copy the match.results("$1") from an entire matchcollection to a string array??? I entered the command mcMatches.copyto(sArray,0) but it won´t work.
1
by: Mortimer Schnurd | last post by:
Has anyone had any luck getting this CopyTo method to work? I can iterate through a MatchCollection and move each Match.Value to the System.Array without a problem. I just can't figure out why...
0
by: Victoria Kagansky | last post by:
Hi! Has anybody seen the following problem? The Regex.Matches function returns a MatchCollection object containing all successful matches of the expression in provided string. In several cases...
8
by: Adrian | last post by:
Hi I have a JS program that runs localy (under IE6 only) on a PC but it has a memory leak (probably the known MS one!) What applications are there that I could use to look at the memory usage of...
2
by: a_agaga | last post by:
Do you know are there some reasons why many do not make processes to communicate through memory? Why network connections (sockets) are used so commonly in IPC (inter process communication)...
3
by: san | last post by:
we cannot stop the application from increasingly use memory. The CRM Worker process will continually consume memory but not release it back to the system. Please research into how to make the...
7
by: =?Utf-8?B?Tmlrb2xheSBFdnNlZXY=?= | last post by:
Hi! I know this topic has been discussed a long way, but I haven't found any apparent solution (maybe I shouldn't be looking for a one :)) I have a very simple application with one page and with...
10
by: Andy B | last post by:
Is it safe to make a method that returns a match collection or nothing? or is it better to just return a match collection and have the code outside the method validate that match collection is...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.