473,324 Members | 2,178 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Long strings and memory errors.

I'd like to think that my code* is pretty simple, but I'm running into
memory errors when loading larger documents.

The URL you see below in the first line of the Page_Load function is
about 3 MB. That URL then gets parsed with several regular expressions.
The code often runs out of memory. I'm guessing that each time the
document is parsed, a new string of 3 MB is created -- the several
regular expressions I use consume memory rapidly.

One option I have explored is reading in each row. The problem is that
the regular expression parsing looks at multiple lines, and I can't
guarantee where the lines I need will occur.

I'm a bit frustrated at this point, as the code works fine on smaller
documents. I'd sure appreciate any help.

-- Brent

*================================================= =============
public void Page_Load(Object sender, EventArgs e) {

string strResponse =
getText("http://www.sec.gov/Archives/edgar/data/1085158/0001085158-99-000008.txt"));
string report_date = getRegExGroupValue(strResponse, @regExPattern1,"G2");
string report_header = getRegExGroupValue(strResponse,
@regExPattern1,"G2");
string report_companyname = getRegExGroupValue(strResponse,
@regExPattern3,"G2");
string report_date = getRegExGroupValue(strResponse, @regExPattern4,"G2");

}

public string getText(string strURL)
{
HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(strURL);
oRequest.Timeout = 10*60000; // 10 minutes; for long files (10000
= 10 seconds)
oRequest.UserAgent = "Web Client";
HttpWebResponse oResponse = (HttpWebResponse)oRequest.GetResponse();

Stream myStream = oResponse.GetResponseStream();
StreamReader sr = new StreamReader(myStream);
string strResponse = sr.ReadToEnd();
return strResponse;
myStream.Close();
}

public string getRegExGroupValue(string strText, string strPat, string
strGroup)
{

string returnValue;

if(Regex.IsMatch(strText,@strPat,RegexOptions.Mult iline|RegexOptions.IgnoreCase))
{
Match strMatch = Regex.Match(strText,@strPat,RegexOptions.Multiline );
Regex.Replace(strText,@"\s"," ");
return strMatch.Groups[strGroup].Value.Trim();

}
else
{
return "0";
}
}
================================================== ========
Nov 19 '05 #1
4 1533
I'm guessing that the content doesn't change very often. Perhaps creating a
class to parse the content, and cache an instance of the class in
Application or Session State would do the trick.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
The sun never sets on
the Kingdom of Heaven

"Brent" <""b b i g l e r \"@ y a h o o . c o m"> wrote in message
news:11*************@corp.supernews.com...
I'd like to think that my code* is pretty simple, but I'm running into
memory errors when loading larger documents.

The URL you see below in the first line of the Page_Load function is about
3 MB. That URL then gets parsed with several regular expressions. The code
often runs out of memory. I'm guessing that each time the document is
parsed, a new string of 3 MB is created -- the several regular expressions
I use consume memory rapidly.

One option I have explored is reading in each row. The problem is that the
regular expression parsing looks at multiple lines, and I can't guarantee
where the lines I need will occur.

I'm a bit frustrated at this point, as the code works fine on smaller
documents. I'd sure appreciate any help.

-- Brent

*================================================= =============
public void Page_Load(Object sender, EventArgs e) {

string strResponse =
getText("http://www.sec.gov/Archives/edgar/data/1085158/0001085158-99-000008.txt"));
string report_date = getRegExGroupValue(strResponse, @regExPattern1,"G2");
string report_header = getRegExGroupValue(strResponse,
@regExPattern1,"G2");
string report_companyname = getRegExGroupValue(strResponse,
@regExPattern3,"G2");
string report_date = getRegExGroupValue(strResponse, @regExPattern4,"G2");

}

public string getText(string strURL)
{
HttpWebRequest oRequest =
(HttpWebRequest)WebRequest.Create(strURL);
oRequest.Timeout = 10*60000; // 10 minutes; for long files (10000 =
10 seconds)
oRequest.UserAgent = "Web Client";
HttpWebResponse oResponse = (HttpWebResponse)oRequest.GetResponse();

Stream myStream = oResponse.GetResponseStream();
StreamReader sr = new StreamReader(myStream);
string strResponse = sr.ReadToEnd();
return strResponse;
myStream.Close();
}

public string getRegExGroupValue(string strText, string strPat, string
strGroup)
{

string returnValue;

if(Regex.IsMatch(strText,@strPat,RegexOptions.Mult iline|RegexOptions.IgnoreCase))
{
Match strMatch = Regex.Match(strText,@strPat,RegexOptions.Multiline );
Regex.Replace(strText,@"\s"," ");
return strMatch.Groups[strGroup].Value.Trim();

}
else
{
return "0";
}
} ================================================== ========

Nov 19 '05 #2
I'm guessing that the content doesn't change very often. Perhaps creating a
class to parse the content, and cache an instance of the class in
Application or Session State would do the trick.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
The sun never sets on
the Kingdom of Heaven

"Brent" <""b b i g l e r \"@ y a h o o . c o m"> wrote in message
news:11*************@corp.supernews.com...
I'd like to think that my code* is pretty simple, but I'm running into
memory errors when loading larger documents.

The URL you see below in the first line of the Page_Load function is about
3 MB. That URL then gets parsed with several regular expressions. The code
often runs out of memory. I'm guessing that each time the document is
parsed, a new string of 3 MB is created -- the several regular expressions
I use consume memory rapidly.

One option I have explored is reading in each row. The problem is that the
regular expression parsing looks at multiple lines, and I can't guarantee
where the lines I need will occur.

I'm a bit frustrated at this point, as the code works fine on smaller
documents. I'd sure appreciate any help.

-- Brent

*================================================= =============
public void Page_Load(Object sender, EventArgs e) {

string strResponse =
getText("http://www.sec.gov/Archives/edgar/data/1085158/0001085158-99-000008.txt"));
string report_date = getRegExGroupValue(strResponse, @regExPattern1,"G2");
string report_header = getRegExGroupValue(strResponse,
@regExPattern1,"G2");
string report_companyname = getRegExGroupValue(strResponse,
@regExPattern3,"G2");
string report_date = getRegExGroupValue(strResponse, @regExPattern4,"G2");

}

public string getText(string strURL)
{
HttpWebRequest oRequest =
(HttpWebRequest)WebRequest.Create(strURL);
oRequest.Timeout = 10*60000; // 10 minutes; for long files (10000 =
10 seconds)
oRequest.UserAgent = "Web Client";
HttpWebResponse oResponse = (HttpWebResponse)oRequest.GetResponse();

Stream myStream = oResponse.GetResponseStream();
StreamReader sr = new StreamReader(myStream);
string strResponse = sr.ReadToEnd();
return strResponse;
myStream.Close();
}

public string getRegExGroupValue(string strText, string strPat, string
strGroup)
{

string returnValue;

if(Regex.IsMatch(strText,@strPat,RegexOptions.Mult iline|RegexOptions.IgnoreCase))
{
Match strMatch = Regex.Match(strText,@strPat,RegexOptions.Multiline );
Regex.Replace(strText,@"\s"," ");
return strMatch.Groups[strGroup].Value.Trim();

}
else
{
return "0";
}
} ================================================== ========

Nov 19 '05 #3
Thanks, Kevin. I ended up grabbing the first 150 lines of text, where
the header info occurs, then grabbing the whole file again line by line.
It works pretty quickly, but it's probably not elegant...so be it!

--Brent

Kevin Spencer wrote:
I'm guessing that the content doesn't change very often. Perhaps creating a
class to parse the content, and cache an instance of the class in
Application or Session State would do the trick.

Nov 19 '05 #4
re:
It works pretty quickly, but it's probably not elegant...so be it!
Hey, it's a lot more "elegant" that what didn't work !

;-)

Juan T. Llibre
ASP.NET MVP
http://asp.net.do/foros/
Foros de ASP.NET en Español
Ven, y hablemos de ASP.NET...
======================

"Brent" <""b b i g l e r \"@ y a h o o . c o m"> wrote in message
news:11*************@corp.supernews.com... Thanks, Kevin. I ended up grabbing the first 150 lines of text, where the header info
occurs, then grabbing the whole file again line by line. It works pretty quickly, but
it's probably not elegant...so be it!

--Brent

Kevin Spencer wrote:
I'm guessing that the content doesn't change very often. Perhaps creating a class to
parse the content, and cache an instance of the class in Application or Session State
would do the trick.

Nov 19 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Robert Manea | last post by:
Hello everyone, I wrote, simply as an exercise, a small piece of code to find 'strings' (defined as an amount of at least 3 ASCII characters followed by a non ASCII character) in binary files. ...
5
by: swarsa | last post by:
Hi All, I realize this is not a Palm OS development forum, however, even though my question is about a Palm C program I'm writing, I believe the topics are relevant here. This is because I...
44
by: Patrick | last post by:
Hello I have the following "easy" problem. I have a string which contains 1000 chars. Now my task is to cut the string at his 650 position. I tried strcpy, to copy the first 650 chars into...
11
by: Ken Durden | last post by:
I am in search of a comprehensive methodology of using these two object cleanup approaches to get rid of a number of bugs, unpleasantries, and cleanup-ordering issues we currently have in our...
0
by: Brent | last post by:
I'd like to think that my code* is pretty simple, but I'm running into memory errors when loading larger documents. The URL you see below in the first line of the Page_Load function is about 3...
37
by: Anony | last post by:
Hi All, I'm trying to chunk a long string SourceString into lines of LineLength using this code: Dim sReturn As String = "" Dim iPos As Integer = 0 Do Until iPos >= SourceString.Length -...
18
by: Larry Herbinaux | last post by:
I'm having issues with garbage collection with my long-standing service process. If you could review and point me in the right direction it would be of great help. If there are any helpful...
19
by: pkirk25 | last post by:
I wonder if anyone has time to write a small example program based on this data or to critique my own effort? A file called Realm List.html contains the following data: Bladefist-Horde...
95
by: hstagni | last post by:
Where can I find a library to created text-based windows applications? Im looking for a library that can make windows and buttons inside console.. Many old apps were make like this, i guess ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.