Hello,
I have a file with CR/LF separated text.
string.Trim() and string.Split() came very handy to process the content, but
with the immutability the memory is badly managed.
StringBuilder is a good alternative in string processing but it lacks at
least the two methods above.
Not to mention that I would also need: EndsWith(), IndexOf(), LastIndex(),
and so on.
Does anyone know a work around, other than writing code myself to do the
job?
In my opinion, the same methods should be there for StringBuilder too.
Thanks. 14 7438
"Dan Aldean" <da*******@yahoo.com> wrote: I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but
What about StreamReader?
-- Barry
And how is your text file big and how often you manipulate with elemens?
Take into account that performance gap is significant if your iterating
25000 and more strings in circle.
Usually using "string" is an appropriate solution I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but with the immutability the memory is badly managed. StringBuilder is a good alternative in string processing but it lacks at least the two methods above. Not to mention that I would also need: EndsWith(), IndexOf(), LastIndex(), and so on. Does anyone know a work around, other than writing code myself to do the job? In my opinion, the same methods should be there for StringBuilder too.
--
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour
"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsche
Hi,
What do you want to do?
the operations you mention : Split, Trim create new strings and may have
some impact in the performance,
EndsWith, IndexOf, etc does not change the string at all and have no impact
in the performance.
--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation
"Dan Aldean" <da*******@yahoo.com> wrote in message
news:uq**************@TK2MSFTNGP05.phx.gbl... Hello,
I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but with the immutability the memory is badly managed. StringBuilder is a good alternative in string processing but it lacks at least the two methods above. Not to mention that I would also need: EndsWith(), IndexOf(), LastIndex(), and so on. Does anyone know a work around, other than writing code myself to do the job? In my opinion, the same methods should be there for StringBuilder too.
Thanks.
Thanks for the reply. Basically the file is big and stream is not a
solution, as I manipulate a lot.
"Michael Nemtsev" <Mi************@discussions.microsoft.com> wrote in
message news:2B**********************************@microsof t.com... And how is your text file big and how often you manipulate with elemens? Take into account that performance gap is significant if your iterating 25000 and more strings in circle.
Usually using "string" is an appropriate solution
I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but with the immutability the memory is badly managed. StringBuilder is a good alternative in string processing but it lacks at least the two methods above. Not to mention that I would also need: EndsWith(), IndexOf(), LastIndex(), and so on. Does anyone know a work around, other than writing code myself to do the job? In my opinion, the same methods should be there for StringBuilder too.
-- WBR, Michael Nemtsev :: blog: http://spaces.msn.com/laflour
"At times one remains faithful to a cause only because its opponents do not cease to be insipid." (c) Friedrich Nietzsche
Thanks for the reply Barry.
I use StreamReader to read but I need to process the content. For example I
need to find trailing spaces before a '>' character and remove them.
Also within the string I should look for a character splitter ':' and remove
the spaces before and after it.
Split() and then Trim() would have helped, but I cannot afford to use
strings as the file can be big and I have to process every line I read.
"Barry Kelly" <ba***********@gmail.com> wrote in message
news:0n********************************@4ax.com... "Dan Aldean" <da*******@yahoo.com> wrote:
I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but
What about StreamReader?
-- Barry
Thanks Ignacio.
I have a class that handles this file, which is very big.
I have a method that reads and processes the content of each line
I use StreamReader to read the lines: myFile.ReadLine()
For example I need to find trailing spaces before a '>' character and remove
them.
Also within the string I should look for a character splitter ':' and remove
the spaces before and after it. I need to get the content between two
separators and save it.
Split() and then Trim() would have helped a lot, but with a file this big
strings are not recommended.
"Ignacio Machin ( .NET/ C# MVP )" <ignacio.machin AT dot.state.fl.us> wrote
in message news:%2****************@TK2MSFTNGP05.phx.gbl... Hi,
What do you want to do?
the operations you mention : Split, Trim create new strings and may have some impact in the performance, EndsWith, IndexOf, etc does not change the string at all and have no impact in the performance. -- Ignacio Machin, ignacio.machin AT dot.state.fl.us Florida Department Of Transportation
"Dan Aldean" <da*******@yahoo.com> wrote in message news:uq**************@TK2MSFTNGP05.phx.gbl... Hello,
I have a file with CR/LF separated text. string.Trim() and string.Split() came very handy to process the content, but with the immutability the memory is badly managed. StringBuilder is a good alternative in string processing but it lacks at least the two methods above. Not to mention that I would also need: EndsWith(), IndexOf(), LastIndex(), and so on. Does anyone know a work around, other than writing code myself to do the job? In my opinion, the same methods should be there for StringBuilder too.
Thanks.
"Dan Aldean" <da*******@yahoo.com> wrote: Thanks Ignacio. I have a class that handles this file, which is very big. I have a method that reads and processes the content of each line I use StreamReader to read the lines: myFile.ReadLine()
For example I need to find trailing spaces before a '>' character and remove them. Also within the string I should look for a character splitter ':' and remove the spaces before and after it. I need to get the content between two separators and save it. Split() and then Trim() would have helped a lot, but with a file this big strings are not recommended.
If you have seriously long lines, I recommend that you use the
techniques of lexical analysis. Basically:
* Read your strings as System.String from StreamReader.ReadLine().
* Tokenize the strings using manual integer indexing and classify them
according to how you want to modify them.
* Write a loop which sucks in from your tokenizer and builds up a
resulting StringBuilder according to your modification rules.
This change would at least make the algorithm linear with respect to
input line length.
If the lines are very long (i.e. something that's going to really fall
out of the CPU cache), you might consider working with some kind of
pooled char arrays, using array operations to copy ranges, and thus
reduce memory management overhead. That will really help if your strings
are bigger than 80,000 bytes (i.e. 40,000 chars), since in that case
they'll fall into the large object heap and don't get collected until
generation 2 GCs.
To get the benefit from char arrays would mean using
TextReader.ReadBlock() instead of ReadLine(), and breaking into lines in
the tokenizer yourself.
-- Barry
Thanks Barry, I think this will help me a great deal.
"Barry Kelly" <ba***********@gmail.com> wrote in message
news:pa********************************@4ax.com... "Dan Aldean" <da*******@yahoo.com> wrote:
Thanks Ignacio. I have a class that handles this file, which is very big. I have a method that reads and processes the content of each line I use StreamReader to read the lines: myFile.ReadLine()
For example I need to find trailing spaces before a '>' character and remove them. Also within the string I should look for a character splitter ':' and remove the spaces before and after it. I need to get the content between two separators and save it. Split() and then Trim() would have helped a lot, but with a file this big strings are not recommended.
If you have seriously long lines, I recommend that you use the techniques of lexical analysis. Basically:
* Read your strings as System.String from StreamReader.ReadLine(). * Tokenize the strings using manual integer indexing and classify them according to how you want to modify them. * Write a loop which sucks in from your tokenizer and builds up a resulting StringBuilder according to your modification rules.
This change would at least make the algorithm linear with respect to input line length.
If the lines are very long (i.e. something that's going to really fall out of the CPU cache), you might consider working with some kind of pooled char arrays, using array operations to copy ranges, and thus reduce memory management overhead. That will really help if your strings are bigger than 80,000 bytes (i.e. 40,000 chars), since in that case they'll fall into the large object heap and don't get collected until generation 2 GCs.
To get the benefit from char arrays would mean using TextReader.ReadBlock() instead of ReadLine(), and breaking into lines in the tokenizer yourself.
-- Barry
Dan Aldean <da*******@yahoo.com> wrote: Thanks Ignacio. I have a class that handles this file, which is very big. I have a method that reads and processes the content of each line I use StreamReader to read the lines: myFile.ReadLine()
For example I need to find trailing spaces before a '>' character and remove them. Also within the string I should look for a character splitter ':' and remove the spaces before and after it. I need to get the content between two separators and save it. Split() and then Trim() would have helped a lot, but with a file this big strings are not recommended.
Are you sure you're not misinterpreting advice for a different
situation? It's not advisable to read a file by doing:
string result = "";
using (StreamReader reader = ...)
{
string line;
while ((line=reader.ReadLine()) != null)
{
result += line;
}
}
but that's because the strings involved become large, so copying them
for each iteration becomes a problem.
It's not nearly so bad to keep a StringBuilder to collect any content
(if indeed you need to) and use normal string operations on any one
particular line.
Have you tried the simplest solution (using strings) and found it too
slow? Have you profiled it?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Thanks Jon.
I used
private StreangBuilder line = ......
line.Append(sourceFile.ReadLine());
Then I iterated through the "line" (line[i]) to identify the tokens, trim
whitespaces, build the identifiers.
I only used streambuilder, no strings. Even though the strings are more
flexible (IndexOf, Split, Trim), using them excessively is going to pay a
price. I do not know how big the input file is to process it.
So the answer is no, I did not use strings, I don't know how slow it would
be, the immutability was what scared me.
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message Are you sure you're not misinterpreting advice for a different situation? It's not advisable to read a file by doing:
string result = "";
using (StreamReader reader = ...) { string line; while ((line=reader.ReadLine()) != null) { result += line; } }
but that's because the strings involved become large, so copying them for each iteration becomes a problem.
It's not nearly so bad to keep a StringBuilder to collect any content (if indeed you need to) and use normal string operations on any one particular line.
Have you tried the simplest solution (using strings) and found it too slow? Have you profiled it?
-- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Dan Aldean <da*******@yahoo.com> wrote: I used private StreangBuilder line = ...... line.Append(sourceFile.ReadLine());
Then I iterated through the "line" (line[i]) to identify the tokens, trim whitespaces, build the identifiers.
I only used streambuilder, no strings. Even though the strings are more flexible (IndexOf, Split, Trim), using them excessively is going to pay a price. I do not know how big the input file is to process it.
So the answer is no, I did not use strings, I don't know how slow it would be, the immutability was what scared me.
Well, are you able to process a line at a time? If so, read the line,
process it, and *then* append it.
That's *definitely* worth trying before you start anything more
complicated (and thus error-prone).
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
I can process one line at a time. I also need to determine if the next line
continues the current one.
Probably I need Peek for that
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om... Dan Aldean <da*******@yahoo.com> wrote: I used private StreangBuilder line = ...... line.Append(sourceFile.ReadLine());
Then I iterated through the "line" (line[i]) to identify the tokens, trim whitespaces, build the identifiers.
I only used streambuilder, no strings. Even though the strings are more flexible (IndexOf, Split, Trim), using them excessively is going to pay a price. I do not know how big the input file is to process it.
So the answer is no, I did not use strings, I don't know how slow it would be, the immutability was what scared me.
Well, are you able to process a line at a time? If so, read the line, process it, and *then* append it.
That's *definitely* worth trying before you start anything more complicated (and thus error-prone).
-- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Dan Aldean <da*******@yahoo.com> wrote: I can process one line at a time. I also need to determine if the next line continues the current one. Probably I need Peek for that
How often are there continuations? I would suggest keeping a "current
line", and when you read a line, if it's a continuation of the current
line, add it and keep going. If it's not a continuation, process the
"current line", then set the current line to the one you've just read.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
There are quite often continuations, but I don't know what the next read
line is until I find tokens, which might be anywhere in the string. I might
use a second stringbuilder object for the next line until I determine what
type it is.
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om... Dan Aldean <da*******@yahoo.com> wrote: I can process one line at a time. I also need to determine if the next line continues the current one. Probably I need Peek for that
How often are there continuations? I would suggest keeping a "current line", and when you read a line, if it's a continuation of the current line, add it and keep going. If it's not a continuation, process the "current line", then set the current line to the one you've just read.
-- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Kevin C |
last post by:
Quick Question:
StringBuilder is obviously more efficient dealing with string concatenations
than the old '+=' method... however, in dealing with relatively large string
concatenations (ie,...
|
by: Dave |
last post by:
I'm receiving info from a com port into a string. I gradually process the
string which constantly shortens it. The question is how long can a string be
before I need to write some info to disk...
|
by: Tubs |
last post by:
Am i missing something or does the .Net Framework have a quirk in the
way methods work on an object. In C++ MFC, if i have a CString and i
use the format method, i format the string i am using. ...
|
by: Peter Row |
last post by:
Hi,
I know this has been asked before, but reading the threads it is still not
entirely clear.
Deciding which .Replace( ) to use when.
Typically if I create a string in a loop I always use a...
|
by: Richard Lewis Haggard |
last post by:
I thought that the whole point of StringBuilder was that it was supposed to
be a faster way of building strings than string. However, I just put
together a simple little application to do a...
|
by: Mo |
last post by:
Hi,
I am trying to write a code to build a string 768 characters long.
This string is going to be written to a file which is then read by
another application. The format of the string is already...
|
by: morleyc |
last post by:
Hi, i would like to remove a number of characters from my string (\t
\r \n which are throughout the string), i know regex can do this but i
have no idea how. Any pointers much appreciated.
Chris
|
by: raylopez99 |
last post by:
StringBuilder better and faster than string for adding many strings.
Look at the below. It's amazing how much faster StringBuilder is than
string.
The last loop below is telling: for adding...
|
by: =?Utf-8?B?TmVvbWl0ZQ==?= |
last post by:
Hi, I'm having some difficulty converting a String to a std::basic_string
(c++).
I've tried using StringBuilder and Marshal.LPxxxx , no luck.
I consistently get AccessViolation errors.
I've...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |