473,396 Members | 2,013 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Regular Expression Help...Line Break

I'm trying to parse a rather large string from a .txt file that breaks
into 2 lines ( the string is a fixed-length string reaching 6694
characters) into a dataset using VB.net. The issues that I am having is
with the line breaks. I am able to parse the string into the dataset up
in till the line breaks. My pattern is as follows..
"^(?<n0>.{1})(?<n1>.{8})(?<n2>.{2})..... (?<n404>.{2})$"
My code is as follows:

Dim re as New Regex(patern)
Dim ma as Match = re.Match(sr.ReadLine(), RegexOptions.MultiLine)
Do while
ma.Groups("n" & i & "").Value.ToString
Loop

My issue is when I hit a line break in the txt file. The reason the
string breaks is that it is so long that it reaches the max places of
the txt file and has to continue on the next line. The beginning of a
new line then begins on the next line .
Example:
I33445 89 u000x00000 900000 x00000 ....... (on and on until the
max length of a txt file).......0000 00000000 (End of Txt file)
TOWN88989 000000000000 909 (end of String)
NEW Line Starts Here.

I have tried different options with RegexOptions but to no avail.

Thank you for your time.

Apr 18 '06 #1
9 2409
js*****@gmail.com wrote:
I'm trying to parse a rather large string from a .txt file that breaks
into 2 lines ( the string is a fixed-length string reaching 6694
characters) into a dataset using VB.net. The issues that I am having
is with the line breaks.


How about removing all the line breaks first?

Andrew
Apr 18 '06 #2
I have tried that - It's not an option. The string is so large that it
extends the whole length of the txt file and then some. I didn't think
there was an end or a line break in a txt file. I though you would be
able to extend as far right as you would like. Not so.

Thanks for you idea.

Jim

Apr 18 '06 #3
js*****@gmail.com wrote:
I have tried that - It's not an option. The string is so large that it
extends the whole length of the txt file and then some.
It cannot. Past the end of the file there is nothing. Or do you mean there
is an EOF character in there?
I didn't think
there was an end or a line break in a txt file.
Any byte can be in a text file; text file means only that it is meant to be
human-readable with minimal processing (e.g. tab characters).

Look at the supposedly text file in a hex editor. Do not trust Notepad.
I though you would be
able to extend as far right as you would like. Not so.
That must be a limitation of the program displaying/writing the text file.
( the string is a fixed-length string reaching 6694 characters)


That's a tiny string compared to the limit of something like 2^31-1 chars.
When you say "reaching 6694 characters", do you mean it is /always/ 6694
chars, and, if so, does that mean that each field is of a fixed width too?
In that case, you could just extract each substring with
System.String.Subtring(int32, int32).

Andrew
Apr 18 '06 #4
Both of you questions are yes. Is /always/6694 char, and each field is
a fixed length. Although each field's length is different. I have been
reading the file thought Note Tab and probably why the break is at the
same spot.
I'm going to look into the system.string.substring. That may work. If
the a string can stretch 2^31-1 char in a txt file, then how come I'm
having issues with my short string breaking? I have no control on how
the string is extracted out and dumped into the txt file. The file I
have received is a .txt file.
Thanks again for you help.

-Jim

Apr 18 '06 #5
Which object would you sugest to use? StreamReader, stringReader?
TextReader? In order to use the subString and write out the string
values to a ds. I was using a StreamReader.

Thanks

Apr 18 '06 #6
js*****@gmail.com wrote:
Which object would you sugest to use? StreamReader, stringReader?
TextReader?


See the help for StreamReader and for StringReader to determine the
appropriate choice.

Andrew
Apr 19 '06 #7
js*****@gmail.com wrote:
I'm going to look into the system.string.substring. That may work. If
the a string can stretch 2^31-1 char in a txt file, then how come I'm
having issues with my short string breaking? I have no control on how
the string is extracted out and dumped into the txt file. The file I
have received is a .txt file.


Have you looked at the file in a hex editor to see if there are any
unexpected characters in it? The fact it has an extension of .txt is
irrelevant.

In the case of what I think your file looks like, unexpected characters
would be anything with a decimal value in the range 0-31. Possibly you would
find 13,10 (CRLF) at the end.

Note that files and strings are not connected. The maximum possible size of
a file (disregarding hardware) is determined by the OS.

Andrew
Apr 19 '06 #8
The Hexadecimal value for the line break in the txt file is 30 and the
decial value is 0. Are 13, 10 values you mentond are decimal values or
hexadecimal values?

Thanks

Apr 19 '06 #9
js*****@gmail.com wrote:
The Hexadecimal value for the line break in the txt file is 30 and the
decial value is 0.
Three-zero in hex is the /character/ 0, i.e. zero, in ASCII.
30(hex), sometimes written 0x30, or &H30 in VB, has a decimal value of 48.

I really don't think 0 is being used as a line break.
Are 13, 10 values you mentond are decimal values or
hexadecimal values?


Decimal.
http://www.cppreference.com/ascii.html
http://www.jimprice.com/jim-asc.htm

Andrew
Apr 19 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Harry | last post by:
Hi there, does anyone know how I can build a regular expression e.g. for the string.search() function on runtime, depending on the content of variables? Should be something like this: var...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
2
by: Brian Kitt | last post by:
I have a process where I do some minimal reformating on a TAB delimited document to prepare for DTS load. This process has been running fine, but I recently made a change. I have a Full Text...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: LordHog | last post by:
Hello all, I am attempting to create a small scripting application to be used during testing. I extract the commands from the script file I was going to tokenize the each line as one of the...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
11
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend...
3
by: Zeba | last post by:
Hi guys, I need some help regarding regular expressions. Consider the following statement : System.Text.RegularExpressions.Match match =...
0
by: ahropak | last post by:
Hi, I have a question regarding a regular expression within Regex.Split() method which will help me to break each line of code into tokens. I'm trying to parse some lines of C# source code and...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.