473,795 Members | 2,861 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex, TextReader...?

I have attached a block of text similar to the type that I am working
with.

I have been learning a lot about Regex - it is quite impressive. I can
easily capture bits of info, but I keep having trouble with line breaks.

I want to identify the start and end of blocks of text. Are there some
tips someone can share?

EG: in my text, I can grab a collection of everyones Phone number with:
^"M:"\t"(?<Phon eNumber>[^"])"

But, what about if I wanted to grab many lines, until it matched a
certain pattern. I use the ^ to say not the quote, but can I say not 14
hyphens?

The way I have split this type of data is inefficient. I match all the
cases of:
^-{14}
Then I use many math equations to split the file using the index of the
matches. I am sure Regex must have some way to pattern match a complex
not, to indicate the end of my match?

Thank you.


--------------
"M:" "3242310532 "
"Subscriber Name:" "MR Regex"
"Additional line user name:" ""
"Sublevel:" " "
"Sublevel:" ""
"Reference 1:" ""
"Reference 2:" ""

"CURRENT CHARGES"
"Monthly Service Plan" $40.00
"Additional Local Airtime" $0.00
"Long Distance Charges" $0.00
"Roaming Charges" $0.00
"Network and Licensing Charges" $7.20
"Total Taxes:" $7.09
"Total Current Charges:" $47.20

"MONTHLY SERVICE PLAN" 11-Oct-03 to 10-Nov-03
"Service Plan Name" "Total"
"Mike Dispatch 40 (11-Oct-03 to 10-Nov-03)" $40.00
"Total Monthly Service Plan Charges" $40.00

"ADDITIONAL LOCAL AIRTIME"
"Service" "Total Mins. Used" "Free Mins. Used" "Included Mins.
Used" "Chargeable Mins. Used" "Total"
"Direct Connect Private (minutes)" 28:04 28:04 0:00 0:00 $0.00
"Total Additional Local Airtime Charges" $0.00

"LONG DISTANCE, ROAMING AND OTHER CALL CHARGES"
"Service" "Incl. LD Minutes" "Chargeable LD Minutes" "Total"
"Total Long Distance Charges" $0.00

"ROAMING"
"Service" "Roaming Minutes" "Roaming Charges" "Roaming LD Minutes"
"Roaming LD Charges" "Roaming Surcharge" "Total"
"Total Roaming Charges" $0.00

"WIRELESS WEB - PREMIUM SERVICE"
"Service" "Total Events" "Event Type" "Total"
"Total Wireless Web Premium Services Charges" $0.00

"PHONE - PREMIUM SERVICE"
"Service" "Total Events" "Event Type" "Total"
"Total Phone Premium Services Charges" $0.00

"PAGER SERVICES"
"Service" "Total Messages" "Included Messages" "Chargeable
Messages" "Total"
"Total Pager Charges" $0.00

"VALUE-ADDED SERVICES" 11-Oct-03 to 10-Nov-03
"Service" "Total"
"Wireless Web - Surf Sampler (11-Oct-03 to 10-Nov-03)" $0.00
"Total Value Added Service Charges" $0.00

"OTHER CHARGES AND CREDIT"
"Charge or Credit" "Total"
"Total Other Charges and Credits" $0.00

"NETWORK and LICENSING CHARGES"
"Service" "Total"
"911 Emergency Access Charge (11-Oct-03 to 10-Nov-03)" $0.25
"System Licensing Charge (11-Oct-03 to 10-Nov-03)" $6.95
"Total Network Licensing Charges" $7.20

"TAXES"
"" "Total"
"Total Taxes" $7.09

--------------
"M:" "9042437121 "
"Subscriber Name:" "Fred 1"
"Additional line user name:" ""
"Sublevel:" " "
"Sublevel:" ""
"Reference 1:" ""
"Reference 2:" ""

"CURRENT CHARGES"
Nov 22 '05 #1
4 2090
Yes, you can do it in regex. The trick is to allow your pattern to match
more than one time. For example, if I had something like:

1234
34123
11313
113133
xxxxx

I could write something like:

(?<Numbers>^\d+ $)+xxxxx

Which means that I need to look at Match.Captures instead of Match.Groups,
IIRC.

Note that in most uses of this technique, what you really need to write is
something like:

((?<Numbers> match numbers) match stuff between numbers)+xxxxx

so that the match can continue. You may also need to play around with the
singleline and multiline options.

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://weblogs.asp.net/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Masahiro Ito" <ma**@pleasespa mgoaway.it> wrote in message
news:Xn******** *************** *@216.196.105.1 30...
I have attached a block of text similar to the type that I am working
with.

I have been learning a lot about Regex - it is quite impressive. I can
easily capture bits of info, but I keep having trouble with line breaks.

I want to identify the start and end of blocks of text. Are there some
tips someone can share?

EG: in my text, I can grab a collection of everyones Phone number with:
^"M:"\t"(?<Phon eNumber>[^"])"

But, what about if I wanted to grab many lines, until it matched a
certain pattern. I use the ^ to say not the quote, but can I say not 14
hyphens?

The way I have split this type of data is inefficient. I match all the
cases of:
^-{14}
Then I use many math equations to split the file using the index of the
matches. I am sure Regex must have some way to pattern match a complex
not, to indicate the end of my match?

Thank you.


--------------
"M:" "3242310532 "
"Subscriber Name:" "MR Regex"
"Additional line user name:" ""
"Sublevel:" " "
"Sublevel:" ""
"Reference 1:" ""
"Reference 2:" ""

"CURRENT CHARGES"
"Monthly Service Plan" $40.00
"Additional Local Airtime" $0.00
"Long Distance Charges" $0.00
"Roaming Charges" $0.00
"Network and Licensing Charges" $7.20
"Total Taxes:" $7.09
"Total Current Charges:" $47.20

"MONTHLY SERVICE PLAN" 11-Oct-03 to 10-Nov-03
"Service Plan Name" "Total"
"Mike Dispatch 40 (11-Oct-03 to 10-Nov-03)" $40.00
"Total Monthly Service Plan Charges" $40.00

"ADDITIONAL LOCAL AIRTIME"
"Service" "Total Mins. Used" "Free Mins. Used" "Included Mins.
Used" "Chargeable Mins. Used" "Total"
"Direct Connect Private (minutes)" 28:04 28:04 0:00 0:00 $0.00
"Total Additional Local Airtime Charges" $0.00

"LONG DISTANCE, ROAMING AND OTHER CALL CHARGES"
"Service" "Incl. LD Minutes" "Chargeable LD Minutes" "Total"
"Total Long Distance Charges" $0.00

"ROAMING"
"Service" "Roaming Minutes" "Roaming Charges" "Roaming LD Minutes"
"Roaming LD Charges" "Roaming Surcharge" "Total"
"Total Roaming Charges" $0.00

"WIRELESS WEB - PREMIUM SERVICE"
"Service" "Total Events" "Event Type" "Total"
"Total Wireless Web Premium Services Charges" $0.00

"PHONE - PREMIUM SERVICE"
"Service" "Total Events" "Event Type" "Total"
"Total Phone Premium Services Charges" $0.00

"PAGER SERVICES"
"Service" "Total Messages" "Included Messages" "Chargeable
Messages" "Total"
"Total Pager Charges" $0.00

"VALUE-ADDED SERVICES" 11-Oct-03 to 10-Nov-03
"Service" "Total"
"Wireless Web - Surf Sampler (11-Oct-03 to 10-Nov-03)" $0.00
"Total Value Added Service Charges" $0.00

"OTHER CHARGES AND CREDIT"
"Charge or Credit" "Total"
"Total Other Charges and Credits" $0.00

"NETWORK and LICENSING CHARGES"
"Service" "Total"
"911 Emergency Access Charge (11-Oct-03 to 10-Nov-03)" $0.25
"System Licensing Charge (11-Oct-03 to 10-Nov-03)" $6.95
"Total Network Licensing Charges" $7.20

"TAXES"
"" "Total"
"Total Taxes" $7.09

--------------
"M:" "9042437121 "
"Subscriber Name:" "Fred 1"
"Additional line user name:" ""
"Sublevel:" " "
"Sublevel:" ""
"Reference 1:" ""
"Reference 2:" ""

"CURRENT CHARGES"

Nov 22 '05 #2
Thank you Eric. I was doing a capture group (in my first example using
my sample text I used (?<PhoneNumber>[^"]*) to capture everything until
the next " in my phonenumber collection.

In this simple example, capturing the Field 1 and Field5 value, I cannot
reliably regex the 'everything between numbers'.

My attempt (doesn't work:
Field1:\s(<F1>[0-9]*)[^Field5:]*Field5:\s(?<F5 >[0-9.$]*)
^trouble^

Field1: 1234
Field2: 34123
Field3: 1313
Field4: 13133
Field5: $xxxx.00
Field6: 2342df
Field1: 2342
Field2: 33241
Field3: 2142
Field4: 543523
Field5: $342.00
Field6: 43254
Field1: 3415
Field2: 234235
Field3: 341
Field4: 13212533
Field5: $5234.00
Field6: 32415

Of course, I can run two separate captures, but...

You gave the example technique : ((?<Numbers> match numbers) match stuff
between numbers)+xxxxx

Does this +xxxxx match everything until the xxxxx is found? In my regex
apps (I use expresso and Regex Workshop as dotnet tools) there are no
matches.

Thanks,

Masa

"Eric Gunnerson [MS]" <er****@online. microsoft.com> wrote in
news:#O******** ******@TK2MSFTN GP12.phx.gbl:
Yes, you can do it in regex. The trick is to allow your pattern to
match more than one time. For example, if I had something like:

1234
34123
11313
113133
xxxxx

I could write something like:

(?<Numbers>^\d+ $)+xxxxx

Which means that I need to look at Match.Captures instead of
Match.Groups, IIRC.

Note that in most uses of this technique, what you really need to
write is something like:

((?<Numbers> match numbers) match stuff between numbers)+xxxxx

so that the match can continue. You may also need to play around with
the singleline and multiline options.


Nov 22 '05 #3
I'm a little confused about what you're trying to do. Given the example text
below, what is the expect output that you want?

If I assume that you didn't mean to write xxxx.00 for the Field5 value
below, the following regex may do what you want:

new Regex(@"
(
(?<S2>.*?)
Field1:\s(?<F1>[0-9]*)
(?<S1>.+?)
Field5:\s(?<F5>[0-9.\$]+)
)+",
RegexOption.Ign orePatternWhite space);

All the F1 values will be in one capture, all the F5 values in the other
capture. I named the S1 and S2 captures so you could see what they're
matching.

I'd suggest using my Regex Workbench at
http://www.gotdotnet.com/Community/U...1-4EE2729D7322 -
it makes playing around with Regex much easier.

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://weblogs.asp.net/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Masahiro Ito" <ma**@pleasespa mgoaway.it> wrote in message
news:Xn******** *************** @207.46.248.16. ..
Thank you Eric. I was doing a capture group (in my first example using
my sample text I used (?<PhoneNumber>[^"]*) to capture everything until
the next " in my phonenumber collection.

In this simple example, capturing the Field 1 and Field5 value, I cannot
reliably regex the 'everything between numbers'.

My attempt (doesn't work:
Field1:\s(<F1>[0-9]*)[^Field5:]*Field5:\s(?<F5 >[0-9.$]*)
^trouble^

Field1: 1234
Field2: 34123
Field3: 1313
Field4: 13133
Field5: $xxxx.00
Field6: 2342df
Field1: 2342
Field2: 33241
Field3: 2142
Field4: 543523
Field5: $342.00
Field6: 43254
Field1: 3415
Field2: 234235
Field3: 341
Field4: 13212533
Field5: $5234.00
Field6: 32415

Of course, I can run two separate captures, but...

You gave the example technique : ((?<Numbers> match numbers) match stuff
between numbers)+xxxxx

Does this +xxxxx match everything until the xxxxx is found? In my regex
apps (I use expresso and Regex Workshop as dotnet tools) there are no
matches.

Thanks,

Masa

"Eric Gunnerson [MS]" <er****@online. microsoft.com> wrote in
news:#O******** ******@TK2MSFTN GP12.phx.gbl:
Yes, you can do it in regex. The trick is to allow your pattern to
match more than one time. For example, if I had something like:

1234
34123
11313
113133
xxxxx

I could write something like:

(?<Numbers>^\d+ $)+xxxxx

Which means that I need to look at Match.Captures instead of
Match.Groups, IIRC.

Note that in most uses of this technique, what you really need to
write is something like:

((?<Numbers> match numbers) match stuff between numbers)+xxxxx

so that the match can continue. You may also need to play around with
the singleline and multiline options.

Nov 22 '05 #4
"Eric Gunnerson [MS]" <er****@online. microsoft.com> wrote in
news:OQ******** ******@TK2MSFTN GP09.phx.gbl:
I'm a little confused about what you're trying to do. Given the
example text below, what is the expect output that you want?

If I assume that you didn't mean to write xxxx.00 for the Field5 value
below, the following regex may do what you want:

new Regex(@"
(
(?<S2>.*?)
Field1:\s(?<F1>[0-9]*)
(?<S1>.+?)
Field5:\s(?<F5>[0-9.\$]+)
)+",
RegexOption.Ign orePatternWhite space);

All the F1 values will be in one capture, all the F5 values in the
other capture. I named the S1 and S2 captures so you could see what
they're matching.

I'd suggest using my Regex Workbench at
http://www.gotdotnet.com/Community/U...px?SampleGuid=
C712F2DF-B026-4D58-8961-4EE2729D7322 - it makes playing around with
Regex much easier.

Thanks Eric. Actually, I was using your Regex Workbench already - it is
great! Thank you for sharing it.

Something is not clicking with me and these regex expressions. Even when I
paste your regex, I don't believe I am getting the responses you intended.
In the sample I posted, I am trying to capture the field 1 and field 5
values. I can capture them separately, but can't seem to grasp the 'skip
everything until a specific pattern is matched'.

I am trying to break down your sample piece by piece. Does the @ at the
start do something?

Also, using Regex Workbench, using your sample in your first reply, I am
not getting any matches.
String:
1234
34123
11313
113133
xxxxx

Regex:
(?<Numbers>^\d+ $)+xxxxx

I have tried every permutation I can think of with Multi/single line, etc..
I feel like I am going crazy.

Thank you.

Masa

Nov 22 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
329
by: Masahiro Ito | last post by:
I have attached a block of text similar to the type that I am working with. I have been learning a lot about Regex - it is quite impressive. I can easily capture bits of info, but I keep having trouble with line breaks. I want to identify the start and end of blocks of text. Are there some tips someone can share? EG: in my text, I can grab a collection of everyones Phone number with:
6
519
by: | last post by:
I am rewriting a C++ application in C#. This file has a combination of Text and Binary data. I used CFile before to read the text. If I hit a certain string that denotes the following data is binary, I used the current position in the file and another stream to read to the binary data. All text data is ended with a carriage return / line feed while the binary is actually an image file listed byte by byte. Preceding the binary data...
1
2555
by: Colin Green | last post by:
Hi I wonder if anyone has any ideas about this... I am dumping the contents of an XmlDocument into a RichTextBox so the user can see the raw XML. I use a line of code something like this to lado the Xml text into the textbox: richTextBox.Text = xmlDoc.ToString(); I recon this technique is inefficient with large xml documents.
3
1456
by: Chan | last post by:
Got in a difficult situation of storing and retrieving TextReader in Cache, but found no such post yet. Tried to store a textreader into Cache object in a similar way illustrated in .NET's SDK doc ("Retrieving Values of Cached Items"): ' ----------------------------- ... Dim CPCache As textreader = CType(Cache("CP"), textreader) If CPCache Is Nothing Then ...
2
16309
by: Bryan Dickerson | last post by:
StreamReader says it is designed to read a stream of characters StringReader says it is designed to read a string TextReader says it is designed to read a sequential list of characters. I hate to sound like a VB6 grump, but aren't we splitting hairs?? What's the difference? -- TFWBWY...A
11
12660
by: info | last post by:
Hi All, How can i rewind the following sile stream: TextReader tr = new StreamReader(File.Open(fileName, FileMode.Open)); Is there a dedicated method or shall I close and re-open the file...
3
1935
by: trint | last post by:
When testing locally with: TextReader tr = new StreamReader(@"C:\gcc.set"); Now that I have uploaded this to the server on the net, that locations permissions are denied. How can I change this to work with StreamReader on the Web or do I have to use a different function? Any help is appreciated. Thanks, Trint
1
6198
by: Rene | last post by:
Hi, I decided to take a closer look at the TextWriter and TextReader abstract classes just for fun. While poking around, I noticed that the TextWriter class includes an 'Encoding' property in its definition. To me, this sounds very logical because the TextWriter is all about writing text and sooner or later you will need to use a type of encoding that specifies how to write (persist) the text.
3
3729
by: Tony Johansson | last post by:
Hello! I just wonder in this specific case is it any advantage to use a TextReader a reference to a StreamReader ? Try { TextReader tr = new StreamReader(locationTextBox.Text); Try {
0
1371
by: Mobious | last post by:
Hello, I am currently designing a console app that will run and search our network for any files that contain 16 digit numbers. I'm having to utilise iFilters to properly index each file, which was gratefully based on the great article I found on CodeProject. The problem I'm having is that it basically seems to hang after a certain amount of time and after some debugging, it seems to always happen on the same file. This file is a ~40MB...
0
9673
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10216
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10165
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9044
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7543
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5437
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5565
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4113
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3728
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.