473,803 Members | 3,380 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex repeating capture

Howdy,

I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.

Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72

I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72

How can I force it to repeat the capturing?

Thanks,
Jay

Jan 30 '07 #1
7 5584
Use:

public string[] Split (
params char[] separator
)
to split your string on the asterisk as a first step.

Now you can enumerate over the string array splitting out your identifiers
and data strings. You could use a StringBuilder to build what ever you want
to output.

Now you can use:

public bool StartsWith (
string value
)andpublic string Substring (
int startIndex
)e.g.StringBuil der sb = new StringBuilder() ;
foreach (string s in strArray)
{
if (s.StartsWith(" CZ")
{
sb.Append("CZ") ;
sb.Append(s.Sub string(2));
}
else
{
sb.Append("fuuu ");
sb.Append(s.Sub string(4))
}
}

return sb.ToString();

I'm sure there's an easier way using a Regex, but I can't be bothered to
puzzle it out.

HTH
Peter

<ja*******@gmai l.comwrote in message
news:11******** *************@h 3g2000cwc.googl egroups.com...
Howdy,

I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.

Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72

I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72

How can I force it to repeat the capturing?

Thanks,
Jay

Jan 30 '07 #2
Sorry, this was a simple example. In all, there are 50+ identifiers
and * is allowed in that data as long it isn't immediately followed by
an identifier, otherwise it is considered another identifier.

On Jan 30, 11:58 am, "Peter Bradley" <pbrad...@uwic. ac.ukwrote:
Use:

public string[] Split (
params char[] separator
)
to split your string on the asterisk as a first step.

Now you can enumerate over the string array splitting out your identifiers
and data strings. You could use a StringBuilder to build what ever you want
to output.

Now you can use:

public bool StartsWith (
string value
)andpublic string Substring (
int startIndex
)e.g.StringBuil der sb = new StringBuilder() ;
foreach (string s in strArray)
{
if (s.StartsWith(" CZ")
{
sb.Append("CZ") ;
sb.Append(s.Sub string(2));
}
else
{
sb.Append("fuuu ");
sb.Append(s.Sub string(4))
}

}return sb.ToString();

I'm sure there's an easier way using a Regex, but I can't be bothered to
puzzle it out.

HTH

Peter

<jayluc...@gmai l.comwrote in messagenews:11* *************** *****@h3g2000cw c.googlegroups. com...
Howdy,
I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.
Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72
I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72
I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72
How can I force it to repeat the capturing?
Thanks,
Jay
Jan 30 '07 #3


<ja*******@gmai l.comwrote in message
news:11******** *************@h 3g2000cwc.googl egroups.com...
Howdy,

I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.

Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72

I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72

How can I force it to repeat the capturing?

Thanks,
Jay
So, to split based on an * using a regular expression:

string pattern = @"\*(?<Text>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);

while (match.Success) {
Console.WriteLi ne(match.Groups["Text"].Value);
match = match.NextMatch ();
}

HTH,
Mythran


Jan 30 '07 #4


"Mythran" <ki********@hot mail.comwrote in message
news:40******** *************** ***********@mic rosoft.com...
>

<ja*******@gmai l.comwrote in message
news:11******** *************@h 3g2000cwc.googl egroups.com...
>Howdy,

I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.

Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72

I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72

How can I force it to repeat the capturing?

Thanks,
Jay

So, to split based on an * using a regular expression:

string pattern = @"\*(?<Text>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);

while (match.Success) {
Console.WriteLi ne(match.Groups["Text"].Value);
match = match.NextMatch ();
}

HTH,
Mythran

ahh, I didn't know you wanted to break it out into identifier, text,
identifier, text...thus the previous post should be obliterated :P...do you
know if the identifier is always 4 characters? Hope so, the following
example shows how to achieve this:

string pattern = @"\*(?<Identifi er>.{4})(?<Valu e>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);

while (match.Success) {
Console.WriteLi ne(
"Identifier : {0} - Value: {1}",
match.Groups["Identifier "].Value,
match.Groups["Value"].Value
);
match = match.NextMatch ();
}

HTH,
Mythran
Jan 30 '07 #5
Jay
The identifier is at least 2 character, but has no upper limit.
Thanks,
Jay

On Jan 30, 12:36 pm, "Mythran" <kip_pot...@hot mail.comwrote:
"Mythran" <kip_pot...@hot mail.comwrote in messagenews:40* *************** *************** ***@microsoft.c om...


<jayluc...@gmai l.comwrote in message
news:11******** *************@h 3g2000cwc.googl egroups.com...
Howdy,
I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string.
Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72
I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72
I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72
How can I force it to repeat the capturing?
Thanks,
Jay
So, to split based on an * using a regular expression:
string pattern = @"\*(?<Text>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);
while (match.Success) {
Console.WriteLi ne(match.Groups["Text"].Value);
match = match.NextMatch ();
}
HTH,
Mythranahh, I didn't know you wanted to break it out into identifier, text,
identifier, text...thus the previous post should be obliterated :P...do you
know if the identifier is always 4 characters? Hope so, the following
example shows how to achieve this:

string pattern = @"\*(?<Identifi er>.{4})(?<Valu e>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);

while (match.Success) {
Console.WriteLi ne(
"Identifier : {0} - Value: {1}",
match.Groups["Identifier "].Value,
match.Groups["Value"].Value
);
match = match.NextMatch ();

}HTH,
Mythran
Jan 30 '07 #6
Jay
I know what the identifiers are, so I'm okay with replacing the .{4}
with (Identifier1|Id entifier2|...|I dentifierN) at run time. However, I
cannot blindly end the data capture on an asterisk. "*CZ1 2.3 4*A56
*fuuuS24364 08 23 72" is also valid provide *A6 is not a valid
identifier. The data capture can only end if it encounters another
valid identifier.

On Jan 30, 12:52 pm, "Jay" <JaythePC...@gm ail.comwrote:
The identifier is at least 2 character, but has no upper limit.

Thanks,
Jay

On Jan 30, 12:36 pm, "Mythran" <kip_pot...@hot mail.comwrote:
"Mythran" <kip_pot...@hot mail.comwrote in messagenews:40* *************** *************** ***@microsoft.c om...
<jayluc...@gmai l.comwrote in message
>news:11******* **************@ h3g2000cwc.goog legroups.com...
>Howdy,
>I'm trying to break an input string into multpile pieces using a
>series of delimiters that start with an asterisk. Following the
>asterisk is a mulitple character identifier immediately followed by a
>data string of variable length. The input string may contain more than
>one identifier anywhere in the string.
>Here is an example:
>*CZ1 2.3 4-56 *fuuuS24364 08 23 72
>I'd like to break this into
>CZ
>1 2.3 4-56
>fuuu
>S24364 08 23 72
>I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
>following ouput:
>CZ
>1 2.3 4-56 *fuuuS24364 08 23 72
>How can I force it to repeat the capturing?
>Thanks,
>Jay
So, to split based on an * using a regular expression:
string pattern = @"\*(?<Text>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);
while (match.Success) {
Console.WriteLi ne(match.Groups["Text"].Value);
match = match.NextMatch ();
}
HTH,
Mythranahh, I didn't know you wanted to break it out into identifier, text,
identifier, text...thus the previous post should be obliterated :P...do you
know if the identifier is always 4 characters? Hope so, the following
example shows how to achieve this:
string pattern = @"\*(?<Identifi er>.{4})(?<Valu e>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);
while (match.Success) {
Console.WriteLi ne(
"Identifier : {0} - Value: {1}",
match.Groups["Identifier "].Value,
match.Groups["Value"].Value
);
match = match.NextMatch ();
}HTH,
Mythran
Jan 30 '07 #7


"Jay" <Ja*********@gm ail.comwrote in message
news:11******** **************@ p10g2000cwp.goo glegroups.com.. .
>I know what the identifiers are, so I'm okay with replacing the .{4}
with (Identifier1|Id entifier2|...|I dentifierN) at run time. However, I
cannot blindly end the data capture on an asterisk. "*CZ1 2.3 4*A56
*fuuuS24364 08 23 72" is also valid provide *A6 is not a valid
identifier. The data capture can only end if it encounters another
valid identifier.

On Jan 30, 12:52 pm, "Jay" <JaythePC...@gm ail.comwrote:
>The identifier is at least 2 character, but has no upper limit.

Thanks,
Jay

On Jan 30, 12:36 pm, "Mythran" <kip_pot...@hot mail.comwrote:
"Mythran" <kip_pot...@hot mail.comwrote in
messagenews:40* *************** *************** ***@microsoft.c om...
<jayluc...@gmai l.comwrote in message
news:11******* **************@ h3g2000cwc.goog legroups.com...
Howdy,
>I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by
a
data string of variable length. The input string may contain more
than
one identifier anywhere in the string.
>Here is an example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72
>I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72
>I have tried the pattern (?:\*(CZ|fuuu)( .*)), which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72
>How can I force it to repeat the capturing?
>Thanks,
Jay
So, to split based on an * using a regular expression:
string pattern = @"\*(?<Text>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);
while (match.Success) {
Console.WriteLi ne(match.Groups["Text"].Value);
match = match.NextMatch ();
}
HTH,
Mythranahh, I didn't know you wanted to break it out into identifier,
text,
identifier, text...thus the previous post should be obliterated :P...do
you
know if the identifier is always 4 characters? Hope so, the following
example shows how to achieve this:
string pattern = @"\*(?<Identifi er>.{4})(?<Valu e>[^\*]+)";
string input = "*CZ1 2.3 4-56 *fuuuS24364 08 23 72";
Match match = Regex.Match(inp ut, pattern);
while (match.Success) {
Console.WriteLi ne(
"Identifier : {0} - Value: {1}",
match.Groups["Identifier "].Value,
match.Groups["Value"].Value
);
match = match.NextMatch ();
}HTH,
Mythran
How many identifiers are there? If there are a small list (say, less than
10ish), then you can use the regex OR character '|' in the pattern to
separate the list of valid identifiers instead of matching on the asterisk
itself.

HTH,
Mythran

Jan 31 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4497
by: Adam Flott | last post by:
I'm having some difficulty getting the expect function of telnetlib to capture some data that gets returned from a telnet connection. Python's telnet debug reports this: recv 'whoami\n\r\xff\xfc\x01\r\nHi, my name is : Home\r\nHere' IAC WONT 1 recv ' is what I know about myself:\r\nModel: ' recv ' VSX 7000\r\nSerial Number: 822232085C63K\r\nS' recv 'oftware Version: Release 1.0 - August 2001\r'
4
329
by: Masahiro Ito | last post by:
I have attached a block of text similar to the type that I am working with. I have been learning a lot about Regex - it is quite impressive. I can easily capture bits of info, but I keep having trouble with line breaks. I want to identify the start and end of blocks of text. Are there some tips someone can share? EG: in my text, I can grab a collection of everyones Phone number with:
5
1547
by: Bill Cohagan | last post by:
I'm looking for help with a regular expression question, so my first question is which newsgroup is the best one to post to? Just in case *this* is the best choice, here's the problem: I'm trying to "parse" something that looks like a command line; e.g., op arg1, arg2, ..., argn The individual parts (op, arg1, ...) can be matched with a \w+ pattern -- except that the args *might* be quoted to cover the case where they contain
2
5985
by: Jose | last post by:
There's something for me to learn with this example, i'm sure :) Given this text: "....." and my first attempt at capture the groups: "(?:\)" RegExTest gives me what i expect: 6 captured groups: Contact, Region, All ContractRegion, ASIA PACIFIC, Japan, Japan. However, with this C# code, i just get 2 capture groups: ",
1
1167
by: msnews.microsoft.com | last post by:
Hi, I have the expression "<font+>""(*)""</font>+\?AUTHOR_ID=+"">(*)</a>" Any body can tell me what is the meaning of that expression and what is the output of the expression. Regards, Muhammad Jamil Nawaz
3
2290
by: Masa Ito | last post by:
I am trying to capture the contents of a function with Regex. I am using Expresso to test (nice - thanks for the great tool UltraPico!). I can handle my own with single line regex's (I think).. I want to have a named capture of the entire 'contents' of specific functions. EG: Sample code <Description("{0} is a required field.")_ Protected Overridable Function AccountIDRequired(ByVal target As Object, ByVal e As RuleArgs) As Boolean...
3
1652
by: Ethan Strauss | last post by:
Hi, I have written a regular expression which is supposed to pull a direction (forward or reverse) designation from a file name. Unfortunately, the direction designation can either be the whole word ("Forward" or "Reverse") or just a single letter ("F" or "R") and the rest of the name is not as consistent as I would like.. For example "P1|1_G10_Forward_primer.ab1" or "K8_I1_A01_F.ab1".
1
2724
by: =?Utf-8?B?QWxCcnVBbg==?= | last post by:
I have a regular expression for capturing all occurrences of words contained between {{ and }} in a file. My problem is I need to capture what is between those symbols. For instance, if I have tags such as {{FirstName}}, {{LastName}}, and {{Address}} placed in the file, I need to be able to capture the text strings of FirstName, LastName and Address, respectively. I'm sure it can be done with Regex as easily as finding the locations of...
2
1606
by: Good Man | last post by:
Hi there I have a series of HTML tables (well-formed, with elements ID'd quite nicely) and I need to extract the contents from certain TDs. For example, I'd like to get "Hi Mom!" from the example below: <td class="RSCWeb MainMsg">Hi Mom!</td> My RegEx skill leave much to be desired, I don't know how to capture data *between* two things (ie: the <td blah blah></td>)... can it be done? If
0
9703
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10550
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10317
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10295
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7604
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5501
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5633
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4275
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2972
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.