473,732 Members | 2,146 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex Matches

Any takers?

Got a string of DNA as an input sequence GGATGGATG, apply the simple
regex "GGATG" as in

Regex r = new Regex("GGATG", (RegexOptions.C ompiled));

MatchCollection matches = r.Matches("GGAT GGATG");

Now I would expect to get two matches right? One at index 0 in the
string and the second at index 4? Or am I being really dumb or
something (EricGu, where art thou?).

Thanks 4 help.

Kofi.

Nov 24 '05 #1
5 2421
It would match two times if you put an extra G at index 4 in the matches
string:
GGATGGGATG

Gabriel Lozano-Morán
Nov 24 '05 #2

Gabriel Lozano-Morán wrote:
It would match two times if you put an extra G at index 4 in the matches
string:
GGATGGGATG

Gabriel Lozano-Morán


Well, yes, but I think that what the OP wanted to know is why Regex
doesn't re-scan after a match. That is, in the string GGATGGATG, the
Regex will match the initial string: GGATG. After that, where does the
Regex processor look to start matching next? Does it start with the
part of the string after the first matched character, so does it begin
matching the substring GATGGATG, in which case it would find a second
match in the fifth character of the original string (the fourth
character of the substring)? Or does it start looking for another match
after the last character matched in the first match, therefore matching
against GATG, which will result in no second match?

Regex appears to display the latter behaviour, according to the OP.

I checked the RegexOptions enumeration, and don't see any flag for
Rescan. I have seen this option for other Regex pattern matchers, but
it doesn't appear to be in the .NET one.

One thing the OP could do is use Match instead of Matches:

string dna = "GGATGGATG" ;
int matchIndex = 0;
Regex r = new Regex("GGATG");
Match sequence = r.Match(dna, matchIndex);
while (sequence != Match.Empty)
{
matchIndex = sequence.Index;
Console.WriteLi ne("Sequence matched at index {0}", matchIndex);
matchIndex++;
sequence = r.Match(dna, matchIndex);
}

Or something like that. Then he could determine where Regex should
start searching again after it finds a match.

Nov 25 '05 #3
Barry,

Thanks for your helpful reply - spot on.

Kofi.

Nov 25 '05 #4
That is logical that you get only one result.
If you want to get all indexes that have matched, you can use this trick :
use GGAT(?=G) instead of GGATG
So you will match all GGAT sequence that is followed by G. You will not get
of course GGATG in a match result but you don't mind because you know you
are looking for GGATG.
So with
Regex r = new Regex("GGAT(?=G )", (RegexOptions.C ompiled));
MatchCollection matches = r.Matches("GGAT GGATG");

you will get 2 matches, the first at position 0 and the second at position 4

Hope it helps,

Ludovic SOEUR.
"Kofi" <ko**@nimoh.com > a écrit dans le message de
news:11******** **************@ o13g2000cwo.goo glegroups.com.. .
Barry,

Thanks for your helpful reply - spot on.

Kofi.

Nov 25 '05 #5

Bruce Wood wrote:
string dna = "GGATGGATG" ;
int matchIndex = 0;
Regex r = new Regex("GGATG");
Match sequence = r.Match(dna, matchIndex);
while (sequence != Match.Empty)
{
matchIndex = sequence.Index;
Console.WriteLi ne("Sequence matched at index {0}", matchIndex);
matchIndex++;
sequence = r.Match(dna, matchIndex);
}


I should point out that there's a bug in my code. The loop test should
read:

while (sequence != Match.Empty && matchIndex < dna.Length) ...

The bug will show up only when matching a one-character Regex pattern
that matches on the last character of the string.

Nov 25 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
5727
by: alphatan | last post by:
Is there relative source or document for this purpose? I've searched the index of "Mastering Regular Expression", but cannot get the useful information for C. Thanks in advanced. -- Learning is to improve, but not to prove.
4
3881
by: Gawelek | last post by:
Lat say, we have such a string : "Ala ma kota" Is is possible to express using Regular Expresion, that I want to get word "kot", that lies behind word "ma" ? BUT, it is the most important thing, in "matches" I want to get only word "kot". Example : String s = "Ala ma kota"; Regex r = new Regex("match_but_not_include_word_ma\\skota");
7
2617
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b) every occurrence of characters "p", "q", "r", "s" with "y". Right now, I do it as follows:
2
6843
by: D | last post by:
My first attempt at this and I'm searching formulas like so RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT(A15,'yy'))+1 I want to extract the row / col coordinates (A15 in above) so I'm using this +\d+\d* however I want unique ones and not 3 copies of A15 as the above returns.
17
3978
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher http://forta.com/books/0672325667/
8
1827
by: vbmark | last post by:
I'm new to RegEx in vb.net so I'm not sure how to do this. I want to know if a string contains two minus signs "-". If there are two then I want it to return TRUE. I also need to know if the string contains two plus signs "+". Should this be a seperate RegEx or can one RegEx check for both signs? Thanks!
5
5103
by: Chris | last post by:
How Do I use the following auto-generated code from The Regulator? '------------------------------------------------------------------------------ ' <autogenerated> ' This code was generated by a tool. ' Runtime Version: 1.1.4322.2032 ' ' Changes to this file may cause incorrect behavior and will be lost if ' the code is regenerated. ' </autogenerated>
17
2788
by: Mark | last post by:
I must create a routine that finds tokens in small, arbitrary VB code snippets. For example, it might have to find all occurrences of {Formula} I was thinking that using regular expressions might be a neat way to solve this, but I am new to them. Can anyone give me a hint here? The catch is, it must only find tokens that are not quoted and not commented; examples follow
2
3985
by: O.B. | last post by:
In the following example, the Matches operation never returns 4 matches as I am expecting. What's wrong with my syntax? private const string DOUBLE_REGEX = @"?*?*"; private const string HEMISPHERE_REGEX = @""; string sourceString = "550402N0420502.50S"; string matchPattern = "(" + DOUBLE_REGEX + ")+" +
1
5621
by: al.moorthi | last post by:
the below program is working in Suse and not working on Cent 5: can any body have the solution ? #include <regex.h> #include <stdlib.h> #include <stdio.h> int main(){ char cool = "http://www.cnn.com:80/wowsers.html";
0
8944
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9445
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9306
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9180
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6733
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4548
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3259
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2721
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.