By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,885 Members | 1,499 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,885 IT Pros & Developers. It's quick & easy.

string/regex: extracting the context of a string match around the found search term?

P: n/a
I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display excerpts
of that context, just as a search engine might. In terms of code, what's the
easiest way to make that happen? The code below works fine for identifying
the matches, but it doesn't try to extract the surrounding context or
display it:

currPageText = [method which grabs text from my source]
numberOfMatches = Regex.Matches(currPageText,
pattern,RegexOptions.IgnoreCase).Count;
Response.Write("found " + numberOfMatches + "<br>");

Thank you,
-KF

Jul 4 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On Tue, 03 Jul 2007 23:41:47 -0700, <ke*****@nospam.nospamwrote:
I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display
excerpts
of that context, just as a search engine might. In terms of code, what's
the
easiest way to make that happen?
The Regex.Matches() method returns a Matches instance, which is a
collection of Match instances. That's why you can look at the Count
property to see how many matches there were.

The Match class has the Index and Length property (inherited from the
Capture class), which tells you where in the original string the text was
found. You can easily use that information to look at the larger region
of text containing the matching text.

So, just enumerate the MatchCollection returned by Matches(), and for each
Match instance look at the substring defined by expanding the Match.Index
to Match.Index+Match.Length range to be as large as you think is
appropriate.

Pete
Jul 4 '07 #2

P: n/a
On Wed, 04 Jul 2007 00:48:21 -0700, Peter Duniho
<Np*********@nnowslpianmk.comwrote:
The Regex.Matches() method returns a Matches instance [...]
Obviously, that should read "returns a MatchCollection instance".
Hopefully the later part of my post made that clear. Forry sor any
foncusion. :)
Jul 4 '07 #3

P: n/a
You could write a Regex to give you some text around the match doing by
doing something like

Regex MatchWithContext = new Regex
("(?<ContextBeforeMatch>.{10})(?<ActualMatch>Strin gYouWantTofind)(?<ContextAfterMatch>.{10})")

This would allow you to capture three groups. An "ActualMatch" group, a
"ContextBeforeMatch" group of 10 characters, and a "ContextAfterMatch"
group of 10 characters. I am almost certain this would work the way you want
it in the middle of the string, but you would not match anything fewer than
10 characters from the beginning or end of the string. I think that you
could fix that as follows

Regex MatchWithContext = new Regex
("(?<ContextBeforeMatch>.{0,10})(?<ActualMatch>Str ingYouWantTofind)(?<ContextAfterMatch>.{0,10})")
to allow the context groups to be as small as zero if needed, but I would
defiantly test this one before using it...
Ethan
Ethan Strauss Ph.D.
Bioinformatics Scientist
Promega Corporation
2800 Woods Hollow Rd.
Madison, WI 53711
608-274-4330
800-356-9526
et***********@promega.com
<ke*****@nospam.nospamwrote in message
news:en**************@TK2MSFTNGP04.phx.gbl...
I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display
excerpts of that context, just as a search engine might. In terms of code,
what's the easiest way to make that happen? The code below works fine for
identifying the matches, but it doesn't try to extract the surrounding
context or display it:

currPageText = [method which grabs text from my source]
numberOfMatches = Regex.Matches(currPageText,
pattern,RegexOptions.IgnoreCase).Count;
Response.Write("found " + numberOfMatches + "<br>");

Thank you,
-KF



Jul 5 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.