473,406 Members | 2,849 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

string/regex: extracting the context of a string match around the found search term?

I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display excerpts
of that context, just as a search engine might. In terms of code, what's the
easiest way to make that happen? The code below works fine for identifying
the matches, but it doesn't try to extract the surrounding context or
display it:

currPageText = [method which grabs text from my source]
numberOfMatches = Regex.Matches(currPageText,
pattern,RegexOptions.IgnoreCase).Count;
Response.Write("found " + numberOfMatches + "<br>");

Thank you,
-KF

Jul 4 '07 #1
3 1902
On Tue, 03 Jul 2007 23:41:47 -0700, <ke*****@nospam.nospamwrote:
I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display
excerpts
of that context, just as a search engine might. In terms of code, what's
the
easiest way to make that happen?
The Regex.Matches() method returns a Matches instance, which is a
collection of Match instances. That's why you can look at the Count
property to see how many matches there were.

The Match class has the Index and Length property (inherited from the
Capture class), which tells you where in the original string the text was
found. You can easily use that information to look at the larger region
of text containing the matching text.

So, just enumerate the MatchCollection returned by Matches(), and for each
Match instance look at the substring defined by expanding the Match.Index
to Match.Index+Match.Length range to be as large as you think is
appropriate.

Pete
Jul 4 '07 #2
On Wed, 04 Jul 2007 00:48:21 -0700, Peter Duniho
<Np*********@nnowslpianmk.comwrote:
The Regex.Matches() method returns a Matches instance [...]
Obviously, that should read "returns a MatchCollection instance".
Hopefully the later part of my post made that clear. Forry sor any
foncusion. :)
Jul 4 '07 #3
You could write a Regex to give you some text around the match doing by
doing something like

Regex MatchWithContext = new Regex
("(?<ContextBeforeMatch>.{10})(?<ActualMatch>Strin gYouWantTofind)(?<ContextAfterMatch>.{10})")

This would allow you to capture three groups. An "ActualMatch" group, a
"ContextBeforeMatch" group of 10 characters, and a "ContextAfterMatch"
group of 10 characters. I am almost certain this would work the way you want
it in the middle of the string, but you would not match anything fewer than
10 characters from the beginning or end of the string. I think that you
could fix that as follows

Regex MatchWithContext = new Regex
("(?<ContextBeforeMatch>.{0,10})(?<ActualMatch>Str ingYouWantTofind)(?<ContextAfterMatch>.{0,10})")
to allow the context groups to be as small as zero if needed, but I would
defiantly test this one before using it...
Ethan
Ethan Strauss Ph.D.
Bioinformatics Scientist
Promega Corporation
2800 Woods Hollow Rd.
Madison, WI 53711
608-274-4330
800-356-9526
et***********@promega.com
<ke*****@nospam.nospamwrote in message
news:en**************@TK2MSFTNGP04.phx.gbl...
I'm analyzing large strings and finding matches using the Regex class. I
want to find the context those matches are found in and to display
excerpts of that context, just as a search engine might. In terms of code,
what's the easiest way to make that happen? The code below works fine for
identifying the matches, but it doesn't try to extract the surrounding
context or display it:

currPageText = [method which grabs text from my source]
numberOfMatches = Regex.Matches(currPageText,
pattern,RegexOptions.IgnoreCase).Count;
Response.Write("found " + numberOfMatches + "<br>");

Thank you,
-KF



Jul 5 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Ken Fine | last post by:
I'm looking to find or create an ASP script that will take a string, examine it for a search term, and if it finds the search term in the string, return the highlighted search term along with the...
11
by: Lord Khaos | last post by:
If I am trying to find an expression, foo, I can do something like this: rExp = /foo/gi; if(results.search(rExp) > -1){ and all work fine. however, if I want my search term to be a...
4
by: Masahiro Ito | last post by:
I have attached a block of text similar to the type that I am working with. I have been learning a lot about Regex - it is quite impressive. I can easily capture bits of info, but I keep having...
3
by: Richard L Rosenheim | last post by:
I have some text where I need to extract some pieces from. The text will be in a format like this: a string description color="red" type="unknown" In the above example, I would be looking to...
32
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if...
29
by: zoro | last post by:
Hi, I am new to C#, coming from Delphi. In Delphi, I am using a 3rd party string handling library that includes some very useful string functions, in particular I'm interested in BEFORE (return...
7
by: Brian Mitchell | last post by:
Is there an easy way to pull a date/time stamp from a string? The DateTime stamp is located in different parts of each string and the DateTime stamp could be in different formats (mm/dd/yy or...
2
by: ankit | last post by:
I want to get the start and end of all the patterns mattched in regex. I know I can get it with start() and end() fn of matched objects. But re.search() return the match object of first matching...
10
by: Dan | last post by:
I have a number of strings that represents time. 1w 2d 3h 15m 2d 3h 15m 4h 30m 45m I want to extract the number parts of my strings into separate variables for Weeks, Days, Hours and...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.