473,396 Members | 2,030 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

stuck on a REGEX (\S[^\s/>]*)

I'm trying to find the opening < and the text of a tag (without the
attributes or closing tags)

This is what I'm using:

(\S[^\s/>]*)

Which, I think, reads as:

(any number of non-whitespace characters [up to a space, /, or >])

Is that correct? I can't get it to work.

If my text is:

<tag

then it returns "<tag" which is what I want.

However, if I have:

<tag/ or <tag>

it instead matches "/" or ">" respectively.

Why?


Nov 18 '05 #1
5 1233
darrel wrote:
I'm trying to find the opening < and the text of a tag (without the
attributes or closing tags)

This is what I'm using:

(\S[^\s/>]*)

Which, I think, reads as:

(any number of non-whitespace characters [up to a space, /, or >])

Is that correct? I can't get it to work.

If my text is:

<tag

then it returns "<tag" which is what I want.

However, if I have:

<tag/ or <tag>

it instead matches "/" or ">" respectively.

Why?


In my brief testing, when run against "<tag/" it first matches "<tag" -
then the next match is "/". The second match matches "/" because it
matches the \S character class.

Post some examples of how you want the regex to behave, and maybe
someone can help put one together.

--
mikeb
Nov 18 '05 #2
> In my brief testing, when run against "<tag/" it first matches "<tag" -
then the next match is "/". The second match matches "/" because it
matches the \S character class.
But shouldn't this: [^/] stop it from doing that?

Here's how I want the regex to behave:

I want to find the first 'word' in the string. this would be any number of
characters in a row up to (but not including) a space, a new line, or a / or


so in this:

"hello there, how are you"

it should match 'hello'

in this:

"<blockquote>hello there, how are you"

it should match '<blockquote'

Thanks!

-Darrel
Nov 18 '05 #3
But shouldn't this: [^/] stop it from doing that?
Aha. Mike, you are correct!

Here's what's happening. If this is my text:

<blockquote>monkey</blockquote>

and this is my Regex:

\S[^>]*

It returns these matches:

<blockquotemonkey</blockquote


So, it's returning the last match, I suppose. This is where I get lost. How
do I get it to ONLY return the first match?


Nov 18 '05 #4
Got it!

The problem was the very next group I was using.

I had this:

(\S[^\s/>]*)
but had to add another group:
(\s|\n[^\S>]*)|(>))
which checks for whitespace/new lines OR a closing tag.
-Darrel
Nov 18 '05 #5
Use the Match Class of the regular expression object
Dim m as Match = yourRegEx.Match(string)
m will return the first match

"darrel" wrote:
But shouldn't this: [^/] stop it from doing that?


Aha. Mike, you are correct!

Here's what's happening. If this is my text:

<blockquote>monkey</blockquote>

and this is my Regex:

\S[^>]*

It returns these matches:

<blockquote
monkey</blockquote


So, it's returning the last match, I suppose. This is where I get lost. How
do I get it to ONLY return the first match?


Nov 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: James Dyer | last post by:
I'm having problems getting a regex to work. Basically, given two search parameters ($search1 and $search2), it should allow me to filter a log file such that lines with the $search1 string in are...
7
by: Thomas | last post by:
Hi, I want to categorize using one Regular Expression: input: "Earth" output: "Planet" input: "Moon" output: "Planet"
7
by: Rocky Moore | last post by:
I have a web site called HintsAndTips.com. On this site people post tips using a very simply webform with a multi line TextBox for inputing the tip text. This text is encode to HTML so that no...
10
by: Claud Balls | last post by:
I am splitting large files based on a text delimeter, but I don't want the delimeter left out of the string. For example if I had a string "NAME: Bill TOWN: Helena NAME: Frank TOWN: Helena" I...
1
by: =?Utf-8?B?TXJOb2JvZHk=?= | last post by:
I want to match some HTML string using Regex but the linebreaks are getting me. Is there a way to just completely ignore linebreaks in my regular expression? If not, how would I specify a...
3
by: MCH | last post by:
hi there, I am working with a HTML-like text with boost:regex. For example, the following pattern might occur in my text <abc efg> <p>EFG</p 12<3> In this case, I would like to extract...
1
by: =?Utf-8?B?VHJ1cHRpIERhbGlh?= | last post by:
Hi to all, I am new in this newsgroup and new to .NET also. I am creating an application where I have to perform Mail Merge from my program. I have created a form which contain RTB and user can...
1
by: theduffman | last post by:
Hi, I'm trying to write code to update nightly some NHL stats. Everything works, except for names with an accent, e.g. José Théodore. This won't generate a match no matter what I try. I've edited...
11
by: coflo | last post by:
Hello I would like to replace an a href link that is provided in the RSS below with my own link. The link that I am looking to replace is defined in the <description> tag within the RSS. Im...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.