Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old February 28th, 2007, 10:05 AM
Kris
Guest
 
Posts: n/a
Default regexp question

I have a text and I want to match several words like this "(controller|
data|IT)?" at least one of the words in the group should be mandatory,
but I dont want to match a word that is part of another word like this
"datamanager" it should only match the word "data" or any other word
in the group.

Any suggestions?

  #2  
Old February 28th, 2007, 10:45 AM
Heiko Richler
Guest
 
Posts: n/a
Default Re: regexp question

Kris wrote:
Quote:
I have a text and I want to match several words like this "(controller|
data|IT)?" at least one of the words in the group should be mandatory,
but I dont want to match a word that is part of another word like this
"datamanager" it should only match the word "data" or any other word
in the group.
>
Any suggestions?
You know there are classes for characters, like \w? Well there are
classes for non characters:
^ Start of Line or Data
$ End of Line or Data
\b Word-boundery, change from \w to \W

- "(\b(controller|data|IT)\b)?"

But some letters like German umlaute may be seen as Non-Word-Elements,
depending on your the encoding.

In this cases you may try search for your words between non letters.
- "((\s|^)(controller|data|IT)(\s|$))?"

what about data-base?

Heiko
--
http://portal.richler.de/ Namensportal zu Richler
http://www.richler.de/ Heiko Richler: Computer - Know How!
http://www.richler.info/ private Homepage
  #3  
Old February 28th, 2007, 11:25 AM
Kris
Guest
 
Posts: n/a
Default Re: regexp question

On Feb 28, 11:29 am, Heiko Richler <heiko-rich...@nefkom.netwrote:
Quote:
Kris wrote:
Quote:
I have a text and I want to match several words like this "(controller|
data|IT)?" at least one of the words in the group should be mandatory,
but I dont want to match a word that is part of another word like this
"datamanager" it should only match the word "data" or any other word
in the group.
>
Quote:
Any suggestions?
>
You know there are classes for characters, like \w? Well there are
classes for non characters:
^ Start of Line or Data
$ End of Line or Data
\b Word-boundery, change from \w to \W
>
- "(\b(controller|data|IT)\b)?"
>
But some letters like German umlaute may be seen as Non-Word-Elements,
depending on your the encoding.
>
In this cases you may try search for your words between non letters.
- "((\s|^)(controller|data|IT)(\s|$))?"
>
what about data-base?
>
Heiko
--http://portal.richler.de/Namensportal zu Richlerhttp://www.richler.de/ Heiko Richler: Computer - Know How!http://www.richler.info/ private Homepage
Hi Heiko

thanks for the reply

I know about boundaries and such, just couldnt get it to work, now I
see why, I had tesed \b(controller...)\b which didnt work as intended.

Kris

 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles