473,405 Members | 2,185 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

regular expression help

Hi, I'm not sure that this is the right forum for this, but I've been having
a very tough time completing this expression, and I was hoping someone might
have some suggestions for me.
I am trying to read measurements out of a text description, and I have a
working expression, but it captures a pile of empty matches. I obviously am
not interested in them, but I screw up my functionality when I try to get
rid of them.

My expression is:
(?:(?:(?<Feet>[0-9]*)\'){0,1}(?:(?:(?<WholeInches>[0-9]*(?![/\w])){0,1}(?:[
,\-]){0,1}(?<Fraction>[0-9]*\/[0-9]*){0,1}(?<Decimal>\d*\.\d*){0,1}\")){0,1})

Some test strings are:
1/4" x 2" Flat 44W x 20'
1 1/4" x 2" Flat 44W x 20'
1/4" x 2.5" Flat 44W x 20'
1/4" x 2" Flat 44W x 20' 3"
1/4" x 2" Flat 44W x 20' 3.5"
1/4" x 2" Flat 44W x 20' 1/2"
1/8" x 4" C-1018 flat x 14' 5-1/4"

I really could use some help on this. I've been working on this on and off
for several months now, and just can't seem to get it right.
Dec 10 '05 #1
5 2049
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them ignored.
I only want matches on the measurements which can be written about a million
different ways. This is for pulling data out of a legacy inventory
application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty groups,
but this is just not how it should work.

Thanks,
Trevor_B
Dec 10 '05 #2
Trevor Braun wrote:
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them ignored.
I only want matches on the measurements which can be written about a million
different ways. This is for pulling data out of a legacy inventory
application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty groups,
but this is just not how it should work.

Thanks,
Trevor_B


shoot me an email and i'll work with you on these. there's no need to
flood a C# newsgroup with a bunch of back and forth messages about
regular expressions, when they're just between you and me.

send me a long list of the test strings and i'll see what i can do for
you. i've never written a regular expression this complicated and i
would love to give it a try.

jeremiah
Dec 11 '05 #3
I disagree... regular expressions are fun.

-Marc N.
"jeremiah johnson" <na*******@gmail.com> wrote in message
news:uk**************@TK2MSFTNGP09.phx.gbl...
Trevor Braun wrote:
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them
ignored. I only want matches on the measurements which can be written
about a million different ways. This is for pulling data out of a legacy
inventory application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty
groups, but this is just not how it should work.

Thanks,
Trevor_B


shoot me an email and i'll work with you on these. there's no need to
flood a C# newsgroup with a bunch of back and forth messages about regular
expressions, when they're just between you and me.

send me a long list of the test strings and i'll see what i can do for
you. i've never written a regular expression this complicated and i would
love to give it a try.

jeremiah

Dec 11 '05 #4
Hey trevor,

It maybe easier to write multiple regex strings than one large regex
string capable of handling all situations. There is always going to be
a legacy string that will fail your regex. So, instead have a set of
regex strings that you will loop through and try to match. If no match
is found, then you know you need to create a new regex.

It's like a bunch of security check points. If it fails one, then it
goes through another checkpoint. Having one large centralized
checkpoint can cause a lot of complications.

Give it a whirl because sometimes it's easier to have a bunch of little
tasks than one large complicated task.

josh

Dec 11 '05 #5
In article <Om*************@TK2MSFTNGP12.phx.gbl>,
Trevor Braun <tb***********@codetrue.com> wrote:

: Hi, I'm not sure that this is the right forum for this, but I've been
: having a very tough time completing this expression, and I was hoping
: someone might have some suggestions for me.
: I am trying to read measurements out of a text description, and I have
: a working expression, but it captures a pile of empty matches. I
: obviously am not interested in them, but I screw up my functionality
: when I try to get rid of them.
:
: My expression is:
: [snipped]
:
: Some test strings are:
: 1/4" x 2" Flat 44W x 20'
: 1 1/4" x 2" Flat 44W x 20'
: 1/4" x 2.5" Flat 44W x 20'
: 1/4" x 2" Flat 44W x 20' 3"
: 1/4" x 2" Flat 44W x 20' 3.5"
: 1/4" x 2" Flat 44W x 20' 1/2"
: 1/8" x 4" C-1018 flat x 14' 5-1/4"
:
: I really could use some help on this. I've been working on this on and
: off for several months now, and just can't seem to get it right.

One easy suggestion is that you can write "{0,1}" more succinctly as
"?", e.g., "a{0,1}" and "a?" are equivalent.

If you want to insist that one of the groups matches, then say what
you mean. Remember that the ? and * quantifiers *always* succeed
because they can match nothing.

For complex patterns, I like to use IgnorePatternWhitespace

Your subpatterns are inconsistent, e.g., some included the unit and
some didn't, and even with your followup, I may not be clear on what
you're trying to capture.

Take a look at the code below. Note how the pattern requires one of
the alternatives to match non-empty strings.

static void Main(string[] args)
{
Regex measurements = new Regex(
@"
(?<Fraction> (\d+\s+)?\d+/\d+"" ) |
(?<Decimal> \d+\.\d+"" ) |
(?<Feet> \d+' ) |
(?<WholeInches> \d+(?![/\w]) )
",
RegexOptions.IgnorePatternWhitespace |
RegexOptions.ExplicitCapture);

string[] inputs = {
"1/4\" x 2\" Flat 44W x 20'",
"1 1/4\" x 2\" Flat 44W x 20'",
"1/4\" x 2.5\" Flat 44W x 20'",
"1/4\" x 2\" Flat 44W x 20' 3\"",
"1/4\" x 2\" Flat 44W x 20' 3.5\"",
"1/4\" x 2\" Flat 44W x 20' 1/2\"",
"1/8\" x 4\" C-1018 flat x 14' 5-1/4\"",
};

string[] groups = {
"Feet", "WholeInches", "Fraction", "Decimal",
};

foreach (string input in inputs)
{
Console.WriteLine("[" + input + "]:");

int count = 1;
foreach (Match m in measurements.Matches(input))
{
Console.WriteLine(" - {0}:", count++);

foreach (string group in groups)
Console.WriteLine(" - {0}: [{1}]",
group, m.Groups[group].Value);
}
}
}

Is it at least a start in the right direction? Should an input such
as [20' 3"] produce one match or two (one for the feet component and
one for the inches component)? What else needs fixing?

I agree with Mark Noon: regular expressions are fun, so I look forward
to hearing back from you.

Hope this helps,
Greg
--
"Those who deliberately sign their names to deception will be punished,"
[President Bush] said, leaving out that this is precisely what happens
every time he signs a budget or a law, or Congress votes.
-- Lew Rockwell
Dec 14 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
10
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
3
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular...
18
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.