473,395 Members | 1,641 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

regular expression help

Hi, I'm not sure that this is the right forum for this, but I've been having
a very tough time completing this expression, and I was hoping someone might
have some suggestions for me.
I am trying to read measurements out of a text description, and I have a
working expression, but it captures a pile of empty matches. I obviously am
not interested in them, but I screw up my functionality when I try to get
rid of them.

My expression is:
(?:(?:(?<Feet>[0-9]*)\'){0,1}(?:(?:(?<WholeInches>[0-9]*(?![/\w])){0,1}(?:[
,\-]){0,1}(?<Fraction>[0-9]*\/[0-9]*){0,1}(?<Decimal>\d*\.\d*){0,1}\")){0,1})

Some test strings are:
1/4" x 2" Flat 44W x 20'
1 1/4" x 2" Flat 44W x 20'
1/4" x 2.5" Flat 44W x 20'
1/4" x 2" Flat 44W x 20' 3"
1/4" x 2" Flat 44W x 20' 3.5"
1/4" x 2" Flat 44W x 20' 1/2"
1/8" x 4" C-1018 flat x 14' 5-1/4"

I really could use some help on this. I've been working on this on and off
for several months now, and just can't seem to get it right.
Dec 10 '05 #1
5 2047
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them ignored.
I only want matches on the measurements which can be written about a million
different ways. This is for pulling data out of a legacy inventory
application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty groups,
but this is just not how it should work.

Thanks,
Trevor_B
Dec 10 '05 #2
Trevor Braun wrote:
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them ignored.
I only want matches on the measurements which can be written about a million
different ways. This is for pulling data out of a legacy inventory
application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty groups,
but this is just not how it should work.

Thanks,
Trevor_B


shoot me an email and i'll work with you on these. there's no need to
flood a C# newsgroup with a bunch of back and forth messages about
regular expressions, when they're just between you and me.

send me a long list of the test strings and i'll see what i can do for
you. i've never written a regular expression this complicated and i
would love to give it a try.

jeremiah
Dec 11 '05 #3
I disagree... regular expressions are fun.

-Marc N.
"jeremiah johnson" <na*******@gmail.com> wrote in message
news:uk**************@TK2MSFTNGP09.phx.gbl...
Trevor Braun wrote:
Sorry, it's been a hectic day... I didn't finish my post, but somehow
managed to send it anyway....

In the strings, I there are always random numbers, and I want them
ignored. I only want matches on the measurements which can be written
about a million different ways. This is for pulling data out of a legacy
inventory application.

Any thoughts or suggestions would be very, very much appreciated. Right
now, my app uses this expression, and removes matches to the empty
groups, but this is just not how it should work.

Thanks,
Trevor_B


shoot me an email and i'll work with you on these. there's no need to
flood a C# newsgroup with a bunch of back and forth messages about regular
expressions, when they're just between you and me.

send me a long list of the test strings and i'll see what i can do for
you. i've never written a regular expression this complicated and i would
love to give it a try.

jeremiah

Dec 11 '05 #4
Hey trevor,

It maybe easier to write multiple regex strings than one large regex
string capable of handling all situations. There is always going to be
a legacy string that will fail your regex. So, instead have a set of
regex strings that you will loop through and try to match. If no match
is found, then you know you need to create a new regex.

It's like a bunch of security check points. If it fails one, then it
goes through another checkpoint. Having one large centralized
checkpoint can cause a lot of complications.

Give it a whirl because sometimes it's easier to have a bunch of little
tasks than one large complicated task.

josh

Dec 11 '05 #5
In article <Om*************@TK2MSFTNGP12.phx.gbl>,
Trevor Braun <tb***********@codetrue.com> wrote:

: Hi, I'm not sure that this is the right forum for this, but I've been
: having a very tough time completing this expression, and I was hoping
: someone might have some suggestions for me.
: I am trying to read measurements out of a text description, and I have
: a working expression, but it captures a pile of empty matches. I
: obviously am not interested in them, but I screw up my functionality
: when I try to get rid of them.
:
: My expression is:
: [snipped]
:
: Some test strings are:
: 1/4" x 2" Flat 44W x 20'
: 1 1/4" x 2" Flat 44W x 20'
: 1/4" x 2.5" Flat 44W x 20'
: 1/4" x 2" Flat 44W x 20' 3"
: 1/4" x 2" Flat 44W x 20' 3.5"
: 1/4" x 2" Flat 44W x 20' 1/2"
: 1/8" x 4" C-1018 flat x 14' 5-1/4"
:
: I really could use some help on this. I've been working on this on and
: off for several months now, and just can't seem to get it right.

One easy suggestion is that you can write "{0,1}" more succinctly as
"?", e.g., "a{0,1}" and "a?" are equivalent.

If you want to insist that one of the groups matches, then say what
you mean. Remember that the ? and * quantifiers *always* succeed
because they can match nothing.

For complex patterns, I like to use IgnorePatternWhitespace

Your subpatterns are inconsistent, e.g., some included the unit and
some didn't, and even with your followup, I may not be clear on what
you're trying to capture.

Take a look at the code below. Note how the pattern requires one of
the alternatives to match non-empty strings.

static void Main(string[] args)
{
Regex measurements = new Regex(
@"
(?<Fraction> (\d+\s+)?\d+/\d+"" ) |
(?<Decimal> \d+\.\d+"" ) |
(?<Feet> \d+' ) |
(?<WholeInches> \d+(?![/\w]) )
",
RegexOptions.IgnorePatternWhitespace |
RegexOptions.ExplicitCapture);

string[] inputs = {
"1/4\" x 2\" Flat 44W x 20'",
"1 1/4\" x 2\" Flat 44W x 20'",
"1/4\" x 2.5\" Flat 44W x 20'",
"1/4\" x 2\" Flat 44W x 20' 3\"",
"1/4\" x 2\" Flat 44W x 20' 3.5\"",
"1/4\" x 2\" Flat 44W x 20' 1/2\"",
"1/8\" x 4\" C-1018 flat x 14' 5-1/4\"",
};

string[] groups = {
"Feet", "WholeInches", "Fraction", "Decimal",
};

foreach (string input in inputs)
{
Console.WriteLine("[" + input + "]:");

int count = 1;
foreach (Match m in measurements.Matches(input))
{
Console.WriteLine(" - {0}:", count++);

foreach (string group in groups)
Console.WriteLine(" - {0}: [{1}]",
group, m.Groups[group].Value);
}
}
}

Is it at least a start in the right direction? Should an input such
as [20' 3"] produce one match or two (one for the feet component and
one for the inches component)? What else needs fixing?

I agree with Mark Noon: regular expressions are fun, so I look forward
to hearing back from you.

Hope this helps,
Greg
--
"Those who deliberately sign their names to deception will be punished,"
[President Bush] said, leaving out that this is precisely what happens
every time he signs a budget or a law, or Congress votes.
-- Lew Rockwell
Dec 14 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
10
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
3
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular...
18
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.