473,382 Members | 1,437 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

RegEx how do I do unique?

D
My first attempt at this and I'm searching formulas like so

RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT(A15,'yy'))+1

I want to extract the row / col coordinates (A15 in above)

so I'm using this [a-z]+\d+\d*

however I want unique ones and not 3 copies of A15 as the above returns.

How do I do that? I was reading that (?!) is used for uniqueness but I
cannot seem to get it to work here

Thanks
Nov 16 '05 #1
2 6825
"D" <Da**@nothing.net> wrote in
news:#Q**************@TK2MSFTNGP09.phx.gbl:
My first attempt at this and I'm searching formulas like so

RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT
(A15,'yy'))+1

I want to extract the row / col coordinates (A15 in above)

so I'm using this [a-z]+\d+\d*

however I want unique ones and not 3 copies of A15 as the above
returns.

How do I do that? I was reading that (?!) is used for uniqueness
but I cannot seem to get it to work here


Dave,

(?!) is the negative lookahead construct. I don't think it would be
applicable here.

What might work best is to get all of the matches, then put them into
a hash table to get just the unique ones.

The regex "[a-z]+\d+\d*" probably matches too much (also the last \d*
isn't needed). If this is for an Excel spreadsheet, note that Excel
has a limit of 256 columns (A thru IV) and 65,536 rows per worksheet.
So a regular expression to match Excel cell references should take
this into account so it doesn't match things that can't be valid cell
references.

For example:

string regex = @"
#
# This regex matches strings like A15, IV3491, C9 or DP138.
#

\b # Match a word boundary.
(?:[a-z]|[a-i][a-z]?) # Match cell IDs A thru IV.
\d{1,5} # Match between one and five numbers.
\b # Match a word boundary.
";

string s =
"RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan'," +
"'-',TEXT(A15,'yy'))+1";

// Get all of the matches.
MatchCollection mc = Regex.Matches(s, regex,
RegexOptions.Singleline |
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase);

Console.WriteLine("All cell IDs = {0}", mc.Count);

// Put all of the non-unique matches into the keys of
// a hash table.
Hashtable uniqueCells = new Hashtable(mc.Count);
foreach (Match m in mc)
uniqueCells[m.Value.ToUpper()] = null;

Console.WriteLine("Unique cell IDs = {0}", uniqueCells.Count);

// Print the unique cell IDs.
foreach (DictionaryEntry de in uniqueCells)
Console.WriteLine(de.Key.ToString());

--
Hope this helps.

Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/
Nov 16 '05 #2
D

----- Original Message -----
From: "Chris R. Timmons" <crtimmons@X_NOSPAM_Xcrtimmonsinc.com>
Newsgroups: microsoft.public.dotnet.languages.csharp
Sent: Wednesday, February 09, 2005 1:30 AM
Subject: Re: RegEx how do I do unique?

"D" <Da**@nothing.net> wrote in
news:#Q**************@TK2MSFTNGP09.phx.gbl:
My first attempt at this and I'm searching formulas like so

RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT
(A15,'yy'))+1

I want to extract the row / col coordinates (A15 in above)

so I'm using this [a-z]+\d+\d*

however I want unique ones and not 3 copies of A15 as the above
returns.

How do I do that? I was reading that (?!) is used for uniqueness
but I cannot seem to get it to work here


Dave,

(?!) is the negative lookahead construct. I don't think it would be
applicable here.

What might work best is to get all of the matches, then put them into
a hash table to get just the unique ones.

The regex "[a-z]+\d+\d*" probably matches too much (also the last \d*
isn't needed). If this is for an Excel spreadsheet, note that Excel
has a limit of 256 columns (A thru IV) and 65,536 rows per worksheet.
So a regular expression to match Excel cell references should take
this into account so it doesn't match things that can't be valid cell
references.

For example:

string regex = @"
#
# This regex matches strings like A15, IV3491, C9 or DP138.
#

\b # Match a word boundary.
(?:[a-z]|[a-i][a-z]?) # Match cell IDs A thru IV.
\d{1,5} # Match between one and five numbers.
\b # Match a word boundary.
";

string s =
"RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan'," +
"'-',TEXT(A15,'yy'))+1";

// Get all of the matches.
MatchCollection mc = Regex.Matches(s, regex,
RegexOptions.Singleline |
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase);

Console.WriteLine("All cell IDs = {0}", mc.Count);

// Put all of the non-unique matches into the keys of
// a hash table.
Hashtable uniqueCells = new Hashtable(mc.Count);
foreach (Match m in mc)
uniqueCells[m.Value.ToUpper()] = null;

Console.WriteLine("Unique cell IDs = {0}", uniqueCells.Count);

// Print the unique cell IDs.
foreach (DictionaryEntry de in uniqueCells)
Console.WriteLine(de.Key.ToString());

--
Hope this helps.

Chris.
-------------
C.R. Timmons Consulting, Inc.
http://www.crtimmonsinc.com/

Thanks.

I found the (?!) operator on this page
http://www.thescarms.com/dotNet/RegExUnique.asp where he was doing this

Dim pattern As String = "(?\b\w+\b)(?!.+\b\k\b)"

to find the unique occurrences of words in a string. Prior to that way he
also suggest the hash table which I have been using for testing while I
figured out the regex method.

I'll definately use your suggestions about the bounds of excel, that's a
good point I didn't think of.

Thank you.

Regards,
Dave

Nov 16 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: lawrence | last post by:
When users enter urls or other long strings it can destroy the formatting of a page. A long url, posted in a comment, can cause page distortions that make the page unreadable, till the website...
75
by: Xah Lee | last post by:
http://python.org/doc/2.4.1/lib/module-re.html http://python.org/doc/2.4.1/lib/node114.html --------- QUOTE The module defines several functions, constants, and an exception. Some of the...
5
by: Bill Cohagan | last post by:
I'm looking for help with a regular expression question, so my first question is which newsgroup is the best one to post to? Just in case *this* is the best choice, here's the problem: I'm...
1
by: Hardy Wang | last post by:
Hi all: I have a big string, format like below: A 8972057019577575 37893 7937 92 702 059032868 -86-248 6408 6048-8608175398 295830 8308 02 B A 3098453-8 3-86-386-860458602862086029...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
4
by: Allen | last post by:
For my web-based php regex find/replace do-hickey, I need to match individual back references and wrap a tag around them so they'll be unique to the rest of the match for individual color markup. ...
24
by: cassetti | last post by:
Here's the issue: I have roughly 20 MS excel spreadsheets, each row contains a record. These records were hand entered by people in call centers. The problem is, there can and are duplicate...
15
by: Kay Schluehr | last post by:
I have a list of strings ls = and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in . There might be...
2
by: Good Man | last post by:
Hi there I have a series of HTML tables (well-formed, with elements ID'd quite nicely) and I need to extract the contents from certain TDs. For example, I'd like to get "Hi Mom!" from the...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.